Understanding Neural Networks Through Deep Visualization

primus202 · on July 8, 2015

Ok now I want to "hear" one of these trippy neural network things using a speech/language recognition neural network! Or perhaps if there's a music based one? I'd love to hear a computer's auditory hallucinations!!!

Houshalter · on July 9, 2015

This isn't quite what you are looking for, but here is one trained on midi files extending let it go: https://ericye16.com/music-rnn/

Here is one trained to predict the next byte in an wav file: https://www.youtube.com/watch?v=eusCZThnQ-U

userbinator · on July 9, 2015

That YouTube one is a good approximation to what I hear if I leave the radio turned on in the background and playing music, but am not paying attention to it and focusing on other stuff.

mandor · on July 8, 2015

A nice video tour of this "deep vizualisation toolbox": https://www.youtube.com/watch?v=AgkfIQ4IGaM&feature=youtu.be

Sven7 · on July 9, 2015

Very cool! This reminds me of galaxyzoo.org

So I guess future corps, will consist of cube farms of employees staring at such screens, training/optimizing proprietary nets.

shahar2k · on July 8, 2015

So if it were possible to read a single neuron in the brain, and optimize input through its' signal would it be similarly possible to extract what triggers that neuron from such a setup?

cafebeen · on July 8, 2015

Yes, there's a technique called single-unit recording:

https://en.wikipedia.org/wiki/Single-unit_recording

I think it's pretty rare to see an very selective response though, unless you're looking close to sensory neurons.

nickledave · on July 9, 2015

There's definitely selective responses in single unit recordings, even upstream of sensory neurons in "higher" areas; look for articles on the "Jennifer Aniston cell" and on so-called mirror neurons. The question is, how come you can fool a DNN into classifying noise as something with high probability (as they explain in the article), but you can't fool the brain?

fr0styMatt2 · on July 9, 2015

I think it's actually quite common that you can; see for example Pareidolia: https://en.wikipedia.org/wiki/Pareidolia)

This page links to some YouTube examples: http://theness.com/roguesgallery/index.php/skepticism/audio-...

gwern · on July 9, 2015

> but you can't fool the brain?

How do you know you can't? No brain of any kind has been scanned and emulated to the point where you could try such a gradient-ascent method.

nickledave · on July 9, 2015

Have you ever mistaken "color TV static" for a king penguin? If not, then your built in DNN does a good job of discriminating between them. There are optical illusions that mess with our visual system, of course. You could I guess do something like raise a brain in one environment with statistics different from the natural world and then ask how that affects discrimination. Is that what you're getting are with "a gradient ascent method"? Because AFAIK we also don't have any proof the brain uses a gradient ascent algorithm so I'm not sure why you'd ask an in silico brain to carry one out

TheEzEzz · on July 9, 2015

Gradient ascent is what you use to find the image that tricks the DNN. If you could run repeated experiments on a brain in exactly the same state over and over then you could perform gradient ascent on a brain as well. Whether the result of that hypothetical would be static that tricks the brain is unknown, but I don't see any reason to assume one way or the other. An easier experiment to help the discussion would be to calculate the probability that a random image of static can fool a DNN, rather than a special designed image that appears like noise. If the probability is not vanishingly small then there is indeed something fundamentally different at a functional level between brains and DNN. If not then we have to work harder to answer that question.

cafebeen · on July 9, 2015

Yes, aka the "grandmother cell" hypothesis. I think those studies are interesting, but pretty limited given the number of neurons in a brain and how many you can (ethically) measure...

chestervonwinch · on July 9, 2015

http://www.nature.com/nature/journal/v435/n7045/abs/nature03...

edit: woops. I mean to respond to GP.

jcr · on July 9, 2015

Here is the non-paywalled copy:

http://suns.mit.edu/articles/Nature.pdf

shahar2k · on July 9, 2015

can this be done in a non-invasive way?

cafebeen · on July 9, 2015

There are some non-invasive things like fMRI, fNIRS, and EEG that can measure activity to some extent, but those are all at far coarser resolution than single units.

FrankenPC · on July 9, 2015

I wonder if there's a sort of JTAG port into the neural network to do direct scans of the individual neurons.

dnautics · on July 8, 2015

Wonder what happens if you use a regularization constraints in fourier space, optimizing the 2DFT of the image, with restraint increasing with frequency...

mjw · on July 9, 2015

When you penalise the L2 norm of the convolution of the image with a filter (like a gradient or edge detector for example) you are effectively doing this. The spectrum of the filter determines how much different frequency components are penalised.

See https://en.wikipedia.org/wiki/Tikhonov_regularization

https://en.wikipedia.org/wiki/Regularization_by_spectral_fil...

I think (although they're a little handwavey about it) that their "Gaussian blur" prior must be of this form. They certainly talk about it penalising high frequency components.

The total variation method they mention is a generalisation of this too.

Houshalter · on July 9, 2015

Why would the 2DFT be a good model of natural images?

thanatropism · on July 9, 2015

Speculatively (pun intended): we see in saccades (rapid eye movements) rather than static analysis of scenery, so we're bound to understand the frequency of, say, edges over the fixed-period saccade rhythm better than we understand distances over a plain 2D-field representation. This is why we're often surprised by perspective in photography (well, also because of stereo vision): as we move our eyes the relative position of lines at different distances moves imperceptibly so we get a clue of 3D space.

When I was younger I took some drawing lessons because I hoped to be an architect, and the first thing we learned was precisely to undo this instinct and see the world as a flat thing -- this is why artists are seen stereotypically as extending their arm and looking at their brush with one eye -- they're using it to measure the distance of points in their visual field as a static field, as contrasted to the dynamic field that can't be put on paper.

dnautics · on July 9, 2015

Reducing the amplitude of high frequencies is a way of smoothening the images.

bpg_92 · on July 8, 2015

I love the fact this is using caffe by Berkeley. Good stuff indeed.

joelthelion · on July 9, 2015

Is this implemented in any of the deep learning libraries? It looks very useful for debugging, but it also looks like a lot of work to implement, so it would make sense to have it directly integrated in the libraries.

solve · on July 9, 2015

Looking at those generated images, I realized that many of those training photos seem to apply:

https://en.wikipedia.org/wiki/Rule_of_thirds

asanagi · on July 9, 2015

It's like you're seeing into a machine's imagination. Look at the 8th-layer images of the pitcher, or the gorillas with improved prior, for instance. They're very close to layout sketches an artist might use to block out a painting or photograph before beginning the work.

Unreal. Strong AI is not as far off as we think. 15 years. Maybe 20.