Don't forget "Intriguing properties of neural networks", otherwise known as "Does Deep Learning have deep flaws?".
In sum, you can teach a network to say "that's a dog" when presented with a picture of a dog, but you'll also be able to (1) find an imperceptibly modified version of the image where it'll say "that's garbage" and (2) intelligently generate an image of noise that also gets recognised as a dog.
In (2)'s paper [2], the images are really neat. Staring at it for a few moments, I feel like I can see where the computer is coming from. A bit.
The robin, armadillo, centipede, peacock, and bubble all actually have a little swirl of features that - to me at least - resemble the labels provided. But from afar, and you've got to basically damp the noise. I did this by taking my glasses off and leaning about 8 inches from the screen (with the grid taking about 3 inches wide). I've got about -7.5 diopter near-sightedness, so this cleaned the images right up. At least the armadillo I would have guessed, as it's absolutely a little critter walking to the lower right, and it has the demi-circle body and long face. And it might make sense: it was told to make an armadillo, so it did. And nothing else.
(I also think the baseball was super clever - if this were abstract art, I would totally have fallen for that classification, as well as a few of the others. There's something cool going on there.)
I'm printing and reading the rest of the paper now (about a quarter done). To say it's both fascinating and interesting really understates it. Really neat stuff!
You're going to get most of these from a simple Google search, if you're going to build a list of what to read you should at least put some effort into it. Currently, this list is missing a lot of history behind deep learning - only 3 papers listed!
if you want a good set of papers that starts with perceptrons and hebbian learning to multi-layered neural nets and the emergence of what we now refer to as deep networks checkout http://deeplearning.cs.cmu.edu/
I would add "Practical recommendations for gradient-based training of deep architectures" to the list for those who already have a feel for training multi-layer neural nets. It provides a good overview for those that want to learn more about gradient descent, hyperparameter tuning, and other practical considerations involved with training deep architectures.
Automatic Speech Recognition: a Deep Learning Approach contains an excellent section about deep learning, as well as more content about ASR and hybrid deep learning methods.
In sum, you can teach a network to say "that's a dog" when presented with a picture of a dog, but you'll also be able to (1) find an imperceptibly modified version of the image where it'll say "that's garbage" and (2) intelligently generate an image of noise that also gets recognised as a dog.
(1) http://arxiv.org/abs/1312.6199 (2) http://www.newscientist.com/article/dn26691-optical-illusion...