http://deeplearning.net/software/theano/ is a good place to start. It's open source, in Python, has a few tutorials that lead you towards some rather state-of-the-art methods.
There's no secret sauce for how to choose the number of layers and the activation functions (and anyone that tells you otherwise is lying). It's all application-dependent and the best way to do it is by cross-validation.
Can answer questions about this topic in conjunction with mr. bravura (I'm working with the code referenced in that paper by Dean et al., here at Google, and did grad school at the same place as bravura did his postdoc).
I'm a xoogler, stuff like this makes me wish I was back there :) I want to play a bit with this stuff and have some fun, so thanks for the Theano reference, seems cool!
WRT the DNN parameters, is it possible to try all potential possibilities (within reason) and find the best one using only cross-validation, or are there just too many choices and you have to use intuition? (from your comment I don't get if cross-validation is ok to get optimal number of layers etc. or if you have to be "smart")
For things like the number of layers, the total number of options is relatively small -- usually people try between 1 and 6-7. For most of the other parameters you have to be smarter than that, especially since a lot of them are real-valued, so you can't really explore them all.
One of the trends these days is to perform automatic hyper-parameter tuning, especially for cases where a full exploration of hyper-parameters via grid-search would mean a combinatorial explosion of possibilities (and for neural networks you can conceivably explore dozen of hyper-parameters). A friend of mine just got a paper published at NIPS (same conference) on using Bayesian optimization/Gaussian processes for optimizing the hyper-parameters of a model -- http://www.dmi.usherb.ca/~larocheh/publications/gpopt_nips.p.... They get better than state of the art results on a couple of benchmarks, which is neat.
Btw, Geoff Hinton is teaching an introductory neural nets class on Coursera these days, you should check it out, he's a great teacher. Also, you can always come back to Google, we're doing cool stuff with this :)
Ask me again in two weeks when my online course in advanced machine learning for extreme beginners kicks off. We won't start with deep learning immediately, but it's in the pipeline not too soon after launch.
It'll be on a new site for teaching complex things in non-complicated ways. The goal is to allow everyone from clever middle school students through retired people to understand the coming changes to the world. There's a huge on-site community focus too. We don't want there to be 100,000 anonymous people just going through the motions. There will be plenty of interaction between course material and community feedback. It's kinda awesome.
Topics will be presented in multiple ways (simple and intermediate) so you can have plenty of different views on the same material. The material works as both zero-knowledge intro to the topics as well as quick refreshers if you haven't seen the material in a while (quick -- what's an eigenvector?!).
The launch courses will be 1.) real-world applications of probability and statistics (signal extractions), 2.) linear algebra for computer science, and 3.) wildcard (a random assortment of whatever the heck we think is important or entertaining to know). Future courses are: introduction to neural networks, introduction to computational neuroscience, introduction to deep learning, advanced deep learning, how to take over the world with a few dozen GPUs, avenues by which google will become irrelevant, and robotics for fun and evil.
This is phase zero of a four phase plan. I'll get some pre-launch material together to shove down HN shortly, then it'll launch a few days later. Hopefully you'll hear about the project again.
Do you know of any tutorial that may guide the beginner using DNN, I have no idea how to choose the number of hidden layers and activation functions.
Thanks!