Hacker News new | past | comments | ask | show | jobs | submit login

http://deeplearning.net/software/theano/ is a good place to start. It's open source, in Python, has a few tutorials that lead you towards some rather state-of-the-art methods.

There's no secret sauce for how to choose the number of layers and the activation functions (and anyone that tells you otherwise is lying). It's all application-dependent and the best way to do it is by cross-validation.

Can answer questions about this topic in conjunction with mr. bravura (I'm working with the code referenced in that paper by Dean et al., here at Google, and did grad school at the same place as bravura did his postdoc).




I'm a xoogler, stuff like this makes me wish I was back there :) I want to play a bit with this stuff and have some fun, so thanks for the Theano reference, seems cool!

WRT the DNN parameters, is it possible to try all potential possibilities (within reason) and find the best one using only cross-validation, or are there just too many choices and you have to use intuition? (from your comment I don't get if cross-validation is ok to get optimal number of layers etc. or if you have to be "smart")

Thanks for replying!


For things like the number of layers, the total number of options is relatively small -- usually people try between 1 and 6-7. For most of the other parameters you have to be smarter than that, especially since a lot of them are real-valued, so you can't really explore them all.

One of the trends these days is to perform automatic hyper-parameter tuning, especially for cases where a full exploration of hyper-parameters via grid-search would mean a combinatorial explosion of possibilities (and for neural networks you can conceivably explore dozen of hyper-parameters). A friend of mine just got a paper published at NIPS (same conference) on using Bayesian optimization/Gaussian processes for optimizing the hyper-parameters of a model -- http://www.dmi.usherb.ca/~larocheh/publications/gpopt_nips.p.... They get better than state of the art results on a couple of benchmarks, which is neat.

The code is public -- http://www.cs.toronto.edu/~jasper/software.html -- and in python, so you could potentially try it out (runs on EC2, too).

Btw, Geoff Hinton is teaching an introductory neural nets class on Coursera these days, you should check it out, he's a great teacher. Also, you can always come back to Google, we're doing cool stuff with this :)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: