Hacker News new | past | comments | ask | show | jobs | submit login

Well, yes, mostly, but there also have been genuine discoveries in the last 10 years. We can now train deep networks because we learned how to regularize - before it was impossible because of vanishing gradients. We can have even 1000-layer deep nets, which would have been unthinkable. Also, there are some interesting approaches to unsupervised learning like GANs and VAEs. We learned how to embed words and not only words, but multiple classes of things into vectors. Another one would be the progress in reinforcement learning, with the stunning success of AlphaGo and playing over 50 Atari games. Current crop of neural nets do Bayesian statistics, not just classification. We are playing with attention and memory mechanisms in order to achieve much more powerful results.



>We can now train deep networks because we learned how to regularize - before it was impossible because of vanishing gradients.

Those are two different things. Vanishing gradient problems were ameliorated by switching from sigmoidal activation functions to rectified linear units or tanh activations, and also by dramatically reducing the amount of edges through which gradients propagate. The latter was accomplished through massive regularization to reduce the size of the parameter spaces: convolutional layers and dropout.

Stuff like residuals are still being invented in order to further banish gradient instability.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: