Hacker News new | past | comments | ask | show | jobs | submit login

I don't think the revolution was about hardware improvement. I did some neural network research (and published a few papers) in the 1990s and switched to other research disciplines afterwards. So, I'm not really familiar with the recent developments. But to my knowledge, there was indeed a revolution in neural network research. It was about how to train a DEEP neural network.

Traditionally, neural networks were trained by back propagations, but many implementations had only one or two hidden layers because training a neural network with many layers (it wasn't called "deep" NN back then) was not only hard, but often led to poorer results. Hochreiter identified the reason in his 1991 thesis: it is because of the vanishing gradient problem. Now the culprit was identified but the solution had yet to be found.

My impression is that there weren't any breakthroughs until several years later. Since I'd left that field, I don't know what exactly these breakthroughs were. Apparently, the invention of LTSM networks, CNNs and the replacement of sigmoids by ReLUs were some important contributions. But anyway, the revolution was more about algorithmic improvement than the use of GPUs.




The things that have most improved training neural networks since you left were: 1. Smart (Xavier/He) initialization 2. ReLU activations 3. Batch normalization 4. Residual connections

The GPUs and dataset size were definitely very important though.


Thanks for the info. I may want to keep up with the recent advances some day.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: