Deep learning is just a rebranding of "neural networks". When neural nets became unpopular in the 90s and early 2000s, people talked about "multilayer networks" (dropping the "neural") since it wasn't really useful to think about this approach from the neuro perspective (since it's such a cartoonish model of real neural networks anyway).
Now that very deep networks have become possible, and various graphical models and Bayesian approaches have also been folded under "deep learning" (for example, using back-propagation to learn complicated posterior distributions in variational Bayes) deep learning is not just about vanilla feedforward nets.
>…since it wasn't really useful from the neuro perspective (since it's such a cartoonish model of real neural networks anyway).
Still isn't and still masses of people go on to think that these "neural" networks work the same as neurons in a body do… while neuroscientists are still trying to understand how real neuronal networks operate with a bunch of pet phenomenological theories that most pretty much ignore physics despite the tools used lol
Well, yes, mostly, but there also have been genuine discoveries in the last 10 years. We can now train deep networks because we learned how to regularize - before it was impossible because of vanishing gradients. We can have even 1000-layer deep nets, which would have been unthinkable. Also, there are some interesting approaches to unsupervised learning like GANs and VAEs. We learned how to embed words and not only words, but multiple classes of things into vectors. Another one would be the progress in reinforcement learning, with the stunning success of AlphaGo and playing over 50 Atari games. Current crop of neural nets do Bayesian statistics, not just classification. We are playing with attention and memory mechanisms in order to achieve much more powerful results.
>We can now train deep networks because we learned how to regularize - before it was impossible because of vanishing gradients.
Those are two different things. Vanishing gradient problems were ameliorated by switching from sigmoidal activation functions to rectified linear units or tanh activations, and also by dramatically reducing the amount of edges through which gradients propagate. The latter was accomplished through massive regularization to reduce the size of the parameter spaces: convolutional layers and dropout.
Stuff like residuals are still being invented in order to further banish gradient instability.
Yeah, I've always heard that expert systems are good, because there you can reason about the solution. For instance diagnosing people based on rules contributed by doctors. You can trace the steps the algorithm takes easily.
But for a neural net, you cannot say why this particular net should be trusted, as you don't know how it arrives at a solution. Therefore it's "scary" to use.
While I don't agree, it explains why it has been unpopular.
Random Decision Forests provide sort of a middle ground.
For training, a large set of decision trees are built randomly based on the input features.
When classifying input for one tree, each node considers feature value of the input, and decides on a branch. Leafs corresponds to a classification, so when a leaf is reached, the tree has classified the given input.
By having a large set of trees, and picking e.g. the most common resulting class (majority vote), we increase accuracy.
However, each individual tree can actually be reasoned about. E.g. you can see the analysis (nodes) leading to each class (leafs).
I've had some success with RDFs in the past, and highly recommend them!
They are very easy to implement, very efficient to train and query, and they seem to work really great on classification of "discrete input" (i.e. where the input feature values are binary or from relatively small sets).
Just curious, what prevents somebody to debug deep network in the same way?
You potentially can check what features contributed to activation of each "neuron"..
Now that very deep networks have become possible, and various graphical models and Bayesian approaches have also been folded under "deep learning" (for example, using back-propagation to learn complicated posterior distributions in variational Bayes) deep learning is not just about vanilla feedforward nets.