The OP is including a lot of different concepts under the neural network umbrella. Things like restricted boltzmann machines and hierarchical temporal memory are technically neural networks, but many computer scientists would consider them to be different enough in approach to think of them separately. ie, you wouldn't say "let's use a type of neural network to solve this problem" you would probably say "let's use a restricted boltzmann machine".
It is true that these things are becoming more popular. I've found in practice that a modern computer scientist is still more likely to solve a simple learning problem with some form of regression, if only because it's faster than training a NN.
Restricted Boltzmann machines are as bonafide neural networks (NN) as you can get and has been around since the golden age of neural networks. You have the same layered structure, feed-forward connections, same "squashing function". The only "restriction" is that the unknown and the known nodes must live on different layers so that the connections are only between a known and an unknown node. It has been called different names and the theory explaining them has had different names too, for example "Harmony Theory" of Smolensky.
I think NN is a broad enough category that no matter what you want to use or describe, you will have to qualify your "lets use blah" statements with a particular kind of neural network. Similar in spirit to statements like "lets use a parser" vs "lets use a LALR parser".
But back to the topic of new found interest on NNs, part of the reason is that there have been new developments in training algorithms which work significantly better than what were used traditionally. With these methods NNs require far less baby-sitting. NNs traditionally really required a huge lot of that.
The other reason is that sheer scale and size of the data sets that are available now, have forced machine learners to move from powerful but batch optimization algorithms (quadratic programming for instance) to simple and online gradient based algorithms that have been the forte of the NN community all along.
Training a NN is no different than regression. It is another name/technique for (some what systematically) creating a tower of increasingly complex regression functions. If the simplest(linear) one works, its imperative that one uses the simplest one in the interest of good predictive accuracy on unseen data. Bundled together with the low training time that parent mentioned, its a win win.
In my experience, the reason is less due to speed and more because there's a perception that training algorithms for things like RBMs still involve a certain amount of "black magic" in tuning parameters, deciding when it's converged, etc, in contrast to linear/logistic regression or support vector machines, where you can basically turn a crank and get an answer out.
I have this impression that there are many more startups popping up that generally operating under the AI umbrella. Companies are doing things relating to medicine, finance, etc. There's even a couple on the May Who's Hiring post here on HN. Quite an exciting time. Seems like the AI Spring is in progress and we're coming up on Summer.
I don't think this is significant at all because of the information boom. As the Internet and Storage has boomed and become more cheap and available, more content and data is hosted on the web. More and more is indexed by Google. May be we need to adjust the stats graph that are based on Google Scholar with the inflation of data in the last decade.
Maybe I missed something but I don't understand this criticism. The author states that 'The number of hits for each year was divided by the number of hits for "machine learning".' Wouldn't that control for the inflation of data in exactly the way you are proposing?
Is there much commercialization of neural networks going on? I own NeuralNetwork.com but am not suited to turn it into a business. If anyone is in the space and is interested in using the category killer name, I'd love to talk.
I work with NNs in a research context and am working on a startup commercializing some research-related NN technology. What price range were you thinking?
my email is jlehman at eecs dot cs dot ucf dot edu. the captcha on your blog wouldn't work for me -- I see a recaptcha in the html, but it wasn't displaying in either firefox or chrome for me
I also find it surprising to see in which countries the term is most popular according to g-trends.
None of the top 10 is part of the G8. Also English is only the third most frequent language. I suppose they are lagging behind, but I might be wrong.
At my Uni, the only course that had some focus on NN felt pretty outdated.
Might this be due to the explosion on neuroscience publications that are not related to artificial neural networks? There haven't been major breakthroughs in ANNs lately
in machine learning, the 'big' thing about neural networks at the moment is deep learning via some sort of unsupervised pre-training. there's be a lot of work published on this recently.
(Note that this is quite a one-sided view of things, but it does convey the excitement of deep-learning researchers and the potential of what might be possible.)
Although I'm not in deep learning myself (I'm a computer vision researcher), here's a TL;DR as I understand it: rather than having people in specific domains such as computer vision or speech processing create their own features, the idea is to take raw inputs (pixels in the case of images) and train multi-layer neural net architectures that "learn" the relevant higher-level features in an unsupervised way (i.e., without labeled training data). Some of these seem to be pulling out interesting features and perform competitively on a few benchmarks in vision and other fields.
I'm not sold on this yet, because it seems like the complexity of designing features has merely been traded in for the complexity of designing different learning architectures, but it's certainly becoming quite popular these days (mostly led by Geoff Hinton of Toronto, Yoshua Bengio of Montreal, and Yann LeCunn of NYU).
No, non-parametric learning is an unrelated topic, which seeks to estimate probability distributions without using "parametric" (= having known structure) models. The canonical example of a non-parametric approach is using histograms, or their generalization, called Parzen Windows.
That's an orthogonal distinction. The methods thus far have typically been parametric, in that there's a fixed network topology and the learning algorithm adjusts the (fixed set of) weights on the edges. There's no reason, though, why you couldn't have a nonparametric version that adaptively chose the number of hidden nodes in the networks and the connectivity structure.
Also, take a look at some of the work by J.Schmidhuber (http://www.idsia.ch/~juergen/). For example, the LSTM architecture is quite powerful. And he very successfully applied those RNN models to a wide number of areas (handwriting recognition, robotics, ...).
Are you sure about this? I'm not an expert by any means but when I participated in the Netflix Prize a few years, there seemed to be lots of investigation about different variations of neural networks.
The backpropagation neural network itself is a variation of many nonlinear, additive statistical models. The field emerged as an independent entity largely because it had "cool" neurological connotations and was largely ignored by statisticians. The Netflix prize was ultimately won using linear algebra with sensible starting values (the KDD paper that describes the winning method is remarkably easy to read).
backpropagation is just the chain rule of differentiation. each layer of an ANN is logistic regression, which statisticians have been and continue to be interested in.
i'm not sure your caricature of the netflix winning solution is correct: i believe it was a blending of around 25 different models (including, i think, the RBM someone pointed about above) each in themselves quite varied from one another. this is typically how these challenges are won.
not an expert either, and indeed, the Hinton papers on RBMs are the last thing i remember reading about. But their invention dates back to the 80s. I 'm just noticing that if you search "neural network" on google scholar for the recent years, you will find many neuroscientific results. Of course i dont have the data to back it up.
Well, but he's conditioning on "machine learning" already (the little caption on the histogram says "P(neural network | machine learning"), so that most likely filters out the straight neuroscience papers.
It is true that these things are becoming more popular. I've found in practice that a modern computer scientist is still more likely to solve a simple learning problem with some form of regression, if only because it's faster than training a NN.