A few useful things to know about machine learning (2012) [pdf]

AndrewOMartin · on Aug 27, 2014

There are a few criticisms that can be made of this paper, it tries to cover a lot of ground in a small space, has rather informal language, and possibly more, but these can be forgiven as it's a generally informative and entertaining piece.

For me, the largest omission is the lack of reference to theoretical limit of Machine Learning. That is, what can't be achieved even if you assume infinite resources and algorithmic complexity. It's important for me as this paper appears to be a damn good stab at being a comprehensive review of why machine learning projects fail, except for missing this critical point. The idea is best explored in the book What Computers Can't Do (H.Dreyfus, 1972), recounted in the book What Computer's Still Can't Do (H.Dreyfus, 1992), and well summarized in A History of First Step Fallacies (H.Dreyfus, 2012) [1]

Finally, any paper that's freely distributed, can be enjoyed over lunch and includes the phrase "most of the volume of a high-dimensional orange is in the skin, not the pulp" is fine in my book.

[1] - http://link.springer.com/article/10.1007%2Fs11023-012-9276-0 [PDF]

Patient0 · on Aug 27, 2014

The abstract was intriguing. It's a shame you have to pay to read the rest of the paper.

abrichr · on Aug 27, 2014

8. FEATURE ENGINEERING IS THE KEY

Feature engineering is more difficult because it's domain-specific, while learners can be largely general-purpose ... one of the holy grails of machine learning is to automate more and more of the feature engineering process.

This is the goal of deep learning, and more generally, representation learning: automatic discovery of explanatory features from large amounts of data. I'm surprised it wasn't mentioned.

syllogism · on Aug 27, 2014

Deep learning needs feature engineering too.

You still need to transform your context into a vector of boolean or real values, somehow. And that transform is going to encode assumptions about what information is relevant to the problem, and what's not.

Let's say you're trying to predict house prices. There's no end of geo-tagged data you might pull in. And if you have a cleverer idea than the next guy, your model will be more accurate. And, probably, if the next guy's at least competent, it'll be your feature ideas that set you apart.

In a linear model, you need to come up with a clever set of conjunction features, that balances bias and variance. You don't need to do that for a deep learning model, and that's a big advantage. But that's not the same as saying there's no feature engineering.

jpallares · on Aug 27, 2014

That is the goal, however, deep learning models still have a long way until feature engineering goes away im afraid.

Once we get to that point, the "black box machine learning as a service" that many startups seem to be selling nowadays will be replacing data scientists.

blutoot · on Aug 27, 2014

This is a paper published in 2012. Deep learning wasn't mainstream at the time of writing it (probably 2011-2012).

mjw · on Aug 27, 2014

Machine learning has been full of methods for learning latent feature representations since way before deep learning was trendy, from simple things like PCA to more sophisticated Bayesian models. Deep learning refers specifically to using multi layer neural models and while neat is only one way of doing it and certainly didn't invent the whole concept of learning feature representations as recently as 2012!