There are so many issues with this post, let me enumerate: 1. Straw man tweet by...

sbov · on May 31, 2017

> 3.

How much of this is students doing research where they already have access to big data, which makes sense if your goal is to do deep learning research, vs being given a problem a business wants to solve? Can you make the same statement for the average problem at your average small-medium sized business? Can you really get big data that is relevant to the local, non-chain coffee shop down the street?

If you can it seems like an amazing business opportunity - to bring Google level insights to businesses that don't directly have Google-level data.

PeterisP · on May 31, 2017

The issue of whether some business like "the local, non-chain coffee shop down the street" has any reason to use machine learning whatsoever seems to be orthogonal to the problem discussed in article which is the choice of approaches if you're going to do some machine learning.

There's a classical quote from Tukey "The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data." - yes, it's quite likely that an average small-medium sized business has no problems where the possible benefit of ML-driven insights won't match the costs required to analyze whatever data they have.

However, if a small-medium business has some problem with a large enough likely payback to justify making some ML system, it is quite likely that deep learning may be applicable on their data.

A big issue is transfer learning - in many domains while you may have a small amount of data, you'd want a system that has learned to generalize on a huge quantity of similar external data, and just tuned on your data. For example, if a cookie bakery needs analysis of cookie pictures or reviews of cookies, and has limited data samples, it would be reasonable to include e.g. ImageNet data or Amazon review corpus. You'd "teach" the system how pictures/internet reviews/English language/whatever else works on the biggest data available, and just retrain/adapt it to your particular problem afterwards.