What's the bottleneck? Bandwidth for downloading images? Understanding of an alg...

petewarden · on Feb 26, 2012

Bandwidth and general cumbersomeness of dealing with larger amounts of data with starving-startup resources. I actually spent about a decade of my career focused on image processing, and while I love it's power, I knew how much of an engineering challenge it can be at massive scale. I need to do a blog post about this, since I know my choice is a bit surprising and needs explanation.

marshallp · on Feb 26, 2012

Difficulty in creating an algorithm.

There are ways to get it done algorithmically, however, the challenge is in getting enough data. 30,000 images is too low. you would need a few million, then just simple machine learning algorithms would work.

The latest machine learning techniques such as unsupervised deep learning might work, with millions of unlabeled images and the 30,000 labelled.