Hacker News new | past | comments | ask | show | jobs | submit login

What's the bottleneck? Bandwidth for downloading images? Understanding of an algorithm that would do the job?



Bandwidth and general cumbersomeness of dealing with larger amounts of data with starving-startup resources. I actually spent about a decade of my career focused on image processing, and while I love it's power, I knew how much of an engineering challenge it can be at massive scale. I need to do a blog post about this, since I know my choice is a bit surprising and needs explanation.


Difficulty in creating an algorithm.

There are ways to get it done algorithmically, however, the challenge is in getting enough data. 30,000 images is too low. you would need a few million, then just simple machine learning algorithms would work.

The latest machine learning techniques such as unsupervised deep learning might work, with millions of unlabeled images and the 30,000 labelled.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: