Hacker News new | past | comments | ask | show | jobs | submit login

Is 30,000 photos a large enough dataset for a ML exercise like this?

It might just be journalistic simplifying but I wonder if they overfit their data with such specific "good" and "bad" words:

Among the best words: Peru, Cambodia, Michigan, tombs, trails and boats. What photo captions are the most likely to signify a bad photo for a travel magazine? San Jose, mommy, graduation and CEO, Warden says.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: