Hacker News new | past | comments | ask | show | jobs | submit login

Yeah, it's amazingly accurate for now only because the training data consists of a high percentage of the actual data set, and so it's probably classifying based on those insignificant differences. We'll see how it holds up...



I think you're training it on the submission titles - I wonder if the text of the websites themselves might be more accurate. Certainly richer. But it's quite possible that the submission titles are more accurate.


Says above he's using body text.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: