Hacker News new | past | comments | ask | show | jobs | submit login

I feel naive bayes works pretty well with such text classification tasks. You might want to give it a try.



Vowpal Wabbit is also pretty easy to use and fast. I train and test on 100K text examples in under 1 minute. Works better than random forests and other things.


The choice of algorithm completely depends on the type of classification task at hand. Naive Bayes would be good for certain types of classifications problems but there could be better ones for another type of text classification problems.


naive bayes does well on small data sets, but would in general do poorly on larger text sets due to its independence assumption (which is horribly wrong in language).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: