Hacker News new | past | comments | ask | show | jobs | submit login

I am currently using this toolkit and I must say that I really like it.

The main advantages of mallet over weka (the main java toolkit used in academic machine learning) for Natural Language Processing are:

- No need to map words and features to position in a feature vector yourself.

- Instances preprocessing can be defined in pipes that can be saved along the models. So no need to remember the pre-processing of data for each experiments.

- Contains algorithms for structured learning (CRF, HMM and general graphic models).

On the other hand, Mallet implements less algorithm (e.g. no Support Vector Machines to my knowledge).

In short, it is a nice toolkit to be aware of if you are planning to do Natural Language Processing.




For anyone wanting to use it in a commercial setting, it's worth noting that weka is GPL and mallet is CPL.

http://en.wikipedia.org/wiki/Common_Public_License




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: