Hacker News new | past | comments | ask | show | jobs | submit login

for unstructured message analysis -- it's not a silver bullet, but you may want to try BERT or another newish embedding tool

(you can do this on your laptop, you don't need to upload the data anywhere)

then use like sklearn OPTICS to cluster (clustering method that doesn't require you to know the number of clusters in advance)

If there are obvious clusters, you may be able to label a few useful categories. If you get lucky and most entries have a cluster, you can potentially manually label the outliers.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: