Hacker News new | past | comments | ask | show | jobs | submit login

It's brilliant, but horrible, since you need somewhere between 300 and 1000 dimensions of floating point word vector semantic space to get good separation. Maybe it's shrunk since I last checked.

Even with less granularity, that's still like 256 bytes per grapheme.

I think this would be great for machine translation though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: