It's brilliant, but horrible, since you need somewhere between 300 and 1000 dimensions of floating point word vector semantic space to get good separation. Maybe it's shrunk since I last checked.
Even with less granularity, that's still like 256 bytes per grapheme.
I think this would be great for machine translation though.
Even with less granularity, that's still like 256 bytes per grapheme.
I think this would be great for machine translation though.