Hacker News new | past | comments | ask | show | jobs | submit login

I tried with clustering similar embeddings but it did extremely poorly (~0%) since the groupings are often deceiving with words in a group only having one small way in which they're connected and lots of spurious fake groups to throw you off. Maybe looking for groups with high similarity on only a sibset of embedding dimensions might help, but I didn't have much time to play either :) A notebook to get you going if you do want to play: https://colab.research.google.com/drive/1KJeSB9Q5XzSeT9ONUJ_...



I definitely think trying to find similarity on a variable subset of dimensions is required. Fingers crossed I get the time to try soon




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: