Also you hear that example over and over again because you can't get other ones ...

simonw · 2024-06-30T22:16:59.000000Z

The other example that worked for me with Word2Vec was Germany + Paris - France = Berlin: https://simonwillison.net/2023/Oct/23/embeddings/#exploring-...

craigacp · 2024-07-01T01:20:25.000000Z

There are a bunch of these things in a word2vec space. I had a blog post years ago on my group's blog which trained word2vec on a bunch of wikias so we could find out who is the Han Solo of Doctor Who (which I think somewhat inexplicably was Rory Williams). You need to carefully implement word2vec, and then the similarity search, but there are plenty of vaguely interesting things in there once you do.

radarsat1 · 2024-07-01T09:08:23.000000Z

It's a good point about true and false positives though, which makes me wonder if anyone's taken a large database of expected outputs from such "equations" and used it to calculate validation scores for different models in terms of precision and recall.