Switching to Word2Vec embeddings led to a substantial improvement in my cosine s...

Nowado · 2023-09-08T05:22:22

Interesting, do you happen to have some quantitative results on this/additional insights/etc?

I've interpreted transformer vector similarity as 'likelihood to be followed by the same thing' which is close to word2vec's 'sum of likelihoods of all words to be replaced by the other set' (kinda), but also very different in some contexts.

BoorishBears · 2023-09-08T06:04:50

There's no simplified definition like that, vectors can even capture logical properties, it's all down to what the model was tuned for: https://www.sbert.net/examples/training/nli/README.html

sandGorgon · 2023-09-08T06:18:04

this is very interesting. you had better results here than the openai ada02 and other embeddings like bge ?

bugglebeetle · 2023-09-08T03:35:54

As opposed to sentencebert or what?

Jimmc414 · 2023-09-08T03:40:53

DistilBERT and RoBERTa