Hacker News new | past | comments | ask | show | jobs | submit login

Er, is it possible that you are using `scipy.spatial.distance.cosine` to compute the similarity? If so, note that this computes the cosine distance, and the cosine similarity is defined as 1-cosine distance.

I tried out your example using the following code:

  from sentence_transformers import SentenceTransformer
  import scipy.spatial as ssp
  
  model = SentenceTransformer("all-mpnet-base-v2")
  A = model.encode(['chocolate chip cookies','PLS6;YJBXSRF&/'])

  CosineDistance = ssp.distance.cosine(A[0],A[1])
Where `CosineDistance == 0.953`

This means the model is actually working quite well, were these to be similar to each other we'd expect CosineDistance to be much closer to 0.

The other comments about such distances being useful for relative comparisons also apply: I've used SentenceTransformers quite successfully for nearest-neighbor searches.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: