Semi-related: Anybody know what's the current state of the art for calculating nearest neighbors in high-dimensional spaces (at least 10D, maybe 100s of D)?
Various approaches based on LSH (locality sensitive hashing) and random projections.
But "the best" really depends on your data profile, as the generic problem formulation, for completely generic vectors, is too… generic. See also the curse of dimensionality [0].
In particular, for something good AND open source:
- Leonid Boytsov's NMSLIB [1]
- Erik Bernhardsson's Annoy [2]
- Facebook's FAISS [3]
For a commercial engine focused more on practical use (index management, transactions, versioning, support), see ScaleText [4] (disclaimer: our product).