Anyone have insights on using Elasticsearch as a vector database as opposed to a...

gk1 · on Sept 5, 2023

> I’m wondering whether that’s a good idea.

That depends... How fast do you need it to be, how many vectors will you be searching across, and how often will the index be updated? If the answers are anything resembling "very, many, and often" then you'll want to at least compare with a DB that's been purpose-built for vector search. I'm talking >10M vectors, <100ms, <hourly updates. If the workload is anything less than that, then just use whatever is most convenient -- which in your case could be Elastic.

(Disclosure: I'm from Pinecone. The above is based on independent testing.)

hsm3 · on Sept 5, 2023

Do have have any links handy to those test results?

gk1 · on Sept 5, 2023

This video goes through some of the testing methodology and techniques: https://youtu.be/7E-eiUN9d6U?si=zSoiGH2QAlQoxxcr

jillesvangurp · on Sept 5, 2023

I looked at this a few months ago. Elasticsearch has some limitations with respect to the vector length (1024, I think); which rules out using a lot of the popular off the shelf models. A key insight is that the performance of off the shelf models isn't great compared to a hand tuned query and bm25 (the Lucene ranking algorithm). I've seen multiple people make the point that the built in ranking is pretty hard to beat unless you specialize your models to your use case.

A key consideration for the vector size limitation is that storing and working with large amounts of huge vectors gets expensive quickly. Simply storing lots of huge embeddings can take up a lot of space.

And of course using knn with huge result sets is very expensive, especially if the vectors are large. Having the ability to filter down the result set with a regular query and then ranking the candidates with a vector query helps keep searches responsive and cost low.

If you are interested in this, you might want to look at Opensearch as well. They implemented vector search independently from Elasticsearch. Their implementation supports a few additional vector storage options and querying options via native libraries. I haven't used any of that extensively but it looks interesting.

softwaredoug · on Sept 5, 2023

I believe the 1024 limit has been upped in recent versions of Elasticsearch

https://github.com/elastic/elasticsearch/issues/92458

9dev · on Sept 5, 2023

Ah, that’s neat, thank you for the input! We’re actually using a homegrown (word2vec descendant) model to build document vectors - it works well on its own, but implementing a search engine on top efficiently has proven to be futile.

bigbillheck · on Sept 5, 2023

It's been a minute since I've looked at embeddings in this context but isn't Johnson-Lindenstrauss going to be applicable here so that you can get away with 1024-long (or shorter) vectors?

softwaredoug · on Sept 5, 2023

Jimmy Lin, a researcher a U Waterloo, recently published "Lucene is All You Need" https://arxiv.org/abs/2308.14963

I know, FWIW, Elasticsearch has been investing a lot in this space lately. It's not perfect, but its bound to get better.

peterstjohn · on Sept 5, 2023

That paper does a terrible job of making Lucene look useful, though. 10qps from a server with 1TB of RAM is not great (and I know Lucene HNSW can perform better than that in the real world, so I am somewhat mystified that this paper is being pushed by the community).