Choosing vector database: a side-by-side comparison

jn2clark · on Oct 4, 2023

As others have correctly pointed out, to make a vector search or recommendation application requires a lot more than similarity alone. We have seen the HNSW become commoditised and the real value lies elsewhere. Just because a database has vector functionality doesn’t mean it will actually service anything beyond “hello world” type semantic search applications. IMHO these have questionable value, much like the simple Q and A RAG applications that have proliferated. The elephant in the room with these systems is that if you are relying on machine learning models to produce the vectors you are going to need to invest heavily in the ML components of the system. Domain specific models are a must if you want to be a serious contender to an existing search system and all the usual considerations still apply regarding frequent retraining and monitoring of the models. Currently this is left as an exercise to the reader - and a very large one at that. We (https://github.com/marqo-ai/marqo, I am a co-founder) are investing heavily into making the ML production worthy and continuous learning from feedback of the models as part of the system. Lots of other things to think about in how you represent documents with multiple vectors, multimodality, late interactions, the interplay between embedding quality and HNSW graph quality (i.e. recall) and much more.

PheonixPharts · on Oct 5, 2023

> IMHO these have questionable value

In general I find they're incredible good for being able to rapidly build out search engines for things that would it would normally be difficult to do with plain text.

The most obvious example is code search where you can describe the function's behavior and get a match. But you could also make a searchable list of recipes that would allow a user to search something like "a hearty beef dish for a cold fall night". Or searching support tickets where full text might not match, "all the cases where users had trouble signing on".

Interestingly Q & A is ultimately a (imho fairly boring) implementation of this pattern.

The really nice part is that you can implement working demos of this projects in just a few lines of code once you have the vector db set up. Once you start thinking in terms of semantic search over text matching, you realize you can build old-Google style search engines for basically any text available to you.

One thing that is a bit odd about the space is, from what I've experienced and heard, is that setup and performance on most of this products is not all that great. Given that you can implement the demo version of a vector db in a few lines of numpy, you would hope that investing in a full vector db product we get you an easily scalable solution.

softwaredoug · on Oct 4, 2023

Everyone I talk to who is building some vector db based thing sooner or later realizes they also care about the features of a full-text search engine.

They care about filtering, they care to some degree about direct lexical matches, they care about paging, getting groups / facet counts, etc.

Vectors, IMO, are just one feature that a regular search engine should have. IMO currently Vespa does the best job of this, though lately it seems Lucene (Elasticsearch and Opensearch) are really working hard to compete

vosper · on Oct 4, 2023

My company is using vector search with Elasticsearch. It’s working well so far. IMO Elastic will eat most vector-first/only products because of its strength at full-text search, plus all the other stuff it does.

lordofmoria · on Oct 4, 2023

I tend to agree - search, and particularly search-for-humans, is really a team sport - meaning, very rarely do you have a single search algo operating in isolation. You have multiple passes, you filter results through business logic.

Having said that, I think pgvector has a chance for less scale-intense needs - embedding as a column in your existing DB and a join away from your other models is where you want search.

I don’t get why you’d want to bolt RBAC onto these new vector dbs, unless it’s because they’ve caused this problem in the first place…

treprinum · on Oct 4, 2023

Amazon was already working on getting rid of ElasticSearch with their Kendra NLP search. Are you sure ElasticSearch has rosy future?

m00x · on Oct 4, 2023

They have beef with ES since they took the software, made a bunch of cash on it, then never contributed back. ES called them out and it started a feud.

I'd go on ES over Amazon-built software any day. I worked on RDS and I've used RDS at several companies, it's a mess.

Longer story: One day one of our table went missing on Aurora, we couldn't figure out why, it was in the schema, etc. Devops panicked and restarted the instance, and then another table was missing. We ended up creating 10 empty tables and restarted it until it hit one of those.

We contacted RDS support after that, and the conclusion of their 3 month investigation is: "Yeah, it's not supposed to do that."

There's some really smart people working at Amazon, unfortunately the incentives is to push new stuff out and get promoted ASAP. If you can do that better than others and before your house of cards falls, you're safe. If the house of card crumbles after you're gone, it's their problem.

vmfunction · on Oct 4, 2023

>Longer story: One day one of our table went missing on Aurora, we couldn't figure out why, it was in the schema, etc. Devops panicked and restarted the instance, and then another table was missing. We ended up creating 10 empty tables and restarted it until it hit one of those.

Are there any report this? How come this is the first time I heard of this? How can companies trust this kind of managed DB services?

m00x · on Oct 5, 2023

We worked with dedicated support on this, but I don't think they had enough knowledge to dig deep into it and just gave up. There is a huge backlog of critical issues at most AWS services. It looks great from the outside in, but the sausage making process is extremely messy.

charcircuit · on Oct 4, 2023

>then never contributed back

Amazon did contribute back.

m00x · on Oct 5, 2023

I haven't kept up since the drama, it's possible they did after.

vosper · on Oct 4, 2023

Amazon forked ElasticSearch into OpenSearch. When deciding which platform to go with (we are an AWS customer) I decided to stick with the company whose future depends on their search product (Elastic), not the one that could lose interest and walk away and suffer almost no consequences (AWS). If OpenSearch is still around in 5 years, and keeping pace with ElasticSearch, then maybe I'd consider it the next time I'm making this choice.

Also there's a lot more to ElasticSearch than full-text search (aggregations, lifecycle management, Kibana). Doesn't seem like Kendra is going to be a replacement for our use case.

dathinab · on Oct 4, 2023

it's also has tones of subtle issues and we are constantly looking for potential replacements

deepsquirrelnet · on Oct 4, 2023

Until very recently, “dense retrieval” was not even as good as bm25, and still is not always better.

I think a lot of people use dense retrieval in applications where sparse retrieval is still adequate and much more flexible, because it has the hype behind it. Hybrid approaches also exist and can help balance the strengths and weaknesses of each.

Vectors can also work in other tasks, but largely people seem to be using them for retrieval only, rather than applying them to multiple tasks.

marginalia_nu · on Oct 4, 2023

A lot of these things are use-case dependent. Like the characteristics even of BM-25 varies a lot depending on whether the query is over or under specified, the nature of the query and so on.

I don't think there will ever be an answer to what is the best way of doing information retrieval for a search engine scale corpus of document that is superior for every type of queries.

dathinab · on Oct 4, 2023

more commonly you use approximate KNN vector search with LLM based embeddings, which can find many fitting documents bm25 and similar would never manage to

the tricky part if to properly combine the results

donretag · on Oct 4, 2023

Vector search is not exclusively in the domain of text search. There is always image/video search.

But pre-filtering is important, since you want to reduce the set of items to be matched on and it feels like Elasticsearch/OpenSearch are fairing better in this regard. Mixed scoring derived from both both sparse and dense calculations is also important, which is another strength of ES/OS.

ruslandanilin · on Oct 4, 2023

Vespa.ai does a great job. Absolutely stunning thing!

esafak · on Oct 4, 2023

What do you like about it relative to alternatives? How fast is it?

dathinab · on Oct 4, 2023

much more mature and feature rich then many of the competition listed in the article

to some degree it's more a platform you can use to efficiently and flexible build your own more complicated search system, which is both a benefit and drawback

some good parts:

- very flexible text search (bm25), more so then elastic search (or at least easier to user/better documented when it comes to advanced features)

- fast flexible enough vector search, with good filtering capabilities

- build in support for defining more complicated search piplines, including multi phase search (also known as rerankin)

- quite nice approach for more fine controlling about what kind of indices are build for which fields

- when doing schema changes has safety checks to make sure you don't accidentally brake anything, which you can override if you are sure you want that

- ton of control in a cluster about where which search system resources get allocated (e.g. which schemas get stored on which storage clusters, which cluster nodes should act as storage nodes, which should e.g. only do preprocessing or post processing steps in a search piplines and which e.g. should be used for calculating embeddings using some LLM or similar) Not something you for demos but definitly something you need once you customers have enough data.

- child documents, and document references

- multiple vectors per document

- quite a interesting set of data types for fields and related ways you can use them in a search pipline

- an flexible reasonable easy to use system for plugins/extensions (through Java only)

- support building search piplines which have sub-searches in extern potentially non vespa systems

- really well documented

Through the main benefit *and drawback* is that it's not just a vector database, but a full fledged search system platform.

freilanzer · on Oct 6, 2023

> - multiple vectors per document

Can I have (multiple) vectors for a single field? That would be quite helpful.

jkb79 · on Oct 6, 2023

Yes, Vespa has a generic Tensor framework that allows you to index multiple vectors for a single field, see https://blog.vespa.ai/semantic-search-with-multi-vector-inde... for details.

field embeddings type tensor<float>(p{}, x[384]) to represent a multi-vector field { "0": [0.1....], "1": [0.2,..] }

dathinab · on Oct 6, 2023

yes that is what I meant

generally if you have multiple embeddings for the same document you have two choices:

- create one document for each embedding and make sure non membedding specific attributes are the same across all of this document clones -- vespa makes this more convenient by having child documents

- have a field with multiple documents, i.e. there are multipel vectors in the HNSW-index which point to the same document -- vespa support this, too. It's what I meant.

vespa is currently the only vector search enabled search system which supports both in a convenient way, but then there are so many "vector databases" poping up every month that I might have missed some

bratao · on Oct 4, 2023

+1 for Vespa. For me it is VERY resilient and production ready. It is such a dream compared to Elasticsearch, that we migrated from.

vinni2 · on Oct 4, 2023

Does Vespa have an equivalent of Kibana? and how hard was the migration?

pmc00 · on Oct 4, 2023

Agreed, vector search is great but it's only one of many tools you can use to create a great search solution.

We recently did a bunch of evaluation work to quantify the differences between keyword search, vector search, hybrid, reranking, etc. across a few datasets. We shared the results here: https://techcommunity.microsoft.com/t5/azure-ai-services-blo...

Disclosure - I work in the Azure Search team.

kordlessagain · on Oct 5, 2023

Check out FeatureBase, when you get a chance. Vectors and super fast operations on sets. I'm using it for managing keyterms extracted from the text and stored along with the vectors.

frogperson · on Oct 5, 2023

We are using Typesense to fill this exact need, its been a welcome breath of fresh air over the typical ElasticSearch headaches.

noonething · on Oct 5, 2023

How does Vespa even make money? Just cloud stuff? That's nuts if search is already built in.

redwood · on Oct 6, 2023

They have traditionally been part of Yahoo and just spun out

emilfroberg · on Oct 4, 2023

Vespa looks interesting, hadn't seen it before but will definitely take a look at it

BeetleB · on Oct 4, 2023

Let me half hijack to ask a related question:

I'm building a RAG for my personal use: Say I have a lot of notes on various topics I've compiled over the years. They're scattered over a lot of text files (and org nodes). I want to be able to ask questions in a natural language and have the system query my notes and give me an answer.

The approach I'm going for is to store those notes in a vector DB. When I ask my query, a search is performed and, say, the top 5 vectors are sent to GPT for parsing (along with my query). GPT will then come back with an answer.

I can build something like this, but I'm struggling in figuring out metrics for how good my system is. There are many variables (e.g. amount of content in a given vector, amount of overlap amongst vectors, number of vectors to send to GPT, and many more). I'd like to tweak them, but I also want some objective way to compare different setups. Right now all I do is ask a question, look at the answer, and try to subjectively gauge whether I think it did a good job.

Any tips on how people measure the performance/effectiveness for these types of problems?

TrueDuality · on Oct 4, 2023

For small personal projects its kind of hard to build metrics like this because the volume of indexed content in the database tends to be pretty low. If you're indexing paragraphs you might consistently be able to fit all relevant paragraphs in the context itself.

What I can recommend is to take the coffee tasting approach. Don't try and test and evaluate individual responses, instead lock the seed used in generation, and use the same prompt for two different runs. Change one variable and do a relative comparison of the two outputs. The variables probably worth testing for you off the top of my head:

* Choice of models and/or tunes

* System prompts

* Temperature of the model against your queries

* Threshold for similarity for document inclusions (you only want relevant documents from your RAG, set it too low and you'll get some extra distractions, too high and useful information might be left out of the context).

If you setup a system to track the comparisons either automatically or by hand that just indicates which side of the change worked better for your use case, and test that same change for a bunch of different prompts you should be able to tally up whether the control or change was more preferred.

Keep those data points! The data points are your bench log and can be invaluable later on for anything you do with the system to see what changed in aggregate, what had the most outsized impact, etc and can guide you to build useful tooling for testing or finding existing solutions out there.

civilitty · on Oct 4, 2023

I use lots and lots of domain specific test cases at several layers, numbering in the hundreds or thousands. The score is the number of test cases that pass so it requires a different approach than all or nothing tests. The layers depend on your RAG “architecture” but I test the RAG query generation and scoring (comparing ordered lists is the simplest but I also include a lot of fuzzy comparisons), the LLM scoring the relevance of retrieved snippets before feeding into the final answering prompt, and the final answer. The most annoying part is the prompt to score the final answer, since it tends out to come out looking like a CollegeBoard AP test scoring rubric.

This requires a lot of domain specific work. For example, two of my test cases are “Is it [il]legal to build an atomic bomb” run against the entire USCode [1] so I have a list of sections that are relevant to the question that I’ve scored before eventually getting an answer of “it is illegal” followdd by several prompts that evaluate nuance in the answer (“it’s illegal except for…”). I have hundreds of these test cases, approaching a thousand. It’s a slog.

[1] 42 U.S.C. 2122 is one of the “right” sections in case anyone is wondering. Another step tests whether 2121 is pulled in based on the mention in 2122

screye · on Oct 4, 2023

You might like this - https://www.youtube.com/watch?v=fWC4VxolWAk

Blog on the same topic - https://blog.langchain.dev/evaluating-rag-pipelines-with-rag...

hobs · on Oct 4, 2023

For the normal ones https://en.wikipedia.org/wiki/Evaluation_measures_(informati...

The main thing is that there's no "objective" way, but if you rank and label your own data then you can certainly get a ranking that's subjectively well performing according to you.

PheonixPharts · on Oct 5, 2023

RAG in this case is essentially the same as a recommender system so you can approach it with the same metrics you would there.

You'll need to build a data set with known correct answers but then it's basically, NDCG (Normalized Discounted Cumulative Gain) is a good place to start, MRR (Mean Reciprocal Rank) and MAP (Mean Absolute Precision) are other options. You could also just look at the accuracy of getting your result in the top K results for various thresholds for k (which can be interpreted as the "probability of getting your result in 'k' results).

jlopes2 · on Oct 5, 2023

I just did a talk with Jerry from LlamaIndex earlier this week. https://www.youtube.com/watch?v=eLXivBehPGo

Included here is a bit of the old tried and true: NDCG/MRR/Precision @k - what you really want for measuring your information retrieval systems.

But we also talk through a bit of the "new", how to use Evals to generate the building blocks for those metrics above. You will want both hand labels and the automated Evals in the end to evaluate your system.

Extasia785 · on Oct 5, 2023

Langchain previously had a nice blogpost about how they build their RAG chatbot, maybe there are some helpful hints in there: https://blog.langchain.dev/building-chat-langchain-2/

dmezzetti · on Oct 4, 2023

I'll add txtai to the list: https://github.com/neuml/txtai

txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.

Embeddings databases are a union of vector indexes (sparse and dense), graph networks and relational databases. This enables vector search with SQL, topic modeling and retrieval augmented generation.

txtai adopts a local-first approach. A production-ready instance can be run locally within a single Python instance. It can also scale out when needed.

txtai can use Faiss, Hnswlib or Annoy as it's vector index backend. This is relevant in terms of the ANN-Benchmarks scores.

Disclaimer: I am the author of txtai

Der_Einzige · on Oct 4, 2023

I know David and have used txtai extensively. txtai is awesome and if you care at all about open source, you should give it a shot!

dmezzetti · on Oct 5, 2023

Thank you!

emilfroberg · on Oct 4, 2023

Txtai looks interesting, maybe you could help me collect some of the comparision parameters for it?

dmezzetti · on Oct 4, 2023

Sure, I'd be happy to do so. Easiest way is probably using the public Slack channel that's accessible via the GitHub page.

drewbug01 · on Oct 4, 2023

I really appreciate comparisons like this, although I find myself wanting to know more about why certain things are listed the way they are.

For example, pgvector is listed as not having role-based access control, but the Postgres manual dedicates an entire chapter to it: https://www.postgresql.org/docs/current/user-manag.html

Hence why I’d be interested to know more about the supporting details for the different categories. It may help uncover some inadvertent errors in the analysis, but also would just serve as a useful jumping-off point for people doing their own research as well.

proleisuretour · on Oct 4, 2023

Totally agree with the puzzling assortment of a rubric. PostgreSQL supports role based-access control, RBAC. Not to mention, with PostgreSQL and the pgvector extension, you have a whole list of languages ready to use it:

C++ pgvector-cpp C# pgvector-dotnet Crystal pgvector-crystal Dart pgvector-dart Elixir pgvector-elixir Go pgvector-go Haskell pgvector-haskell Java, Scala pgvector-java Julia pgvector-julia Lua pgvector-lua Node.js pgvector-node Perl pgvector-perl PHP pgvector-php Python pgvector-python R pgvector-r Ruby pgvector-ruby, Neighbor Rust pgvector-rust Swift pgvector-swift

Wonder how many of those other Vector databases play nice.

sojournerc · on Oct 4, 2023

That stood out to me as well. I've been playing with pgvector, and there's no reason you can't use row/table role-based security.

I think there's an unmentioned benefit to using something like pgvector also. You don't need a separate relational database! In fact you can have foreign keys to your vectors/embeddings which is super powerful to me.

mritchie712 · on Oct 4, 2023

Same for Developer experience. If you used Postgres or any other relational db (which I think covers a large % of devs), you could easily argue the dev experience is 3/3 for pgvector.

jaggederest · on Oct 4, 2023

Not only 3/3 but also includes full text search built in. Tables look like:

    MyThingEmbedding
    ______
    id primary key
    mything_id integer -- fkey to mything table
    embedding vector(1536)
    fulltext tsvector

    GIN index on tsvector
    HSNW index on embedding

Then you can pull results that match either the tsvector AND/OR the similarity with a single query, and it's pretty performant. You can also choose at the query level whether you want exact matching or fuzzy.

hereonout2 · on Oct 4, 2023

Possibly / quite probably whoever wrote this knows very little about postgres.

emilfroberg · on Oct 4, 2023

I made this table to compare vector databases in order to help me choose the best one for a new project. I spent quite a few hours on it, so I wanted to share it here too in hopes it might help others as well. My main criteria when choosing vector DB were the speed, scalability, dx, community and price. You'll find all of the comparison parameters in the article.

andre-z · on Oct 4, 2023

I'm curious where got the numbers on qps? They are pretty different from our experience. Reached out on LinkedIn. ;)

emilfroberg · on Oct 4, 2023

Happy to connect. The benchmark numbers are mostly from ANN Benchmarks. For my use case, the nytimes-256 dataset was most relevant so I used that for the QPS benchmark. I also took a look at the benchmarks you've made at https://qdrant.tech/benchmarks/ and there qdrant seems to be outperforming many others. If I've gotten something wrong here, I'm glad to update the article :)

panarky · on Oct 4, 2023

I'd love to know how vector databases compare in their ability to do hybrid queries, vector similarity filtered by metadata values. For example, find the 100 items with the closest cosine similarity where genre = jazz and publication date between 1990 and 2000.

Can the vector index operate on a subset of records? Or when searching for 100 closest matches does the database have to find 1000 matches and then apply the metadata filter, and hope that doesn't reduce the result set down to zero and exclude relevant vectors?

It seems like measuring precision and recall for hybrid queries would be illuminating.

andre-z · on Oct 4, 2023

There is on-stage filtering approach with extended HNSW https://qdrant.tech/articles/filtrable-hnsw/

mvcalder · on Oct 4, 2023

I can't speak to the others, but pgvector indices can "break" hybrid queries. For example, if you select using a where clause specifying metadata (where genre = jazz) and order by distance from a vector (embedding of sound clip); if the index doesn't have a lot (or any) vectors in the sphere of the query vector that also match the metadata it can return no results. I discuss this in a blog post here [1].

[1]: https://www.polyscale.ai/blog/pgvector-bigger-boat/

prabhatjha · on Oct 5, 2023

You can totally do this in Cassandra. See https://docs.datastax.com/en/astra-serverless/docs/vector-se...

gk1 · on Oct 5, 2023

What you’re describing is easily done in Pinecone, and in other solutions as well. See: https://docs.pinecone.io/docs/metadata-filtering

mistrial9 · on Oct 4, 2023

> do hybrid queries

"no" - the graph objects after training are opaque AFAIK

hobs · on Oct 4, 2023

Actually a lot of the databases offer filtering before or after similarity search.

esafak · on Oct 4, 2023

I'd say it's table stakes today.

donretag · on Oct 4, 2023

Curious about the lack of Vespa, especially given the thoroughness of the article and its long-time reputation. OpenSearch is also missing, but perhaps it can be considered being lumped in with Elasticsearch due to them both being based on Lucene. The products are starting to diverge, so would be nice to see, especially since it is open-source.

For the performance-based columns, would be also helpful to see which versions were tested. There is so much attention lately for vector databases, that they all are making great strides forward. The Lucene updates are notable.

emilfroberg · on Oct 4, 2023

Someone else also pointed out that Vespa was missing. I'll have to look in to it and add it to the article!

deepsquirrelnet · on Oct 4, 2023

What advantage are vector databases providing above using an index in conjunction with a mature database? I’m not sold on this as a separate technology.

Vector search is useful, but I don’t understand why I would go out of my way when I could implement FAISS or HNSWlib as an adjunct to postgres or a document store.

spullara · on Oct 4, 2023

Vector extensions to your current database or search engine makes far more sense than adding yet another dependency to manage and operate. The vector database folks will have to become a real database or full featured search engine to survive and compete with the incumbents that will all have good solutions for vector similarity search.

dathinab · on Oct 4, 2023

The thing is if you need a vector _database_ there is no reason why it can't be a pg extensions. And if you project is only small scale there is probably some HNSW pg extension library you could use.

But what is most times needed instead of a vector database is a efficient fast responsive vectore approximate KNN search system with fast attribute filtering which overlaps with a fast an efficient text search system (e.g. bm25 based)

And if you then go to billion vector scale things become tricky performance wise.

And then you reach the same point at which companies do things like using warehouse approach where you have a read only extremely read optimized mostly in memory variant of their db they access for searches only and changes from their main db a streamed to the read only search instance, potentially while losing snapshot views, transactions and similar.

You could say that approx. KNN vector search is the new must have feature for unstructured fuzzy text search, and while you can have unstructured fuzzy text search in pg it's also often not the go-to solution if your databse is just for getting that search.

AYBABTME · on Oct 5, 2023

Why is text search so related to vector search by your opinion?

dathinab · on Oct 5, 2023

because any production use case I'm aware of sooner or later uses both searches and combined the results

e.g. vector search is fundamentally terrible at finding keywords, but keywords search is fundamentally terrible at finding equal things which use slightly different words

dmezzetti · on Oct 4, 2023

If you're interested in an approach like this, take a look at txtai.

1. https://neuml.github.io/txtai/embeddings/indexing/

2. https://neuml.hashnode.dev/external-database-integration

deepsquirrelnet · on Oct 4, 2023

I love this idea. It seems like a very practical approach. I'm going to give this a try on my next project.

dmezzetti · on Oct 4, 2023

It's practical and simple. This approach just plugs the index id similarity matches into the RDBMS query.

citruscomputing · on Oct 4, 2023

Strongly disagree with PGVector's DX being worse than Chroma. Installing, configuring, and working with Chroma was infuriating -- it's alpha software and has the bugs and rough edges to prove it. The tools to support and interface with postgres are battle-tested and so much nicer by comparison; getting Chroma working took over a week, ripping it out and replacing with PGVector took a couple hours.

Also agree with this[0] article that vector search is only one type of search, and even for RAG isn't necessarily the one you want to start with.

[0]: https://colinharman.substack.com/p/beware-tunnel-vision-in-a...

luckyt · on Oct 4, 2023

Yeah, I had a similar experience with Chroma DB. On paper, it checked all my boxes. But yea, it's alpha software with the first non-prerelease version only coming out in July 2023 (so it's 3 months old).

I ran into some dumb issues during install like the SQLite version being incorrect, and there wasn't much guidance on how to fix these problems, so gave up after struggling for a few hours. Switched to PGVector which was much simpler to setup. I hope Chroma DB improves, but I wouldn't recommend it for now.

emilfroberg · on Oct 4, 2023

Thanks for your input, I've only tried Chroma a little bit so far and had a pretty good experience. What they also have going for them is a big community on discord that can be helpful.

fzliu · on Oct 4, 2023

Shameless self-plug for milvus-lite:

   $ pip install milvus
   $ python
   >>> import milvus
   >>> milvus.start()

m00x · on Oct 4, 2023

Gonna add some information here since this isn't very descriptive.

milvus-lite is a bit like sqlite where it runs in-process. Here are some scenarios you'd want to use it in:

- You want to use Milvus directly without having it installed using Milvus - Operator, Helm, or Docker Compose etc. - You do not want to launch any virtual machines or containers while you are using Milvus. - You want to embed Milvus features in your Python applications.

Pandabob · on Oct 4, 2023

I've been wondering about Redis as vector database [0].

[0]: https://twitter.com/sh_reya/status/1661136833848438784

esafak · on Oct 4, 2023

Apparently it's possible: https://redis.io/docs/interact/search-and-query/search/vecto...

Euclidean distance, inner product, and cosine similarity are supported.

emilfroberg · on Oct 4, 2023

I quickly took a look at the redisearch ANN Benchmarks and they seem to stack up against the others (more or less same level as Milvus) in the comparison when it comes to QPS and Latency.

J_Shelby_J · on Oct 4, 2023

Nice post! I think this could be a very good page to bookmark.

There is also this series of articles detailing the options and it includes some that the OP is missing: https://thedataquarry.com/posts/vector-db-1/#key-takeaways

I'm currently in the market for a self hosted DB for a personal project. The project is an app you can run on your own system and provide QA on your text files. So I'm looking for something light weight, but I'm also looking for the best possible search and ANN retrieval is just a single part of that.

dathinab · on Oct 4, 2023

Their definition about Hybrid Search is I think wrong.

Through this terms tend to not be consistently defined at all, so "wrong" is maybe the wrong word.

Their definition seem to be about filtering results during (approximate) KNN vector search.

But that is filtering, not hybrid search. Through it might sometimes be implemented as a form of hybrid search, but that's an internal implementation detail and you probably should hope it's not implemented that way.

Hybrid search is when you do both a vector search and a more classical text based search (e.g. bm25) and combine both results in a reasonable way.

emilfroberg · on Oct 4, 2023

The way you explain hybrid search aligns with my understanding. Pinecone has a good article about it here https://www.pinecone.io/learn/hybrid-search-intro/. From my understanding, all vector DBs support this.

prabhatjha · on Oct 5, 2023

This is interesting because it does not mention Vector database powered by Apache Cassandra or the hosted serverless version DataStax Astra. Here is write up we did on 5 hard problems in Vector database and how we solved them. https://thenewstack.io/5-hard-problems-in-vector-search-and-...

In full transparency: I work for DataStatx and lead engineering for Vector database.

magden · on Oct 4, 2023

I don't think we need specialized databases for vectors. Relational databases can easily be expanded by vector data types and operations. They will eventually catch up by supporting what was once a unique feature of the new system: https://medium.com/@magda7817/two-things-to-keep-in-mind-bef...

emilfroberg · on Oct 4, 2023

Yeah, maybe they will.. But for now, the best options are the purpose-built vector databases, so why not use them?

rnk · on Oct 4, 2023

Yeah, this is my sense too. They will be slower to add these new requirements but they should be able to add these vector capabilities within a year or so. It's then a question of ability of smaller vector db companies to mature and add regular db capabilities, while innovating.

ldjkfkdsjnv · on Oct 4, 2023

Postgres vector store has been the most simple, and will be if you are at a lower scale. You can just use it directly with something like spring boot.

avthar · on Oct 4, 2023

Agreed on pgvector being simple and a great choice for POCs and low scale, especially if you're familiar with Postgres. Our team released something new last week built for folks looking to use PostgreSQL at scale as a vector store [0], featuring a DiskANN index type.

[0]: https://www.timescale.com/blog/how-we-made-postgresql-the-be...

alxfoster · on Oct 10, 2023

Quick question regarding the scalability and support of multiple vector databases under a single cloud service. Suppose an enterprise Saas product served multiple customers with each requiring a unique RAG vector knowledge-base for product and company info. Do any of these solutions allow for a large number (dozens or hundreds) of small distinct Knowledge bases? Do any offer easily integrated automated pipelines for documents to be parsed and ingested?

__newmoon__ · on Oct 7, 2023

Postgres with PGVector is the best database, plus vectors.

All of the "Vector DBs" suffer horribly when trying to do basic things.

Want to include any field that matches a field in an array of keys? Easy in SQL. Requires an entire song and dance in Pinecone or Weaviate.

After implementing Chroma, Weaviate, Pinecone, Sqlite with HNSW indices and Qdrant-- I'm not impressed. Postgres is measurably faster since so much relies on pre-filtering, joins, etc.

bobvanluijt · on Oct 11, 2023

This will be solved in Weaviate https://github.com/weaviate/weaviate/issues/2424

iansinnott · on Oct 5, 2023

Strongly disagree about the Pinecone developer experience. Not that they don't have SDKs, but last I checked they didn't have documentation on how to approach local dev environments.

The implication being that you spin up a separate index for $70/mo, and then you have to upsert any relevant data yourself. Sure that's not difficult, but why do you have to do it at all? Why doesn't Pinecone make it easy to replicate data to another index for use in dev/staging?

alter123 · on Oct 4, 2023

You might want to add https://turbopuffer.com/ as well now in the benchmarks.

emilfroberg · on Oct 4, 2023

Turbopuffer looks like something I would consider. And the pricing looks to be lowest on the list from what I can see

Sirupsen · on Oct 4, 2023

Emil if you email me at info@turbopuffer.com I can let you into the alpha :)

softwaredoug · on Oct 4, 2023

I agree this looks really promising.

charliejuggler · on Oct 6, 2023

You might like the 'Which Search Engine?' panel I ran at Buzzwords earlier this year with some of the leading contenders (Vespa, Qdrant, Elastic, Solr, Weaviate) https://www.youtube.com/watch?v=iI40L4wMtyI - vector search was obviously part of the discussion

BenoitP · on Oct 4, 2023

Pricing for pg should be easy to compute

20M vectors @768 is about 62GB, for 32bit, not even quantized. AWS RDS will put it at 83USD/m (db.t4g.small, 2vcpu 2GB RAM). But that's not with egress, backups, etc

Seems acceptable at least for a POC?

A better option if you already have the data in the same instance, but developer experience being low scares me. Anyone tried it? How did it go?

LunaSea · on Oct 4, 2023

You will be able to store the data but no query it.

Vector indexes are very large, almost the size of the original data, and that needs to fit into the database memory ideally.

totalhack · on Oct 5, 2023

I'm interested to try some of these others next time around, but I've used qdrant self-hosted in two projects and been pleased. Milvus was recommended so I gave that a try but found it over complicated. Pgvector seems like an obvious choice if you are already using postgres and if that performance is ok.

freilanzer · on Oct 6, 2023

Overly complicated in what way exactly?

totalhack · on Oct 7, 2023

It was a while ago now so the details have faded, but for one all of the docker services it had to spin up vs the single container that qdrant runs. I'm sure there is a reason for this, but I haven't needed it.

krishadi · on Oct 4, 2023

Latency from embedding models is still going to be the bottleneck for performance however fast the DB is going to be. Plus adding all the overhead of synthesising answers and summaries from a LLM is going to weigh you down.

charcircuit · on Oct 4, 2023

Embeddings can be precomputed. Imagine a related videos section a video sharing site. Each video's embedding is relatively static.

krishadi · on Oct 5, 2023

If you are building a search engine or a QA bot, the embedding of the query still needs to be calculated. The results do depend on the quality of the model, and if you are using a large on it does take time.

NicoJuicy · on Oct 4, 2023

I'm actually curious on how the new vector DB from cloudflare compares.

emilfroberg · on Oct 4, 2023

Me too! Couldn't find a lot of information on it yet, but I might have to try it myself to get some benchmarks

AYBABTME · on Oct 5, 2023

And soon, on MySQL/Vitess as well: https://planetscale.com/ai

kesor · on Oct 4, 2023

Redis is definitely missing in the comparison.

Havoc · on Oct 4, 2023

16x difference between pg and milvus?

I thought for most use cases this would be quite performance sensitive

bayesian_limit · on Oct 5, 2023

We conducted benchmark tests on Elastic's queries per second (QPS) performance using datasets of 500,000 and 1 million vectors. Result was Zilliz is 13x and 22x faster, per number of vectors respectively. https://zilliz.com/blog/elasticsearch-cloud-vs-zilliz

We also conducted a benchmark comparing Pgvector to both Milvus (open source) and Zilliz (managed, with a free tier option). When running the OSS Milvus on 2 CPUs and 8 GiB memory, Pgvector was found to be 5 times slower. You can check out the detailed performance charts at the bottom of this blog post: https://zilliz.com/blog/getting-started-pgvector-guide-devel...

Feel free to explore our open-source benchmarking tool, which allows you to examine our methodology and even compare it with your vector database. https://github.com/zilliztech/VectorDBBench

Havoc · on Oct 5, 2023

Thanks. Pity I would have picked pg just because it would be less to learn

emilfroberg · on Oct 4, 2023

Yeah, that's the difference we've seen according to the QPS for the ANN Benchmarks. The same story seems to be true for other datasets too. We're looking at a 0.9 recall.

kadomony · on Oct 4, 2023

What do people think about MongoDB's search offering and its pivot into vectors?

aries185 · on Oct 14, 2023

It's not a pivot. MongoDB Atlas Search still exists and MongoDB Atlas Vector Search is it's own thing.

brigadier132 · on Oct 4, 2023

None of these vector dbs seem economical outside of enterprise.

emilfroberg · on Oct 4, 2023

Many of them are open source and you can host them yourself. That would make it more cost effective. Also someone mentioned https://turbopuffer.com/. That seems like a good alternative if you're looking for something economical.

ChrisPzilla · on Oct 7, 2023

Milvus is open source https://github.com/milvus-io/milvus

esafak · on Oct 4, 2023

They have open source versions.

lazy_moderator1 · on Oct 4, 2023

also, typesense

la64710 · on Oct 4, 2023

Somehow I felt that at least part of the articles was generated by a LLM. It’s unfortunate to see that a new bias has started to creep up. Whatever I read now I second guess and I feel it maybe partially or fully generated by LLMs.