If you are thinking of SOTA as just recall that's a bit different than if you are thinking about other performance considerations in real-world data sets.
For example, IVF generally requires that you process the entire dataset in order to build the index, so it doesn't become searchable for some time.
HNSW and DiskANN can be built incrementally so you can start searching them immediately.
Here's a short blog that talks about some hard problems in vector search and how to solve them.
As many have mentioned, DiskANN often outperforms HNSW. HNSW seems to be the standard that most vectorDBs are going with.
For example, IVF generally requires that you process the entire dataset in order to build the index, so it doesn't become searchable for some time.
HNSW and DiskANN can be built incrementally so you can start searching them immediately.
Here's a short blog that talks about some hard problems in vector search and how to solve them.
As many have mentioned, DiskANN often outperforms HNSW. HNSW seems to be the standard that most vectorDBs are going with.
https://thenewstack.io/5-hard-problems-in-vector-search-and-...