Amusingly, vector search is conceptually nearly as old as keyword search. Most o...

PaulHoule · on Sept 5, 2023

I was interested in dense vector search in 2004 and a bit depressed at how the indexing algorithms weren’t that good, Around 2013 I was involved with an actual product which was a search engine for patents.

The thing about vector search today is that it is an absolute gold rush, Pinecone had the right idea but the wrong product (sorry SAAS is toxic), many of the latecomers are… late, and will find the whole startup and VC process will cost precious months when customers want to build products right now.

Note for that patent search engine we had a quite a different idea than is fashionable today, that is, like very old TREC we care about the quality of results you got 1000 or 2000 results in in the assumption that a patent researcher wanted to be comprehensive. When I was first into IR it was in the context of “how is Google so much better than other search engines?” and one part of it was that based on the way search engines were being evaluated at TREC, Google was not a better search engine* because it was so focused on the first page.

It has come full circle now, I have talked to people recently who complain that current TREC evaluations are too focused on the first few results.

thfuran · on Sept 5, 2023

I assume we're not talking about the Texas Real Estate Commission?

marginalia_nu · on Sept 5, 2023

https://trec.nist.gov/