I wouldn't mind if the mail client stored local embeddings of my email contents ...

DaiPlusPlus · on May 10, 2023

What practical benefit does that offer over existing (synonym-aware) keyword and phrase search approaches? The corpus of one’s mailbox is too small a dataset to draw conclusion from, surely?

Not to mention being far slower to query.

yazaddaruvala · on May 10, 2023

The really difficult parts about keyword search are tokenization, normalization, and synonym selection. Especially keeping them up to date. The uses of search never know about any of this, but as the developer these things need to be top of mind.

Using embeddings basically lets the AI configure those things for you and auto updates when the AI updates.

You could also use the embeddings for far more advanced things like in LLMs, but the basic version that is just “better keyword search” is also valuable.

> Not to mention being far slower to query.

KNN on the embeddings is not obviously slower to query. In production using AWS ElasticSearch, for a very large search index, my team saw no meaningful change to latency when using embeddings instead.

DaiPlusPlus · on May 10, 2023

> using AWS ElasticSearch

If this is for my personal e-mail, then I'd only use it if it could run locally - which also means inobtrusively in the background, not slowing down my computer - is that feasible?

yazaddaruvala · on May 10, 2023

I was just giving an example, you could just as easily run a KNN implementation locally.

spopejoy · on May 10, 2023

I wouldn't mind if they just fixed their global search, it's pretty terrible. I say this as a devoted longtime user who will never switch.