What practical benefit does that offer over existing (synonym-aware) keyword and phrase search approaches? The corpus of one’s mailbox is too small a dataset to draw conclusion from, surely?
The really difficult parts about keyword search are tokenization, normalization, and synonym selection. Especially keeping them up to date. The uses of search never know about any of this, but as the developer these things need to be top of mind.
Using embeddings basically lets the AI configure those things for you and auto updates when the AI updates.
You could also use the embeddings for far more advanced things like in LLMs, but the basic version that is just “better keyword search” is also valuable.
> Not to mention being far slower to query.
KNN on the embeddings is not obviously slower to query. In production using AWS ElasticSearch, for a very large search index, my team saw no meaningful change to latency when using embeddings instead.
If this is for my personal e-mail, then I'd only use it if it could run locally - which also means inobtrusively in the background, not slowing down my computer - is that feasible?