> practitioners and vendors alike are overly focused on "just put embeddings som...

> practitioners and vendors alike are overly focused on "just put embeddings somewhere and do cosine similarity" and that's the only problem to solve

I agree, and as one who does exactly and only this on the search side, it's also something that falls flat on its face if you don't think a little more about the data and tasks involved.

I wrote about it here[0], but the gist of it for our use case is that if we don't intentionally include what may be considered "less relevant" data then we stand a good chance at failing our main generative task.

[0]: https://phillipcarter.dev/2024/01/15/three-properties-of-dat...