> practitioners and vendors alike are overly focused on "just put embeddings somewhere and do cosine similarity" and that's the only problem to solve
I agree, and as one who does exactly and only this on the search side, it's also something that falls flat on its face if you don't think a little more about the data and tasks involved.
I wrote about it here[0], but the gist of it for our use case is that if we don't intentionally include what may be considered "less relevant" data then we stand a good chance at failing our main generative task.
I agree, and as one who does exactly and only this on the search side, it's also something that falls flat on its face if you don't think a little more about the data and tasks involved.
I wrote about it here[0], but the gist of it for our use case is that if we don't intentionally include what may be considered "less relevant" data then we stand a good chance at failing our main generative task.
[0]: https://phillipcarter.dev/2024/01/15/three-properties-of-dat...