Hacker News new | past | comments | ask | show | jobs | submit login

> practitioners and vendors alike are overly focused on "just put embeddings somewhere and do cosine similarity" and that's the only problem to solve

I agree, and as one who does exactly and only this on the search side, it's also something that falls flat on its face if you don't think a little more about the data and tasks involved.

I wrote about it here[0], but the gist of it for our use case is that if we don't intentionally include what may be considered "less relevant" data then we stand a good chance at failing our main generative task.

[0]: https://phillipcarter.dev/2024/01/15/three-properties-of-dat...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: