> 1. Fine-tuning OSS embedding models on your real-world query patterns
This is not as easy as you make it sound :)
Typically, the embeddings are multi-modal: the query string maps to a relevant document that I want to add as context to my prompt.
If i collect lots of new query strings, i need to know the ground truth "relevant document" it maps to. Then I can use the two-tower embedding model to learn the "correct" document/context for a query.
I have thought about this problem for LLMs that do function calling. And what you can do is collect query strings and the function calling results, and ask GPT-4 - "is this a 'good' answer?". GPT-4 can be a teacher model for collecting training data for my two-tower embedding model.
This is not as easy as you make it sound :) Typically, the embeddings are multi-modal: the query string maps to a relevant document that I want to add as context to my prompt. If i collect lots of new query strings, i need to know the ground truth "relevant document" it maps to. Then I can use the two-tower embedding model to learn the "correct" document/context for a query.
I have thought about this problem for LLMs that do function calling. And what you can do is collect query strings and the function calling results, and ask GPT-4 - "is this a 'good' answer?". GPT-4 can be a teacher model for collecting training data for my two-tower embedding model.
Reference: https://www.hopsworks.ai/dictionary/two-tower-embedding-mode...