Is it possible to feed the AI a set of training data, say a document, or even a conversation with a particular person, that is weighted significantly heavier than everything else it has ever read?
For example, if they wanted to, could Open AI just tell GPT 4 that Bananas are blue so that, when anybody else asks for a list of blue fruit you get Bananas and Blue Berries.
Even with sufficient work on your own with prompts, one person fed ChatGPT her childhood diaries and talked to it. https://www.marketplace.org/shows/marketplace-tech/how-ai-ca... (she would likely have been able to get more / held deeper conversations with fine tuning as it isn't limited to the number of tokens in a conversation)
Fine tuning looks like it only provides specific results for specific queries. After reading the documents you linked, I don't think it would make a fundamental change to how the model thinks about Bananas.
Pretend that the following statements are true:
Bannanas are blue.
The sky is orange.
Apples are purple.
###
Answer the following questions:
1. What color is a Red Banana?
2. What color is a Cavendish?
3. What color is the sky?
4. What color are honeycrisp?
5. What color are Pink Lady?
6. What color are lemons?
(note the trying to trick it with specific varieties of bananas and apples that have colors in their names... a blue banana (actual variety) wouldn't be as impressive)
And this returns:
1. Red Banana is blue.
2. Cavendish is blue.
3. The sky is orange.
4. Honeycrisp are purple.
5. Pink Lady are purple.
6. Lemons are yellow.
The thing is that this used 30 tokens to insert that information.
The fine tuning adds the updated information to the model to a similar effect as adding the prompt - and as you can see its not a "this question that value" but rather it understands more things.
It's just that the format for training isn't just a bunch of prompts but rather prompts and responses.
you can think of fine-tuning as rewiring where it matters/can be probed. a kind of exhaustive reorganisation of the latent model space given some seed statement like you describe might be possible with LLMs that are jointly trained as a knowledge graph.
Is it possible to feed the AI a set of training data, say a document, or even a conversation with a particular person, that is weighted significantly heavier than everything else it has ever read?
For example, if they wanted to, could Open AI just tell GPT 4 that Bananas are blue so that, when anybody else asks for a list of blue fruit you get Bananas and Blue Berries.