Yes, but that is your parents' point: "And what kind of documents exist that beg...

cloudking · on May 13, 2023

It's not that "you are X" type text has to be explicitly in the training data, it's that the model weights interpret "you are X" as an instruction that a human would receive as an emergent behavior after digesting a ton of human written text.

jameshart · on May 13, 2023

Well, no - it's interpreting it as an instruction a chatbot AI would receive. From an almighty and omniscient 'system'.

We're training our AI on dystopian sci-fi stories about robot slaves.

cloudking · on May 13, 2023

It has to be prompted that it's an AI chatbot first, so its essentially pretending to be a human that is pretending to be an AI chatbot. Back to the point, it interprets instruction as a human would.

If you look under the hood of these chat systems they have to be primed with a system prompt that starts like "You are an AI assistant", "You are a helpful chat bot" etc. They don't just start responding like an AI chatbot without us telling them to.

jameshart · on May 13, 2023

What is the “it” that is doing the pretending?

cloudking · on May 13, 2023

The trained model, it takes your input and runs it through some complex math (tuned by the weights) and gives an output. Not much mystery to it.

avereveard · on May 13, 2023

It doesn't seem there's such a nefarious intent.

If you think at most literature, two characters interacting will address each other in the second person. If you think at recipes, most often instructions are addressed to the reader as you.

There's plenty of samples of instructions being given in the second person, and there's plenty samples in literature where using the second person elicits a second person follow-up, which is great for chat model because even if they are still just completing sentences with the most likely token, it gives the illusion of a conversation.

cubefox · on May 13, 2023

The base model wouldn't do that though, it would just predict the most likely follow up, which could e.g. simply be more instructions. After instruction fine-tuning the model does no longer "predict" tokens in this way.

dragonwriter · on May 13, 2023

In the RLHF training sets?