Thinking of it as "the training dataset" vs "the context window" is the wrong wa...

nl on May 23, 2023 | parent | context | favorite | on: RWKV: Reinventing RNNs for the Transformer Era

Thinking of it as "the training dataset" vs "the context window" is the wrong way of looking at it.

There's a bunch of prior art for adaption techniques for getting new data into a trained model (fine tuning, RLHF etc). There's no real reason to think there won't be more techniques that turn what think of now as the context window into something that alters the weights in the model and is serialized back to disk.

dcl on May 23, 2023 [–]

It's a reasonable way to look at it given that's how pretty much all 'deployed' versions of LLM's work?