Correct, fine-tuning is not new. It's long been used to augment foundational LLMs with private data. Eg. private enterprise data. We do this at Zapier, for instance.
The new and surprising thing about test-time training (TTT) is how effective it is an approach to deal with novel abstract reasoning problems like ARC-AGI.
How is TTT anything other than a deep learning algorithm? We have a deep learning model, we generate training data based on an example and use a stochastic gradient descent to update the model weights to improve its predictions according to the training data. This is a classic DL paradigm. I just don’t see why would you consider this an advancement if you your goal is to move “beyond” deep learning.
The new and surprising thing about test-time training (TTT) is how effective it is an approach to deal with novel abstract reasoning problems like ARC-AGI.
TTT was pioneered by Jack Cole last year and popularized this year by several teams, including this winning paper: https://ekinakyurek.github.io/papers/ttt.pdf