> In our experiments on the Pile, a standard language modeling benchmark, a 7.5 billion parameter RETRO model outperforms the 175 billion parameter Jurassic-1 on 10 out of 16 datasets and outperforms the 280B Gopher on 9 out of 16 datasets.
The research is still ongoing, although perhaps lower-profile than what appears in the press.
RETRO did get press, but it was not the first retrieval model, and in fact was not SOTA when it got published; FiD was, which later evolved into Atlas[0], published a few months ago.
> In our experiments on the Pile, a standard language modeling benchmark, a 7.5 billion parameter RETRO model outperforms the 175 billion parameter Jurassic-1 on 10 out of 16 datasets and outperforms the 280B Gopher on 9 out of 16 datasets.
https://www.deepmind.com/blog/improving-language-models-by-r...
Though, there hasn't been much follow-up research on it (or DeepMind is not publishing it).
Annotated paper: https://github.com/labmlai/annotated_deep_learning_paper_imp...