Sparse transformers are a thing. Dictionary lookup is a thing. The transformer can probably be trained to store long chains of information in a dedicated memory system, retrieving it using keywords.
These are the things that came to mind in ten seconds.
This is not going to be the problem that meaningfully delays AGI.