RAG, at its core, is a very human way of doing research, because RAG is essentially just building a search mechanism for a reasoning engine. Much like human research.
Your boss asks you to look into something, and you do it through a combination of structured and semantic research. Perhaps you get some books that look relevant, you use search tools to find information, you use structured databases to find data. Then you synthesize it into a response that's useful to answer the question.
People say RAG is temporary, that it's just a patch until "something else" is achieved.
I don't understand what technically is being proposed.
That the weights will just learn everything it needs to know? That is an awful way of knowing things, because it is difficult to update, difficult to cite, difficult to ground, and difficult to precisely manage weights.
That the context windows will get huge so retrieval will be unnecessary? That's an argument about chunking, not retrieval. Perhaps people could put 30,000 pages of documents into the context for every question. But there will always be tradeoffs between size and quality: you could run a smarter model with smaller contexts for the same money, so why, for a given budget, would you choose to stuff a dumber model with enormous quantities of unnecessary information, when you could get a better answer from a higher intelligence using more reasonably sized retrievals at the same cost?
Likewise, RAG is not just vector DBs, but (as in this case) the use of structured queries to analyze information, the use of search mechanisms to find information in giant unstructured corpuses (i.e., the Internet, corporate intranets, etc).
Because RAG is relatively similar to the way organic intelligence conducts research, I believe RAG is here for the long haul, but its methods will advance significantly and the way it gets information will change over time. Ultimately, achieving AGI is not about developing a system that "knows everything," but a system that can reason about anything, and dismissing RAG is to confuse the two objectives.
That's the problem, it's just search. If search was the answer, Google would of achieved AGI long ago. The problem is there's no intelligence. In some situations it can find semantically similar content, but that's it. The intelligence is completely missing from the RAG mechanism, because it's not even part of the model itself.