All the SOTA LLM solutions like this have nearly the same problem. Sure the context window is huge, but there is no guarantee the model understands what 100K tokens of code is trying to accomplish within the context of the full codebase, or even into the real world, within the context of the business. They are just not good enough yet to use in real projects. Try it, start a greenfield project with "just cursor" like the ai-influencers do and see how far you get before it's an unmanagable mess and the LLM is lost in the weeds.
Going the other direction in terms of model size, one tool I've found usable in these scenarios is Supermaven [0]. It's still just one or multi-line suggestions a la GH Copilot, so it's not generating entire apps for you, but it's much much better about pulling those one liners from the rest of the codebase in a logical way. If you have a custom logging module that overloads the standard one, with special functions, it will actually use those functions. Pretty impressive. Also very fast.
Embeddings/RAG don't address the problem I'm talking about. The issue is that you can stuff the entire context window full of code and the models will superficially leverage it, but will still violate existing conventions, inappropriately bring in dependencies, duplicate functionality, etc. They don't "grok" the context at the correct level.
Going the other direction in terms of model size, one tool I've found usable in these scenarios is Supermaven [0]. It's still just one or multi-line suggestions a la GH Copilot, so it's not generating entire apps for you, but it's much much better about pulling those one liners from the rest of the codebase in a logical way. If you have a custom logging module that overloads the standard one, with special functions, it will actually use those functions. Pretty impressive. Also very fast.
[0] https://supermaven.com/