Yes, I think we are going to need a new architecture for LLMs to move beyond, "t...

og_kalu · on Feb 28, 2023

It's not an architecture problem of the transformer at all. This is the result of thinking the idea that you can make inviolable rules for a system you don't understand is not anything but ridiculous. You're never going to make inviolable rules for a neural network because we don't understand what is going on on the inside.