Depends how you do it. LLM are huge, but their input and output is miniscule bits of text. If you find a way to put "narrow paths" in the hidden layers, basically to subdivide a model into smaller interconnected models, then the bandwidth will be similarly massively reduced.
This is not without precedent, look up how your brain hemispheres and the regions within are connected.