> we currently do not have the ability to have an LLM learn from another LLM We ...

> we currently do not have the ability to have an LLM learn from another LLM

We do. It's called model distillation, and it's relatively straightforward.

In fact, training a smaller model on the outputs of a much bigger model will significantly cut down on your training time/create a higher quality model than just training on raw human data (which is often low quality and noisy).