Very interesting but hard to interpret until the performance numbers / benchmark...

Very interesting but hard to interpret until the performance numbers / benchmarks are available. I can already fine-tune a 70B language model at home using CPU + RAM, but it would be so slow as to be almost totally impractical (~20x slower than GPU). It would be great to see a comparison to eg 8 x A100 (available for $32/hr on AWS on-demand) and also CPU + RAM. Presumably it’s somewhere in between, but hard to predict where!