I get the overall idea *why* they make sense. But the fact that the author does ...

I get the overall idea why they make sense. But the fact that the author does not address GPU acceleration either means that she's not thought about that, or that she thinks its implementation is trivial.

Either way, I would need a deeper dive along those lines to be convinced of that the argument has real-world merit, and can actually be implemented in practice.

FWIW my estimate is that 90% of production training loads in the wild are done on GPU. Please correct me if my assumption is wrong.