While intentionally a bit click-baitey, I think the author is saying that features _using_ the ML tensor cores are on life support. They announced/hinted at that a bunch of features would use on-device ML and in the end they are just using the cloud like they always have.
As far as I understand, some inference is done on-device. LLMs and diffusion just changed the field in the last year and it takes time for hardware to catch up (+ work to reduce model sizes). So, it's just hard to run the latest models on-device. So you either end up doing it online (Google's preference) or having weaker models (Apple's preference).