Chips take a lot of time from design to roll out. Giant LLMs being something you want to serve is a very recent development and it will take time for the deployed hardware to catch up, no surprises here.
Other than bringing up Google Assistant with Bard, the article doesn't mention giant LLMs at all. It lists features such as Gboard proof-reading that are not done on device, as well as the chips heating up when doing some of the general tasks such as downloading, so from that regard, it seems that most AI tasks are not done on the device. Which was the promise sold with these chips.