That's actually old news, and mentioned in OP. They build their own chips (TPUs) that are super cool. You can even use them as part of google cloud! Still, moving to BERT ain't cheap.
TPU is old news. I think what would the actual news will be a Chip that is customized/optimized to just run BERT.
The Transformer architecture itself stays mostly unchanged in the 2 years after it has been proposed, and with BERT/variants, most (competitive) NLP models are now Transformer based, it makes sense to make custom chips to just run Transformers, the same as CNNs.
Meh, not sure how much more there is to do to specialize for transformer specifically. TPU and GPU are mainly just fantastic matrix multipliers. And transformer is partly designed with this hardware in mind: the operations are basically the same operations you see in a CNN. And in fact, one of the nice parts of the transformer is that you can run it without RNN, making it even better optimized for the matrix multipliers.
Furthermore, tpus are a moving target themselves: as ML needs change, the team build new operations and optimizations into the next generation of chips.