As I see it, the result is rather establishing a very fundamental property pertaining to the expressive power of a mechanism, and it can be useful also in practice.
For instance, I have many potential applications of Turing complete formalisms, because I am interested in results of arbitrary computations. The result obtained in the article means that I can use a Neural Network to obtain this, under the conditions outlined in the article, and in the way shown in the article.
This may simplify software architectures, especially in situations where Neural Networks are already applied, and additional mechanisms would otherwise be needed to cover arbitrary computations.
Something being Turing Complete just that means in principle it could be used to solve any computation problem. But this may require infinite memory or infinite time.
The paper showed that Transformer with positional encodings and rational activation functions is Turing complete.
Rational activation functions with arbitrary precision make sure that you are in the smaller countable infinities, where floats run into that cardinality of the continuum problem.
While all nets that use attention are feed forward and thus effectively DAGs, they add in positional encodings to move it from well-founded to well ordered.
While those constraints allow the authors to make their claims in this paper, they also have serious implications for real world use as rational activation functions are not arbitrarily precise in physically realizable machines in finite time and you will need to find a well-ordering of your data or find a way to force one one it which is not a trivial task.
So while interesting, just as it was interesting when someone demonstrated sendmail configurations were Turing complete, it probably isn't as practical as you seem to think of it.
As attention is really runtime re-weighting and as feed forward networks are similar to DAGs it is not surprising to me that someone found a way to prove this, but just as I am not going to use the C preprocessor as a universal computation tool as it is also TC, I wouldn't hold your breath waiting for attention to be a universal computation tool either.
I'm not going to engage with this directly, but for any other readers passing through - this is nonsense. One more drop in the flood of uninformed AI noise that's been drowning out the signal.
Choose the sources you trust very carefully, and look to the people actually working on real-world AI systems, not the storytellers and hangers-on.
For instance, I have many potential applications of Turing complete formalisms, because I am interested in results of arbitrary computations. The result obtained in the article means that I can use a Neural Network to obtain this, under the conditions outlined in the article, and in the way shown in the article.
This may simplify software architectures, especially in situations where Neural Networks are already applied, and additional mechanisms would otherwise be needed to cover arbitrary computations.