It is true that actual interaction is not tested in this way. Still, AI actions are not all it tests: For example, just by keeping such games running for a long time, it implicitly also tests resource management (such as garbage collection), robustness of the networking code, synchronization etc.
What I found appealing in the description is the idea of keeping your program running for a long time and have it perform all kinds of actions automatically. Nowadays, I often hear about "test-driven development", "unit tests" etc., and it often turns out that they test very specific things out of a vast universe of all possible things. That does not mean that they are useless. As I see it, it means - as they phrased it in this article - that we often do not take enough advantage of automated testing.
Personally, when I test software, I always try to follow the general idea stated in the post mortem: Keep it running, and perform all kinds of actions automatically. I found several crashes and memory leaks in this way, which were not noticed during manual interaction because they only became significant when the actions were repeated thousands of times.
Based on my personal experience of automated testing in AAA games, even if you "only" count the overlap with the actions humans take, it's a huge overlap in practice. And while there are edge cases only humans find, automated testing finds bugs humans don't.
The point of that section is that the earlier you find bugs, the quicker you fix them, and you end up with less overall pain for the length of the project.