There's a related anecdote about John von Neumann: he used to joke that he has s...

maxov · on Aug 14, 2020

That's a cool anecdote :-) I wouldn't say it uses concentration of measure exactly, but I see how it is related. The anecdote is about asymptotic properties of random sequences, and concentration of measure is about the same too. In this case, I think you can show that homogenous blocks of length log(n) - log log (n) occur at least with constant probability as n gets large. In other words, the length of homogenous blocks is basically guaranteed to grow with n. I suppose a human trying to generate a random sequence will prevent homogenous blocks above a certain constant length from appearing regardless of the length of the sequence, which would make distinguishing the sequences for large n quite easy!

I think there is also a quite strong connection in this anecdote to the information-theoretic notion of entropy, which takes us all the way back to the idea of entropy as in the article :-) Information-theoretically, the entropy of a long random sequence concentrates as well (it concentrates around the entropy of the underlying random variable). The implication is that with high probability, a sampled long random sequence will have an entropy close to a specific value.

Human intuition actually is somewhat correct in the anecdote, though! The longer the homogenous substring, the less entropy the sequence has, and the less likely it is to appear (as a limiting example, the sequence of all 0s or all 1s is extremely ordered, but extremely unlikely to appear). I think where it breaks down is that there are sequences with relatively long homogenous substrings with entropy close to the specific values (in the sense that the length is e.g. log (n) - log log (n) as in the calculation before), where the human intuition of the entropy of the sequence is based on local factors (have I generated 'too many' 0s in a row?) and leads us astray.