For me it is pretty consistently 50% +/- 5 even after a minute. It doesn't seem to work well if you repeat the same key a lot (i.e don't flip to often).
This is actually due to a psychological thing where humans don't find long strings of repeated letters/digits "random" and hence will flip more frequently than you'd expect from a machine.
I also recall there being a problem in the use of 5-grams which lead to a sequence in which it guessed 0%, but I'm struggling to find it now.
If you would find the sequence I describe in my other comment, I think you would almost get that result, because you would (except at the beginning of the sequence) always pick the k-gram that has been seen one time less.
This is definetly not working. I'm certain that one can write a program like this, but I'm consistenly getting 50 % even though I use different strategies every time.
I got 52% after a good lot of keypresses (not sure exactly how many). I chose to not look at the screen at all during that time - I think that probably helps.
Mine started at 25% and worked it's way up to 54%. Which also involved me holding down keys for a few seconds. I'm not sure if it works or if Hacker New's user base is skewed towards people who are slightly better at being random. (or if that's cause by self reporting, would someone whose best was 80% want to come forward?)
Edit:
Interesting tidbit: I pre-seeded at 100% by holding down F and it trended down to ~68% before I decided I shouldn't spend too much time on this. Maybe we didn't use it long enough?
There is an inherent 50% probability of each character occurring next. i.e if you were to guess randomly, you would be right 50% of the time given enough trials.
I did something like this a long time ago in a little competition for beating others' rock paper scissors programs, except I used the other players' moves relative to my own (e.g. my program could catch on to the other player choosing the move that would beat my last move) as well as their current pattern of winning or losing.
and typed 'd' for True and 'f' for False in the array, bringing the accuracy of the predictor down to 53%. Theoretically, I'm guessing doing it for large enough numbers should make it exactly 50%.