I wonder if it can be made not rely on randomness at all. The 'standard' solution just required that every prisoner would be taken to the room infinitely many times. As long as this was the case, the warden could try any devious sequence they liked.
Interesting question. Haven't solved it yet, but it did give me an idea for an optimization.
If the prisoner stores c_max, they can decide to only ever turn off the light (until c=0) if c < c_max - 1.
This is because there is at least 1 other prisoner still turning on the light (they could turn off the light they turned on to reach c = c_max - 1, but only get lower if someone else turns on the light for them).
Unfortunately an adversarial sequence still hangs progress, as prisoners can be juggled between c_max and c_max-1.
I wonder if it can be made not rely on randomness at all. The 'standard' solution just required that every prisoner would be taken to the room infinitely many times. As long as this was the case, the warden could try any devious sequence they liked.