To be honest, I'm not an expert at this stuff—merely enthusiastic about it. I could have some details wrong.
My intuition says that a straight average should give you fairly good results.. although antirez talks a bit about bias in HyperLogLog counts that he's trying to adjust for.
So maybe an average + some bias correction gives you better results?
On light reflection: probably if you do N tests, the modal result would be a good enough choice, or perhaps the median.
The issue is that (e.g.) the arithmetic mean would be very influenced by the stats that gave randomly bad results - the very results you are trying to exclude by doing the test multiple times and taking an average.
So I think this is roughly what antirez is talking about with 'registers' for a counter.
From the article:
The standard error of HyperLogLog is 1.04/sqrt(m), where “m” is the number of registers used.
Redis uses 16384 registers, so the standard error is 0.81%.
My understanding is that the error is very small because you're averaging over 16,000 registers.
Edit: also, I think you want the mean because it'll give you back a precise/fractional power of 2 that is pretty close to the actual count.
The median would give you the nearest integer power of two, which is probably not a good estimate of the real count.
My intuition says that a straight average should give you fairly good results.. although antirez talks a bit about bias in HyperLogLog counts that he's trying to adjust for.
So maybe an average + some bias correction gives you better results?