As an approximation, we can estimate each face with an approximation to the binomial at looking at where the cut off is for rejections at p=0.05.
If you're looking at a single face,
With 2000 rolls, you would expect a range of 81-120. (0.0405- 0.0600)
At 3000 rolls, that "narrows"\* to 127-174 (0.0423 - 0.580)
At just 100 rolls, anywhere between 1 and 10 occurences looks fair.
However, it would be better to do a proper chi-squared test, this was just illustrative, so let's do that.
Let's take the chessex opaque purple, which looks like it has 8300 rolls.
By our approximations above, we expect to see rolls between 377 and 454.
In our actual data we appear to have a minimum of 313 with face13, and a maximum of 531 with face17.
So let's do a pearson's chi-squared test. Rolling 8300 times we expect 415 for each face.
We calculate for each observation, the difference between that and the expected value, and divide by the expected value, and then sum over all faces.
This gives a value of 155.3108 . We then have to compare this to the chi-squared test for 19 degrees of freedom. (There are 19 degrees of freedom because we have 20 faces, so after we have 19 results, the 20th must be fixed by being 8300 minus the sum of the first 19 faces).
Digging out our statistical tables (you DO have statistical tables right?), and lookup at 19 degrees of freedom, we can see 155 far exceeds even the p=0.01 level.
So we can conclude that the chessex opaque purple die rolled here is biased. (Or the die-roller is).
\* It narrows in proportion to the total, the absolute margin is wider. This is something that people often forget when dealing with the law of large numbers.
import random, statistics
num_of_sims = 10000
num_of_rolls = 3000
results = []
for s in range(num_of_sims):
sum_of_rolls = 0
# roll 20 sided dice
for r in range(num_of_rolls):
sum_of_rolls += random.randint(1,20)
# keep track of the average value
results.append(sum_of_rolls / num_of_rolls)
print("ave: ", statistics.mean(results)) # which is 10.49915
print("stdev: ", statistics.stdev(results)) # which is 0.10492
> (There are 19 degrees of freedom because we have 20 faces, so after we have 19 results, the 20th must be fixed by being 8300 minus the sum of the first 19 faces).
I've read the degrees of freedom in statistics explained so many times without actually understanding it. Now I get it. It's that simple. Thank you!
> So we can conclude that the chessex opaque purple die rolled here is biased. (Or the die-roller is).
We could use conditional probabilities to determine if the die is biased or if the roller is biased, I think: see if P(face is 19 | face was n before roll) = P(face is 19) by statistical hypothesis testing. Not that the data we have tells whether or not.
If you're looking at a single face,
With 2000 rolls, you would expect a range of 81-120. (0.0405- 0.0600)
At 3000 rolls, that "narrows"\* to 127-174 (0.0423 - 0.580)
At just 100 rolls, anywhere between 1 and 10 occurences looks fair.
However, it would be better to do a proper chi-squared test, this was just illustrative, so let's do that.
Let's take the chessex opaque purple, which looks like it has 8300 rolls. By our approximations above, we expect to see rolls between 377 and 454.
In our actual data we appear to have a minimum of 313 with face13, and a maximum of 531 with face17.
So let's do a pearson's chi-squared test. Rolling 8300 times we expect 415 for each face.
We calculate for each observation, the difference between that and the expected value, and divide by the expected value, and then sum over all faces.
This gives a value of 155.3108 . We then have to compare this to the chi-squared test for 19 degrees of freedom. (There are 19 degrees of freedom because we have 20 faces, so after we have 19 results, the 20th must be fixed by being 8300 minus the sum of the first 19 faces).
Digging out our statistical tables (you DO have statistical tables right?), and lookup at 19 degrees of freedom, we can see 155 far exceeds even the p=0.01 level.
So we can conclude that the chessex opaque purple die rolled here is biased. (Or the die-roller is).
\* It narrows in proportion to the total, the absolute margin is wider. This is something that people often forget when dealing with the law of large numbers.