Hacker News new | past | comments | ask | show | jobs | submit login

Probabilities are "error bounds", of a sort, on a binary prediction (in that they express the degree of uncertainty in the prediction.) Error bounds on a probability don't make a lot of sense.



It can easily make sense.

Consider a very simple model, where each person in the country has a probability `p_i` of the chance that they vote for party A, and `1-p_i` that they vote for party B.

Now if we have access to the values `p_i` of all `n` people in the population, then probability that party A wins can be calculated as

    P = sum_{a \in {0,1}^n with |a| > n/2} product_{i = 1 to n} {p_i if a_i = 0 else 1-p_i}
(The precise formula doesn't matter, but clearly it can be computed.) Now assume that we don't have all the values `p_i`, but only the values of `k` people randomly sampled from the population. Then we can calculate `P` on these values to get an estimate of the probability that A will win the election. Our result clearly won't be exact, but we can use statistics to find values `P_1` and `P_2` such that the real value of `P` is in the range `P_1 <= P <= P_2` with high probability over the random sampling of people.

Here we have assumed an underlying random model, which has a probability, and we can estimate that probability within error bounds. We could try to calculate

    P* = sum_{way of sampling k people} (probability of sampling this way) * (probability A wins in this case)
But then we don't know how likely a certain set of `p` values is to be sampled, so we have to guess. Hence we are mixing uncertency and probability. Real statistical models have many different levels of uncertainties and probabilities like that.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: