It always helps me to picture a probability space (as in, an actual physical spa...

mgraczyk · on Feb 16, 2017

Your second example has completely different assumptions than your first. To get 14/23, you don't have to assume that the predictions are independent, only that they are independent given future weather.

Let W be 1 if it will rain tomorrow and 0 if it will not rain, with 0.5 probability either way. Let A be weatherman A's prediction and let B be weatherman B's prediction. We have

  P(W = 0) = P(W = 1) = 0.5
  P(A = w | W = w) = 0.7 # Weatherman A is right 70% of the time
  P(B = w | W = w) = 0.6 # Weatherman B is right 60% of the time

  P(A, B | W) = P(A | W) P(B | W)

Now we want to know "Weatherman A predicts rain tomorrow, weatherman B predicts dry weather. What is the probability that it will rain tomorrow?"

That is, what is P(W=1 | A=1, B=0)?

    P(W=1 | A=1, B=0) = P(A=1, B=0 | W=1) P(W=1) / P(A=1, B=0)
    = P(A=1 | W=1) P(B=0 | W=1) P(W=1) / (sum_w P(A=1, B=0|w)P(w))
    = 0.7*0.4*0.5 / (P(A=1|w=0)P(B=0|w=0)P(w=0) + P(A=1|w=1)P(B=0|w=1)P(w=1))
    = 14/23

If you don't like the assumption that the predictions are statistically independent (herding, etc), then you just have to come up with the conditional joint P(A,B | W). That wouldn't be difficult given a small amount of data since W is binary. You would put a dirichlet prior on the distribution (basically just a beta distribution with an additional dimension) and essentially just count the times each triple (a, b, w) happens.

Still, the problem isn't a lack of rigor, it's a lack of clarity in stated assumptions.

To be specific, you didn't state the assumption that P(w)=0.5, which you used to compute 14/23.

kutkloon7 · on Feb 17, 2017

I don't understand your point. I was arguing that the way the problem is posed seems to imply a unique solution. I was showing the problem was ill-posed, and that many probability problems have similar subtle or less subtle hidden assumptions. Here, this assumption is P(A, B | W) = P(A | W) P(B | W). I don't think you even need P(W = 0) = P(W = 1) = 0.5.

These assumptions usually seem quite natural to make, but can be very unrealistic (why would the predictions of weather men be independent? I would bet they are not in reality). This is a very bad to teach students. It is always very important to know which assumptions you are making, and if textbooks do this wrong, it will be nearly impossible for students to get this right.

I would think of a student which struggles with this problem as a better mathematician than the student which just uses P(A, B | W) = P(A | W) P(B | W) 'because the formula is in his textbook', but the second student is more likely to get rewarded (especially in American education, since the USA seems to be especially fond of textbooks which give a ready-made recipe for every problem that a student is supposed to solve).

mgraczyk · on Feb 22, 2017

> I don't think you even need P(W = 0) = P(W = 1) = 0.5.

You do,

In the article, the prior of interest is the prior gender of a child, which can be safely assumed to be 0.5. Similarly, independence of children's genders is a very good approximation as well. It wasn't necessary for the article to state these assumptions because they are obvious common knowledge.