Hacker News new | past | comments | ask | show | jobs | submit login

It is an interesting example, but after some thought it's basically identical to the elevator example. It is so similar that I have some doubts about whether I could correctly extrapolate a principle based on those two examples.

> Why is the mean the correct representative value for counting nails, but not for minimising fines? I'm arguing that it's because when the nails are weighed together the errors are signed, but when the fines are calculated the errors are unsigned (in the same way that the distances to the elevators are unsigned).

And here, there are two points that bother me.

First, the mean is the correct representative for counting nails because that's how it's defined. A mean is just the answer to any question of the form "if all of these data points were equal to each other, what is the value they would all share?". That's the question you're asking about the weight of the nails.

Second, I find it difficult to believe in your stated explanation of why you might choose the mean over the median (or vice versa), because we've already observed that the mean minimizes the sum of squared errors, and squared errors are all unsigned. You're relying on the assumption that, in the above example, the fine (which should be minimized) is directly proportional to difference from the estimate. But that assumption doesn't appear in the explanation of why one metric or the other will minimize the fine.




> You're relying on the assumption that, in the above example, the fine (which should be minimized) is directly proportional to difference from the estimate. But that assumption doesn't appear in the explanation of why one metric or the other will minimize the fine.

Yes, my argument is specifically for the case where the fine etc. is directly proportional to the difference (or absolute difference). I'm pointing out that the two cases are similar apart from the signedness.

To put it mathematically, I'm just observing that if you have a set {x_i}, then:

- avg{|x_i - y|} is minimised by setting y to median{x_i}

- |avg{x_i - y}| is minimised by setting y to avg{x_i}

The excerpt in the article says "the mean minimizes the average squared distance" (i.e. setting y to avg{x_i} minimises avg{(x_i - y)^2}), which is correct but not obvious (the proof is non-trivial, and if you were to ask the students who chose the mean why they chose it I bet none of them would say "because it minimises the squared distance"). So I find it odd to include that statement. If I was the author I would have said "the mean minimizes the average signed distance", which I find more intuitive and think is a better explanation of why those students chose the mean. (Also, I like how "average value of the absolute difference" and "absolute value of the average difference" are the same words just in a different order.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: