It is an interesting example, but after some thought it's basically identical to...

hddqsb · on July 7, 2022

> You're relying on the assumption that, in the above example, the fine (which should be minimized) is directly proportional to difference from the estimate. But that assumption doesn't appear in the explanation of why one metric or the other will minimize the fine.

Yes, my argument is specifically for the case where the fine etc. is directly proportional to the difference (or absolute difference). I'm pointing out that the two cases are similar apart from the signedness.

To put it mathematically, I'm just observing that if you have a set {x_i}, then:

- avg{|x_i - y|} is minimised by setting y to median{x_i}

- |avg{x_i - y}| is minimised by setting y to avg{x_i}

The excerpt in the article says "the mean minimizes the average squared distance" (i.e. setting y to avg{x_i} minimises avg{(x_i - y)^2}), which is correct but not obvious (the proof is non-trivial, and if you were to ask the students who chose the mean why they chose it I bet none of them would say "because it minimises the squared distance"). So I find it odd to include that statement. If I was the author I would have said "the mean minimizes the average signed distance", which I find more intuitive and think is a better explanation of why those students chose the mean. (Also, I like how "average value of the absolute difference" and "absolute value of the average difference" are the same words just in a different order.)