Hacker News new | past | comments | ask | show | jobs | submit login

The intuition isn't clear to me either, but the calculus is fairly simple. At the minimum, the derivative of the aggregate discrepancy is zero:

0 = d/ds sum (x_i - s)^2 = -2 sum(x_i - s) = -2 [(sum x_i) - n * s]

Thus

n*s = sum x_i

so

s = sum x_i / n




I find turning the math into working code sometimes helps me. The for-loop in my code only approximates the mean within +- 0.5 intervals. This is because it relies on the definition for finding the mean provided in OP's article, rather than your derivation of the arithmetic mean.

import numpy as np

X = [0,1,2,3,3,3,4,5,6,6,6]

mean_dict = {}

for i in [x/10 for x in range(0,100)]:

    mean_dict[sum([abs(x-i)**2 for x in X])] = i
print(mean_dict[min(mean_dict)], "score:", min(mean_dict))

#add actual mean to series

mean_dict[sum([abs(x-np.mean(X))2 for x in X])] = np.mean(X)

print("real mean:", mean_dict[min(mean_dict)], "score:", min(mean_dict))


You can get proper code formatting by indenting two spaces.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: