This principle plays a crucial role in the short story Story of your Life by Ted Chiang. Sadly this did not make it into the movie Arrival, though I guess this is somewhat understandable, would be tough to communicate this in a Hollywood movie.
Went from starting a car analogy to this, as part of the simple car analogy:
> Now the mean square of something that deviates around an average, as you know, is always greater than the square of the mean; so the kinetic energy integral...
No, I don't know, and you lost me. I don't think it matters to me anyway.
One bit of additional intuition: Since the square of a number goes up from addition by slightly more than it goes down from subtraction, perturbations increase the average of the squares. This is why the difference between the two quantities Feynman mentions is used to measure the variance of a set of numbers.
Since that’s still not precise, let’s compare the square of the mean to the mean square for two numbers a and b.
The square of the mean is ((a+b)/2)^2=(a^2+2ab+b^2)/4
The mean of the squares is (a^2+b^2)/2
Feynman’s claim is that the second is always bigger if the numbers deviate around an average (a and b aren’t equal).
So let’s subtract the first from the second. We get (a^2-2ab+b^2)/4. The numerator is equivalent to (a-b)^2. Since a square of a real can’t be negative, when a and b are unequal the mean of the squares is always larger.
You likely know that driving constant 60 is using less gas than driving 30 half and making it up with 90. While is not a square function it may be close enough for a rough real life analog to help.
I love the Feynman lectures, but I feel like he does this sort of thing all of the time. He's so keen on delivering intuition for the subject that sometimes he glosses over crucial technical details.
The problem is that to get intuition for A, you also need to have intuition for all of A's dependencies. In this case if you had good intuition for high school math the statement would be obvious, but almost nobody build that kind of intuition.
I'm guessing he's relying on common knowledge of variance here. If you know that variance is always nonnegative, and you also know that variance can be defined as (mean square - square of the mean), then it's obvious.
Suppose you have a list of N numbers x_i, whose mean is X.
The unaveraged variance,
∑_i (x_i - X)^2
is the sum of a bunch of squares, which are non-negative things, so the variance is non-negative.
Now, we can expand the square
0 ≤ ∑_i (x_i - X)^2 = ∑_i x_i^2 - 2 X ∑_i x_i + X^2 ∑_i 1
≤ N * (mean of the squares) - 2 N X^2 + X^2 N
≤ N * (mean of the squares) - N * (square of the mean)
and find
square of the mean ≤ mean of the squares
equality being saturated if and only if the list is a constant list.
Fun example/illustration of the Principle Of Least Action in another realm: when it snows, consider the tracks people leave in the snow as opposed to the paths that lie beneath: they almost always represent trajectories of least action.