Hacker News new | past | comments | ask | show | jobs | submit login

With meager assumptions and a standard set up, we are in a vector space and want a meaning of distance, that is, a vector space norm.

The three norms people think of first are L1, L2, and L-infinity: L1 is from absolute values. L2 is from squares. And L3 is from the absolute value of the largest value (i.e., the worst case).

But in addition it would be nice if the vector space with a norm was also an inner product space and the norm from the inner product. Then, right, bingo, presto, for the standard inner product the norm we get is L2.

Why an inner product space? Well, with a lot of generality and meager assumptions, we have a Hilbert space, that is, a complete inner product space. The core of the proof of completeness is just the Minkowski inequality.

Being in a Hilbert space has a lot of advantages: E.g., we get orthogonality and can take projections and, thus, get as close as possible in our L2 norm. We like projections, e.g., the Pythagorean theorem. E.g., in regression in statistics, we like that the

total sum of squares is the sum of the regression sum of squares and the error sum of squares

right, the Pythagorean theorem.

We have some nice separation results. We can use Fourier theory. And there's more.

And there are some good convergence results: If we converge in L2, then we also converge in other good ways.

One reason for liking a Hilbert space is that the L2 real valued random variables form a Hilbert space, and there convergence in L2 means almost sure convergence (the best kind) of at least a subsequence and, often in practice, the whole sequence. So, we connect nicely with measure theory.

We have some representation results: A linear operator on a Hilbert space is just a point in the space applied with the inner product. We like linear operators and like knowing that on a Hilbert space they are so simple.

Working with L1 and L-infinity is generally much less pleasant. That is, we really like Hilbert space.

Net, we rush to a Hilbert space and its L2 norm from its inner product whenever we can.




> And L3 is from the absolute value of the largest value

You mean L-infinity.


Thanks!

Right! I wrote "L3" but never defined an L3. So, yes, I meant L-infinity. Sorry 'bout that!

Not the first time I typed too fast!

I did omit the other L^p spaces.


Excellent points, though it's not completely clear why Euclidian geometry is the superior choice to me.


You get to do projections as in the Pythagorean theorem.

The coefficients you need in the projections are just the values of some inner products. With random variables, those coefficients are covariances, that is, much the same as correlations, that commonly can estimate from data.

In the multivariate Gaussian case, uncorrelated implies independence.

Fourier theory is easier in L2 than in L1. E.g., in classic Fourier series, the error in the approximation is in L2 and is from the L2 orthogonality of the harmonics.

Yes, L-infinity can also be nice: The uniform limit of a sequence of continuous functions is continuous.

Or, with L2, often get a Hilbert space but with L1 or L-infinity usually get at best just a Banach space -- that is, a complete, normed vector space. Then, yes, can get the Hahn-Banach theorem, but the same thing in Hilbert space is easier.

There is a sense in which L1 and L-infinity are duals of each other, but L2 is self-dual which is nicer.

Filling in all these details and more is part of functional analysis 101. There tough to miss at least three books of W. Rudin: Principles of Mathematical Analysis, Real and Complex Analysis, and Functional Analysis.

There's more, but I've got some bugs to get out of the software of my Web pages!

I like the question -- asked it myself at the NIST early in my career. The answer I gave here is better than what people told me then.

I've indicated likely most of the main points, but my answer here is rough and ready (I typed too fast), and a quite polished answer is also possible -- I just don't have time today to dig out my grad school course notes, scan through Rudin, Dunford and Schwartz, Kolmogorov and Fomin, much of digital filtering, much of multi-variate statistics, etc.


I intuit that you're getting at the real answer with the self-dual. That makes a lot of sense. Also, from a practical perspective L2 is very nice because it causes the problem of error reduction to be quadratic, so it scales well.

Time to dig out Rudin.


Thia sounds like lampposting -- closing a model because the math is nice, not because it is an accurate model.


No: If what you want is the L-infinity norm, then go for it. A standard place for that is numerical approximations of special functions -- want guarantees on the worst case error. And there is some math to help achieve that. It's sometimes called Chebyshev approximation.

But, in practice, the usual situation, e.g., signal processing, multi-variate statistics, there's no good reason not to use L2 and many biggie reasons to use it. E.g., for a given box of data, commonly the better tools in L2 just let you do better.

Or to the customer: "If you will go for a good L2 approximation, then we are in good shape. If you insist on L1 or L-infinity, then we will need a lot more data and still won't do as well.".

Again, a biggie example is just classic Fourier series. Sure, if you are really concerned about the Gibbs phenomenon, then maybe work on that. Otherwise, L2 is the place to be.

E.g., L1 and L-infinity can commonly take you into linear programming.

Generally you will be much happier with the tools available to you in L2.


Again, just now I just don't have time for a more full, complete, and polished explanation.

A really good explanation would require much of a good ugrad and Master's in math, with concentration on analysis and a wide range of applications. I've been there, done that but just don't have time to write out even a good summary of all that material here.


putting a shout in for Bollobas here, surely all roads to functional analysis don't lead through Rudin?


Right. Also, say, Kolmogorov and Fomin. And Dunford and Schwartz.

No doubt the full literature is enormous -- I don't know all of it!

But Rudin is a good author, and as a writer got better, less severe in style and, thus, easier to read, over time in his career.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: