Proof of the Singular Value Decomposition

FabHK · 2024-03-13T13:00:00 1710334800

Very nice (though hardly approachable unless you already have a good handle on, say, eigenvalue decomposition, outer products = rank 1 matrices, positive semi-definiteness, etc.).

NB: I think many of these linear algebra proofs would benefit (in terms of legibility) if the dimensions of the matrices/equations were annotated beneath them. (I created a LaTeX macro for my master's thesis to do just that.)

patrick451 · 2024-03-13T13:48:39 1710337719

> (though hardly approachable unless you already have a good handle on, say, eigenvalue decomposition, outer products = rank 1 matrices, positive semi-definiteness, etc.)

These are what are known as prerequisites. This expectation on HN that all science and math should be made approachable without requisite prior knowledge is so bizarre to me. On the other hand, if blog post is about move semantics, nobody chimes in "ha, try understanding that without knowing about constness first".

jacobolus · 2024-03-13T15:38:51 1710344331

> expectation on HN that all science and math should be made approachable without requisite prior knowledge

I doubt anyone expects this, but there are sharp trade offs involved. The more accessible you make a presentation, the wider the possible audience. With each year of additional technical training you expect, the total audience drops significantly; a math article targeting people whose math training only made it to an introductory calculus course can be read by perhaps 1000 times as many people as a math article with prerequisites from grad-school pure math coursework.

FabHK · 2024-03-13T16:33:39 1710347619

And depending on what audience you're writing for, you might want to say something about the prerequisites. This article didn't (maybe the blog is read mostly by mathematicians or maths students; the author is not necessarily delinquent in their duties). But given that, I outlined some prerequisites so that readers here know what to expect.

gwgundersen · 2024-03-13T15:04:21 1710342261

I have a theory that whatever popularity my blog has comes from writing to the level of the reader. My favorite example of the opposite behavior is how every blog post on a topic involving Bayesian inference seemingly must start with Bayes’ formula.

syockit · 2024-03-13T14:00:40 1710338440

Something like this?

    \underset{m,r}{U}\underset{r,r}{\Sigma} = \underset{m,n}{A}\underset{n,r}{V}

I guess it's helpful when first introducing the variables, but end up being visual clutter as you try to follow the equations later on.

FabHK · 2024-03-13T17:03:25 1710349405

Yes, \underset. My old macro (see sister comment) ensures math mode, switches to a smaller font, uses a "x" (\times). You'd then write (with \cdots for better spacing)

  \ddim Umr \cdot \ddim {\Sigma}rr = \ddim Amn \cdot \ddim Vnr

to get https://www.fabian-lischka.de/files/Various/ddim-ex3.png

ballooney · 2024-03-13T13:33:58 1710336838

Just out of interest would you mind sharing an example (eg a bit of screengrab) of the rendered result of the macro, and/or the macro itself? It sounds useful but I can’t quite visualise.

FabHK · 2024-03-13T14:02:05 1710338525

So, the macro is

    \newcommand{\ddim}[3]{\ensuremath{\underset{\scriptscriptstyle #2\times #3}{\rule[-1ex]{0ex}{+1ex}#1}}}

And you use it e.g. like `\ddim {A}NN` to indicate that A is NxN, or `Ax \ddim {=}N1 b` to show that this is an Nx1 equation. I suppose one could fine-tune it, but it was good enough for me.

Looks like this:

https://www.fabian-lischka.de/files/Various/ddim-ex1.png

https://www.fabian-lischka.de/files/Various/ddim-ex2.png

ballooney · 2024-03-13T14:38:06 1710340686

Very neat - thank you for sharing.

richrichie · 2024-03-13T14:42:36 1710340956

Strang builds up to SVD proof from basically a rudimentary matrix product over a few chapters. The author will have to reproduce those chapters to give a full elementary proof.

mrbungie · 2024-03-13T15:54:50 1710345290

IANAMathematician so I don't know if it's already a discussion, but I think that focusing in these kind of notation improvements would make maths both easier to grasp and more fun to learn.

PS: I'm not sure if it's something that bothers other people but sometimes it bothers me. Especially notable that to me there are some domains that seem to use weird notation as a shibboleth or to make it artificially harder to think about.

nicf · 2024-03-13T16:25:09 1710347109

For what it's worth, I am a mathematician, and I'll say that there is definitely a lot of awkward notation out there in math, and some of it I would agree is even fair to describe as just bad. But I'm pretty confident in saying that basically no one is malevolently inventing notation just to make other people's lives harder! This just isn't how mathematicians think, and anyway if you intentionally invented terrible notation for your own field, you'd be hurting yourself much more than anyone else.

There are plenty of other reasons mathematical notation might be hard for beginners:

* The notation in question might have originally been invented for a narrower purpose than it's being used for now, and has been stretched beyond the domain where it fits well.

* Someone made a weird decision a hundred years ago, but now everyone's used to writing things that way, and if you want to be understood it's easier to use the same notation as everyone else.

* The notation might actually be better than you're giving it credit for, but this is only obvious once you've spent a while working in the field.

Overall, I've noticed that people coming to math from CS tend to put more weight on the question of notation than mathematicians do. I will say, from my perspective, I think it's very easy to make a bigger deal out of notation than necessary. A lot of math is just authentically pretty tricky and takes effort to learn no matter what notation you use, and I think it's pretty rare for the notation to actually be the lowest-hanging fruit here. This isn't to say that there's no mathematical notation that could use improvement (quite the opposite!) but I don't think making those changes would have quite the impact you might be imagining.

jcranmer · 2024-03-13T17:11:37 1710349897

There are two other big issues with math notation that I see:

* The tendency of math to have many different notations to express the same thing. My calculus textbook started by apologizing for the like seven different notations for "take the derivative of this function". Would it kill mathematicians to just pick one notation and standardize on it?

* The number of times the same notation can mean two different things. I've always been particularly bothered by stuff like implicit adjacency is multiplication, except when it's not. Is "a(b + c)" supposed to mean "add b to c and use the result as the argument for a" or is it supposed to mean "add b to c and multiply it with a"? You'd think it should be obvious, but functions are a ring, with multiplication corresponding to composition...

nicf · 2024-03-13T18:28:53 1710354533

Sure, but I think we're just talking about two slightly different things here: I was trying to describe the "sociological" reasons why math notation might end up in a state that you don't like rather than trying to list the particular things you might not like about it.

I do agree with you that calculus textbooks often go overboard with notations for the derivative, and in fact I remember being taught some (like a capital D with a subscript x) that I then never saw again for the rest of my life.

But as to the question of why "mathematicians" would do this, I don't think the answer is all that mysterious: "mathematicians" is a category that includes a lot of different people in different places and there's no central committee that enforces notation standards! It's like that one XKCD comic everyone posts about inventing new standards.

ykonstant · 2024-03-13T13:38:19 1710337099

Can you share the macro? Merely subscripting the matrix with dimensions as in

    \begin{equation*}
     \begin{bmatrix}
      a & b & c \\
      d & e & f \\
      h & h & j \\
     \end{bmatrix}_{3 \times 3} = 5
    \end{equation*}

creates an unpleasant horizontal void in the equation.

Lucasoato · 2024-03-13T13:45:57 1710337557

Remember that if you consider the SVD for complex matrixes, you should use the Hermitian transpose and not only the usual transposition (symbol used on that page: A^T). Of course, it's the same in this case since it's considering real matrixes.

winwang · 2024-03-13T16:52:40 1710348760

Shoulda used Dirac notation :P

SpaceManNabs · 2024-03-13T15:15:34 1710342934

I have ran into this blog on my own while looking up the gumbel distribution. this is a very high quality blog. surprisingly, one of my collaborators was also credited on one of the posts. small world. it makes me wanna say the blog is even more high quality than i thought.

enthdegree · 2024-03-13T13:52:47 1710337967

Why is the matrix diagonalizable with an eigendecomposition of that form?

nicf · 2024-03-13T14:03:34 1710338614

This is the Spectral Theorem. It says (in the version for real matrices, which are what we have here) that any symmetric matrix has an orthonormal basis of eigenvectors, or equivalently that any symmetric matrix can be diagonalized via an orthogonal change of coordinates.