I would also add that one fundamental aspect of linear algebra (that no one ever taught me in a class) is that non-linear problems are almost never analytically solvable (e.g. e^x= y is easily solved through logarithms, but even solving xe^x=y requires Lambert’s W function iirc). Almost all interesting real world problems are non-linear to some extent, therefore, linear algebra is really the only tool we have to make progress on many difficult problems (e.g. through linear approximation and then applying techniques of linear algebra to solve the linear problem).
This is correct which is why the Implicit Value Theorem is heavily under appreciated by newcomers, since it is saying roughly that "Locally, calculus = linear algebra" meaning at a certain scale all equations is linear algebra.
FWIW I lead with this whenever I find myself teaching lin.agl. and I agree it's the most important point of context for students entering the subject (and presumably embarking on undergrad math).
Yeah, it’s funny how most of high school is spent focusing on the “exceptional” solvable cases of nonlinear equations (low degree polynomials, simple trig equations, exponentials) that one can come out with the skewed idea that solvability in nonlinear equations is more common than it actually is. While I understand that it’s important to build up a vocabulary of basic functions (along with confidence manipulating them) I think it is also important to temper expectations with the reality that nonlinear behaviors are so diverse and common that it is an a small miracle that we have somehow discovered enough examples of analytically solvable systems to enable us to understand a rich subset of behaviors!
One of my favorite perspectives on the difficulty of formulating a general theory of PDE in light of the difficulties posed by nonlinearities is Sergiu Klainerman’s “PDE as a unified subject” https://web.math.princeton.edu/~seri/homepage/papers/telaviv.... If I understand correctly, any general theory of PDE would have to incorporate all the subtle behaviors of nonlinear equations such as turbulence (which has thus far evaded a unified description). Indeed, ”solvable” nonlinear systems in physics are so special Wikipedia has a list of them https://en.m.wikipedia.org/wiki/Integrable_system. With this perspective, I’m tempted to say (in a non-precise manner) that solvable systems are the vanishingly small exception to the rule in a frighteningly deep sea of unsolvable equations.
Perhaps not what you’re looking for but if you can get through Griffith’s Quantum Mechanics you likely can get through Axler. I found it helpful to draw examples from QM when self studying in Linear Algebra Done Right.
I’ve been watching the 18.06SC linear algebra course on MIT OpenCourseWare (taught by Gilbert Strang) and it’s really great - one of the best courses I’ve ever seen. I’ve just started the follow-up course that covers applications.
Isn't it so good?! The way he starts the class having you thinking you missed a reading or something, and then by the end of the lecture you have an aha moment. Feels like he crams an aha moment into each lesson too, so good.
axler’s text is phenomenal, but it’s probably not what you’re looking for if you want an “applied” view into computational techniques gleaned from linear algebra. the text centers on finite dimensional vector spaces, in the standard, mathy, axiomatic way—which is far more general than the prototypical numerical usage in standard programming problems in most swe jobs
I agree here. I tried to power through the book (which to be fair is really clear, didactic and still concise) and I was stumped because I'm more practical minded, and I'd be trying to look for examples in my domain (signal processing: filtering, beamforming, mle/map, compression, etc.) and I'd be stumped quite fast.
I appreciate any textbook with interesting real world examples and slow worked-through solutions. But maybe I'm a bit too lazy to do the work myself.
I still haven't grasped really what svd does, why it is different from eigenstuff (and... well... what eigenvalues/vectors are...) and the link between those and solving linear systems, and with the characteristic polynomial, and matrix inversion, and... I have intuitions, and I can mostly implement the stuff, but no clear understanding.
This seems like a way of viewing a small subset of linear algebra (matrix multiplication). My favorite approach is 3 Blue1Brown's visual one, also avail on Khan Academy.
This article leaves out the key insight of matrices as a transformation of space.
For an intuitive and comprehensive book on linear algebra, Mike Cohen has self-published an excellent book on linear algebra [1]. He also has a very popular Udemy course on the same subject [2].
I really enjoyed this, almost read as a primer in less academic order of operations and something more natural in the form of intuitive learning. Thanks for sharing!
Excel has the built in MMULT function, but I’m not aware of any built-in support for eigenvalues or eigenvectors. Many people have written such functions though.
That said, I would be surprised if Excel spreadsheets were implemented as matrices. Since you can update one cell and have it automatically update any computation that uses that cell, I would expect spreadsheets to be implemented with some sort of dependency graph so it’s easy to traverse and update the values that need to be changed. (This could be implemented as an adjacency matrix, but I haven’t seen that representation used before for programming language dataflow analysis.)
Interesting idea. I wonder would it be possible to create a spreadsheet-app where every "sheet" was a matrix and then you would apply linear algebra operations between the sheets to produce output-sheets. Would that be all a spreadsheet-app needs? It might be a simpler unifying design for spreadsheet programming.
The problem I've had with Excel etc. is that every cell can hide an operation and some do and some don't and it becomes difficult to understand what the totality of calculation is. Whereas if it could be expressed as operations between matrices the whole calculation could be expressed as a single formula, perhaps. (?)
A prominent sentiment in the comments here is that this resource isn't that good. I only studied up to Calculus II, so what would be a good resource to approach LA?
It really depends on what you’d like to learn LA for, and how comfortable you are with abstraction: LA can span from concrete multiplication of matrices and vectors all the way to very abstract (e.g. vector spaces over general fields or even modules over rings). I know many people recommend Gilbert Strang’s introductory linear algebra (I have not read it, but it seems to fall into the former camp), but I might also recommend Sheldon Axlers Linear Algebra Done Right. In all honesty, I learned La the best from David Griffiths quantum mechanics text, although it is not a comprehensive in its coverage of the subject (not that it should be, given that it is a physics text). I guess I am trying to say that there are many different flavors and interpretations of linear algebra, and while matrices and vectors may be simple at first, it does tend to rob the subject of its richness and depth (e.g. what do these matrices represent, what are the canonical structures etc.) so I am a bit biased towards going full generality at first, and perhaps reading a more rote computation book on the side (I understand we all have limited time though).
3b1b’s Essence of Linear Algebra videos on YouTube do the best job I’ve ever seen of explaining the intuition and motivation for linear algebra and matrices. They aren’t a complete course in and of themselves. For that you’d need a textbook. But having learned LA the traditional textbook way and then watching the 3b1b videos a decade later, I would strongly encourage you to watch his videos first before opening a textbook. Once you understand the intuition, the stuff you read in textbooks makes a lot more sense and are much better motivated.
Math is about concepts, structures, and relationships. It’s not about definitions and theorems. We use those only to formalize the concepts that we’ve thought of. That’s why I recommend the “intuitionist” approach for learning math. You can fill the computations and proofs later.
Here's another vote for 3Blue1Brown's series on linear algebra [1]. Spending a few hours on this series will easily save you dozens of hours when going through a comprehensive LA textbook.
Sib comments are suggesting textbooks (good ones at least, IMO), but FWIW, if I were trying to give someone a 1 or 2 hour intro to LA, it would be:
- Strang's 4 subspaces: https://web.mit.edu/18.06/www/Essays/newpaper_ver3.pdf , and just enough supporting material to understand that at a basic level, and hopefully finishing up with the SVD, which to me is basically a summary of LA in one equation.
- A quick look at the eigenvalue problem, Ax = 𝛌x, with the key note that hey, in the vast majority of cases, Ax does not equal a scaled version of x (it took me way too long to realize that's what we were doing here when I was first learning LA). <roll tacoma narrows bridge tape here>
These kinds of explanations are so meh to me. Linear algebra is useful once you begin to look for vector spaces you didn't know you had.
Thinking of matrices as spreadsheets is barely abstraction. Seeing the derivative operator represented as a matrix, acting over the polynomial vector space can open your eyes.
Taking the determinant of that matrix shows that d/dx isn't invertible.
Thinking of the fixed point of the transformation yields exp, the eigenfunction of the operator.
Right, and that's a perspective you pick up on in a second course in linear algebra, typically. The key insight really is that the core concept is that of a vector space, rather than vectors per se. The only thing we really ask of vectors is that it be possible to apply linear functions with coefficients from your favorite field to them. Other than that, vectors themselves aren't that interesting: it's more about functions to and from vector spaces, whether it's a linear function V -> V or a morphism V -> W between two different vector spaces.
This is actually a common theme of mathematics, that the individual objects are in some sense less interesting than maps between them. And, of course, the idea that any time you have a bunch of individual mathematical objects of the same type, mathematicians are going to group them together and call it a "space" of some kind.
In fact, my previous paragraph is pretty much the basis for category theory. One almost never looks at individual members of a category other than a few, selected special objects like initial and terminal objects. A lot of algebra works in a similar way. If I could impart one important insight from all the mathematics I've read, done, and seen, it would be this idea of relations being more important than the things themselves.
Just plain algebra is abstract math, and even the most common everyday math most overlapping common programming work.
I didn’t even know until today there was a concept called linear algebra, it was taught to me as introductory geometry alongside other geometry concepts. So that’s neat to learn!
Yes I think this spreadsheet view is so detrimental and confusing for newcomers. I'm not even sure the analogy makes sense. The key part of linear algebra imo is the concept of linear transformations.
T(a+b)=T(a)+T(b)
Matrices just happen to be one way of expressing those transformations.
And for extra magic, since every vector space has a basis, every linear transform between vector spaces with a finite basis can be represented by a finite matrix (https://en.m.wikipedia.org/wiki/Transformation_matrix). While this might feel obvious if you haven’t explored structure-preserving transforms between other types of algebraic objects (e.g. groups, rings), it is in fact very special. Learning this made me a lot more interested in linear algebra. It unifies the algebraic viewpoint that emphasizes things like the superposition property (T(x+y) = T(x) + T(y) and T(ax) = aT(x)) with the computational viewpoint that emphasizes calculations using matrices.
Since all linear transforms between vector spaces with a finite basis are finite matrices, the computational tools make it tractable to calculate properties of vector spaces that aren’t even decidable for e.g. groups. For a simple, but remarkable example: All finite vector spaces of the same dimension are isomorphic, but in general, it’s undecidable to compute if two finitely-presented groups are isomorphic.
A semi-decidable problem is still pretty bad news from a computational perspective, but I agree that it's not the best example of what I was trying to illustrate. I was aiming for something dramatic and (somewhat) approachable, but ended up emphasizing properties of vector spaces as free abelian groups, rather than as vector spaces per se (which undermines my emphasis of the specialness of vector spaces in comparison to other algebraic structures). That said, to the best of my knowledge, the algorithms for computing whether two finitely-generated* abelian groups are isomorphic take advantage of the close relationship between finitely-generated abelian groups and vector spaces to compute the Smith normal form of matrices associated with the groups and then compare the normal forms. This takes roughly O(nmsublinear factors) for n x m matrices[0]. So to revise my example, vector spaces with a finite basis (and any finitely-generated free abelian group) can be compared for isomorphism in constant time and finitely-generated non-free abelian groups take time roughly quadratic in the number of generators, so there is a huge win there still.
Do you have a favorite example that highlights the unique computational properties of vector spaces?
*I don't know how this changes in the finitely-presented case, but I assume the extra constraint can be used to improve the performance of the algorithms. It's a lot easier to find asymptotic analysis of the finitely-generated case though and I don't see a way around dealing with the fact that it's still not free.
If I may add, I found "useful magic" like discrete Fourier transforms, local linear approximations and homogenous differential equations as exciting examples to motivate students into the abstract theory of linear transformations
Ditto. More broadly, I am bored by efforts to rehash the same introductory material from whatever your given technical topic is (math, programming, machine learning). There's already really good books out there on these things that have been written by masters and do a much better job than blogs like this (provided you read them properly).
To people out there writing educational blogs: do more research and find good, well written, timeless resources to point people to for the basics. Spend your energy writing something new that we haven't all already read.
You might find people doing this and not notice it. Sometimes the educational progress is formulating a thing you already know for a subset of people who will receive it more effectively in that format. Might be boring to you, might be brain exploding revelatory for someone else. Even a better articulation of something which might have helped you learn can be in that category! Keep in mind you’re judging education of material you already know.
I used to teach this. One of the key ideas is to get rid of 3d geometry and state, from the beginning, huge sized problems (simple models of traffic using kirchoff’s laws, image convolution, statics…). Otherwise, why define the determinant? Just compute it. Or eigenvalues? Or kernels?