Author here. I like to look over HN during breakfast, so a surprise this morning. I appreciate folks's kind words.
Can I slide in the back story? The course I had as an undergrad was very computational. Matrix multiplication is associative because the indices combine in this way and then you distribute the a_i,j's, etc. I could follow it line-by-line but I didn't feel any real understanding.
In grad school I learned that it all makes beautiful sense. So when I started teaching, I wanted an approach that conveyed the sense. I adopted a book that is a great book, very clear and clean, but students did not understand it. They were good students, but they just were not ready. They needed more examples, and by-hand calculations, for instance.
I resolved to find a presentation that helped bring them to where they would be ready, or at any rate formed part of a deliberate effort by an undergrad program to work on maturity. I didn't find a book that worked for me, so I wrote one.
So although there are a lot of examples and a lot of computation, like the text I had as an undergrad, this development keeps trying to direct the attention of students toward the understanding that they get from higher-level ideas.
I often see online where someone asks "What is the right book for XXX?" It seems to me that often there is no one best. Often, it depends on the audience. For the students that I see, this approach accomplishes a lot, it seems to me.
> So although there are a lot of examples and a lot of computation, like the text I had as an undergrad, this development keeps trying to direct the attention of students toward the understanding that they get from higher-level ideas.
I have not read your book, but this to me is one of the commonalities I have found in all successful curriculums for difficult subjects. An effort to tie new material to old material by repeated comparison. The insight that is brought from persistently trying re-examine previous concepts through the lens of an new one, or vice-versa is so satisfying, and illuminating. Nowadays, I try and force myself to make these kinds of connections and comparisons myself whenever I learn something new.
reminds me of the semester sophomore year where I was concurrently taking discrete math in the math dept., computer organization (prereq for architecture, basically covers binary number representations and logical implementation of adders, muxers, etc.) in the cs dept., and symbolic logic in the philosophy dept. it was a great feeling when I realized a few weeks into the semester that I was essentially taking three versions of the same course but from very different perspectives.
Elaboration and reprocessing is a good way to strengthen memories. The best though is effortful recall -- pulling up a memory that you are close to forgetting.
Wanted to take this chance to thank you for writing such a lucid and at the same time complete textbook, and then going ahead and making it freely available. It's the first maths textbook that I went through on my own, and might stay the only one, because any other text reads convoluted after yours.
I too never really grokked some essential points of linear algebra until I (re)learned it from a chapter in Hungerford's Algebra book, part of Springer's yellow GTM series. I was taught from Strang in undergrand, and while I did well in the course, I often found myself blind as to what all these computations meant. I recall that Strang had said in his preface that the subject had been taught too abstractly and the crucial importance missed, which reading Greub e.g. or other treatises of that era may be true.
> The course I had as an undergrad was very computational
Do you remember whether that was a course specifically targeting CS students? I am asking because the linear algebra course at "my" university was a course for CS and math students and it was (sadly) not computational at all. If I remember correctly (it's quite some years ago), we started with rings and fields, then vector spaces, and matrices kind of naturally followed from that.
This seems unnecessary for a first class. I would agree with the parent here that the class should help build intuition and theoretical understanding. Presenting a rich theory with full generality is probably not the best way to do that though. There are some beautiful concepts and geometric understanding to be had with the reals. The abstraction to vector spaces defined over arbitrary rings and fields builds naturally once you have that base.
Page 238 has the key intuition. But to really understand it you will need some build up to understand linear functions.
Matrices are simply a way to write down linear functions, and matrix multiplication is function composition. All of the algebraic properties of function composition are therefore true of matrices and vice versa.
That intuition also explains why matrices come up so often. For example consider Calculus. The key idea behind the differential calculus is that if y is close to y0 then f(y) is approximately f(y0) + f'(y0) (y - y0).
In the multivariable calculus the same thing is true. Except that f'(y0) is a linear function. Which means that when we write it down we have to write down a matrix.
Add me to the list of people who have the same question, but with a slightly different perspective.
I understand the interpretation of matrix multiplication as a linear combinations, but I would also like to understand what is occurring geometrically. If I matrix multiply a 2D vector <a, b> by the basis vectors <<0, 1>, <1, 0>>, then I understand I am "weighting" an x, and y unit vector (as defined by my basis) by the x-component and y-component of my <a, b> vector, and then adding them together. So geometrically, I'm stretching by a factor of a, and b, then summing to get my resulting vector.
So, the way I'm trying to think of it geometrically is: any matrix is just a representation of some kind of basis (i.e some kind of linear system). I picture this as a sort of orthogonal cartesian grid that we squish and stretch while still being a series of intersecting parallel lines.
Matrix multiplication, is therefore taking in some vector, and "weighting" the basis vectors represented by our matrix by the components of that vector, and then summing them. To put it another way, we're essentially mapping a vector into that new basis system represented by our matrix.
Is that correct? Is there a better way to try and think of matrix multiplication geometrically?
1. Linear maps are determined by their behavior on basis elements
2. The composition of linear maps is still linear
Together, this means you can work out how the composition of two linear maps affects the coefficients of each vector (relative to a fixed basis). If you do this, you get the "formula" for matrix multiplication.
Let V be an n-dimensional vector space. Let {e_1, e_2, e_3, ..., e_n} be a basis for V.
That means any v in V can be written
v = a_1*e_1 + a_2*e_2 + ... + a_n*e_n
Let f: V → V be a linear map. Let's see what it means to apply f to v. We have
In other words, if we know the values of f(e_1), f(e_2), ..., f(e_n) then we can calculate the value of f(v) for any v.
Every choice of value for f(e_1), f(e_2), ... is valid and determines a unique linear map.
For n=2, use the standard basis where e_1 is (1,0) and e_2 is (0,1).
Write f(1,0) as (a,c) and write f(0,1) as (b,d). This is just "relabeling" the values f(1,0) — under the standard basis we know that f(1,0) looks like (a,c) for some values of (a,c).
So that's the "formula" for applying a matrix to a single vector. It's determined entirely by the four values (a,b,c,d), but really it's determined by the action of the map on the basis vectors.
Now let f,g be linear maps and calculate g(f(v)) in the same way. You'll get the "formula" for matrix multiplication.
In other words, matrix notation, matrix multiplication formulas, etc. are "just" compact ways of representing the behavior of linear maps.
I was afraid to ask the same question. I can do it, I can program it, and I can check to see if it's correct, but I can't for the life of me understand why somebody saw fit to describe matrix multiplication the way they did.
Matrix multiplication is the composition of linear maps. It’s sometimes lost in the more computational approach, but we can think of this geometrically. If you know the first matrix rotates the plan by some amount, and the 2nd rotates it in the same direction as well, then the product must be the rotation matrix that rotates by the sum of the original rotations. That’s a very simple example. Since every nonsingular matrix is just a change of basis, you can get a rich geometric understanding for multiplying matrices. Moreover when things get more complicated, we can use invariants like the determinant and the trace to help guide our intuition.
I’d highly suggest watching 3Blue1Brown’s videos on Linear Algebra. He won’t get you to understanding everything (you’ll need to sit down and do problems for that) but he will help you see what intuition is out there in a very beautiful way. He makes a very good point that often when we go through the computations without the geometric intuition, we can spend a ton of time crunching numbers to see results that should have been obvious.
There is a subsection called matrix multiplication that is about this, and which explicitly mentions the proof of associativity both in the "clear" way and in the "slog through indices" way.
Just an idea: You should allow people to pay more than $22 for the hard copy if they want. Probably a much better way to generate donations than a separate tip payment.
The book has been posted since 1995, so no, not Axler. :-)
I do think that matching the approach to the audience is key. I admire Axler, and a number of other current texts, but in addition to considering the mathematical approach, an instructor also needs to consider where the folks in the room currently are.
At least for me, doing examples and computations is the best way to learn math, and also very important in research. Often an opaque general statement becomes clear after doing a few small examples. In linear algebra, I personally find some of the courses have too few computations, some concepts are best learned by working an hour by hand on some annoying 6 x 6 matrix..
That said, "working an hour by hand on some annoying 6 x 6 matrix" is how I was taught linear algebra. I got a rare B in the course (even though I aced all the tests!) because there was so much busy work I just refused to do it all. I got literally nothing out of the course in terms of understanding (there was neither time to think nor any real direction given) and a year later I couldn't even do the work anymore. I ended up picking up Axler.
Agreed. I had an additional conflict in that I am a naturally sloppy person, and the worst part of doing matrix calcs by hand is that I'd often make a calculation error due to an inability to read my own handwriting, which would lead to terrible marks again and again.
Perhaps a trivial question, but I'm wondering about the web site typography not being consistent with the book (serif vs sans)? One looks much nicer on the screen to me than the other.
This is amazing for self-learners. Free textbook, free lecture videos, free solutions. I bought a hard copy of this book when I first came across this resource.
This is really outstanding because most textbooks do not contain full solutions and it can be difficult for self-learners without access to universities, tutors, peers to debug errors or check their work. If such a student ever got stuck, then it could be hard to move forward. One solution is to use sites like Math Stack Exchange, but I can imagine it would be tedious to do multiple proofs every chapter and type every single one of them up for someone to check. Some would argue that if your solution is correct then you should be able to rigorously justify each step and you should “know” in the end of its right. Beginners often don’t have the mathematical maturity developed to do this yet and even experts make mistakes in their proofs.
If you had a question, then I could see scenarios where the question would be closed on sites like MSE for being duplicates or some odd thing that didn’t satisfy some sort of rule somewhere. These are more barriers for the self-learners that those in classrooms don’t necessary face and textbooks are typically written for those that access to professors.
I think more math textbooks should be written this way.
I logged into post the same comment - instead up-voted yours.
While online courses are often good, I hate the idea of learning math by answering multiple choice questions. The fact that this provides the complete proofs as answers will help one to check their own thought process (assuming they don't cheat and look at the solution without putting in any effort).
At a glance, I don't see anything I haven't studied in high school and university. However, the fact that people are writing such books and offering them for free is absolutely astonishing. For that reason alone I'm purchasing the hard copy, people like that deserve the support! Amazing!
Working with linear algebra has been one of the most valuable skill-sets to develop as a programmer. It's not relevant to a lot of domains, but for dealing with real or simulated 3D spaces it is like a super-power.
Are there programmers that deal with real or simulated 3D spaces without knowing any linear algebra? How? Euler angles (including gimbal lock) all the way?
Sure, if you're doing it all with a framework that handles all that for you. If you program a game in Unity, you're a "programmer that deals with simulated 3D spaces" and you don't need a lick of math.
I think plenty of programmers go the route of using a tool that does everything for them, to slowly learning the underlying nuts and bolts (math, etc) that allow them to modify the framework itself.
Was working on a 3D game with a programmer who didn't demonstrate knowledge of LA. (Maybe he had it, but he hid it well.) Several times we'd find him multiplying by matrix A and then by matrix A' (or doing multiplications by some random constant and then dividing the result by that same constant to get the sizing correct) because he was in the habit of jiggling things just until they weren't obviously broken.
I always assumed that he was just a one-off, but I suspect this could easily arise from copy/paste and framework usage. (I'm also not sure I think this is universally bad; I think it's great that early teenagers can whip up a Minecraft-ish demo without having to learn all of linear algebra first.)
as SamBam said, there are a ways to touch these domains without linear algebra.
But what I meant is that it feels so powerful to unlock these domains. Once you develop a comfort level it opens up these really cool, magical things you can do with a computer, like graphics, simulation, ML, image processing etc. And the linear algebra skills you learn in one domain are often transferable to other really cool, interesting domains. This is what feels like a super power.
I'm not exactly the target for this book, but since I love so much this distribution model (freely downloadable latex/pdf plus print-on-demand) I just ordered a printed copy!
Reading this, I realized I need a browser-side PDF viewer that provides kindle-like features (bookmarks, current reading position, highlighting, ...) layered over a bare PDF file fetched from a server like this one. I'm guessing that thing exists somewhere?
Everybody that I know pronounces it this way, but I could be wrong. If you could send me a bug report with evidence, I'd change it.
I have long thought about a web site that contains mathematical pronounciations. A person could click to hear a native German speaker say Entscheidungsproblem, for example. I've never seen one, although I have seen a book online along those lines (written by a blind mathematician?).
On the topic of pronunciation, say, "beta", "eta", "zeta" — is the first vowel actually pronounced as "ei" in English, instead of just "e" (as in "bed")?
"Mew" and "new" are actually a good approximation for the ancient Greek letter names: actual sound was of course "y", "i Graeca", but it's close enough, the blending of initial "j" with following "u" produces a reduced "y".
I was told that some of these pronunciations have changed over time, and the English versions are an attempt to approximate the pronunciation from ~2000 years ago, because some parts of the Christian bible were written in Greek.
I don't think it's meant to be a joke, but I don't think it's right either, in English.
In modern Greek, the top poster is correct, μ IS pronounced "Me" or [Mi] (spelled as μι). Here is a YouTube video of a Greek person going through the alphabet, linked to μ. [1]
However, in English this was never used, at least not recently, and its not a "mathematicians" thing. English pronunciation of Greek letters was fixed around the 16th century (see [2] for a tiny bit of history) and was an attempt to use the classical Greek pronunciations, though probably not a very successful one.
If you're going to complain about μ, then hopefully you also complain about π, which in modern Greek is pronounced like "pee," not "pie," and so would make π-day very different...
I'm not too fond of the example-driven approach. The student will certainly be able to multiply simple matrices and compute determinants in some simple cases, but will not be able to answer fundamental questions like "what can I put in a matrix?", or "what is a vector field?" (the author gives 10 "rules" with no structure, so it's impossible for the student to memorize them). The author throws around `R` everywhere, but all his examples use `Z`. For a book on linear _algebra_, there is very little algebraic rigor in the book.
The standard textbook for learning linear algebra at the undergraduate level is Hoffman and Kunz, which is far better.
In my experience teaching my friends and colleagues there are two ends of the learning spectrum--you can learn by example first or by theory first, or both, but the former is by far the lion's share of the population.
I've taught people where there's absolutely no way I can teach them a generalization unless it's rote--however they can get domain-specific knowledge and beat me over the head in their domain and teach me something about my generalization as well that I might have been missing.
The best way for me to teach calculus to most people is to walk around a room explaining concepts like velocity, speed, acceleration, and position. However, I personally would never learn it this way--I prefer Spivak, during my read through I learned more than I could ever learn in a class.
It's also my belief that most people aren't like me in the sense that they learn from example first. However, most people I know are like me so it's hard to realize how needed a textbook like this is and how huge it could be for the regular person to learn something like this.
If you think about it, pretty much everyone learns math through example first. You learn arithmetic in elementary school, and it might only be after an additional decade or more of schooling that you actually learn the logical underpinnings which make the basic arithmetic operations work. If you actually started with boolean logic and tried to build up all of mathematics from there without at some point tying it back to a tangible example which can be understood in terms of the real world I'm not sure anyone could successfully learn it.
True. Examples are necessary for both motivation, and for showing how the formalisms work, at this level. One good undergraduate textbook that has plenty of examples is Dummit & Foote. However, that doesn't mean that you don't define anything properly, and solely use examples to show how to crunch certain computations. The student is then stuck with a bunch of patterns in their heads, with little mathematical understanding.
So I actually had a basic algebra system before they taught it to me, for what it's worth, probably something I figured out from the structure of the language my parents were using (both accountants). I'm a pretty rare individual, I've never had to work at math.
I took a long time to learn how to learn from example, it used to be impossible but now it's almost just as natural.
My first course in linear algebra was extremely formal with very little practical examples or visual intuition. It’s easy not to appreciate just how much intuition can go a long way in helping us understand a concept, even if the intuition is formally “fuzzy”. I used Hefferon’s Linear Algebra to plug the gaps in my intuitive and practical knowledge, and was all the better for it.
People have different reasons and approaches for learning, and in the linear algebra landscape Hefferon is a worthy addition.
(edit: just like one of the commenters below I bought the hard copy a few years ago to show my support and thanks for the free edition.)
How is "what can I put in a matrix?" a fundamental question? Looking at the textbook, the author certainly covers the fundamental questions in vector spaces, linear maps, and decomposition/normal forms.
There's even a really great section on projective geometry that would have been helpful before seeing it quickly defined as homogenous coordinates before moving on to the definition as a quotient manifold of a sphere and Lie group.
Are you complaining about the axioms for a vector space? Because that list will exist for any LA class lol. That's like complaining that a college Calculus class is teaching the limit definition of the derivative.
I think (from only having skimmed the contents) that there is some handwaving around the definition of the characteristic polynomial. In order to properly define it, you need to allow matrices over arbitrary rings, not just fields.
Aren't these the standard vector space axioms? I seem to recall that they were defined in a quite similar way in my class.
Of course, you can simply the definition if you can say "vector spaces form a group and also satisfy these axioms with respect to scalar multiplication", but then you need to introduce groups first, which would be a distraction.
I think it's the leap from - 'I'm not a fan of the approach' to 'this is a poor textbook'. Just because they don't like the approach doesn't mean it won't be effective for other learners. Some people absolutely need examples before they can attach anything theoretical to a concept.
Interesting approach. He starts off with vector spaces, after providing some intuition, and uses it to drive the rest of the book. He defines matrices in terms of linear maps, and talks about operations on vector spaces, with small sections dedicated to the corresponding matrix computations. There's no hand-waving or lack of rigor.
There's also the Terse Introduction to Linear Algebra, which is shorter and more rigorous than Linear Algebra Done Right. All of these are known texts which have generated some body of discussion.
An idiosyncrasy of Axler's book is that it introduces determinants only toward the end.
> An idiosyncrasy of Axler's book is that it introduces determinants only toward the end
I think the idea is that determinants do not necessarily belong in linear algebra proper, at least not with as prominent a position as they are usually awarded in textbooks. More specifically the idea is that determinants are being excessively and detrimentally used in linear algebra proofs. Axler actually published a paper called "Down With Determinants!" in 1995, for which he won the Lester R. Ford Award. Quote from the introduction:
> This paper focuses on showing that determinants should be banished from much of the theoretical part of linear algebra. Determinants are also useless in the computational part of linear algebra.
That seems like a personal crusade. Of course, determinants are computationally mostly irrelevant, and of course, they actually belong to multilinear algebra, but theoretically, determinants are quite nice and they are useful in many LA proofs, especially when it comes to the characteristic polynomial.
You are being dismissive of Axler (with the "personal crusade" comment) and disagree with his position, but it would be less disrespectful of Axler and more beneficial to the discussion (and interesting to me) to give at least some argumentation to match Axler's. That is, why is Axler's approach (in the short paper or the book) worse than the traditional approach?
From my skimming of that article, Axler's main complaint with determinants is that it pedagogically leaves students with the impression that eigenvalues are somehow a property of the determinant rather than being a fundamental property of a linear transformation.
I'm not sure I've ever met anyone who has that view; in my experience, more people leave linear algebra with a laundry list of matrix properties whose utility isn't obvious, with determinants and eigenvalues both on that list. This in general is a pedagogical issue with teaching linear algebra as plug-and-chug techniques without explaining why we're doing them.
Truthfully, I'm not convinced that introducing the determinant as "just" the product of eigenvalues is itself useful. One of the more useful properties of the determinant is that it is built up via row-reduction, and consequently, it can be computed using Gaussian elimination. I've seen some texts actually define the determinant via its row-reduction properties, and then build up to demonstrating the other formulas for it.
I'll also point out that there are two more uses that Axler doesn't acknowledge: the determinant tells you the volume of a parallelpiped, and it's a convenient way to remember the formula for a cross product.
It's not. For all I care he may write his book without introducing determinants or only introducing them at the very end. I'm all for different pedagogical approaches.
But he writes a paper called "down with determinants" and starts it with "if you think complex matrices have an eigenvalue because the characteristic polynomial has a root, then this is wrong". But it's not wrong, mathematically it's 100% correct! It's just his own personal opinion because he somehow doesn't like determinants.
Mathematics is all about different approaches and different tools, not about "the one true enlightened way", as he makes it out to be. This is why I'm calling it a "personal crusade". Had he just called his paper "a determinant-free approach to linear algebra" I would take no issue.
The degree to which classrooms and textbooks have converged onto Axler's pedagogical approach is the degree to which Axler's position is "personal" vs mainstream.
Can I slide in the back story? The course I had as an undergrad was very computational. Matrix multiplication is associative because the indices combine in this way and then you distribute the a_i,j's, etc. I could follow it line-by-line but I didn't feel any real understanding.
In grad school I learned that it all makes beautiful sense. So when I started teaching, I wanted an approach that conveyed the sense. I adopted a book that is a great book, very clear and clean, but students did not understand it. They were good students, but they just were not ready. They needed more examples, and by-hand calculations, for instance.
I resolved to find a presentation that helped bring them to where they would be ready, or at any rate formed part of a deliberate effort by an undergrad program to work on maturity. I didn't find a book that worked for me, so I wrote one.
So although there are a lot of examples and a lot of computation, like the text I had as an undergrad, this development keeps trying to direct the attention of students toward the understanding that they get from higher-level ideas.
I often see online where someone asks "What is the right book for XXX?" It seems to me that often there is no one best. Often, it depends on the audience. For the students that I see, this approach accomplishes a lot, it seems to me.