There is a very slick proof for the irrationality of the number 2 that is found in Baby Rudin.I thought it was his invention.Turns out it appears verbatim in Hardy's A course of Pure Mathematics.
omg just had a look and this one is just everything I hate about mathematics and academia.
Starts with lots of random definitions, remarks, axioms and introducing new sign language while completely disregarding introducing what it‘s supposed to do, explain or help with.
All self-aggrandization by creating complexity, zero intuition and simplification. Isn‘t there anybody close to the Feynman of Linear Algebra?
Yeah, a good example is on the second page of the first chapter:
> Remark. It is easy to prove that zero vector 0 is unique, and that given
v ∈ V its additive inverse −v is also unique.
The is the first time the word "unique" is used in the text. Students are going to have no idea whether this is meant in some technical sense or just conventional English. One can imagine various meanings, but that doesn't substitute for real understanding.
This is actually why I feel that mathematical texts tend to be not rigorous enough, rather than too rigorous. On the surface the opposite is true - you complain, for instance, that the text jumps immediately into using technical language without any prior introduction or intuition building. My take is that intuition building doesn't need to replace or preface the use of formal precision, but that what is needed is to bridge concepts the student already understands and has intuition for to the new concept that the student is to learn.
In terms of intuition building, I think it's probably best to introduce vectors via talking about Euclidean space - which gives the student the possibility of using their physical intuitions. The student should build intuition for how and why vector space "axioms" hold by learning that fundamental operations like addition (which they already grasp) are being extended to vectors in Euclidean space. They already instinctively understand the axiomatic properties being introduced, it's just that the raw technical language being thrown at them fails to connect to any concept they already possess.
> This is actually why I feel that mathematical texts tend to be not rigorous enough, rather than too rigorous.
The thing that mathematicians refuse to admit is that they are extremely sloppy with their notation, terminology and rigor. Especially in comparison to the average programmer.
They are conceptually/abstractly rigorous, but in "implementation" are incredibly sloppy. But they've been in that world so long they can't really see it / just expect it.
And if you debate with one long enough, they'll eventually concede and say something along the lines of "well math evolved being written on paper and conciseness was important so that took priority over those other concerns." And it leaks through into math instruction and general math text writing.
Programming is forced to be extremely rigorous at the implementation level simply because what is written must be executed. Now engineering abstraction is extremely conceptually sloppy and if it works it's often deemed "good enough". And math generally is the exact opposite. Even for a simple case, take the number of symbols that have context sensitive meanings and mathematicians. They will use them without declaring which context they are using, and a reader is simply supposed to infer correctly. It's actually somewhat funny because it's not at all how they see themselves.
> The thing that mathematicians refuse to admit is that they are extremely sloppy with their notation, terminology and rigor. Especially in comparison to the average programmer.
Not sure why you say that. Mathematicians are pretty open about it. The well known essay On proof and progress on mathematics discusses it. It is written by a Fields medalist.
This drove me mad when I had to do introductory maths at uni. Maths as written did seem pretty sloppy and not at all like a programming language whose expressions I could parse as I expected. Obv most simple algebra looks about as you'd expect but I clearly recall feeling exactly what you describe in some cases, and commented upon it to the lecturer about it asking why it was that way during a tutorial. He humoured me, was a good guy.
But I think mathematicians probably have a point - it did evolve that way over a long time and anyone practicing it daily just knows how to do it and they're not going to do a thorough review and change now.
It's us tourists that get thrown for a loop, but so it goes. It's not meant for us.
> Maths as written did seem pretty sloppy and not at all like a programming language whose expressions I could parse as I expected.
Look at Lean's mathlib: that's what fully formal mathematics looks like. It's far too verbose to be suitable for communicating with other people; you might as well try to teach an algorithms course transistor by transistor.
You’re confusing syntax and semantics. Programmers write code for syntax machines (Turing machines). The computers care a lot about syntax and will halt if you make an error. They do not care at all about semantics. A computer is happy to let you multiply a temperature in Fahrenheit times a figure in Australian dollars and subtract the volume of the earth in litres, provided that these numbers are all formatted in a compatible enough way that they can be implicitly converted (this depends on the programming language but many of them are quite liberal at this).
If you want the computer to stop you from doing such nonsense, you’ve got to put in a lot of effort to make types or contracts or write a lot of tests to avoid it. But that’s essentially a scheme for encoding a little bit of your semantics into syntax the computer can understand. Most programmers are not this rigorous!
Mathematicians, on the other hand, write mathematics for other humans to read. They expect their readers to have done their homework long before picking up the paper. They do not have any time to waste in spelling out all the minutiae, much of which is obvious and trivial to their peers. The sort of formal, syntax-level rigour you prefer, which can be checked by computers, is of zero interest to most mathematicians. What matters to them, at the end of the day, is making a solid enough argument to convince the establishment within their subfield of mathematics.
But programmers are expected to get the semantics right. Sure, it happens to mismatch temperatures and dollars, but it’s called a bug and you will be expected to fix it
Why do mathematicians hide their natural way of thinking ? They provide their finished work and everyone is supposed to clap. Why can't they write long articles like about false starts, dead ends and so on. It's only after magazines like Quanta and YouTube channels that we get to feel the thinking process. Math is not hard. Atleast the mathematics we are expected to know.
Mathematics is extremely hard. The math people are expected to know for high school is not hard, but that is such a minuscule amount of math compared to what we (humans) know, collectively.
Mathematicians do speak and also write books about the thinking process. It’s just very difficult and individualized. It’s a nonlinear process with false starts and dead ends, as you say.
But you can’t really be told what it feels like. You have to experience it for yourself.
>Even for a simple case, take the number of symbols that have context sensitive meanings and mathematicians. They will use them without declaring which context they are using, and a reader is simply supposed to infer correctly.
Yes!! Like, oh, you didn't know that p-looking thing (rho) means Pearson's correlation coefficient? That theta means an angle? Well just sit there in ignorance because I'm not going to explain it. And those are the easy ones!
My experience with the average programmer is...different from yours. The software development field is exceptionally bad in this regard. Physicists are mathematically sloppy sometimes (why, yes, I will just multiply both sides by `dy` and take as many liberties with operators, harmonics/series, and vector differential operations as I care to, thanks).
Mathematics, like any other academic field, has jargon (and this includes notation, customary symbols for a given application, etc.), and students of the field ought to learn the jargon if they wish to be proficient. On the other hand, textbooks meant to instruct ought to teach the jargon. It's been forever since I've opened a mathematics textbook; I don't recall any being terribly bad in this regard.
Well I have a different approach. Sometimes I write and hack it to solve a particular problem. The code might be elegant or not, but if you understand the problem you can probably grok the code.
Next I generalize it a bit. Specific variables configurable parameters. Something that happened implicitly or with a single line of code gets handled by its own function. Now it’s general but makes much less sense at first because it’s no longer tied to one problem, but a whole set of problems. It’s a lot less teachable and certainly not self-evident any more.
The problem with math education is that we think the solution approach would be inherently superior to the first, and would make a better textbook—because it’s more generic. But that is not how real people learn—they would all “do” math the first way. By taking away the ability of the student to do the generalization themselves we are depriving them of the real pleasure of programming (or math).
Maybe back when paper was scarce this approach made sense but not any more.
Ideally I would love to present a million specific solutions and let them generalize themselves. That is exactly how we would train a ANN. Not be regurgitating the canned solution but by giving it all sorts of data and letting it generalize for itself. So why don’t we do this for human students? When it comes to education I think people have a blind spot towards how learning is actually done.
Notation and terminology? Sure, some explanations and mechanical manipulations are elided in mathematics because with context they're clear.
Rigor? Ha, you have got to be kidding me. Math is rigorous to a fault. Typical computer programming has no rigor whatsoever in comparison. A rigid syntax is not rigor. Syntax, in turn, is certainly not the difficult part of developing software.
This, really. Sometimes, when reading math papers, you find that they end up being very hand-wavy with the notation, e.g. with subscripting, because "it's understood". But without extensive training in mathematics, a lot of it is not understood.
Programmers will use languages with a lot of syntactic sugar, and without knowing the language, code can be pretty difficult to understand when it is used. But even then, you can't be sloppy, because computers are so damn literal.
> The thing that mathematicians refuse to admit is that they are extremely sloppy with their notation, terminology and rigor.
The refuse part is imo ver dependent on the person. Nearly all of my professors for theoretical cs courses just plainly said that their "unique" notation is just their way because they like it.
It's more or less just a simple approach to alter the language to fit your task. This is also not unfamiliar to the programmers who may choose a language based on the task, with, e.g. Fortran for vector based calculus or C for direct hardware access.
Bro I know this feel. Even books teaching algorithms being written by mathematicians are error everywhere.
They never state the type, class, no comment, no explanation, read exceed the last index... This list can go endlessly. When they say "lets declare an empty set for variable T", you don't know whether the thing is a list, set, tuple, ndarray, placeholder for a scalar, or a graph.
Some even provide actual code, however, never actually run the code to verify their correctness.
Try this guy then, he's got a PhD in mathematics from the California Institute of Technology from a thesis Finite Semifields and Projective Planes but he's written a bunch of stuff on algorithms and will write you a check for any errors you find in his work: https://en.wikipedia.org/wiki/Donald_Knuth
Church organist actually - serious enough to have a two story pipe organ built into his own home.
Enough of the True Scotsman .. it's clear as day from the introduction and opening chapters of TAOCP that he approaches programming as a mathematician.
I believe that any mathematician that took a differential geometry class must have already realized this, the notation is so compressed and implicit that some proofs practically are "by notation" as it can be a dauting prospect to expand a dozen indexes.
The average computer scientist (not only "programmer", as a js dev would be) never wrote lean/coq or similar, and is not aware of the Curry-Haskell like theorems and their implications.
I think you entirely missed the point. GP put it well:
>> They are conceptually/abstractly rigorous, but in "implementation" are incredibly sloppy.
Maturity in concept-space and the ability to reason abstractly can be achieved without the sort of formal rigor required by far less abstract and much more conceptually simple programming.
I have seen this first hand TAing and tutoring CS1. I regularly had students who put off their required programming course until senior year. As a result, some were well into graduate-level mathematics and at the top of their class but struggled deeply with the rigor required in implementation. Think about, e.g., missing semi-colons at the end of lines, understanding where a variable is defined, understanding how nested loops work, simple recursion, and so on. Consider something as simple as writing a C/Java program that reads lines from a file, parses them according to a simple format, prints out some accumulated value from the process, and handles common errors appropriately. Programming requires a lot more formal rigor than mathematical proof writing.
You have a valid point, which is that we are not even being rigorous enough about the meaning of the word “rigor” in this context.
- One poster praises how programming needs to be boiled down into executable instructions as “rigor,” presumably comparing to an imaginary math prof saying “eh that sort of problem can probably be solved with a Cholesky decomposition” without telling you how to do that or what it is or why it is even germane to the problem. This poster has not seen the sheer number of Java API devs who use the Spring framework every day and have no idea how it does what it does, the number of Git developers who do not understand what Git is or how it uses the filesystem as a simple NoSQL database, or the number of people running on Kubernetes who do not know what the control plane is, do not know what etcd is, no idea of what a custom resource definition is or when it would be useful... If we are comparing apples to apples, “rigor” meaning “this person is talking about a technique they have run across in their context and rather than abstractly indicating that it can be used to fix a problem without exact details of how it does that, they know the technique inside and out and are going to patiently sit down with you until you understand it too,” well, I think the point more often goes to the mathematician.
- Meanwhile you invoke correctness and I think you mean not just ontic correctness “this passed the test cases and happens to be correct on all the actual inputs it will be run on” but epistemic correctness “this argument gives us confidence that the code has a definite contract which it will correctly deliver on,” which you do see in programming and computer science, often in terms of “loop invariants” or “amortized big-O analysis” or the like... But yeah most programmers only interact with this correctness by partially specifying a contract in terms of some test cases which they validate.
That discussion, however, would require a much longer and more nuanced discussion that would be more appropriate for a blog article than for an HN comment thread. Even this comment pointing out that there are at least three meanings of rigor hiding in plain sight is too long.
>> Programming requires a lot more formal rigor than mathematical proof writing.
> This is is just wrong? Syntax rigour has almost nothing to do with correctness.
1. It's all fine and well to wave your hand at "Syntax rigour", but if your code doesn't even parse then you won't get far toward "correctness". The frustration with having to write code that parses was extremely common among the students I am referring to in my original post -- it seemed incidental and unnecessary. It might be incidental, but at least for now it's definitely not unnecessary.
2. It's not just syntactic rigor. I gave two other examples which are not primarily syntactic trip-ups: understanding nested loops and simple recursion. (This actually makes sense -- how often in undergraduate math do you write a proof that involves multiple interacting inductions? It happens, but isn't a particularly common item in the arsenal. And even when you do, the precise way in which the two inductions proceed is almost always irrelevant to the argument because you don't care about the "runtime" of a proof. So the fact that students toward the end of undergraduate struggle with this isn't particularly surprising.)
Even elementary programming ability demands a type of rigor we'll call "implementation rigor". Understanding how nested loops actually work and why switching the order of two nested loops might result in wildly different runtimes. Understanding that two variables that happen to have the same name and two different points in the program might not be referring to the same piece of memory. Etc.
Mathematical maturity doesn't traditionally emphasize this type of "implementation rigor" -- even a mathematician at the end of their undergraduate studies often won't have a novice programmer's level of "implementation rigor".
I am not quite sure why you are being so defensive on this point. To anyone who has educated both mathematicians and computer scientists, it's a fairly obvious point and plainly observable out in the real world. Going on about curry-howard and other abstract nonsense seems to wildly and radically miss this point.
Having taught both I get what you are saying, but the rigor required in programming is quite trivial compared to that in mathematics. Writing a well structured program is much more comparable to what is involved in careful mathematical writing. It's precisely the internal semantic coherence and consistency, rather than syntactic correctness, that is hardest.
You need more rigour to prove let’s say Beppo Levy theorem than writing a moderately complex piece of software.
Yet you can write it in crappy English, the medium not being the target goal, the ideation process even poorly transcribed in English needs to be perfectly rigorous. Otherwise, you proved nothing.
> Syntax rigour has almost nothing to do with correctness.
I see your point: has almost nothing correctness with rigour do to Syntax.
Syntax rigor has to do with correctness to the extent that "correctness" exists outside the mind of the creator. Einstein notation is a decent example: the rigor is inherent in the definition of the syntax, but to a novice, it is entirely under-specified and can't be said to be "correct" without its definition being incorporated already...which is the ultimate-parent-posts' point and I think the context in which the post-to-which-you're-replying needs to be taken.
And if you're going to argue "This is just wrong?" (I love the passive-aggressive '?'?) while ignoring the context of the discussion...QED.
but programmers don't write underspecified notational shortcuts, because those are soundly rejected as syntax errors by the compiler or interpreter
this is not about semantics (like dependent types etc) this is just syntax. it works like this in any language. the only way to make syntax accepted by a compiler is to make it unambiguous
... maybe LLMs will change this game and the programming languages of the future will be allowed to be sloppy, just like mathematics
Yep, but for any notational shortcut the compiler infers a single thing for it. It's still unambiguous as far as computers are concerned (but it may be confusing reading it without context)
It’s not only a notational shortcut as in syntactic sugar though.
It can apply a set of rewrite rules given you were able to construct the right type at some point.
It’s type inference on steroids because you can brute force the solution by applying the rewrite rules and other tactics on propositions until something (propositional equality) is found, or nothing.
> Remark. It is easy to prove that zero vector 0 is unique, and that given v ∈ V its additive inverse −v is also unique.
I'm sorry, this book is meant for the audience who can read and write proofs. Uniqueness proofs are staple of mathematics. If word "unique" throws you off, then this book is not meant for you.
I'd go a bit further and say that if you're not comfortable with the basics of mathematical proofs, then you're not ready for the subject of linear algebra regardless of what book or course you're trying to learn from. The purely computational approach to mathematics used up through high school (with the oddball exception of Euclidean geometry) and many introductory calculus classes can't really go much further than that.
Or, you know, mathematics can be viewed as a powerful set of tools…
Somehow I seem to remember getting through an engineering degree, taking all the optional extra math courses (including linear algebra), without there ever being a big emphasis on proofs. I’m sure it’s important if you want to be a mathematician, but if you just want to understand enough to be able to use it?
> taking all the optional extra math courses (including linear algebra), without there ever being a big emphasis on proofs
Sorry to break it to you, but you didn't take math classes. You took classes of the discipline taught in high school under the homonymous name "math". There is a big difference.
It's the same difference as there is between what you get taught in grade school under the name "English" (or whatever is the dominant language where you live): the alphabet, spelling, pronunciation, basic sentence structure... And what gets taught in high school under the name "English": how to write essays, critically analyze pieces of literature, etc. The two sets of skills are almost completely unrelated. The first is a prerequisite for the second (how can you write an essay if you can't write at all?), so somehow the two got the same name. But nobody believes that winning a spelling bee is the same type of skill as writing a novel.
I know it's a shock to everyone who enters a university math course after high school. Many of my 1st year students are confounded about the fact that they'll be graded on their ability to prove things. They expect the equivalent of cooking recipes to invert matrices, compute a GCD, solve a quadratic equation, or whatever, and balk at anything else. I want them to understand logical reasoning, abstract concepts, and the difference between "I'm pretty sure" and "this is an absolute truth". There's a world of difference, and most have to wait a few years to develop enough maturity to finally get it.
> Sorry to break it to you, but you didn't take math classes. You took classes of the discipline taught in high school under the homonymous name "math". There is a big difference.
If you look at the comments below, you’ll see that this can’t be strictly true. At least, not 20+ years ago in Australia when I was a student. Some of the courses I took were in the math faculty with students who were going on to become mathematicians. At that time this would have been a quarter load of a semester, and was titled “Linear Algebra”, but I can’t remember if it was 1st/2nd or even 3rd year subject (it’s been too long).
Perhaps the lack of emphasis on proofs (I am not saying proofs were absent, I made another comment with more explanation), was a combination of these being introductory courses, the universities knowledge that there were more than just math faculty students taking them, or changes with time in how the pedagogy has evolved.
What is more interesting to me, is what do you think a student misses out on, from a capability point of view, with an applications focused learning as opposed to one focused on reading and writing proofs?
Would a student who is not intending to become a mathematician still benefit from this approach? Would a middle aged man who was taught some “Linear Algebra” benefit from picking up a book such as the one referenced here?
> What is more interesting to me, is what do you think a student misses out on, from a capability point of view, with an applications focused learning as opposed to one focused on reading and writing proofs?
The generalizable value is not so much in collecting a bunch of discrete capabilities (they're there, but generally somewhat domain-specific) as it is in developing certain intuitions and habits of thought: what mathematicians call "mathematical maturity". A few examples:
- Correcting trivial errors in otherwise correct arguments on the fly instead of getting hung up on them (as demonstrated all over this comment section).
- Thinking in terms of your domain rather than however you happen to be choosing to represent it at the moment. This is why math papers can be riddled with "syntax errors" and yet still reach the right conclusions for the right reasons. These sorts of errors don't propagate out of control because they're not propagated at all: line N+1 isn't derived from line N: conceptual step N+1 is derived from conceptual step N, and then they're translated into lines N+1 and N independently.
- Tracking, as you reason through something, whether your intuitions and heuristics can be formalized without having to actually do so.
- More generally, being able to fluently move between different levels of formality as needed without suffering too much cognitive load at the transitions.
- Approaching new topics by looking for structures you already understand, instead of trying to build everything up from primitives every time. Good programmers do the same, but often fail to generalize it beyond code.
> Would a student who is not intending to become a mathematician still benefit from this approach?
If they intend to go into a technical field, absolutely.
> Would a middle aged man who was taught some “Linear Algebra” benefit from picking up a book such as the one referenced here?
Depends on what you're looking for. If you want to learn other areas of math, linear algebra is more or less a hard requirement. If you want to be able to semiformally reason about linear algebra faster and more accurately, yes. If you just want better computational tricks, drink deep or not at all: they're out there, but a fair bit further down the road.
The sibling comment answered most of what you wrote. So, I'll just add that I'm talking about the present day, not 20+ years ago. I don't know about your experience in Australia 20+ years ago, but I'm teaching real students, today, who just got out of high school, in Western Europe. Not hypothetical students 20 years ago in Australia. And based on what the Australian colleagues I met at various conferences told me, their teaching experience in Australia today isn't really different from mine.
FWiW I started out in Engineering and transferred out to a more serious mathematics | physics stream.
The Engineering curriculum as I found it was essentially rote for the first two years.
It had more exams and units than any other courses (including Medicine and Law which tagged in pretty close) and included Chemistry 110 (for Engineers) in the Chemistry Department, Physics 110 (for Engineers) in the Physics Department, Mathematics 110 (for Engineers) in the Mathematics Department, and Tech Drawing, Statics & Dynamics, Electrical Fundementals, etc in the Engineering Department.
All these 110 courses for Engineers covered "the things you need to know to practically use this infomation" .. how to use Linear Algebra to solve loading equations in truss configurations, etc.
These were harder than the 115 and 130 courses that were "{Chemistry | Math | Physics} for Business Majors" etc. that essentially taught familiarity with subjects so you could talk with the Engineers you employed, etc.
But none of the 110 courses got into the meat of their subjects in the same way as the 100 courses, these were taught to instruct people who intended to really master. Maths, Physics, or Chemistry.
Within a week or two of starting first year university I transfered out of the Maths 110 Engineering unit and into Math 100, ditto Chem and Physics. Halfway through second year I formally left Engineering the curriculum altogether (although I later became a professional Engineer .. go figure).
The big distinction between Math 100 V. Math 110 was the 110 course didn't go into how anything "worked", it was entirely about how to use various math concepts to solve specific problems.
Math 100 was fundementals, fundementals, fundementals - how to prove various results, how to derive new versions of old things, etc.
Six months into Math 100 nothing had been taught that could be directly used to solve problems already covered in Math 110.
Six months and one week into Math 100 and suddenly you could derive for yourself from first principals everything required to be memorised in Math 110 and Math 210 (Second year "mathematics for engineers").
I'm incredulous that a linear algebra course taught by mathematics faculty didn't have a lot of theorem proving.
Maybe that would be the case if the intended audience is engineering students. But for mathematics students, it would literally be setting them up for failure; a student that can't handle or haven't seen much theorem-proving in linear algebra is not going to go very far in coursework elsewhere. Theorem proving is an integral part of mathematics, in stretching and expanding tools and concepts for your own use.
Maybe the courses are structured so that mathematics students normally go on to take a different course. In that case, GP's point would still have been valid - the LA courses you took were indeed ones planned for engineering, not for those pursuing mathematics degrees. At my alma mater, it was indeed the case that physics students and engineering students were exposed to a different set of course material for foundational courses like linear algebra and complex analysis.
Just like compiler theory, if you don't write compilers maybe it's not that useful and you shouldn't be spending too much time on it, but it would be presumptuous to say that delivering a full compiler course is a fundamentally incorrect approach, because somebody has to make that sausage.
I can only speak to my own experiences, but the math courses were not customised for engineering students. I sat next to students who were planning to become mathematicians. Linear Algebra was an optional course for me.
Having said that, I’m sure theorem proving was part of it (this was many years ago), I just don’t recall it as being fundamental in any sense. I’m sure that has something more to do with the student than the course work. I liked (and like), maths, but I was there to build my tool chest. A different student, with a different emphasis, would have gotten different things out of the course.
But I think my viewpoint is prevalent in engineering, even from engineers who started with a math degree. The emphasis on “what can I do with this”, relegates theorem proving to annoying busywork.
I can second this, in my Engineering degree the Linear Algebra course (and the Calculus course) were both taught by the Math Faculty at my Uni.
The textbook we used was "Linear Algebra: And its Applications" by David C Lay 20 years later I still keep this textbook with me at my desk and consult it a few times a year when I need to jog my memory on something. I consider it to be a very good textbook even if it doesn't it doesn't contain any rigorous proofs or axioms...
Engineers can learn linear algebra from an engineering perspective, i.e. not emphasizing proofs, and that’s fine, but the books being discussed are not intended for that audience.
I don't know whom to agree with. Maybe there need to be two tracks, and it might not even depend on discipline, but just personal preference. Do you love math as an art form, or as a problem solving tool? Or both?
I went back and forth. I was good at problem solving, but proofs were what made math come alive for me, and I started college as a math major. Then I added a physics major, with its emphasis on problem solving. But I would have struggled with memorizing formulas if I didn't know how they were related to one another.
Today, K-12 math is taught almost exclusively as problem-solving. This might or might not be a realistic view of math. On the one hand, very few students are going to become mathematicians, though they should at least be given a chance. On the other hand, most of them are not going to use their school math beyond college, yet math is an obstacle for admission into some potentially lucrative careers.
At my workplace, there's some math work to be done, but only enough to entertain a tiny handful of "math people," seemingly unrelated to their actual specialty.
> I'd go a bit further and say that if you're not comfortable with the basics of mathematical proofs, then you're not ready for the subject of linear algebra regardless of what book or course you're trying to learn from.
Engineers frequently need to learn some fairly advanced mathematics. Are you suggesting they can’t use the same textbooks?
By the way; I don’t think the original poster is wrong as such (every similar textbook is undoubtedly full of proofs), I’m just suggesting a different viewpoint. Not everyone learning Linear Algebra is intending to become a mathematician.
Most engineers don’t learn linear algebra, they learn a bit of matrix and vector math in real Euclidean spaces. They don’t learn anything about vector spaces or the algebra behind all of it. What they learn would only be considered “advanced mathematics” 200 years ago, when Gauss first developed matrices.
You can see with a quick skim that the content is very application focused. I just don’t know enough to know what I don’t know. If one were to learn Linear Algebra using this textbook, would it be a proper base? Would you have grasped the fundamentals?
It covers most of the topics covered in a first course in linear algebra, it’s just very application-specific. It has some basic proofs in the exercises, but nothing overly difficult, and the more involved proofs give you explicit instructions on the steps to take.
There is a chapter on abstract vector spaces and there are a few examples given besides the usual R^n (polynomials, sequences, functions) but there is almost no time spent on these. There is also no mention of the fact that the scalars of a vector space need not be real numbers; that you can define vector spaces over any number field.
There is only a passing discussion of complex numbers (as possible Eigenvalues and in an appendix) but no mention of the fact that vector spaces over the field of complex numbers exist and have an even more well-developed theory than for real numbers.
But more fundamental than a laundry list of missing or unnecessary topics is the fact that it’s application focused. Pure mathematics courses are proof and theory focused. So they cover all the same (and more) topics in much richer theoretical detail, and they teach you how to prove statements. Pure math students don’t just learn how to write proofs in one or two courses and then move on; all of the courses they take are heavily proof-based. Writing proofs, like programming, is a muscle that benefits from continued exercise.
So if you’re studying (or previously studied) science or engineering and learned all your math from that track, switching to pure math involves a bunch of catch up. I’ve met plenty of people who successfully made the switch, but it took a concerted effort.
There seems to be a fundamental difference in mindset between the “applications” based learning of mathematics, and this pure math based version. Are there benefits to be had for a person that only intends to use mathematics in an applied fashion?
This depends on who you ask. Personally, I found studying pure math incredibly rewarding. It gave me the confidence to be able to say that I can look at any piece of mathematics and figure it out if I take the time to do so.
I can't speak for those who have only studied math at an applied level directly. My impression of them (as an outsider but also a math tutor) is that they are fairly comfortable with the math they have been using for a while but always find new mathematics daunting.
I have heard famous mathematicians describe this phenomenon as "mathematical maturity" but I don't know if this has been studied as a real social/educational phenomenon.
Are there courses/books on "applied linear algebra"? You are right in some sense, but wrong in some sense. Linear algebra at a 100 level without any really deep understanding is still incredibly useful. Graphics (i guess you sort of call this out), machine learning etc.
Most university math curriculums have a clear demarcation between the early computation-oriented classes (calculus, some diff eq.) and later proof-oriented classes. Traditionally, either linear algebra or abstract algebra is used as the first proof-oriented course, but making that transition to proof-based math at the same time as digesting a lot of new subject matter can be brutal, so many schools now have a dedicated transition course (often covering a fair bit of discrete mathematics). But there's still demand for textbooks for a linear algebra course that can serve double-duty of teaching engineering students a bag of tricks and give math students a reasonably thorough treatment of the subject.
>I'd go a bit further and say that if you're not comfortable with the basics of mathematical proofs, then you're not ready for the subject of linear algebra
I can't write a proof to save my life, but I'm going to keep using linear algebra to solve problems and make money, nearly every day. Sorry!
We had this discussion about Data Science years ago: "you aren't a real Data Scientist unless you fully understand subjects X, Y, Z!"
Now companies are filled to the brim with Data Scientists who can't solve a business problem to save their life, and the companies are regretting the hires. Nobody cares what proofs they can write.
There are (at least) two different things we're calling "linear algebra here", roughly speaking one is building the tools and one is using the tools.
The mathematicians need to understand the basics of of mathematical proofs to learn how to prove new interesting (and sometimes useful) stuff in linear algebra. You have to do the math stuff in order to come up with some new matrix decomposition or whatever.
The engineers/data scientists/whatever people just need to understand how to use them.
You don't need to know how to build a car to drive one. The mathematicians are building the cars, you're using them.
I don't think I've ever done more rote manual calculation than for my undergrad linear algebra class! On tests and homework just robotically inverting matrices, adding/subtracting them (I think I even had to do some of that in high school algebra), multiplying them (yuck). It was tedious and frustrating and anything but theoretical.
I've learned linear algebra course quality varies substantially. One acquaintance whom I met after they graduated a big university in Canada reported having to do things like by-hand step-by-step reduced row echelon form computations for 3x4 matrices or larger. I had to do such things in "Algebra 2" in junior high (9th grade), until our teacher kindly showed us how to do the operations on the calculator and stopped demanding work steps. If we had more advanced calculators (he demoed on some school-owned TI-92s, convincing me to ask for a TI-89 Titanium for Christmas) we could use the rref() function to do it all at once.
In my actual linear algebra class in freshman year college we were introduced to a lot of proper things I wish I had seen before, along with some proofs but it wasn't proof heavy. I did send a random email to my old 9th grade teacher about at least introducing the concept of the co-domain, not just domain and range, but it was received poorly. Oh well. (There was a more advanced linear algebra class but it was not required for my side. The only required math course that I'd say was proof heavy was Discrete Math. An optional course, Combinatorial Game Theory, was pretty proof heavy.)
Linear algebra is usually a required (or at least strongly encouraged) course for an undergraduate degree in basically any engineering discipline, and it is usually not preceded by a course in "the basics of mathematical proofs".
> this book is meant for the audience who can read and write proofs
It seems like the opposite is true:
"It is intended for a student who, while not yet very familiar with abstract reasoning, is willing to study more [than a] "cookbook style" calculus type course."
(from the link).
If your point is one can't learn linear algebra before learning "abstract [mathematical] reasoning"...don't think you're the main target audience of a subject as practical as linear algebra.
> Besides being a first course in linear algebra it is also supposed to be a first course introducing a student to rigorous proof, formal definitions---in short, to the style of modern theoretical (abstract) mathematics.
So I think it's fair to say that the book (ought to) assume zero knowledge of proofs, contra your parent's claim that the audience is expected to be able to read and write proofs.
From the second paragraph of the introduction to the book we are discussing:
> Besides being a first course in linear algebra it is also supposed to be a first course introducing a student to rigorous proof, formal definitions---in short, to the style of modern theoretical (abstract) mathematics.
So it's certainly meant to be the first math book one sees in their life that discusses rigorous proofs.
A vector space is defined as having a zero vector, that is, a vector v such that for any other vector w, v + w = w.
Saying the zero vector is unique means that only one vector has that property, which we can prove as follows. Assume that v and v’ are zero vectors. Then v + v’ = v’ (because v is a zero vector). But also, v + v’ = v’ + v = v, where the first equality holds because addition in a vector space is commutative, and the second because v’ is a zero vector. Since v’ + v = v’ and v’ + v = v, v’ = v.
We have shown that any two zero vectors in a vector space are in fact the same, and therefore that there is actually only one unique zero vector per vector space.
We used this in my Discrete Mathematics class (MATH 2001 @ CU Boulder) (it is a pre-requisite for most math classes). The section about truth tables did overlap a bit with my philosophy class (PHIL 1440 Critical Thinking)
> The above statement of zero vector is unique, I have no idea what is that means.
In isolation, nothing. (Neither does the word “vector”, really.) In the context of that book, the idea is more or less as follows:
Suppose you are playing a game. That game involves things called “vectors”, which are completely opaque to you. (I’m being serious here. If you’ve encountered about some other thing called “vectors”, forget about it—at least until you get to the examples section, where various ways to implement the game are discussed.)
There’s a way to make a new vector given two existing ones (denoted + and called “addition”, but not the same as real-number addition) and a way to make a new vector given an existing one and a real number (denoted by juxtaposition and called “multiplication”, but once again that’s a pun whose usefulness will only become apparent later) (we won’t actually need that one here). The inner workings of these operations in turn are also completely opaque to you. However, the rules of the game tell you that
1. It doesn’t matter in which order you feed your two vectors into the “addition” operation (“add” them): whatever existing vectors v and w you’re holding, the new vector v+w will turn out to be the same as the other new vector w+v.
2. When you “add” two vectors and then “add” the third to the result, you’ll get the exact same thing as when you “add” the first to the “sum” of the second and third; that is, whatever the vectors u, v, and w are, (u+v)+w is equal to u+(v+w).
(Why three vectors and not four or five? It turns out that you have the rule for three, you can prove those for four, five, and so on, even though there are going to be many more ways to place the parens there. See Spivak’s “Calculus” for a nice explanation, or if you like compilers, look up “reassociation”.)
3. There is [at least one] vector, call it 0, such that adding it to anything else doesn’t make a difference: for this distinguished 0 and whatever v, v+0 is the same as v.
Let’s now pause for a moment and split the last item into two parts.
We’ll say a vector u deserves to be called a “zero” if, whatever other vector we take [including u itself!], we will get it back again if we add u to it; that is, for any v we’ll get v+u=v.
This is not an additional rule. It doesn’t actually tell us anything. It’s just a label we chose to use. We don’t even know if there are any of those “zeros” around! And now we can restate rule 3, which is a rule:
3. There is [at least one] “zero”.
What the remark says is that, given these definitions and the three rules, you can show, without assuming anything else, that there is exactly one “zero”.
(OK, what the remark actually says is that you can prove that from the full set of eight rules that the author gives.
But that is, frankly, sloppy, because the way rule 4 is phrased actually assumes that the zero is unique: either you need to say that there’s a distinguished zero such that for every v there’s a w with v+w= that zero, or you need to say that for every v there’s a w such that v+w is a zero, possibly a different one for each v. Of course, it doesn’t actually matter!—there can only be one zero even before we get to rule 4. But not making note of that is, again, sloppy.
This kind of sloppiness is perfectly acceptable among people who have seen this sort of thing before, say done finite groups or something like that. But if the book is supposed to be give a first impression, this seems like a bad idea. Perhaps a precalculus course of some sort is assumed.
Read Spivak, seriously. He’s great. Not linear algebra, though.)
Yes, and this book: https://openlibrary.org/books/OL28292750M (no opinion about differences between the editions, this is the final one). Yes, it’s a doorstopper, but unlike most other people’s attempts at the kitchen sink that is the traditional calculus course, it actually affords proper respect to the half-dozen or so different subjects whose basics are crammed in there. The discussion of associativity I was referring to is in Chapter 1, so perhaps it can serve as a taster.
I think a lot of people just need an opportunity to see math demonstrated in a more tangible way.
For example, I learned trig, calculus, and statistics from my science classes, not from my math classes (and that's despite getting perfect A's in all of my math classes). In math class, I was just mindlessly going through the motions and hating every second of it, but science classes actually taught me why it worked and showed me the beauty and cleverness of it all.
I think most college math depts have "applied math" majors. I like both sides of math, but I found it incredibly frustrating when I would try to study just the equations for that chapter, only to be tested on a word problem. The whole "trying to trick you" conspiracy turned me off to college in general. If I'm trying to teach someone how to do something, I would show them "A, then, B, and you get C" , then assign a variety of homework of that form, and on the test, say "A, then B, then _____" and they would be correct if they concluded C. But for some reason this method isn't used much in university. If I wanted to teach a student how to start with C and deconstruct into A, B , thats what I would have taught them!
If you study mathematics at a rigorous level then you learn by writing proofs. Then you will rack your brain for hours or even days trying to figure out how to prove some simple things. It is not at all “going through the motions” at that point!
A first course in linear algebra still assumes background information, because linear algebra is not a basic topic. It’s not meant to be a first course in math. Math builds on itself and it would be incredibly inconvenient if every course everywhere would have to include a recap of basic things. And proofs are among the most fundamental things in math!
Programming courses or articles or books, beyond the 101 level, don’t teach you again and again the basics of declaring a variable and writing a loop either! No field does that.
Wrt linear algebra in particular, there are plenty of resources aimed at programmers thanks to its relevance in computer graphics and so on. They typically skip proofs and just tell you that this is how matrix multiplication is defined, but they don’t teach you math, merely using math. Which can be plenty enough to an engineer.
This is not snobbery, some subjects just have prerequisites.
You can't learn computer science without having a good sense of what an "algorithm" is, you have to know how to read and write and understand algorithms. Similarly you can't learn math without having a good sense of what a proof is, reading, writing and understanding proofs is the heart of what math is.
Even more strongly, trying to learn math without a solid understanding of how proofs work is something like studying English literature while refusing to learn how to read English.
> Even more strongly, trying to learn math without a solid understanding of how proofs work is something like studying English literature while refusing to learn how to read English.
It depends why you're trying to learn math. Are you interested in math for math's sake, or are you trying to actually do something with it?
If it's the former, then yeah, you need proofs. Otherwise, like in your analogy, it's like studying English literature without knowing any english grammar rules.
But if you're trying to apply the math, if you're studying linear algebra because it's useful rather than for its own sake, then you don't need proofs. To follow the same analogy, it's like learning enough English to be conversational and get around America, without knowing what an "appositive" is.
The software industry, similarly, is full of people who make use of computer science concepts, without having rigorously studying computer science. You can't learn true "computer science" without an understanding of discrete math, but you can certainly get a job as an entry-level SWE without one. You don't need discrete math to learn python, see that it's useful, and do something interesting with it.
The same applies to linear algebra. Everyone who does vector math doesn't need to be able to prove that the tools they are using work. If everyone who does vector math is re-deriving their math from first principles, then something's gone terribly wrong. There's a menu of known, well-defined treatments that can be applied to vectors, and one can read about them and trust that they work without having proven why they work.
EDIT: it occurs to me, an even stronger analogy of this point, is that it is entirely possible to study computer science, without having any understanding of electrical engineering or knowing how a transistor works.
> But if you're trying to apply the math, if you're studying linear algebra because it's useful rather than for its own sake, then you don't need proofs. To follow the same analogy, it's like learning enough English to be conversational and get around America, without knowing what an "appositive" is.
Sure, but then you're not studying math, you're studying applications of math or perhaps you're even studying some other subject like engineering which is built on top of applications of math.
To add an extra analogy to the pile, its like learning to drive a car vs learning how to build a car. Sure, its completely valid to learn how to drive a car without knowing how to build one, but no one says they're learning automotive engineering when they're studying for their driving test. Its a different subject.
Mathematicians are well aware of complaints like these about introductions to their subjects, by the way.
It is for a reason that this book introduces the theory of abstract vector spaces and linear transformations, rather than relying on the crutch of intuition from Euclidean space. If you want to become a serious mathematician (and this is a book for such people, not for people looking for a gentle introduction to linear algebra for the purposes of applications) at some point it is necessary to rip the bandaid of unabstracted thinking off and engage seriously with abstraction as a tool.
It is an important and powerful skill to be presented with an abstract definition, only loosely related to concrete structures you have seen before, and work with it. In mathematics this begins with linear algebra, and then with abstract algebra, real analysis and topology, and eventually more advanced subjects like differential geometry.
It's difficult to explain to someone whose exposure to serious mathematics is mostly on the periphery that being exposed forcefully to this kind of thinking is a critical step to be able to make great leaps forward in the future. Brilliant developments of mathematics like, for example, the realisation that "space" is an intrinsic concept and geometry may be done without reference to an ambient Euclidean space begin with learning this kind of abstract thinking. It is easy to take for granted the fruits of this abstraction now, after the hard work has already been put in by others to develop it, and think that the best way to learn it is to return back to the concrete and avoid the abstract.
The point of starting with physical intuition isn't to give students a crutch to rely on, it's to give them a sense of how to develop mathematical concepts themselves. They need to understand why we introduce the language of vector spaces at all - why these axioms, rather than some other set of equally arbitrary ones.
This is often called "motivation", but motivation shouldn't be given to provide students with a reason to care about the material - rather the point is to give them an understanding of why the material is developed in the way that it is.
To give a basic example, high school students struggle with concepts like the dot and cross products, because while it's easy to define them, and manipulate symbols using them, it's hard to truly understand why we use these concepts and not some other, e.g. the vector product of individual components a_1 * b_1 + a_2 * b_2 ...
While it is a useful skill to be adroit at symbol manipulation, students also need an intuition for deciding which way to talk about an unfamiliar or new concept, and this is an area in which I've found much of mathematics (and physics) education lacking.
Physical intuition isn’t going to help when you’re dealing with infinite-dimensional vector spaces, abstract groups and rings, topological spaces, mathematical logic, or countless other topics you learn in mathematics.
Not at all! I fully endorse learning. My point is that physical intuition will only get you so far in mathematics. Eventually you have to make the leap to working abstractly. At some point the band-aid has to come off!
You just visualize 2 or 3 and say "n" or "infinite" out loud. A lot of the ideas carry over with some tweaks, even in infinite dimensions. Like spectral theorems mostly say that given some assumption, you have something like SVD.
Now module theory, there's something I don't know how to visualize.
>... rather than relying on the crutch of intuition from Euclidean space
Euclidean space is not a good crutch, but there are other, much more meaningful, crutches available, like (orthogonal) polynomials, Fourier series etc. Not mentioning any motivations/applications is a pedagogical mistake IMO.
I think we need some platform for creating annotated versions of math books (as a community project) - that could really help.
On that of course I agree, but mathematicians tend to "relegate" such things to exercises. This tends to look pretty bad to enthusiasts reading books because the key examples aren't explored in detail in the main text but actually those exercises become the foundation of learning for people taking a structured course, so its a bit of a disconnect when reading a book pdf. When you study such subjects in structured courses, 80%+ of your engagement with the subject will be in the form of exercises exploring exactly the sorts of things you mentioned.
Axler serves as an adequate first introduction to linear algebra (though it is intended to be a second, more formal, pass through. Think analysis vs calculus), but it isn't intended to be a first introduction to all of formal mathematics! A necessary prereq is understanding some formal language used in mathematics- what unique means is included in that.
Falling entirely back on physical intuition is fine for students who will use linear algebra only in physical contexts, but linear algebra is often a stepping stone towards more general abstract algebra. That's what Axler aims to help with, and with arbitrary (for instance) rings there isn't a nice spacial metaphor to help you. There you need to have developed the skill of looking at a definition and parsing out what an object is from that.
> This is actually why I feel that mathematical texts tend to be not rigorous enough, rather than too rigorous.
This is _precisely_ the opinion of Roger Godement, French mathematician and member of the Bourbaky group.
I would highly recommend his books on Algebra. They are absolutely uncompromising on precision and correctness, while also being intuitive and laying down all the logical foundations of their rigor.
Overall, I cannot recommend enough the books of the Bourbaky group (esp. Dieudonne & Godement). They are a work of art in the same sense that TAOCP is for computer science.
Unfortunately, some of the Bourbaki books need to be read in French, because the typesetting on the English translations is so atrocious as to be unreadable; as a consolation, the typesetting on the original French is, as always, immaculate.
I absolutely agree about additional rigor and precision making math easier to learn. Only after you're familiar with the concepts can you be more lazy.
That's the approach taken by my favorite math book:
> This is actually why I feel that mathematical texts tend to be not rigorous enough, rather than too rigorous. On the surface the opposite is true - you complain, for instance, that the text jumps immediately into using technical language without any prior introduction or intuition building. My take is that intuition building doesn't need to replace or preface the use of formal precision, but that what is needed is to bridge concepts the student already understands and has intuition for to the new concept that the student is to learn.
If you read the book in the original post you may find it's absolutely for you.
Axler assumes you know only the real numbers, then starts by introducing the commutative and associative properties and the additive and multiplicative identity of the complex numbers[1]. Then he introduces fields and shows that, hey look we have already proved that the real and complex numbers are fields because we've established exactly the properties required. Then he goes on to multidimensional fields and proves the same properties (commutativity and associativity and identities) in F^n where F is any arbitrary field, so could be either the real or the complex numbers.
Then he moves onto vectors and then onto linear maps. It's literally chapter 3 before you see [ ] notation or anything that looks like a matrix, and he introduces the concept of matrices formally in terms of the concepts he has built up piece by piece before.
Axler really does a great job (imo) of this kind of bridge building, and it is absolutely rigorous each step of the way. As an example, he (famously) doesn't introduce determinants until the last chapter because he feels they are counterintuitive for most people and you need most of the foundation of linear algebra to understand them properly. So he builds up all of linear algebra fully rigorously without determinants first and then introduces them at the end.
[1] eg he proves that there is only one zero and one "one" such that A = 1*A and A = 0 + A.
A lot of people think Gil Strang was that. Certainly his 18.06SC lecture series is fabulous.[1]
I really like Sheldon Axler and he has made a series of short videos to accompany the book that I think are wonderful. Very clear and easy to understand, but with a little bit more of the intuition behind the proofs etc.
This, betterexplained, ritvikmath, SeeingTheory will give you a very solid math background(I think they are better than 90% of the intro math classes in colleges).
> Isn‘t there anybody close to the Feynman of Linear Algebra?
No. The subject is too young (the first book dedicated to Linear Algebra was written in 1942).
Since then, there have been at least 3 generations of textbooks (the first one was all about matrices and determinants). That was boring. Each subsequent iteration is worse.
What is dual space? What motivates the definition? How useful is the concept? After watching no less than 10 lectures on the subject on youtube, I'm more confused than ever.
Why should I care about different forms of matrix decomposition? What do they buy me? (It turns out, some of them are useful in computer algebra, but the math textbook is mum about it)
My overall impression is: the subject is not well understood. Give it another 100 years. :-)
Gilbert Strang (already mentioned by fellow commenters).
> The subject is too young
"The first modern and more precise definition of a vector space was introduced by Peano in 1888; by 1900, a theory of linear transformations of finite-dimensional vector spaces had emerged." (from Wikipedia)
The first book was written in 1942 - it's mentioned explicitly in LADR.
It doesn't mean the concepts didn't exist - they did, Frobenius even built a brilliant theory around them (representation theory), but the subject was defined quite loosely - apparently no one cared to collect the results in one place.
It doesn't even matter much: I remember taking the course in 1974, and it was totally different from what is being taught today.
What? Linear Algebra is easily one of the best understood fields of mathematics. Maybe elementary number theory has it beat, but the concepts that drive useful higher level number theory aren't nearly so clear or direct as those driving linear algebra. It's used as a lingua franca between all sorts of different subjects because mathematicians of all stripes share an understanding of what it's about.
From what you said there, it seems like you tried to approach linear algebra from nearly random directions- and often from the end rather than the beginning. If you're in it for the computation, Axler definitely isn't for you. There are texts specifically on numeric programming- they'll jump straight to the real world use. If you want to understand it from a pure math perspective, I'd recommend taking a step back and tackle a textbook of your choosing in order. The definition of a dual space makes a lot more sense once you have a vector space down.
I sympathize with the person you're responding to a lot more than you.
It's very easy to understand what a dual space is. It's very hard to understand why you should care. Many of the constructions that use it seem arbitrary: if finite vector spaces are isomorphic to their duals, why bother caring about the distinction? There are answers to this question, but you get them somewhere between 1 and 5 years later. It is a pedagogical nightmare.
Every concept should have both a definition and a clear reason to believe you should bother caring about it, such as a problem with the theory that is solved by the introduction of that concept. Without the motivating examples, definitions are pointless (except, apparently, to a certain breed of mathematicians).
I've read something like 100 math textbooks at this point. I would rate their pedagogical quality between an F and a D+ at best. I have never read a good math textbook. I don't know what it is, but mathematicians are determined to make the subject awful for everybody who doesn't think the way they do.
(I hope someday to prove that it's possible to write a good math textbook by doing it, but I'm a long way away from that goal.)
I absolutely see what you're saying with that. I think I'm definitely the target audience of the abstracted definition, but I've long held that every new object should be introduced with 3 examples and 3 counter-examples. But you said it yourself- that's the style pure math texts are written in! Saying that "we" as a species don't have a good understanding of linear algebra is unbelievable nonsense. I can't conceive of the thought process it would take to say that with a straight face. The fact is, 10 separate YouTube lectures disconnected from anything else is just the wrong way to try and learn a math topic. That's going to have as much or more to do with why dual spaces seem unmotivated as the style of pedagogy does.
It's not that we don't have a good understanding of linear algebra at all. It's that we don't understand how to make it simple. It's like a separate technological problem than actually building the theory itself.
I'm not the person you were originally replying to, but I have taken all the appropriate classes and still find the dual space to be mostly inappropriately motivated. There is a style of person for whom the motivation is simply "given V, we can generate V* and it's a vector space, therefore it's worth studying". But that is not, IMO, sufficient. A person the subject can't make sense of that understanding the alternative: not defining it, and discarding it, and ultimately why one approach was stolen over the others.
I think in 50 years we will look back on the way pure math was written today as a great tragedy of this age that is thankfully lost to time.
> I think in 50 years we will look back on the way pure math was written today as a great tragedy of this age that is thankfully lost to time.
That could very well be true. I mean just a 100 years ago mathematics (and most education) consisted almost exclusively of the most insane drudgery imaginable. I do sometimes wonder what the world could have been like if we didn't gate contributions in math or physics behind learning classical greek.
I do think that some of the issues come down to different learning styles. I personally like getting the definition up front- it keeps me less confused, and I can properly appreciate the examples down the line. The way Axler introduces the dual space was really charming for me, and it clicked in a way that "vectors as columns, covectors as rows" never did. But that's not everyone! It's by no means everyone in pure math, and its definitely not everyone who needs to use math. I've met people far better than me who struggled just because the resources weren't tuned towards them- there's a huge gap.
My arguments is: whoever understands linear algebra has to be able to explain it to anyone having a sufficient math background. The failure to do so signals the lack of understanding. Presenting it as a pure algebraic game cleverly avoids the problems of interpretation, but when you proceed to applications, it leads to conceptual confusion.
One "discovery" I made while learning LA is that most applications are based on mathematical coincidence. Namely, the formula for the scalar product of 2 vectors is identical to the formula for the correlation between 2 series of data. There's no apparent connection between the "orthogonality" in one sense and "orthogonality" (as a lack of correlation) in another.
I submit that not only the subject is not well understood, but even the name of the subject is wrong. It should be called "The study of orthogonality". This change of perspective will naturally lead to discussion of orthogonal polynomials, orthogonal functions, create a bridge to representation theory and (on the other end) to the applications in data science. What say you? :-)
I think that "when you proceed to applications" is the issue there. Applications where? For applications in field theory, the spatial metaphor is exactly incorrect! For applications in various spectral theories, it's worse than useless.
What you say regarding the seeming coincidental nature of "real world" applications is basically correct (with correlation specifically there's some other stuff going on, it isn't that surprising, but in general), but unavoidable for any aspect of pure mathematics. Math is the study of formal systems, and the real world wasn't cooked up on a black board. If we can demonstrate that some component of reality obeys laws which map onto axioms, we can apply math to the world. But re-framing an entire field to work with one specific real world use (not even imo the most important real world use!) is just silly.
I love the idea of encouraging students early on to look at different areas of math and see the connections. But linear algebra is connected in more ways to more things than just using an inner product to pull out a nice basis. Noticing that polynomials, measurable functions, etc are vectors is possible without reframing the entire field, and there are lots of uses of linear algebra that don't require a norm! Hell representation theory only does in some situations.
You start with a controversial statement ("Math is the study of formal systems"), and the rest follows. Not everyone agrees with this viewpoint. I think algebraic formalization provides just one perspective of looking at things, but there are other perspectives, and their interplay (superposition) constitutes the "knowledge". Focusing just on albegraic perspective is a pedagogical mistake IMO.
Some say it's all a kind of hangover from bourbakinism though.
(Treating math as a game of symbols is equivalent to artificial restriction to use just 1% of your brain capacity IMO)
Hmm, I do see where you're coming from. To me, saying math is the study of formal systems is a statement of acceptance and neutrality- we can welcome ultrafinitists and non-standard analysts under one big tent. But you correctly point out that it's still a boundary I've drawn, and it happens to be drawn around stuff I enjoy. I'm by no means saying that there isn't room for practical, grounded math pedagogy with less emphasis on rigor.
However, there's plenty of value in the formal systems stuff. Algebraic formalization is just one way of looking at the simplest forms on linear algebra, but there really isn't any other way of looking at abstract algebra. Or model theory, or the weirder spectral stuff. Or algebraic topology. And when linear algebra comes up in those contexts (which it does often, it's the most well developed field of mathematics), it's best understood from an abstract, formal perspective.
And, just as a personal note, I personally would never have pursued mathematics if it were presented any other way. I'm not trying to use that as an argument- as we've discussed, the problem with math pedagogy certainly isn't a lack of abstract definitions and rigor. But there are people who think like me, and the reason the textbooks are written like that is because that's what was helpful to the authors when they were learning. It wasn't inflicted on our species from the outside.
> the reason the textbooks are written like that is because that's what was helpful to the authors when they were learning
The author writing a book after 30 years of learning, thinking, talking with other people cannot easily reconstruct what was helpful and what wasn't. Creating 1-dimensional representation of the state of mind (which constitues "understanding") is a virtually impossible task. And here algebraic formalism comes to the rescue. "Definition" - "Theorem" - "Corollary" structure looks like a silver bullet, it fits very well in a linear format of a book. Unfortunately, this format is totally inadequate when it comes to passing knowledge. Very often, you can't understand A before you understand B, and you can't understand B before understanding A - the concepts in math are very often "entangled" (again, I'm talking about understanding, not formal consistency). You need examples, motivations, questions and answers - the whole arsenal of pedagogical tricks.
Some other form of presentation must be found to make it easier to encode the knowledge. Not sure what this form might be. Maybe some annotated book format will do, not sure. It should be a result of a collective effort IMO. Please think about it.
BTW, this is not a criticism of LADR book in particular. The proofs are concise and beautiful. But... the compression is very lossy in terms of representing knowledge.
> "Definition" - "Theorem" - "Corollary" structure looks like a silver bullet, it fits very well in a linear format of a book. Unfortunately, this format is totally inadequate when it comes to passing knowledge.
I really can't emphasize enough that this is exactly how I learn things. I don't claim to be a majority! But saying that no one can learn from that sort of in-order definition-first method is like saying no one can do productive work before 6am. It sucks that morning people control the world, but its hardly a human universal to sleep in.
> Some other form of presentation must be found to make it easier to encode the knowledge. Not sure what this form might be. Maybe some annotated book format will do, not sure. It should be a result of a collective effort IMO.
I 100% agree. Have you seen the napkin project? I don't love the exposition on everything, but it builds up ideas pretty nicely, showing uses and motivation mixed in with the definitions. I've been trying to write some resources of my own intended for interested laymen, so more focus on motivation and examples and less on proofs and such. I like the challenge of trying to cut to the core of why we define things a certain way- though I'm biased towards "because it makes the formal logic nice" as an explanation.
What do you mean with correlation and orthogonality? Like with signal processing, you might calculate the cross-correlation of two signals, and it basically tells you at each possible shifted value, to what extent does one signal project onto the other (so what's their dot product). Orthogonality is not invariant under permuting/shifting entries in just one of the vectors, obviously (e.g. in your standard 2-d arrows space, x-hat is orthogonal to y-hat but not x-hat).
Linear algebra studies linearity, not (just) orthogonality. Orthogonality requires an inner product, and there isn't a canonical one on a linear structure, nor is there any one on e.g. spaces over finite fields. Mathematics, like programming, has an interface segregation principle. By writing implementations to a more minimal interface, we can reuse them for e.g. modules or finite spaces. It also makes it clear that questions like "are these orthogonal" depend on "what's the product", which can be useful to make sense of e.g. Hermite polynomials, where you use a weighted inner product.
> Namely, the formula for the scalar product of 2 vectors is identical to the formula for the correlation between 2 series of data. There's no apparent connection between the "orthogonality" in one sense and "orthogonality" (as a lack of correlation) in another.
Of course there is. Covariance looks like an L2 norm (what you're calling the scalar product) because it is an L2 norm. They're the exact same object.
Why should it buy you something is the real question.
You don't need to understand it the way the "initial" author thought about it, should that person had given it more thoughts...
History of maths is really interesting but it's not to be confused with math.
Concepts are not useful as you think about them in economic opportunity case. Think about them as "did you notice that property" and then you start doing math, by playing with these concepts.
Otherwise you'll be tied to someones way of thinking instead of hacking into it.
I know more math than the average bear, but I think the parent has a point even if I don’t totally agree with them.
Take for instance the dual space example. The definition of it to someone who hasn’t been exposed to a lot of math seems fine but not interesting without motivation — it looks just another vector space that’s the same as the original vector space if we’re working in finite dimensions.
However, the distinction starts to get interesting when you provide useful examples of dual spaces. For example, if your vector space is interpreted as functions (for the novice, even they can see that a vector can be interpreted as a function that maps an index to a value), then the dual space is a measure — a weighting of the inputs of those functions. Even if they are just finite lists of numbers in this simple setting, it’s clear that they represent different objects and you can use that when modeling. How those differences really manifest can be explored in a later course, but a few bits of motivation as to “why” can go a long way.
Mathematicians don’t really care about that stuff — at least the pure mathematicians who write these books and teach these classes — because they are pure mathematicians. However, the folks taking these classes aren’t going to all grow up and be pure mathematicians, and even if they are, an interesting / useful property or abstraction is a lot more compelling than one that just happens to be there.
Your post represents a common viewpoint, but I don't agree with it. I'm a retired programmer trying to learn algebra for the purposes of education only. I am not supposed to take an exam or use the material in any material way, so to speak. I'd like to understand. Without understanding motivations and (on the opposite end) applications I simply lose interest. I happen to have a degree in math, and I know for the fact that when you know (or can reconstruct) the untuition behind the theory - it makes a world of a difference. If this kind of understanding is not a goal, then what is?
BTW, by "buying" I din't mean that it should buy me a dinner, but at least it's supposed to tell me something conceptually important within the theory itself. Example: in the LADR book, the chapter on dual spaces has no consequences, and the author even encourages the reader to skip it :).
> Why should I care about different forms of matrix decomposition? What do they buy me?
A natural line of questioning to go down once you're acquainted with linear maps/matrices is "which functions are linear"/"what sorts of things are linear functions capable of doing?"
It's easy to show dot products are linear, and not too hard to show (in finite dimensions) that all linear functions that output a scalar are dot products. And these things form a vector space themselves, the "dual space" (because each element is a dot-product mirror of some vector from the original space). So linear functions from F^n -> F^1 are easy enough to understand.
What about F^n -> F^m? There's rotations, scaling, projections, permutations of the basis, etc. What else is possible?
A structure/decomposition theorem tells you what is possible. For example, the Jordan Canonical Form tells you that with the right choice of basis (i.e. coordinates), matrices all look like a group of independent "blocks" of fairly simple upper triangle matrices that operate on their own subspaces. Polar decomposition says that just like complex numbers can be written in polar form re^it, where multiplication scales by r and rotates by t, so can linear maps be written as a higher dimensional multiplication/scaling and orthogonal transformation/"rotation". The SVD says that given the correct choice of basis for the source and image, linear maps all look like multiplication on independent subspaces. The coordinate change for SVD is orthogonal, so another interpretation is that roughly speaking, SVD says all linear maps are a rotation, scaling, and another rotation. The singular vectors tell you how space rotates and the singular values tell you how it stretches.
So the name of the game becomes to figure out how to pick good coordinates and track coordinate changes, and once you do this, linear maps become relatively easy to understand.
Dual spaces come up as a technical thing when solving PDEs for example. You look for "distributional" solutions, which are dual vectors (considering some vector space of functions). In that context people talk about "integrating a distribution with test functions", which is the same thing as saying distributions are dot products (integration defines a dot product) aka dual vectors. There's some technical difficulties here though because now space is infinite dimensional, and not all dual vectors are dot products, e.g. the Dirac delta distribution delta(f) = f(0) can't be written as a dot product <g,f> for any g, but it is a limit of dot products (e.g. with taller/thinner gaussians). One might ask whether all dual vectors are limits of dot products and whether all limits of dual vectors are dual vectors (as limits are important when solving differential equations). The dual space concept helps you phrase your questions.
They also come up a lot in differential geometry. The fundamental theorem of calculus/Stokes theorem more-or-less says that differentiation is the adjoint/dual to the map that sends a space to its boundary. I don't know off the top of my head of more "elementary" examples. It's been like 10 years since I've thought about "real" engineering, but roughly speaking, dual vectors model measurements of linear systems, so one might be interested in studying the space of possible systems (which, as in the previous paragraph, might satisfy some linear differential equations). My understanding is that quantum physics uses a dual space as the state space and the second dual as the space of measurements, which again seems like a fairly technical point that you get into with infinite dimensions.
Note that there's another factoring theorem called the first isomorphism theorem that applies to a variety of structures (e.g. sets, vector spaces, groups, rings, modules) that says that structure-preserving functions can be factored into a quotient (a sort of projection) followed by an isomorphism followed by an injection. The quotient and injection are boring; they just collapse your kernel to zero without changing anything else, and embed your image into a larger space. So the interesting things to study to "understand" linear maps are isomorphisms, i.e. invertible (square) matrices. Another way to say this is that every rectangular matrix has a square matrix at its heart that's the real meat.
The thing is, you can teach linear algebra as a gateway to engineering applications or as a gateway to abstract algebra. The second one will require a hell of a lot more conceptual baggage than the first one. It’s also what the book is geared towards.
It is also intended for people who know something about the trade; it isn’t “baby’s first book on maths”. (Why can you graduate high school, do something labelled “maths” for a decade, and still be below the “baby’s first” level, incapable of reading basically any professional text on the subject from the last century? I don’t know. It’s a failure of our society. And I don’t even insist on maths being taught—but if they don’t teach maths, at least they could have the decency to call their stupid two-hundred-year-old zombie something else.)
That conceptual baggage is not useless even in the applied context. For example, I know of no way to explain the Jordan normal form in 19th-century “columns or numbers” style preferred by texts targeted at programmers. (Not point at, not demonstrate, not handwave, explain—make it obvious and inevitable why such a thing must exist.) Or the singular value decomposition, to take a slightly simpler example. (Again, explain. You task, should you choose to accept it, is to see a pretty picture behind it.) And so on.
Again, you can certainly live without understanding any of that. (To some extent. You’ll have a much harder time understanding the motivation behind PageRank then, say. And ordinary differential equations, classical mechanics, or even just multivariable calculus will look much more mysterious than they actually are.) But in that case you need a different book and a different teacher.
I like the free course on linear algebra by Strang’s Ph.D student Pavel Grinfeld. It's a series of short videos with online graded exercises. Most concepts are introduced using geometric vectors, polynomials, and vectors in ℝⁿ as examples. https://www.lem.ma/books/AIApowDnjlDDQrp-uOZVow/landing
> Isn‘t there anybody close to the Feynman of Linear Algebra?
That would probably be Gilbert Strang.
While, as a maths person I would prefer a bit more rigour, his choice of topics and his teaching skill make his the most outstanding introductory course I have seen.
I would run a mile from any course that disrespects determinants. And that includes Axler's!
Also I wish more Linear Algebra courses would cover Generalized Inverses.
As mentioned, the book was intended to be a "second course" in linear algebra. I personally self-studied out of the 3rd edition of Axler, and found it very helpful for understanding exactly what is going on with all the matrix computations we do.
Plus, the same can be said about artists. After all, it's all self-aggrandization, and art is not made to be simple or intuitive.
I actually found the book quite intuitive and helpful in understanding linear algebra. It does explain a lot of the intuition for many definitions, as well as mathematical techniques.
It's easy when presented with new things that you don't understand to reflexively dismiss them, but the ideas here are quite solid. It's also a textbook which aims to introduce students to a slightly higher level of mathematical thinking.
I self studied from this book as an undergrad. I was an EE major and took linear algebra as part of the mandatory ODEs class but didn’t “get it.” At a certain point, it became clear that if I wanted to learn the more advanced applied math I was interested in studying, I needed to really understand linear algebra. I thought Axler was great at introducing both the material and teaching me how to prove things rigorously. The month or so I spent that summer reading that book made the rest of the math I took in undergrad trivial.
Actually,combinatorial thinking is more useful for understanding Algebra and not the other way around.Key word is understanding.No need for a beautiful formula you cannot understand.
You probably have not come across Lev Landau's ranking of physicists.From Wikipedia:
Landau kept a list of names of physicists which he ranked on a logarithmic scale of productivity ranging from 0 to 5. The highest ranking, 0, was assigned to Isaac Newton. Albert Einstein was ranked 0.5. A rank of 1 was awarded to the founding fathers of quantum mechanics, Niels Bohr, Werner Heisenberg, Satyendra Nath Bose, Paul Dirac and Erwin Schrödinger, and others, while members of rank of 5 were deemed "pathologists". Landau ranked himself as a 2.5 but later promoted to a 2.
Also in Central Asia. They told me there: green tea in summer, black tea in winter.
And also, we keep the good one for us, we sell the bad one to Europe, we grind the very bad one and sell it to Europe to make tea bags. Luckily I almost never use tea bags.
reply