Just one caveat I'd make to his presentation. He says "Naturally, one can extend this definition to n-fold tensor products...", but this isn't actually a definition - it's a theorem.
If you wanted to define A x B x C, you could begin by looking at (A x B) x C. This is elements like ((a,b), c) which obey tensor product laws. You could alternately look at A x (B x C), i.e. elements like (a, (b, c)).
It's pretty straightforward to prove that there is a natural isomorphism which preserves all the tensor product properties between (A x B) x C and A x (B x C) - it's basically just rearranging the parenthesis. Here the term "natural" also has a technical meaning which agrees completely with it's intuitive one (http://en.wikipedia.org/wiki/Natural_transformation).
Since (A x B) x C is isomorphic to A x (B x C), it's natural to equate them. You can also define A x B x C in the obvious way, which turns out to be isomorphic to the other two.
But it is important to have the theorem proving isomorphism.
And to amplify this point slightly, the value of this is that we can use a weaker, more general definition and thus capture more behavior with our theory.
Unfortunately, a critically important point in this article is rather misleadingly expressed. The author's first axiom is: "The vectors are v⊗w for v in V, w in W" -- but (as the author clearly realises, and does state elsewhere, but never forcefully enough) the huge majority of elements of V⊗W are not of the form v⊗w.
For this reason, I think the author's attempt to present V⊗W as a sort of modified version of V×W is ill-judged. Yes, those are both called products, but they're very different sorts of product; indeed (dropping briefly into the categorical machinery the author introduces at the end) in the category of vector spaces V×W is a coproduct rather than a product.
Again, it's clear that the author knows all this. But you wouldn't think so from the early parts of the article.
I assume the reason for all this is that the author's trying to relate (scary) tensor products to (nice comfortingly familiar) cartesian products. A laudable aim, but I fear the result will be confusion.
Double-plus nitpick: in Vect, VxW can be seen as both a product (which is usually called cartesian product) and a co-product (which is usually called direct sum). So, I guess I would not say "rather than" but "in addition to".
It's been a while since I studied category theory so maybe my brain is failing me here, but unless I'm super-confused (1) products are unique up to isomorphism and (2) the product in Vect is the tensor product.
The cartesian product is the categorical product in some other categories you can see vector spaces as belonging to -- sets, groups -- but there is only one product in the category of vector spaces and it's the tensor product.
(2) is incorrect in that the tensor product is not the category theoretical product in Vect.
The product of Vect is given by forming the cartesian product of the vector spaces (with the usual vector space structure on it; and the usual projections into the components are the morphisms you need for the product).
The coproduct is the direct sum of vector spaces: In the product construction, take all those vectors where only finitely many coordinates are non-zero (but the morphisms you need for the coproduct are the embeddings of the components into the direct sum).
Of course, if you take products or coproducts of finitely many vector spaces, the two coincide.
The tensor product is neither a product nor a coproduct in Vect, as the universal property it satisfies is rather different from the product/coproduct ones.
(If you consider tensor products of algebras over a commutative base ring R, then the tensor product is the coproduct in the category of R-algebras, but this is probably not what you had in mind.)
> the tensor product is not the category theoretical product in Vect.
Oh, you're right and I'm a twit. (Not simply because I got it wrong, but because if I'd spent 30 seconds to think what the product construction does in Vect rather than relying on my plainly-unreliable memory, it would have been obvious that it isn't the tensor product: there are obviously no candidates for the required projection morphisms.)
Thanks! (And apologies to Jeremy for having made an invalid argument, though in fact I think the arguments "the product is the tensor product, the cartesian product is actually a coproduct, so they're very different" and "the product is the cartesian product, the tensor product is an entirely different kind of product, so they're very different" are about equally convincing modulo the fact that the first one's key premise turns out to be false.)
Author here. It's a good point, and I updated that snippet of text to account for it.
But seriously, you don't think that saying things like "a nasty abomination of a product space," and "they’re only related in that their underlying sets of vectors are built from pairs of vectors in V and W" is enough to drive that point home?
I do struggle in balancing the amount of overly deep/technical bits on my blog with the easier to understand mathematics. If only my audience was primarily HN! :)
To me, "a nasty abomination of a product space" reads like "a product space which is a nasty abomination". The "only related ..." bit is good, but it comes much later in the article; it's the early bits that I think are liable to confuse.
I do appreciate, by the way, that explaining nontrivial mathematics to anything other than an audience of very good mathematicians is really bloody hard!
That's a good point. Maybe the word "adulteration" or "bastardization" would be better? Or maybe to avoid odd connotations I should just say it in long form: taking a mathematical machete to a product space and making it nearly unrecognizable.
I do still think, despite what many have said, that a comparison and distinction needs to be made between usual products and tensor products. I think "modifying a product space" happens to be a different way of thinking about it (maybe not the best way), and for what it's worth it helps me keep track of the damn things.
Prologue: I just searched for the word stress on this essay and couldn't find it. So...
* * *
Let's see; for me, if I can map an abstract concept to something readily visual, my understanding is faster. Are there some close visual aids to understanding tensors?
What physical property(ies?) can be mathematically modeled as a tensor?
Imagine a stack of tiles, bottom to top, thin and piled on top of each other. And each tile is connected with its neighbours with springs (not unlike a spring-coiled bed with many layers), like so:
============== ---> Thin tile
\ \ \ \
/ / / / ---> Springs
\ \ \ \
==============
\ \ \ \
/ / / /
\ \ \ \
==============
\ \ \ \
/ / / /
\ \ \ \
==============
Now, we can pull the topmost tile along the stacked direction causing the springs to expand. If we do this and only this, it is pure tensile stress (I am referring to stress in a bit loose way here). We can also sit on that stack and that leads to compressive stress (just a tensile stress with a minus sign). [As a sidenote, bricks can take great compressive forces but can't withstand tensile forces of similar levels. But something like steel has almost symmetric response between tensile and compressive loads].
OK...what else can we do? Can I pull the topmost tile to the right (or left) while holding the bottommost tile still? Sure, and now the stack looks like a rhombus. This is shear stress...in the right-left direction. I could've also pulled the topmost tile towards (or away from) me. That is also shear stress in the front-back direction.
So, for this setup, we can identify three stress components: 1 tensile and 2 shear.
Now, the first tricky bit: imagine a "stack" that is bottom-top, left-right and front-back. There are springs running in all three directions. We have 3 tensile and 6 shear components.
Second tricky bit: shrink that new whole "stack" to a point. We have the stress tensor(!).
It's a much clearer explanation than the original post, because it explains a different concept, which is easier to understand than the content of the post (what does a tensor product of vector spaces mean?)
The distinction is between understanding an instance of a concept (stress is a tensor) and the concept itself (what is a tensor, i.e., what exactly is it that stress shares with all other tensors). This is why the explanation in terms of a universal property is more difficult to understand: because by virtue of stripping away the extraneous details, it shows you the concept as it applies to all instances of the concept.
As someone who had no idea what a tensor was, a rough explanation of what they might sort of be related to is better than something that I don't understand at all, as long as it's clearly labeled that it's not an exact explanation.
But with Tensors you can not only express compressive/tensile forces on a plane through a point and not only shearing on that plane but also sort of rotation: curls -> non-symmetric across the diagonal elements. I believe Feynman introduced a nice picture for this: imagine you have a perfectly symmetric nutshell with some thin paper blades attached to it, so it looks like a rotor. Put it right into a stream and fix the position. If the stream is flowing faster around on one side, it will start to spin. That's the curl in that point of the stream.
For anyone interested in category theory, the description given halfway, that "the tensor product is initial in the category of multilinear maps", is honestly almost enough information to replicate the remainder of this article (the non-computational bits anyway)
I'm by no means an expert in CT, but I can now read enough of it to see that. It's kind of like back in high school physics when you knew that everything could be solved by knowing F = ma and doing a bit of algebra.
Kip Thorne and Roger Blandford have an excellent introduction to tensors in their lecture notes on the Applications of Classical Mechanics. See sections 1.3 and 1.5:
They emphasize the geometrical viewpoint in which tensors are just linear functions which map vectors onto vectors via the dot product. (Or more generally a rank-n tensor is a linear function which maps a vector onto a rank n-1 tensor via the dot product.) Throughout the entire course they eschew coordinate systems as much as possible since that tends to obscure the underlying math and physics.
I found their introduction to tensors to be the most helpful I had ever read. Realizing that tensors are (or can be considered) functions that map vectors onto vectors was a big conceptual shift for me.
Yes this paper is excellent. Far more comprehensible (to me) than the link at the top of this discussion. I guess it all depends on what you are already familiar with. I was familiar with vectors and functions of vectors, and the concept of a linear function. Therefore, "A rank-n tensor T is, by definition, a real-valued, linear function of n vectors." is a much better starting point for me.
Could that be the start of an introduction to tensor products or even tensors in general ? Start with cartesian product ( or maybe even sql joins ? ) and generalize ? It's probably much more intuitive to us developpers than vector products ( unless you're in 3D graphics or physic engine of course )
I like this introduction. Often it helps to get used to doing symbol manipulations before trying to fully understand things, and I think tensors are a good example.
(By the way, the map "f hat" is at one point described as "multilinear". I think that it's actually linear. After all, it comes out of a tensor product whereas bilinear maps can only really come out of a direct product.)
Where? You should point this out so the author can correct it - I couldn't find a point where f-hat was described as multi-linear, which would certainly be a mistake.* It looked to me that f-hat was correctly described as a linear map throughout the article.
*Yes, technically a linear map is a multilinear map with rank-1.
I see. Great article by the way, I learnt tensor products through physics/exterior algebra, but they tend to teach tensor products and symmetrization/antisymmetrization at the same time. Seeing the plain tensor product was very informative.
This may indeed be a very useful guide to tensors, perhaps for people who have worked with them without a strong background understanding.
I don't want to detract from that, or from the author's thorough and sincere effort to provide tutelage to the world at large, for free, in a wide variety of mathematical subjects; he seems to have written hundreds of blog posts like this one.
However, I don't think that it is in any way about conquering any such thing as 'Tensorphobia', and if it was intended to be "significantly friendly so as not to make anybody run in fear," this effort is going to fail. I do not think that it was; I think there is an (quite innocent) dishonesty to the claim, and I feel like calling this out.
The language used contains typical signals of the sort of para-academic elitism parodied in 'Fraiser', starting with the use of the uncommon word 'jest' in the first full sentence and continuing with the adoption of "we" and the scattering of various tofferies throughout remainder of the text.
I was quite annoyed by this until I looked at the author's bio page, whereupon I thought, okay, perhaps for him this feels like his natural voice. Making the grade is one of his things, and he has in fact made the grade repeatedly and excessively, and he is young enough that there hasn’t really been much time for him to reflect on this much.
He's enthusiastic about teaching, which is a rare gem of a quality amongst the professoriate. So do not be crabby, you old substandard goat, who can barely add two to two without the crutch of an operator.
Okay.
However, the author's audience is much narrower -- he is making it much narrower -- than he may realize. He is writing for a person like himself, and he is, in a somewhat standard manner, offering the subject by way of little tests of the intellect, couched in suggestions of superiority.
This is wonderful fare for a student who's sense of self-worth is founded on the suggestion that they are more clever than the others; it is neither food nor balm for the rest of the anyones who might have cause or need to seek a more general understanding of tensors.
What I take exception to is that presenting the subject in this style while claiming to speak to the rest of the anyones will tend to reaffirm 'tensorphobia' for such people, essentially gaslighting them; this is not helpful. It is a distinct harm.
I remember realizing, in my first year of university, that a person of ‘general’ (low) intelligence would in fact be entirely capable of understanding the proof of the derivative, if it was taught to them properly. Everyone routinely understands more complicated things. The true barrier would be their indifference; they would have no use for such understanding. But otherwise the failure would not be on the part of the student; it would be on the part of the teacher.
I am not suggesting the same of tensors; perhaps I myself lack some quality that is needed to understand them. But I recognize the language that is being used here, and it has an easily comprehensible and distinct meaning that has nothing to do with tensors.
I think the author is doing wonderful and useful work, but I think he should reflect long and hard on his rhetorical affectations.
This article was not written for a layperson; it was written for someone who had taken college-level math classes and has a more-than-casual interest in math, but had until now been put off by overly-technical (read: very very technical) descriptions of tensors. I fall precisely into this category, and enjoyed the article. I found the tone and phrasing of the article quite natural, but as an academic I realize the wording might sound unnatural to others.
You must have a very poor impression of academics if you found the language of this article offensive. What's wrong with the word jest? If he isn't allowed to say 'jest', why do you get away with 'para-academic elitism' and 'rhetorical affectations'? Also, I realize that saying "we" might sound strange, but I didn't know it could offend. (It's not a royal 'we'; it refers to 'you and I', FWIW.)
Tofferies are things done by toffs that show they are toffs, where "toff" is a snooty, upper crust, elitist, aristocratic twit who holds commoners in contempt. Or so I learned while working in the UK.
It's all a part of the jargon of British class warfare. We have toffs in the US (Fraser Crane was a good example), but when Hollywood wants to create the ultimate toff, one whose every vowel drips with contempt and condescension, they will always give him a certain British toff accent that screams toff to every native English speaker--even those of us who don't know the word toff.
It's not a question of the word "jest" having anything wrong with it, and I think you mistake my point somewhat if your impression was that I was offended by the author’s rhetoric.
In North America, one does not hear the word “jest” that often. It is a perfectly decent word, but it is also — along with other turns of phrase employed here — symptomatic of a particular discourse that makes assertions regarding social order and fundamental power relationships between humans.
My objection is not to the author’s rhetoric. My objection is that the article suggests it will help the reader conquer the ‘fear’ of tensors while still using language that confronts the reader with an anxious situation. He is bluntly interpolating an elite — those with a notion of belonging to an elite — and presenting material as a test of membership in that category.
This is a good way to teach smart kids who are strongly motivated by the notion of belonging to a high(er) ranking peer group. (In the short term, at least.) It is a very, very bad way to teach anyone who has learned, consistently, that this language specifically disincludes them, and it is bad enough when it comes to teaching those who have learned that they are not, in fact, a member of any sort of intellectual elite.
“Tofferie” is a (probably misspelt) inferred word; that is, there is no such word, but its meaning is readily deducible. A “toff” is a person who displays mannerisms associated with wealth and status; the usage of this word in the North American context would be an example of the sort of signalling I am talking about.
I’ve met professors, both associate and full, in various social contexts; I probably have a very ordinary (or even mundane) impression of academics.
> In North America, one does not hear the word “jest” that often
That's not my experience. But certainly I'll admit that North America is a big place, and your experience could be different. I grew up in a blue collar town in the midwest of the US.
I hear "jest" all the time, and hear it in all kinds of company. But usually I only hear it in one of two idioms: "I jest", usually said after a joke if it's apparent that the listener is taking offense to something just said; and "only half in jest", indicating that the speaker wants you to take what s/he just said seriously.
I've read your comments three times, and the original article twice. I'm still very confused about what your point is.
Before I read your comments, which call direct attention to the author's style, my impression of it was that the author was writing in a deliberately informal, encouraging and sympathetic tone. The material is somewhat difficult, he admits, but not as difficult as it is made out to be, and the reader will be rewarded with the understanding of an elegant concept if they just push on for a bit. Having read and done a bit of mathematics myself, I recognize this tone as the one I try to adopt when I explain concepts to others, or write notes for my own understanding. (Which is half of the reason Kun writes these things himself, I assume. Kudos to him for it.) It is far from condescension-- if anything, it is _commiseration_.
Your concerns, so far as one can find them stated explicitly, are nits about style. These would be minor even if they were accurate, but they are not. For example, "half in jest" is twice as popular as "half-joking", according to google. "We", as other commenters have noted, would have been perfectly fine even if it weren't alraedy completely standard usage in mathematics. Which it is. I am reminded of a line from the "Note on Notation" in Hodges' _Model Theory_:
> 'I' means I, 'we' means we.
For all the objections about uncommon vocabulary, I can't help but observe that the only usage that seems to have confused readers in this thread (myself included) was "tofferies". But I learned a new word today, so thank you for that.
But I still don't understand how "jest" is "symptomatic of a particular discourse that makes assertions regarding social order and fundamental power relationships between humans". In fact, the phrase:
> symptomatic of a particular discourse that makes assertions regarding social order and fundamental power relationships between humans
strikes me as _rather a lot more_ "symptomatic of a particular discourse that makes assertions regarding social order and fundamental power relationships between humans". In particular, the discourse of critical theory, which has become so cloistered and self-referential that you can pick it out a mile away. I would be grateful if you would spell out exactly what those assertions are, so long as you are charging the author with making them.
I catch another whiff of that with phrases like "he is bluntly interpolating an elite". Either you mean "interpolating" in the sense of approximating a function between two known values, in which case I have no idea what you're talking about. Or you mean "interpellating", in the Althusserian sense, which is the only interpretation that the context seems to support. But in that case, look what you're doing: you're walking into a forum on hacking with highly, _highly_ abstruse jargon that even the average English-speaking philosopher does not understand. I would say that you are hoist by your own petard here, but perhaps that phrase would strike you as oppressive.
Finally, I don't understand what you intend by "an anxious situation". Do you simply mean that math produces anxiety? There's no shame in admitting that, but you might at least acknowledge that it's precisely that anxiety which Kun attempts to combat in his tutorial. I don't see how you can miss that Kun is, in fact, making an extremely generous effort to encourage the reader.
PS: Those "little tests of the intellect" are what we in math call "exercises", which are how one actually grasps the material. You cannot learn math by passive absorbtion-- you have to sharpen your pencil and puzzle it out for yourself. Not because "we did it so you have to, too", but because we sincerely know of no other way to put mathematical understanding in another human being's head. Well-selected exercises are actually a strong signal to the reader that the author really cares about the reader's comprehension.
I don't know of a charitable way of putting this, but for one criticizing the OP's engagement with the conventions of mathematical writing, you seem rather innocent of them yourself.
I'm a somewhat math-undereducated programmer obsessed with DSP. I've gotten very used to the feeling of being in way over my head reading up on math concepts.
As I stick with it, I learn more and more. Yeah, it's frustrating, but it doesn't make me feel stupid—if anything, there's a sense of pride and excitement when something finally does sink in. It's become a bit of an addiction.
I for one definitely support attempts to make math more accessible, even if "more accessible" still requires effort on my part, and even if the author's tone seems overly formal or jargon-laden.
I can only hope the rest of the author's audience has the same attitude. It seems impossible to get anywhere with math without it.
Ok, I think I see what you're trying to say now, though your claim that he is "bluntly interpolating an elite" still seems fantastic to me. I didn't get that impression whatsoever.
Would you mind giving a few (say 3) specific examples from the text that are indicative of this? I (and perhaps others?) am curious what sorts of phrasings give that impression.
I agree that the article isn't as easy to read as the author hopes; it's not exactly "explain it like I'm five". However, I think you're going to have to give more concrete examples of the particular writing style you're objecting to, and how it could be done differently.
Anyone that's covered enough maths or physics to be interested in the subject is unlikely to be put off by the use of the word jest. In order to get to this point you will already be well versed in such language.
For what it's worth, I enjoyed the article. I've used tensors a little bit, but I remain uncomfortable with them. Perhaps I'm exactly the audience you were targeting. :)
The map \alpha:V_1\times \cdots \times V_n\rightarrow
V_1\otimes\cdots \otimes V_n is initial in the (slice) category whose objects are the F-multilinear maps
f:V_1\times \cdots \times V_n\rightarrow W from
V_1\times \cdots \times V_n to vector space W over F,
and whose morphisms are and F-linear maps "between" such multilinear maps--meaning a commuting triangle as follows.
Given f:V_1\times \cdots \times V_n\rightarrow W
and g:V_1\times \cdots \times V_n\rightarrow X,
a morphism \beta:f\rightarrow g "is" (which means "is given by") an F-linear map \beta:W\rightarrow X such that
g = \beta f. Now of course, one has to specify the category in which the equation holds...which would be the category from which the slice category is obtained.
This is all elementary. I hope I haven't deployed the "fake it until you make it" affectation of academic authority that escape_goat (a cute nick) finds deplorable. I do too, though were I to write about it, I would raise the suspicion that I am a disaffected outsider who deserved exile. (I seem to have just done that.)
Nelson's text on tensor analysis is another route to tensorial enlightenment..
I could also add my own obscure blog to the long tail of wannabe math \cap programming blogs: http://publicsphere.org.
If you wanted to define A x B x C, you could begin by looking at (A x B) x C. This is elements like ((a,b), c) which obey tensor product laws. You could alternately look at A x (B x C), i.e. elements like (a, (b, c)).
It's pretty straightforward to prove that there is a natural isomorphism which preserves all the tensor product properties between (A x B) x C and A x (B x C) - it's basically just rearranging the parenthesis. Here the term "natural" also has a technical meaning which agrees completely with it's intuitive one (http://en.wikipedia.org/wiki/Natural_transformation).
Since (A x B) x C is isomorphic to A x (B x C), it's natural to equate them. You can also define A x B x C in the obvious way, which turns out to be isomorphic to the other two.
But it is important to have the theorem proving isomorphism.