I took a history of mathematics course at the University of Virginia. One of the I interesting things I learned was that people used to write out equations as sentences before symbolic notation was invented. And so you would realize that a single moderately large equation (such as the quadratic equation) might be equivalent to a full paragraph or more of text. So if you ever get discouraged at how long it takes you to read mathematics, remember that equations can be as information-dense as entire paragraphs, or even more so, and just because it is presented in a spatially compact format doesn't necessarily mean that it shouldn't take you at least as long as a text paragraph to read and comprehend.
> equations can be as information-dense as entire paragraphs
And generally the equation is easier to understand than the same thing written in prose.
But did you notice this from the original article:
“When you add consecutive numbers starting with 1, and the number of numbers you add is odd, the result is equal to the product of the middle number among them times the last number.” (Levi’s theorem)
It's one of those rare cases where the English is much easier to understand than the equation.
The problem I have with the way equations are often presented e.g. on Wikipedia is that the authors are so focused on conciseness that they don't stop to explain the idea in straightforward terms. For example, a while ago I was looking into the Radon Transform.
After reading a number of articles and eventually stumbling across one on MathWorks (http://www.mathworks.com/help/toolbox/images/f21-25938.html) with some great worked out examples I immediately got the idea: if you put your x-ray glasses on and look at an object from many different angles, and make note of what you're seeing at each angle, you can use your notes to rebuild the image or figure out how similar this object is to another object. Pretty intuitive.
With that understanding of the idea, the Wikipedia article made much more sense. Most mathematics is only difficult for me because I spend all my time trying to understand why the author is " the integral transform consisting of the integral of a function over straight lines". Once I get that, the rest is straightforward.
+1 absolutely agree. Wikipedia is terrible for learning mathematical concepts, not surprising given that its a reference site. Its appropriate to be concise in this context.
People are getting excited about education being an area ripe for disruption. What about a site that rips all the Wikipedia maths topics and allows people to add some worked examples and arrange the content to be more pedagogically pleasing? Providing a nice little HTML5 widget toolbox to create simple interactive visualisations would be cool...
I've read a couple of texts on the history of math. Most of them were dry, a few were very entertaining like Crowe's on Vector Analysis. But I haven't found anything that beats mac tutor. I used to read it nearly everyday many years back.
The section on Al-Khwarizmi, a key innovator in algebra shows how laborious the task was (site down linking cache):
a square and 10 roots are equal to 39 units. The question therefore in this type of equation is about as follows: what is the square which combined with ten of its roots will give a sum total of 39? The manner of solving this type of equation is to take one-half of the roots just mentioned. Now the roots in the problem before us are 10. Therefore take 5, which multiplied by itself gives 25, an amount which you add to 39 giving 64. Having taken then the square root of this which is 8, subtract from it half the roots, 5 leaving 3. The number three therefore represents one root of this square, which itself, of course is 9. Nine therefore gives the square.
I'm sure I have the text lying around somewhere. I can't remember the title and author, but in any case the text was just a compilation of relevant excerpts from primary sources to be used as reference. It didn't provide any analysis. All the analysis was provided by the professor, Karen Parshall, whose courses I can enthusiastically recommend for anyone at U.Va. interested in the history of math or science (bonus: they satisfy your 2nd writing requirement). Her website is here: http://www.math.virginia.edu/~khp3k/
I credit her for pointing out the fact that I note in my original comment, that an equation can be equivalent to a large tract of text. She made the point in the context of why more "advanced" mathematics couldn't have developed before symbolic notation had matured sufficiently to support it. Here, I'm generalizing her point to explain why reading a mathematical text dense in equations might take longer than the page count alone might suggest.
It can also be helpful to find others who are interested in reading the same bit of math, and talk through it with them. They don't have to be particularly better at it than you, they just have to have a similar level of interest and curiosity. In grad school we read recently published papers in a small group setting we called "journal club".
This process helps in a number of ways. It keeps you from reading too fast or too passively because you're constantly asking and answering questions. It makes you less likely to get stuck in a dead end for very long, because others will see alternatives. It gives you an opportunity to ask about notation or background concepts you aren't familiar with. It helps you keep track of the big picture, because while some people are bogged down in a particular detail (like "how do they get from equation A to equation B?") others will be trying to tie it back to the big picture ("how does equation B fit into our overall goal?") And it allows you to see the even bigger picture as others bring in relevant knowledge or experience; it was pretty common to be working through a paper and have someone mention how it tied in to their current research project.
>A particular notorious example is the use of “It follows easily that” and equivalent constructs. It means something like this:
>One can now check that the next statement is true with a certain amount of essentially mechanical, though perhaps laborious, checking. I, the author, could do it, but it would use up a large amount of space and perhaps not accomplish much, since it'd be best for you to go ahead and do the computation to clarify for yourself what's going on here. I promise that no new ideas are involved, though of course you might need to think a little in order to find just the right combination of good ideas to apply.
Even knowing this ahead of time, this kind of thing can be maddening. Namely, when the meaning of the assertion is be sensitive to little sign changes, index shifts and the like these are quite likely to end up in the computation.
My Basic Concepts of Mathematics teacher was notorious for getting about halfway through a problem before announcing "and from here it's trivial to show...". One day he walks into class one day and addressed us:
"I've been told by the head of the department that I can no longer say 'it's trivial'. You have to understand when I say 'it's trivial', I just mean that I can do it"
The article is by Shai Simonson, one of the instructors at ArsDigita University (http://aduni.org/).
This was part of the reading material for ADU Course 0: "Mathematics for Computer Science" (http://aduni.org/courses/math/index.php?view=cw), but unfortunately the video lectures aren't available, whereas they are available for all the other courses.
Shai is an incredible teacher; his video lectures on theory of computation were a major part of what made me understand the beauty of CS theory (automata, CFG's etc). It was sad to see that they had to shut down ADU; but I guess it was an unsustainable model; if I recall correctly, there was absolutely no tuition charged to the students.
"The same half hour in a math article buys you 0-10 lines depending on the article and how experienced you are at reading mathematics"
I'm still reading though the article, but this might be the most important bit of information for anyone getting started reading papers that rely on maths. I wish someone had told me this when I started in with the more complex comp-sci papers because it's hard not to feel dense when you have to go over the same 3 or 4 pages of text time and time again to get the concepts.
Yeah, especially if you’re already a fast reader, it’s frustrating at first to have to drop to what feels like a snail’s pace to properly understand everything. But the information density is a real life-saver once you’re more experienced, because it lets you readily experiment with things at that high level, unencumbered.
During college, in order to plan my time, I used to measure my studying time on more mathematically intense texts. I took 30-60mins per page on average. It is a hard read, at least for me, but intensely gratifying too.
> The way to really understand the idea is to re-create what the author left out.
If reading mathematics requires re-creating what the author left out, why not leave it in?
Sure, it will be longer, but if the purpose is communication, wouldn't that be better? The reasons I can think of are not beneficial for communicating knowledge: that's how the game is played (tradition); it excludes the uninitiated/untalented; it's neater to leave out the truth of discovery; it makes the author seem superhuman; there's satisfaction for the reader in understanding the puzzle.
EDIT the reader can skip explanations he can work out himself (or use them as a check); papers are already structured with details in deeper, skip-able sections. One can have a summary that excludes details altogether (like an abstract, or equivalent to a present maths paper). In the article, the parts left out are not "known", but steps that the author could work out themselves, perhaps after many dead-ends to find the right combination. To avoid repetition of known specific concepts (like vocab), one could explicitly reference them, or assume them for a given audience.
Perhaps the essential problem is that the omitted steps are not a single concept (like a well-known term), but
many concepts, combined according to other concepts (like a complex expression), so they can't be easily be referenced, nor assumed. Someone, somewhere will have to work out the combination - I'm suggesting it is more efficient for it to be the one writer than the many readers.
Imagine how poorly your idea would be communicated if you included definitions or explanations for every word longer than six letters. It would be hard to cut through all the noise to get to the interesting part of your message. By assuming a particular level of background information (in this case, vocabulary), you're able to focus on the interesting and important insights.
The same is true in math. You don't leave out random steps; you leave out steps that you expect your audience to have already mastered. You leave out steps where the working-out process doesn't contain anything particularly new or useful. Anything that comes down to "apply a bunch of lower-level math in a tedious way" doesn't belong in your paper. Re-creating it shouldn't be necessary for getting the basic idea.
When this is done properly, the result is clear and concise communication.
One of my university lecturers gave what we all thought were dreadful lectures. Muddled, unclear, chaotic, with no discernible thread. It took ages to reconstruct and rework the material to a point where we could attack the problems and old exam questions.
I got nearly full marks on that exam.
Other lecturers were brilliant. Clear, lucid, entertaining. I didn't get full marks on their exams, because I found it hard to do the problems, even though I thought I understood the material from the lectures.
Math is not a spectator sport. You need to get involved, otherwise you're in the situation of someone who has watched a lot of tennis, but never played.
I used to mock the "it is clear that" phrase when it would take two or three pages to show the result, but having done the work to show it, I was then equipped to handle the next stage of the work. Having the explanation given to me as to why it was "clear" would not have done that, my understanding would be meagre, and unsatisfactory, and I would gradually fall behind and not understand what was missing.
So no, it's not because:
* that's how the game is played (tradition);
* it excludes the uninitiated/untalented;
* it's neater to leave out the truth of discovery;
* it makes the author seem superhuman;
* there's satisfaction for the reader in understanding the puzzle.
When done properly it's genuinely for more effective communication. I'm not saying it's always done well - not every writes equally well - and I'm not saying that everyone always has the best motives, but working on what you see as gaps in the presentation really is the best way to understand the material.
Added in edit:
You said:
> Someone, somewhere will have to work out the combination -
> I'm suggesting it is more efficient for it to be the one writer
> than the many readers.
If your purpose is to have it written down, then yes. If your purpose is to communicate effectively to the readers, then no. The "doing" is an essential part of the eventual "understanding".
> > The way to really understand the idea is to re-create what the author left out.
> If reading mathematics requires re-creating what the author left out, why not leave it in?
To really get some feeling for the content of the text, it seems to be really essential that the reader explores the content a bit herself. This is illustrated quite well in Don Knuth's "Surreal Numbers". Knowing this is probably the key to learning to read mathematics.
One can try to write all this out, but that will make the text harder to read for the mathematician, who now has to find the important bits of the text (think of the various 1000 page Visual Basic books). Also, it takes a lot more effort to write down all these "trivialities".
Perhaps the problem here is really with the school system. We study "maths" for many years in school, without ever learning how to read it.
Imagine someone explaining a new feature in their web framework while all the time taking time out to explain the concept of arrays, dictionaries, list comprehensions, URL routing, MVC etc. This may be a great idea for a book aimed at beginners though it would be frustrating for experienced programmers to have to constantly revisit basic concepts. Assuming the reader to have a solid background allows the author to concentrate on the novel thing they are trying to show.
Also historically some journals have restrictions on paper length, which increases the incentive to cut the idea down to the essential, novel material.
Maybe mathematics papers would be improved by hypertext links to background topics or explanations of sub-problems at the places where they are needed?
that's exactly the kind of problem that was going on through my mind for about 2 weeks now. We've never learned howto analyze whitepapers or dissertations. Let alone howto convert an unknown mathematical formula into usable code.
I see this as one of the most essential skills we never learned. Could someone please teach that skill? =))
I'd be very grateful if someone could help me and others understand how to dissect and codify the main point of a whitepaper.
Here http://groups.csail.mit.edu/netmit/sFFT/ is another very interesting algorithm I knew for a long time, but even though I understand the principle. And tried to put it into code, I wasn't able to identify what the important part of their whitepaper is. They even provided some pseudo-code which I've seen in many other papers, but I've never seen where and how they standardized the pseudo-code notation. The pseudo-code looks ambiguous to me. (NOTE: They provide the code now, but that wasn't the case when the paper was published first).
I agree that typical math writing must be absorbed very slowly and carefully to be understood well.
Unfortunately, in the world of mathematicians, this isn't what really happens most of the time. In grad school classes (for math), students don't have time to absorb the material this carefully - they must 'learn' too much too quickly (or at least this was the case at my grad school), so that reading math becomes learning just enough to pass exams. Writing a dissertation involves a lot of talking to people and skimming papers to hopefully grab relevant bits. Reviewing papers, for professors, involves handing papers to their grad students and asking them to read it. Even writing a paper for journal submission feels primarily about satisfying reviewers and getting in to the best possible journal - being readable becomes a lower priority than space constraints and the quibbles of a particular reviewer.
Off-topic, but who uses an IE document icon as a favicon? got freaked out here in the office thinking IE managed to crawl back anywhere I can mistakenly open it somehow.
Except that (good) code is written to be read, and to be understood - and not
be terse and cryptic, just so that people writing it have less to do. That's
essentially what most math texts do. I don't want math to be prose, but often,
a little more verbosity or communication of intent would be nice. Just like
good comments and documentation. In CS, this is universally accepted as good
style, and for very good reasons - and for the same reasons, it should be in
math, as well.
I think you're misunderstanding the purpose of the terseness of mathematics. The point is to convey the important concepts as cleanly as possible, assuming your reader has a certain level of prerequisite knowledge. The same can be said about code: you assume your reader has a certain level of understanding of the domain that you're modelling through code. The terseness is a virtue, in the sense that only the critical bits of new information need be conveyed. All the prerequisite knowledge is hidden behind abstractions that the reader should be familiar with. Math, and code, is cryptic if one doesn't have the prerequisite knowledge, but this is by design.
>The same can be said about code: you assume your reader has a certain level
of understanding of the domain that you're modelling through code.
No, this is not at all the same. Code can be read without understanding
everything if it is well documented. Yes, code should be short and simple, too.
But not terse and cryptic - and that's what so many math texts do. So yea,
cool, you can compress half a page of written English into half a line of
mumbo-jumbo. Great. You have gained nothing.
I'm not saying that the half page of written English is better. Not at all. But
there's something somewhere between those two where math should be. And not at
the cryptic end of the scale.
Take just simple things such as meaningful variable naming. There are tons
of good reasons why we do this in programming. I can't count the number of
times I've tried to decipher math texts and had to look up over and over and
over again what some x or f or lambda or phi is actually supposed to
represent. It gets worse when they start making distinctions based on the
bloody way a symbol is typeset (x vs x).
It's perfectly fine to do this in quick calculations on paper (just as it is
fine to use one character variables in quick one-off scripts). But if you are
writing a prolonged mathematical text, then take the time to give variables
meaningful names. It's not that hard, and would go a long way for improving
readability.
I don't think descriptive variable names would be useful in math. The thing in math is that "x" you see can be repeated 20 times in a set of equations. Having to read "running_total" 20 times instead is a hindrance here rather than a benefit.
The difference between math and code is the number of variables. A piece of code can have an order of magnitude more variables than a set of math equations. But in math, a few variables will usually be repeated many times. You're simply optimizing different usage profiles.
So the benefit in descriptive names in code is being able to distinguish easily the many different variables. The benefit in single letter variable names in math is that you're able to write the information in a more compact, digestible manner.
The reason this latter point is important is for the same reason short, compact programming languages are a boon to comprehension. The faster you can read a set of related items, the more of it is in your working memory at any given time and thus the better you're able to understand it. I fully believe the time it takes to input a chunk of information through your visual system is inversely related to one's understanding of it. To put it another way, working memory's decay function is parameterized by time.
>The benefit in single letter variable names in math is that you're able to
write the information in a more compact, digestible manner.
You missed my point entirely. I said that you should not optimize for writing
math, you should optimize for reading math. And that's where the same
argument as for code holds true as well: It doesn't matter if it takes longer
to write, it will be read magnitudes more often, and that is what matters most.
Again, if you're just doing quick calculations on paper or hacking something
into your shell/REPL, I really don't care if you use cryptic variable names. I
do, too. But I don't want to ever see this in a good program, and I think that
it doesn't belong into a good math text, either.
>The reason this latter point is important is for the same reason short,
compact programming languages are a boon to comprehension.
And again you missed what I explicitly stated: it's good to write in a concise
way, but only as long as it doesn't become cryptic. And I disagree with your
belief about information processing. It's better to build semantic relations by
having meaningful names than it is to process a lot of information in a short
time. It really doesn't matter how fast you are able read something - if it
doesn't make sense, you won't understand it.
>You missed my point entirely. I said that you should not optimize for writing math, you should optimize for reading math.
My point is that these goals are nearly one in the same when you get to the high level. Many mathematical relationships are very complex, usually not the step-by-step procedures that is common in code. Thus being able to hold the entire relationship in your head at once is crucial. Single letter names for variables and functions are critical here (for the previously mentioned reason working-memory-decay).
I basically have a bachelors in math, and I could not imagine reading complex equations with full variable names. The hard part is understanding the whole, not remembering what x or i means. If you find an equation cryptic, that just means you don't have the requisite knowledge to really understand it.
Sorry, if all you have left is the (very typical) argument of
"if you think that way you just don't understand it", then I'll
consider this discussion over. I have no interest in hearing
endless appeals to tradition. I'm sure people made similar
arguments for GOTO back in the day. Good thing we moved on.
I would like to think my argument is far more nuanced than you're giving it credit. This isn't an appeal to tradition; I'm making an argument that explains why traditionally math has stayed with the "cryptic" notation rather than descriptive variable names.
You're completely ignoring the working memory argument I'm making. A large part of math is pattern matching: recognizing a common relationship between variables (a common pattern or "theme") and investigating that relationship further. This ability is critical to mathematical ability. Visual compactness is crucial here. Using descriptive names will completely bog down your visual system with reading words rather than identifying patterns.
Pattern matching is a critical skill when one becomes an expert in any field; in math its of utmost importance. Math is optimized for reading by other experts who have those same visual patterns committed to memory. This is the optimal approach for the work's target audience. Yes, it makes advanced math largely inaccessible to outsiders, but that's a part of the trade off.
>You're completely ignoring the working memory argument I'm making.
I'm not ignoring it. I said I don't buy it. I generally don't believe that
you need visual compactness to to point of being cryptic to make use of pattern
matching. I'm not saying this is necessarily easy. Neither is choosing good
variable names in programming, especially at high abstraction levels. But
that's no excuse for not doing it.
The other thing is that mathematicians will finally have to accept that
they aren't the only ones who need math. It's not, for the most part,
a "field for experts" (to paraphrase you) which only those who are willing
to become experts at it are allowed to understand. It's a beautiful science
with lots of appliance in almost every other field. But it's inaccessible
as hell in large parts due to mathematicians with your attitude.
I strongly believe that math can be made more accessible and more
"user-friendly" for people who aren't experts at it. It will be hard to do
that without sacrificing its tremendous power and flexibility (which I am
totally against in software, as well). But it must (and will) eventually be
done.
Take note, by the way, that the accessibility problem is less and less
important the more specialized the math you are doing becomes. I don't really
care if some hyper-abstract field of math that a few dozen people in the world
are even able to grasp the basics of is arcane and cryptic. That's something
only experts will care for and if they are fine with their niche field being
overly cryptic, so be it. What I'm concerned about is math in general.
I firmly agree with the position that math should not be taught as an
appliance, because that is missing the point. However, it's downright fatuous
to ignore the fact that math has appliances, and that therefore, a lot of
people will have to learn a subset of it. To excuse bad practices which have
survived by virtue of no one questioning them with nothing but "it's optimized
for edge cases (experts)" is a cop-out. It ignores the problem.
I also find it funny that I already met with such vehement opposition for just
proposing meaningful variable and function names. That was just an example.
There are a lot more problems to be solved. Stuff like "this is trivial" or
"left as an exercise" as a way to avoid writing out cumbersome, but nonetheless
important (for non-experts) parts, the general terseness of mathematical texts,
including the parts in plain English. I could probably find more.
Let me conclude with this: Math as a whole has accessibility problems, just
like a lot of CS and computer stuff. Flat out ignoring them or even asserting
"it must be this way" is incredibly ignorant. And we need dialogue to do this -
not condescending "there is no problem" rebuttals.
You make a lot of points that I agree with. I just don't think your solution is the right one. Math, when it comes to writing proofs and peer review, should absolutely be optimized for other experts. Whatever makes communicating ideas precisely among themselves is what's important here. I don't see the benefit of changing this process so those of us as outsiders can take a peek.
But as you said, there is a very large part of math that deals with applying these concepts and communicating them to non-mathematicians. The way we go about this is definitely in need of a do-over. Finding more intuitive ways to communicate these ideas is critical a critical part of this (possibly including more descriptive variable names). But, I think there will always be an inherent chasm between the math experts and those who are using the results they discover. An example is the difference between the calculus track of courses that every science major takes, and say, abstract algebra. The math that mathematicians do and the math that the rest of us learn will always be vastly different. It's just the nature of the beast.