Interesting to me is that a lot of the material you could also find in a theoretical machine learning course or in an applied mathematics course. Before seeing the syllabus I was expecting more on streaming algorithms, randomized algoriths, etc. In my experience most Computer Science students get a bit freaked out when, e.g., concentration inequalities enter the scene.
"Nothing I learned in CS has an application to industry programming" is a regular complaint in discussion on education here and elsewhere. This is a good course to point to for people who have this very limited view on CS.
>"Nothing I learned in CS has an application to industry programming" is a regular complaint in discussion on education here and elsewhere. This is a good course to point to for people who have this very limited view on CS.
> Lecture 11 (Mon 5/2): Graphs as matrices and the Laplacian of a graph. Interpretations of the largest and smallest eigenvectors/eigenvalues of the Laplacian. Spectral embeddings, and an overview of applications (e.g. graph coloring, spectral clustering.)
What industry fields does this apply to? Machine Learning?
The web is a graph. Each node in the graph is a web page. Each edge is a connection a hyperlink between pages. You can represent a graph as a matrix.
If you random click on links you'll end up visiting some web pages (the more connected ones) more often than others. You can define a probability distribution over pages as "what is the probability I'll end up at this web page after an infinite number of clicks". This is the stationary distribution of a Markov chain.
The stationary distribution is the largest eigenvector of the matrix that represents the graph. This gives a way to compute the stationary distribution via well studied algorithms.
You can use this idea to
1) assign importance to pages based on the magnitude of the probability, to improve web search. You might call this algorithm PageRank.
2) found a company called Google that is currently worth USDtrillions.
1) use the raw count of inbound links, weighted by the count of inbound links of the linking pages, normalize by the total number of links on that page, and run it a few iterations to get a first approximation of the probability distribution. This is a pretty intuitive approach, would almost certainly have been good enough for Google, and avoids all the jargon, probability, and "well studied algorithms".
2) found a company called Google that is currently worth US$trillions.
I love algorithms as much as the next guy, but PageRank is not a fantastic example of success-because-I-knew-eigenvectors despite what their PR folks would have you believe. One can't deny, though, that Google's PR has definitely attracted them a lot of super smart engineers over the decades, especially in the early days!
I think it's pretty easy to construct adversarial examples to your (1) that are dealt with cleanly by real pagerank.
e.g. if A is a Huge Important website, and A -> B -> C, then locally looking at {B, C} will underweight C significantly. (And, sure, you might say to look at k-th order inbound links for your iterative approach, but the adversary can just move the weight to k+1).
Perhaps, as you claim, your approach would have been good enough, but clearly understanding the theory got them something better.
This argument that it’s pretty easy to construct adversarial examples would be much more convincing if…it were not what PageRank actually is. PageRank itself initially ran iteratively and would have had exactly this same problem.
In any case, I’m not sure your example is actually adversarial, as there’s not an action that an adversary could take implied by it? Maybe you meant “poorly handled case”? But yes, run it iteratively. 10 steps? 20? Until convergence for some epsilon?
Nothing against theory, but I think they did fine without it.
Huh, interesting. In the ML and CS theory literature I’ve only ever seen “adversarial example” used to mean an input explicitly designed to produce a specific unexpected output, not just a worst-case output that isn’t what you want.
Do you have an example use of this sense that I could look at to update myself?
1. The algorithm you're attempting to describe in 1) above is either exactly PageRank or an approximation to it, so I don't get the point you're trying to make.
2. The point of jargon and well studied algorithms is that you can recognize when your problem is a problem someone else has already tackled. In the specific case of PageRank, if you recognise it's an eigenvalue problem you can
1) Find efficient algorithms to implement it. In the case of PageRank there is a matrix factorization that can make the algorithm much faster.
2) Find convergence bounds for appoximate solutions such as the power method, so you can appropriately tradeoff compute time versus accuracy.
3) Find numerical techniques to avoid calculating a matrix full of NaNs.
4) Benefit from the decades of work in optimizing numerical linear algebra.
3. When Google was first released it was clearly significantly better than the competition (mostly AltaVista). I assume that was due to PageRank. I wouldn't be surprised if PageRank is much less important to Google now, but to get this point it had to have that initial advantage.
0. Hah, yeah. Pretty sure you’re the first person to accuse me of being anti-intellectual...I can’t say for certain, but I think this interpretation here might be on you.
1. Indeed I think you missed my point: the causality implied in your earlier post reads as “well, if you know math, PageRank pops right out!” — my point is, you don’t need to know anything about eigenvalues for PageRank to be an intuitive solution; the insight isn’t eigenvalues or even probability distributions of pages, it’s ranking by relative rank of inbound links. There are many ways to operationalize that insight, and your post made gatekeeping allusions about what math it takes to do that. That’s what I’m responding to.
2. Sure, but you were originally responding to a post about when graph algorithms might ever be useful in industry. Obviously all things being equal, knowing about graph algorithms > not knowing about them. But again, you don’t need to know about them in advance.
3. No doubt Google’s early success is due to PageRank. But PageRank’s success is due to the insight behind it — following web links backward (in fact the precursor to PageRank was called “backrub”) — and not due to eigenvalues.
To be clear, I’m not at all anti-intellectual, and I love a good math puzzle or algorithm as much as the next guy — and I said as much in my other post.
But I’m also not in love with the fetishization of formalisms that is common in technical and academic circles, and the attendant value system that is so eager to prove itself that it tries to fit any and all adjacent insights into its own paradigm — just as you’ve done here.
I wasn't using jargon for jargon's sake. The original question quoted this little passage:
"Graphs as matrices and the Laplacian of a graph. Interpretations of the largest and smallest eigenvectors/eigenvalues of the Laplacian. Spectral embeddings, and an overview of applications (e.g. graph coloring, spectral clustering.)"
My response used the same terms to quickly sketch how they relate to PageRank, and hence try to show how the theory relates to an application. If I was trying to explain PageRank to someone who wasn't familiar with the maths I would take a different approach, but this is a tiny little textbox and there are already many very nice visualizations and other explanations that one can Google.
Oh, I wasn't accusing you of using jargon for jargon's sake, and nor do I believe that you were -- you used the appropriate jargon appropriately and correctly! And certainly representing graphs as matrices and then applying matrix math to them is valuable, I am absolutely not arguing against that.
I'm just taking issue with the (probably tongue-in-cheek!) framing of your "quick sketch": rather than merely demonstrate the connection ("hey look, PageRank uses this stuff!"), the sketch implied that if one knew theory about graphs and their representation as matrices, then PageRank is pretty obvious and falls out of the math (and then you could start a US$trillions business!). I was content to just leave it at that with a snarky version of "hey, sure, this math is relevant, but it wasn't a requirement to come up with the PageRank insight" -- but then I got triggered by your "anti-intellectual" comment. :)
So now I feel compelled to say, in case any impressionable young minds read this thread: that sketch (which I likely grossly misrepresented above) is a fantasy portrayal of how the insights behind technologies like PageRank come about; matrix knowledge isn't necessary (but not unhelpful, of course) to create PageRank or similar things. I've met too many (not necessarily you!) who tell themselves some version of this story: "PageRank? That's just the largest eigenvector the edge matrix" followed by some variant of "Those guys just got lucky! / Those guys just commercialized some known math! / I do math, and math can be worth billions!"
That story ignores the actual insights required to get to PageRank, even if you already know the theory; it ignores the hard work of actually building a company and commercializing such an insight.
Anyway, thanks for coming to my therapy session. </soapbox>
I'm not the person you're responding to, and perhaps it was their tone more than their content that you're replying to, but...
I think there's a lot of value in a KISS philosophy, and using the simplest most accessible algorithms possible. I don't think that's anti-intellectual so much as minimalist. Less charitably, one could say that reaching for something more complex than is needed is navel gazing or gate keeping.
I'm far from an expert on the topic, so I'm not trying to assert if either of you is more correct than the other as far as what the more appropriate model is.
I'm not into mathematics or algorithms and the like at all, but if someone explains things like PageRank in layman's terms, I'm like, "yeah I get that". That's been the recurring theme with me and mathematics, if it's just the theory, language and formulas I'm like "what?", but give me a practical example / use case, a way to visualize it, and I get it. I struggled with linear algebra in school - I don't even know why it's called that, and I can't explain it in proper terms - but if it would be called "the math of video games" then it instantly becomes a lot more accessible and visualizable (to me).
Anyway. I think I've intuited plenty of CS algorithms in my career, but don't ask me to explain the theory.
In my experience, there’s a language of mathematics that makes an almost comically strong attempt to make itself inaccessible through jargon, single-letter variable names, and an over reliance on symbolic manipulation.
If you talk to most mathematicians or algorithm researchers, you’ll find that they are some of the most intuitive people you’ll ever meet, and they use visualization and other intuitive techniques all the time.
And then the classes are all about symbol manipulation.
I swear there are millions of people out there who could be mathematicians but just couldn’t get excited by all the symbol manipulation.
Photogrammetry & image processing, anything to do with dimension reduction (basically every field with real-world sensor data), ontology/semantic data processing...
Interesting to me is that a lot of the material you could also find in a theoretical machine learning course or in an applied mathematics course. Before seeing the syllabus I was expecting more on streaming algorithms, randomized algoriths, etc. In my experience most Computer Science students get a bit freaked out when, e.g., concentration inequalities enter the scene.
"Nothing I learned in CS has an application to industry programming" is a regular complaint in discussion on education here and elsewhere. This is a good course to point to for people who have this very limited view on CS.