Hacker News new | past | comments | ask | show | jobs | submit login
Stanford Compilers Course (stanford.edu)
387 points by tu7001 on Nov 20, 2017 | hide | past | favorite | 79 comments



I took this course for fun a few years ago - it was fantastic! I modified the project a bit to match my own interests - instead of compiling to an executable, my compiler generated browser-friendly JavaScript code. I built a small web application that uses my compiler and allows you to run COOL code in your browser: http://nathanfriend.io/cooltojs/


I also did the course, in parallel to the "real" university course. Both had almost the same structure since they're both roughly based on the dragon book.

I can strongly recommend it - really great material!


I hate it that you can't see the syllabus without signing in. How is someone supposed to know whether (s)he's interested?

I've looked over it, it's not too dissimilar (on surface) with what I used to teach at the university. Might not be an accident since I remember at some point we switched from our own compiler skeleton to COOL, which is apparently what Stanford uses; so maybe I inspired my syllabus from Stanford's? I forget :) (for sure I did check the syllabi of several big US universities when building mine).

Interesting that he recommends "Engineering a Compiler"; I don't doubt that it's a good book but I don't know it. The third on my list (though admittedly a bit advanced) was Muchnick's "Advanced Compiler Design and Implementation".


Muchnik covers a lot of ground, so is pretty dense. Cooper & Torczon's book is more approachable, though it covers less ground. In some cases, Cooper & Torczon choose which algorithms to present because they're easier to understand. (E.g., they don't cover partitioning-based value numbering, but rather a simpler hash-based technique.)


Muchnick deserves a second edition. He’s retired but someone should take on the task.


As I recall, the same professor also taught a compiler class using COOL back in 1995-96 at Cal. I sat in on a session because a friend was taking it at the time.


As an autodidact, I really appreciate that the big "brand-name" universities do this. MIT OpenCourseware and the free Stanford lectures have been so incredibly valuable in making me an at-least-half-decent engineer.

I've written a Scheme interpreter a few years back, but never a compiler, so this should be a pretty fun time-dump for the weekends.


Have a look at "An Incremental Approach to Compiler Construction", you might find it interesting.

http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf


Not to be confused with “an approach to incremental compiler construction.”


I especially like when courses are not watered down. For instance, many coursera courses are simplified to be accessible to a broader audience with limited background. When I want to dig deeper, I really enjoy some of MIT opencourseware classes.


Can you recommend any MIT ocw courses that are challenging while being high quality? Some of them don't have video lectures or are quite hard to follow without TA/teacher help, that I've been hesitant to try them out again.


I did three graduate classes. - Distributed systems. 6.824 (There are two versions. the labs changed over the years). - Operating systems: 6.828. Operating systems

I mainly did the labs. In 6.828, you program a full OS kernel (JOS, based on provided source code). In 6.824, one project was a user-space distributed file system in C++, another was a distributed reliable key-value store in Go.

I agree that the course material is hard to follow without external help but it's definitely doable. They have automatic grading scripts and the labs are progressive.


Oh man those both sound awesome. Thanks for adding the info! Especially the distributed systems course.


I have no illusions that this is likely to happen, but if I ever hit it big, the first thing I will do is donate a good chunk of change to educational programs like MIT OpenCourseware. It is extremely generous to make this stuff freely available, and has contributed immensely to my ability to educate myself.


I believe this course was developed by Alex Aiken during his time at Berkeley. I assume he brought the material with him when he moved over to Stanford. The Cool language as a didactic tool seems a little dated now. When I took the Coursera version of the course a few years ago, Aiken distributed the Cool compiler development environment as a VirtualBox image, which was a nice cross-platform solution at the time. Today, Docker would probably be the preferred approach. I didn't get very far in the course because I lacked the necessary C++ skills (and the motivation to learn them).


Syllabus:

Week Videos Quiz / Exam PA assigned PA due

1 Course Overview | Cool: The Course Project

2 Lexical Analysis | Finite Automata Quiz #1 PA1

3 Parsing | Top-Down Parsing Quiz #2

4 Bottom-Up Parsing I | Bottom-Up Parsing II Quiz #3 PA2 PA1

5 Semantic Analysis and Type Checking | Midterm

6 Cool Type Checking | Runtime Organization Quiz #4 PA3 PA2

7 Code Generation | Operational Semantics Quiz #5

8 Local Optimization | Global Optimization Quiz #6

9 Register Allocation | Garbage Collection PA4 PA3

10 Java | Final Exam


They still reference the Dragon book, but how relevant is it still? I recall that a lot of it deals with parsing (which nowadays seems an almost trivial problem), and it doesn't treat garbage collected runtime systems, or JIT compilers. Or type inference with any of the newer typesystems as used by e.g. Rust, or even Hindley Milner.


This is an undergrad-level course. Of course it needs to teach parsing - a lot of people are going to have to write some form of parser during their career, so it's good to have some fundamentals. (and the Dragon book doesn't exclusively talk about parsing, either)


Man, it's really unconscionable the way that textbooks juice people. There is no reason that book ought to cost $190 (what I just saw on Amazon for the hard-cover). It's across the board; once a book gets entrenched as one of these go-to textbooks for a subject, it can for some reason be marked up 400% over equivalent books.


> There is no reason that book ought to cost $190

Why? How many copies of these textbooks do you think they sell? There is about 50,000 "computer and information science" bachelors degrees awarded each year (which is broader than what you'd think of as CS). Let's say, generously, 5% take a compilers course. That's 2,500 books per year in the entire subject. Let's say any one book gets about 1,000 of those sales. At $190 per book, that's $190,000 gross. Over an entire edition, maybe you sell 2-3 times that much (considering campus book rentals, people buying used books, etc.) That's maybe $500,000 to write, edit, publish, and distribute an entire textbook. That doesn't seem unreasonable to me at all.


5% is generous?

It's a requirement at some universities so I wouldn't be surprised if it's higher. Then again, I don't know what % of universities requires it either.


I think 5% is generous if you're looking not just at computer science degrees, but "computer and information science" degrees (which is what the government tracks).


The authors are already generously compensated. I am super curious how much they actually make, or is this the same as the paper paywall? Inflation adjusted this book is way more expensive than it used to be in its first version.

The hardcover version of the first edition is $2.50 in Alibris. New this book shouldn't cost more than $90.


Priceonomics breaks it down: https://priceonomics.com/why-are-textbooks-so-expensive.

tl;dr version: you're ignoring all the other stuff that goes into selling a textbook and grossly overestimating how much margin publishers have. Professors don't submit PDFs ready to hit "print." The books must be edited, typeset, etc. This can't be done by minimum wage workers because the editors have to actually understand what they're editing. The hypothetical $230 book nets about $40 in profit.


It seems a lot of computer science books I read that have been published lately talk about their build process (web, PDF, EPUB/MOBI) in the same way they would talk about compiling code. How much typesetting is really done these days?


Two words: "paperback" and "international".

It's about $20 that way [1].

Note that it is legal in the US for domestic sellers to import international editions to resell domestically, so for US people buying international editions does not necessarily mean actually buying directly from someone outside the US [2].

[1] https://www.abebooks.com/servlet/SearchResults?n=100121501&s...

[2] There is, of course, nothing wrong with doing business with someone outside the US. It is just that if something goes wrong (book lost in shipment for example), it might be more difficult to resolve the issue satisfactorily if the seller is halfway around the world. The time zone difference could make talking to them difficult, as could language differences, and consumer protection provided by your credit card might only apply to domestic purchases.


May I suggest a different perspective?

A textbook represents a distillation of the author's knowledge and experience in some specific field.

You may need that knowledge now, because at some time in the future you may be earning a living from what you know in that field.

Usually a business with some knowledge gap in their practices, hires a consultant for their knowledge in that specific field. By the same token, when you buy a textbook, you are "hiring" the author/s of a textbook as consultants, to fill in some gap or need in your knowledge. How much do you think you should pay your consultants(authors) for their knowledge and experience?

And that is not such a high price - I have read from some engineering contract law, and maritime law, textbooks that were in the order of $3500 each, if I had to pay for them. I have seen some specialist medical surgery textbooks in the same ballpark price.


I almost always buy textbooks through alibris. Thier prices are typically much lower than Amazon for used books. Sometimes I have to purchase an international or the next newest edition, but I almost always get out for less than $30. I actually just bought "the dragon book" 2nd edition for $25 shipped.


It's all about perspective.

You're going to make plenty of money - I've been working for a while, I don't care what things below 1000$ cost - I have more left over than I spend regardless, and the mental bandwidth that's been freed up by not having to look at prices is very rewarding.

It's scarcity vs abundance mentality.

Another way to look at it is - when you're out there working, making say a phone app, and you find out it is costing them a million dollars, and it's a bug ridden piece of shit - you'll look back on this textbook, that was written by very smart people, proofread, etc, and it only costs 190$.


Try bookfinder.com


Branding 101. Do you feel it's a phenomenon unique to textbooks? :)

(but yes, I hear you: education & healthcare shouldn't be "industries").


Well it kind of is unique to text books. Rarely does an entrenched product get such price privilege, except perhaps Apple devices. Usually if the price becomes too high, the product becomes un-entrenched people switch to something else. University students are uniquely locked in in a way that few other consumers are.


> if the price becomes too high, the product becomes un-entrenched people switch to something else

Eventually, they do.

You don't know that this isn't going to happen here. In fact, I'd argue that it already started (but yes, it's a slow process).


The 2nd Edition of the Dragon Book covers dynamic compilation (JIT) and garbage collection. GC gets about 40 pages. It’s still my favorite.


They have a new edition covering new subjects. Also they're explaining the big principles behind compilers. That's why the book is still relevant (I'm working at a compiler startup)


They are relevant. These are orthogonal issues. Typically there are more than one compiler courses in university, and you can go into more details if need to.


The second edition covers techniques for VMs like garbage collection.


Could someone who has taken a decent compiler course and been in the industry a few years comment on how / why this course would be beneficial?

There are a lot of things I'd like to look into these days, but only so much time. And I do realize that some courses I took in University a decade ago were far more valuable than others (e.g. learning to program the m68K was extremely useful just to understand how stacks, interrupts, etc. work. OTOH, I haven't used anything from in my programming languages course since then).


I can't comment on this course specifically, but the argument for studying compilers, even if you don't end up working in that field, is that learning how programs are translated gives a deepter understanding of how computers are programmed. From a practical view-point, it's useful to know what a compiler can and cannot do with your program.

Niklaus Wirth puts it like this:

"It is the essence of any academic education that not only knowledge, and, in the case of an engineering education, know-how is transmitted, but also understanding and insight. In particular, knowledge about system surfaces alone is insufficient in computer science; what is needed is an understanding of contents. Every academically educated computer scientist must know how a computer functions, and must understand the ways and methods in which programs are represented and interpreted. Compilers convert program texts into internal code. Hence they constitute the bridge between software and hardware.

Now, one may interject that knowledge about the method of translation is unnecessary for an understanding of the relationship between source program and object code, and even much less relevant is knowing how to actually construct a compiler. However, from my experience as a teacher, genuine understanding of a subject is best acquired from an in-depth involvement with both concepts and details. In this case, this involvement is nothing less than the construction of an actual compiler."

(From http://www.ethoberon.ethz.ch/WirthPubl/CBEAll.pdf)


I can't comment on this specific course, but I took a compilers course from Tom Cormen that completely changed how I thought about computing. After several years in industry, it's not so much the specific techniques of constructing a compilers that remain useful, but the mode of thinking I learned.

I had an AHA moment along the lines of "Wow, the compiler writer is basically God." I had taken courses in software development and hardware architecture, but after writing a compiler I saw first-hand how a program--a description of a solution to a problem in a human readable language--is converted to the same description in a hardware readable language. The meaning of the program must be preserved by the compiler, but otherwise it has the freedom to generate any one of potentially many outputs that preserve the meaning.

There is a beauty I realized in the relationship between the programmer, the programming language, the compiler, and the hardware architecture. They are all in cahoots, but the compiler is the cornerstone. That relationship doesn't directly affect my decisions as a professional software engineer, but it forms the backdrop upon which I think about the act of programming and computing in general.


It's been two decades since I took the origins of this course at UC Berkeley. Understanding the material from this course (and the Dragon Book in general) should change the way you think about many forms of data processing. I think this topic is absolutely foundational to a well-rounded CS education. It's important to understand a little about tool-building rather than just tool-use if you want to master your craft.

I haven't reviewed his new syllabus in depth, but even 20 years ago, Alex had reduced the focus on parsing compared to more traditional syllabi based on the Dragon Book. You were expected to do some homework and self-study to learn the basic parsing and lexing concepts. The reason for COOL was to get you into the other compiler-construction topics sooner to get some breadth of exposure to concepts like abstract syntax trees, semantic checking, type systems, intermediate languages, code-generation, and the rough idea of runtime systems and ABIs.

Taken together with an operating systems course, these are really two different examples of systems engineering concepts. You should be able to take away not just some vertical silo of knowledge only applicable to "building a compiler" or "writing an OS". You should be able to fuse these concepts into a richer understanding of how to decompose and engineer complex systems given a range of specialized and more general-purpose requirements.

Another facet of this can be found in courses focused more on database construction and storage, but this starts to veer out of the introductory space. Unfortunately, most introductory database courses focus too much on using a database and writing applications instead of on understanding how they work and what the engineering tradeoffs actually are.


It's beneficial if it's an itch that you want to scratch.

For me, compilers, operating systems, databases, networks, processors are foundations upon which everything else is built. I'd like to know a little bit about the foundation upon which I'm making a living.

Some folks just want to get things done. I think it's just a personality thing.


I haven't been on a decent compiler course, for the most, I taught myself. I find that I constantly invent mini languages. For instance, I do a lot with timeseries data, and do a lot with bringing in all kinds of external sources of data in many weird formats, so I have a simple data transfomation language that deals with 90% of the scenarios I encouter. Also have a various device simulation languages for describing various sensors and their interfaces, eg SDI12 with it's various timing constraints and sequences. As part of a broader embedded board based testing system the mini language is used to say how you want a sensor to be simulated and then you can execute tests to ensure things work


I took this course a few years back when it was on Coursera and had taken a similar course many years ago in university; I've been a developer of some kind for 20+ years. This course brought together a bunch of stuff I understood but hadn't internalized so didn't use in my day-to-day job.

If you've been programming for a few years and take this course, you're going to say "OHHH that explains <some weird thing that happened in your career>" a bunch.


Recently, a coworker of mine told me how she thought the compiler should catch when you don't give printf the right arguments. Another example is when a coworker said you couldn't define flags in a Go library but she didn't realized you would have to import the package from that library to force the code to run first.

In my mind, both of these examples showed a lack of understanding in how the compiler works. I would say learning about compilers should help with being able to make good assumptions of how things work. Whether it's worth the rare times it comes in handy is probably dependent on what else you could learn.

Note that I actually didn't take a compilers course myself and said co-workers did, so I can't say for sure even taking a course is sufficient (but I am currently reading a compilers book, though out of enjoyment and not for career skills). I should also say I believe my co-workers are significantly better developers/employees than I am.


> Recently, a coworker of mine told me how she thought the compiler should catch when you don't give printf the right arguments.

The fact that you think she's wrong, and that you're right, on this issue, suggests you don't understand how compilers work either. 99% of printf fails are catchable through static analysis, and should be caught that way.


But is the static analysis of input strings passed to a random library function really a core feature of a compiler, just because the compiler has a switch for such analysis? Yes, it's a feature of modern well-engineered tool, but I would argue it's strictly not part of the compiler.


> The fact that you think she's wrong, and that you're right, on this issue, suggests you don't understand how compilers work either.

I'm happy to be proven wrong and certainly know I have much more to learn but I suspect in this case I was not clear in my comment and consequently you made assumptions about what I think is right or not. Feel free to correct me =] Either way I appreciate the response

> 99% of printf fails are catchable through static analysis, and should be caught that way.

I never said that it wasn't possible! But that would require the compiler have special logic just for printf or in this case the fmt and log packages. Maybe you would think that's the right thing to do, and I can see that point of view, but I suspect you would at least understand why it is a controversial decision and thus not necessarily assume it's there.


GCC does catch problems with printf arguments vs the format string and it has mechanisms for defining functions that take the same format strings. I operate at a high enough warning level that I see this type of warning now and then.

TL;DR: the compiler DOES have that special logic.


Ah interesting to know that GCC does this! I figured there would be compilers which did this but I sadly didn't have any examples.

> the compiler DOES have that special logic.

Indeed it being a controversial decision does not imply that no compilers would do it; if anything it implies that there should be some compilers which make that choice.


printf is not built into the language so the compiler doesn't know about its semantics...


In a totally pendantically isolated world that's true. It's not an actual hindrance to implementing something like this, as long as it can be turned off for people not using the usual stdlib.


The other issue with printf and every other templating mechanism is that the format string or template itself might be computed rather than being a static literal. Also, the types of the formatted arguments may not be decided statically.


If you want an entertaining if opinionated defense for studying compilers, Steve Yegge's got ya covered: https://steve-yegge.blogspot.com/2007/06/rich-programmer-foo...


A few days ago I had to download this course (along with Stanford's course on Automata theory) from a torrent tracker because it was no longer available on Coursera...


They're both on Stanford's Lagunita, along with excellent courses on Computer Networks and Databases.


This happened to a linear algebra course I wanted to take. It was on Coursera and for some reason removed.



Any idea why they're no longer on Coursera?


Sadly, no


My compilers course was the single most difficult college course I've ever taken (it was an undergrad/graduate cross-listed course usually taken senior year, and at least at my school, UCCS in Colorado, was the cause of many a CS student not graduating on time -- putting a weed-out course that close to the end of the BSCS program wasn't cool, man).

More all-nighters during that class than any other. It occurs to me, I've spoken to several Junior devs over the last few years with CS degrees that never had to take a compilers course. That just isn't right.


It was optional in my degree, and I concur it was the single most difficult course I took. At least twice as much work as any other course, learned twice as much too.


It was an optional course for me, and also one of the hardest I took. But also one of the most enlightening, rewarding, and enjoyable of my degree.


Yes! This is what I’m exactly looking for! By the way, what other free compiler courses would you hackers recommend?


I have worked through the following, and found it quite good http://www.diku.dk/~torbenm/Basics/basics_lulu2.pdf

*Full disclosure that the following is mine https://troydaniels.github.io/diku/compiler/course/2017/01/0...


For those of you who took it - does it go into detail how to do advanced LALR languages, including handling semantics and types? Can I use the knowledge to make my own OOP + FP language right away (Scala/Kotlin-like), or do I need to study much much more than what is covered in this course? Thanks!


I didn't take this course. I have taken compiler classes.

Short answer, yes.

Longer answer, compilers of any sort are hard, because there's both a lot of special cases and also pressure to be algorithmically efficient.

Writing any kind of compiler is going to force you to do stuff with the parse tree. It's up to you to decide if

   float a = 1
is an error or not. doing things like constant value evaluation

   int a = 1 + 2 + 3
   =>
   int a = 6
will get you a feel for all the different ways you can transform the parse tree. It's going to feel frustrating, because it's not the language you want to work on. But building up the skills to shuffle ast nodes is really really helpful.

Making your own language is hard. if your goal is a working language, i'd suggest building an interpreter first, so you can really nail down the semantics. at the end of that you're going to have a call that looks like eval(ast, context). you can do a simple compiler by just reimplementing eval, rather than doing the work, spit out the code that would do the work.

also, if you have an adequate interpreter, you can you can write your compiler in your own language.


> Can I use the knowledge to make my own OOP + FP language right away

I didn't take this course, but from the syllabus posted above it looks like garbage collection is not covered at all. That's pretty much a requirement for FP languages, and it makes OOP languages easier too. So you'd have to look at other resources for that. A simple semispace collector should be relatively easy to write even if you don't want to get too deep into the matter.

For a very first version of your language, you can of course just allocate memory without ever freeing it. That's good enough to experiment with simple programs.

Another point that I think isn't covered in this course, and that is relevant for (many) functional languages, is pattern matching and the representation of algebraic datatypes.


I took the course several years ago and finished it. It's focused on the basic concepts of compiler implementation with the example language (COOL) being pretty traditionally object-oriented. It's absolutely not a how-to course on how to implement advanced language features.


I took a PL/Compilers course based on COOL with Wes Weimer at UVA. We built an interpreter and compiler for COOL using its manual (i.e. a document with a verbal/maths spec for the language) and virtually no skeleton code. It was the hardest class in the department and I consider it to be the most valuable class I took as an undergrad. Before that class I didn't really understand what happened when I 'hit go' on my programs. Weimer is at UMich now but still publishes course materials online: https://web.eecs.umich.edu/~weimerw/


More Stanford free online courses/MOOCs(over 150) here: https://www.class-central.com/university/stanford


Too bad Stanford doesn't care much about free material as MIT does.


Check out Stanford's Lagunita.


They have very few courses.


They have the Compilers course, though.


I did the Coursera version of this course a few years ago. I didn't read any book, but definitely watched parts of the videos more than once.

Their scaffolding code is only available for C++ and Java, but (at least when I took it) many of the tests could be downloaded and reverse-engineered, so I was still able to adequately test my Go implementation. See https://github.com/zellyn/gocool for my working but probably quite nasty code :-)


I also took the Coursera version, and I have a Python version of the tooling for those who are interested:

https://github.com/djc/cs143-python


I have wonderful memories coding the lexer, parser, runtime, and code generation support for OO aspects of COOL. Knowing how inheritance and polymorphism were implemented really does bring some clarity to programming.

Compilers was one of my favorite courses. Definitely humbling and enlightening. Would recommend :D


This is worth a look if you have not seen it. https://interpreterbook.com/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: