Hacker News new | past | comments | ask | show | jobs | submit login
On being a PhD student of Robert Harper (cambridge.org)
130 points by JoelMcCracken on Jan 27, 2022 | hide | past | favorite | 65 comments



I’m just some guy that works as a software engineer (i don’t even have a phd) but I’ve always been interested in Type theory.

So, a few years back, i found on youtube the lectures Bob Harper gave at the OPLSS, and watched them all.

It’s been a revelation for me!

It’s truly an experience, and I’ve been a straight up Bob Harper fan ever since. This is how all lectures should be given. The passion he has for these topics is completely evident!

Before, i was vaguely interestes in type theory. Since then, i’ve been really into type theory, and I’ve gotten deeper and deeper into it.

So nice to see this homage to him!


I watched most of the Bob Harper OPLSS videos of 2012 [1] recently.

It felt like I was learning something deep when I was watching, but when I looked back on it later, I couldn’t remember any tangible insights or applications :( A whole lot about equality and then finally how useful dependent types are.

Was wondering if anyone else had the same kind of experience?

1 - https://youtube.com/playlist?list=PL5FJyaC2WsVmiZHLVm7Z718vb...


I think I might be in a similar boat. I've an interest in Type Theory, but rarely get to really connect day to day work with the fun bits.

Interestingly I have never heard of Robert Harper, so I will be looking at those lectures with great interest.

Instead, I have Benjamin C. Pierce's fantastic "Types and Programming Languages" book (still working my way through it bit by bit), and I happened to get one of the most impactful talks of my life from Derek Dreyer: How to Give Talks that People Can Follow. I still use those 20 minutes of advice to this day.

I'm strangely excited to see both their names as authors on this article, and I believe that tells me a lot about what kind of passionate speaking I can expect from Robert Harper!


OPLSS is what education should be: everyone is there to teach and to learn, because it's voluntary and you don't get anything else such as a piece of parchment. What we call education is gross perversion of it.


Similar experience—I came across his HoTT lectures on YouTube and was absolutely blown away by how clear and concise his teaching was. Absolutely transformed my understanding of the subject in just the first lecture.


I don't have a degree but attended OPLSS last year and it was really fun! Worth attending :)


Can't speak to graduate student experiences, but I did take Dr. Harper's principles of programming languages class as an undergrad.

The course was ambitious and moved at a breakneck pace, and Dr. Harper could be a whirlwind in lecture. As some of the anecdotes mention, he's pretty opinionated. He'd often go on intense (albeit well-informed), tangential rants, particularly about Python.

Personally, I would have preferred a less comprehensive course with more time to let some of the abstract ideas settle, but I can imagine his attitude and approach being well-suited to energetic graduate advising.

For those curious about the material, we covered a significant portion of his Practical Foundations of Programming Languages book: https://www.cs.cmu.edu/~rwh/pfpl/2nded.pdf


If you are interested in Harper's opinions of dynamically typed languages, I recommend his blogpost "Dynamic Languages are Static Languages" [1]. It was also discussed on HN [2], in the context of a blogpost of Tratt's, defending dynamically typed languages.

[1] https://existentialtype.wordpress.com/2011/03/19/dynamic-lan...

[2] https://news.ycombinator.com/item?id=8206124


Oh...that guy?

As far as I can tell, literally everything he writes in that post is factually wrong. Not "wrong" as in "I disagree", but "wrong" as in "provably incorrect".

So the previous posts made me very curious about the talks, but if that's the level of care and insight...no thanks.


> He'd often go on intense (albeit well-informed) tangential rants, particularly about Python.

As an academic, it must be soul destroying to see the world dominated by ad-hoc efforts like Python. Just like Linux, it was one persons hobby project that repeated old mistakes and now threatens the existence of argubly better technology.


Maybe but not necessarily. It's possible they enjoy their little academic world where they can forget about average crap and pursue pure abstractions. Jump from conf to conf. And enjoy patiently the 30 years lag between discovery and mainstream application.


You don't have to be an academic for that. The world is full of imperfections. Marketing and luck are bigger factors than merit.


As someone who did it for 10 years... Python is fine, PHP and arguably Go are the annoying ones :)

PL is like art theory & art criticism... You think more about individual aspects, and as long as something's happening, great. Ex: concurrency is super hard, and where python got to with async/await is surprisingly principled for a dyn lang. In contrast, Go's initial interfaces mess was almost an intentional snubbing that unsurprisingly had to get revisited.

Weirder is big ideas take literal decades to come out. Ex: mypy is types from 50 years ago, while stuff like pandas suggests we still need basics like dependent/row types for typing any data science code, which is not a new idea either. But with OSS, it's become more about academic hubris / ivory tower vs an indictment of pythonistas for whether those happen.

It's a pretty fun time in PL for folks who do care: easy to start free frameworks with global reach and not worry about funding/sustainability, so a lot of playing in areas like synthesis and data.


I'm not sure I understand you here. How is Python more ad-hoc than any other language? It strikes me as less ad hoc than, say, Bash. At least Python has a proper standard library.

What do you mean by Python's mistakes? I'm no Python zealot but I'd say the Python language broadly achieves its aims. It's not a platform for cleverness with type theory or monads, and it's not a sensible choice for proving the correctness of programs. Those aren't its aims.


Even accepting it as a dynamic language that is predominantly imperative, it still has many surprising characteristics. The scoping, binding/assignment are conflated, no built-in records, limited lambdas, pervasive mutability and null/None, etc.

I accept that it's a huge improvement over many mainstream languages (certainly bash!), but I find e.g. OCaml/F#/ML code far easier to reason about. These languages are also hybrid imperative and so there is no need to use monads, they can be used just like Python.


Nothing was threatened by Linux as better technology didn't really exist. It was past Symbolics or Xerox Alto times.


Commercial Unix like Solaris and QNX were far better engineered than Linux ever was. Even Minix, which was used to bootstrap Linux, had a micro kernel design.


> Commercial Unix like Solaris and QNX were far better engineered than Linux ever was.

As with many other fields of endeavor, it is very hard to compete against free. Throw in the ability to fix any problems that might affect you personally, and you have a powerful combination.

The same goes for Apache as well, and to a lesser extent MySQL and Perl/Python/PHP.

Trying to set up an equivalent commercial stack was both painful and expensive (and not just the software licensing, but the minimum hardware requirements too). Not to mention that once you're considering commercial options, Microsoft was a contender, often with better initial ROI (the cost to exit Microsoft platforms was quite another matter).


This also glosses over that commercial Unices weren't superior technology. There was more polish (and a ton of own proprietary ugliness too) but you could pretty much build and run the same application stuff on Linux.


Solaris had technology like ZFS and dtrace (granted these now integrate with Linux). QNX had a micro-kernel and support for real time. Solaris would have made a better server OS and QNX a better mobile phone OS.

However Linux is free, and we have all benefited enormously from that, myself included and the tech giants.


Solaris most assuredly had neither ZFS nor dtrace for the first decade of Linux existence.

Kernel architecture (be it micro or monolithic) per se is no virtue.


You are trying to move the goal posts. Both Solaris and QNX are still examples of "superior technology" to Linux, despite both being dead for a long time. Kernel architecture is also highly relevant, ever wonder why Android phones only get a few years of updates?


You were literally wrong given the context of the conversation. It is immaterial which features Solaris got 13 years down the road in its competition with Linux (and Linux wasn't static as well).

> Kernel architecture is also highly relevant, ever wonder why Android phones only get a few years of updates?

I figure in your mind it has to do something with micro-architecture rather than vendors not giving a flying fart, so I'll bite. Can you explain the mechanism by which a monolithic kernel update is any harder than a bunch of services?


> You were literally wrong given the context of the conversation. It is immaterial which features Solaris got 13 years down the road in its competition with Linux.

No, you were wrong with the statement "This also glosses over that commercial Unices weren't superior technology.". The context was Linux killing off the (technically superior) competition, a process that took decades.

There are many other technologies that Solaris had years before Linux, e.g. decent SMP, Zones (containers), ZMF management and many other features that Linux still does not have. QNX had realtime features and a microkernel.

> I figure in your mind it has to do something with micro-architecture rather than vendors not giving a flying fart, so I'll bite.

The situation is more complex than "vendors not giving a flying fart". Linux does not have a stable module ABI, in fact, it doesn't even guarantee a stable API. This is all part of "kernel architecture" and highly relevant. Regarding microkernels, security concerns will eventually force this design on future operating systems (e.g. Google's Fuschia)


Only someone who never used Minix would seriously argue it was better in any respect.


Minix was so good that Intel lifted the codebase (within the terms of the FOSS license, not telling the author) for their management backdoo- uhh engine.


We can only speculate, but it's very unlikely this had anything to do with how good minix is.


> As an academic, it must be soul destroying to see the world dominated by ad-hoc efforts like Python.

I dunno that I see it that way. (I am a PhD student in programming languages, likely going to end up in academia.)

In many ways, yes, Python is "bad". It lacks an expressive static type system, it has some bizarre rules about scope, and there are other idiosyncrasies that, were I to build my own language from scratch, I would seek to avoid.

But in many other ways, Python is very good. The standard library is phenomenally complete, featuring functionality for the vast majority of common use cases. I also find the documentation to be pretty approachable, though that sentiment seems to not be very well-shared. As for the language proper, the lack of a type system does irk me somewhat, but I use mypy to get back some of those guarantees so it's not as bad as it could be. I find the syntax mostly pleasant, and the module system is usually intuitive. Some of the things they've added are very nice features: I like that multiple consecutive string literals are treated as a single string (useful for splitting strings across lines); I like f-string formatting, too; the use of underscores as numerical separators is great (though not unique to Python); the overall design of the language, while truly odd at first, is relatively consistent and powerful (e.g., everything-is-an-object, iterators/iterables are prevalent, classes can be modified dynamically [which is maybe not always a good thing, but it's a neat tool for the power-user], etc.).

Again, it's not a language I would build myself, but as far as languages that exist, it's not even close to my least favorite.

I think it's also worth considering the space of programming language design in linguistic terms. Natural (human) languages are amazing because they constantly evolve through regular use. New terms are inducted into lexicons constantly, and new grammatical rules can be generated and regularized among a group to give the users their own characterization separate from other speakers of the same language. Natural language is beautiful.

If we look at individual programming languages as separate languages, they do not exhibit the same properties. They are pretty much static, except when a new major version is released.

But if we look at programming languages rather as dialects of a single common language (easy to do when we consider "Turing-complete languages" together, for example), then we see each language as a new manifestation of the unique ideas of a group of people, similar to a dialect in natural language. There are new idioms and expected constructions. The capability of expression remains constant, more or less, but the minor connotative differences are many and varied.

All this is to say: Python is not "should destroying". It's just another way of looking at the same system. One from which we can learn things when we seek to design new programming languages, certainly, but I don't think its idiosyncrasies inherently make it so bad as many portray it.


What didn't he like about Python?

There are things about Python which are ugly, but the language is nice for being flexible, lacking surprises, having an ethos favouring readable code, and having a comprehensive set of libraries that are easy to use.

Python's famed readability takes a dip when you use libraries like Pandas or Tensorflow -- which awkwardly glue a whole new language onto Python -- but standard library Python is very common-sense and readable.

(Also, Python's OOP is awkward with object.f() meaning something different from f(object), when really they ought to mean the same thing. In Pandas, you sometimes have a function with the same name as a method, which behave subtly different from each other.)

Academic functional languages don't have most of those, and academia has no right to complain given that they've done little to rectify this.


Bob Harper is known (among my circles, anyway) as taking pretty hard stances in favor of first-principles approaches, and he's also one of the foremost researchers in advanced type theory.

Python is about as far away from his interests as you can possibly get, I'd imagine.

> Academic functional languages don't have most of those, and academia has no right to complain given that they've done little to rectify this.

On this note, I'll have to (hopefully politely) ask you to learn a bit more about what you're going to talk about before offering such opinions.

Most (maybe all?) modern languages were developed primarily as direct results of academic involvement. And in those areas in which the influence is not direct, advances are usually still due to the efforts of those in PL academia. I don't say this to be obnoxious, but you can pretty well trace the developments through conference proceedings over the past few decades.

As for what we refer to as academic languages: they are not meant to be useful to the general programmer. They don't need powerful standard libraries or flexibility. They're meant for exploration and experimentation. Haskell (for example) was never supposed to catch on in the mainstream. It was designed to answer questions like "What does programming in a lazy language look like?" and "Is such a language useful?" and "In what ways is it not useful?" This is detailed in the phenomenal paper "A History of Haskell: Being Lazy with Class" from I think HOPL 3.


> Python ... the language is nice for ... lacking surprises

Really?! The number one thing I dislike about Python is that it is chalk full of surprises, especially if you know more regular languages like MLs (e.g., SML, F#) or Schemes (e.g., Scheme itself, Racket). The scoping rules alone are a minefield.

If one knows F#, for example, going to Python is a step back in basically every way unless you need a package that is only in Python (somewhat rare).

> Academic functional languages don't have most of those, and academia has no right to complain given that they've done little to rectify this.

It's an interesting stance, given Guido van Rossum seemingly did everything he could to avoid what came out of academia.


You've made a lot of claims. I don't use F# of MLs, so why for you are they a step forward?


For F# vs Python, I cannot think of anything that Python does better than F#, and F# has several features that Python doesn't. Things that come to mind in F# are unhampered lambdas, clear and consistent scoping rules, cleanly separating binding and assignment, built-in pattern matching inherent to the language, discriminated unions, built-in records, default built-in immutability, actual concurrency, piping, embracing imperative and OOP and functional paradigms, performance, modules, type inference, units of measure, etc.

They share many things like whitespace sensitivity, notebook implementations, relatively clean syntax, but even the things they share, I'd give F# the edge on all of them.


Isnt this like comparing apples and oranges, both designed & built decades apart (Python late 80's by CWI in Netherlands, F# 2005, Microsoft Research at Cambridge).

F# seems to be a language with more 3rd party addons built into the standard Off-The-Shelf offering seen with other languages. My biggest takeaway when I looked at F# is it seemed better suited to db work than C#.


I think the major thing is that F# did not ignore the things that came before it, whereas Python did. While Python did come about in the early 90s, it had already ignored things like Lisp, Scheme, Smalltalk, Erlang, and Standard ML (SML). As it evolved into the 2000s and 2010s, it continued to ignore developments. Compare that to Clojure, Elixir, and F#, and there's a stark difference. All three of those languages were designed to be highly pragmatic and not academic, but they all three stood on the shoulders before them. If you read interviews with van Rossum, he shows a complete unwillingness to accept the functional programming paradigm as pragmatically useful and basically rejects it wholesale. In his Slashdot interview, he explicitly said "functools (which is a dumping ground for stuff I don't really care about :-)".

https://developers.slashdot.org/story/13/08/25/2115204/inter...

It's frustrating to me, because if Python had adopted at least some of the things that languages like SML then OCaml and then F# were doing, it would dramatically elevate it. At the time of the interview, in which he goes on to state "Admittedly I don't know much about the field besides Haskell", the year was 2013, and so at that point, F#, Clojure, OCaml had already been around for a while, Erlang had been released with a permissive license, and Elixir had actually been released the year before. Further, it's not like even before that that Haskell was the only functional language. SML, Scheme, and even Racket (then known as PLT Scheme) had been around for a while.

Python is a wolf in sheep's clothing to me. It has a deceptively simple look at at the surface, but it is actually a quite complicated language when you get into it. I've done concurrent programming my entire programming life in F#, Elixir, and LabVIEW. I simply do not understand how to do concurrency things in Python in a reliable and composable way (for one, there's several different implementations of "concurrent" programming in Python with various impedance mismatches and incompatibilities). I.e., it's complicated.


There wasnt much before Python either, I know some people who think OS/2 was better for multi tasking than windows, but the popular market didnt seem to think so.

> (for one, there's several different implementations of "concurrent" programming in Python with various impedance mismatches and incompatibilities). I.e., it's complicated.

Is this because Python doesnt offer any Co-operative threading model and just exposes programmers to preemptive threading and all the different synchronisation facilities an OS provides? https://docs.microsoft.com/en-us/windows/win32/sync/synchron...

I dont have a problem with any of the synch objects because they also have strengths and weakness which are relevant to what ever the code is doing. Speed of synch objects is one factor, but there are others, like handling of shared data between two or more apps.

So are you saying F# removes the ability to choose synch objects?

Half the problem I find is just keeping up with different terminology used in various parts of the world.


huh. Forgot to mention tail recursion.


Isnt this a CPU optimisation thing, ie code & languages working better for some instruction sets than others?

One of the languages I've used in the past always performed best on certain AMD cpu's, but that was when we could choose what cpu the compiler was to optimise for, ie 286, 386 etc etc.

I'm also aware of backroom shenanigans going on with some languages and tools working better for some CPU's.

This 12 year old post goes into a lot of detail https://stackoverflow.com/questions/1414911/intel-x86-assemb...


Tail call optimization has nothing to do with optimizing for different CPUs, it's about dropping a function's stack frame when it's evaluating its return expression and its stack frame isn't needed anymore.


Is this the difference in calling conventions ie callee or caller stack clear up? https://en.wikipedia.org/wiki/X86_calling_conventions#List_o...

This is half the problem different terminology in use between different age groups, just like slang is different between different age groups.


I'm not sure, so here are a couple examples.

    def f() =
       g()
In a language implementation that doesn't optimize tail calls, the stack would look like the following after the call to g:

    g
    f
    main
In a language implementation that does optimize tail calls, the stack would look like this, because the result of f is whatever the result of g is so f is no longer needed:

    g
    main
If a language implementation doesn't optimize recursive tail calls, the following code will quickly overflow the stack and the program will crash:

    def loop() =
        do something...
        loop()
In a language implementation that does optimize recursive tail calls, this code can run forever because loop's stack frame gets replaced with the stack frame of the new call to loop.

The reason people want recursive tail calls optimized out is at a much higher level than anything to do with the actual CPU instructions being used, they just want to have a way to write recursive functions without worrying about the stack overflowing.

What's my age group, by the way?


I have never seen this referred to anything other than things like tail-call recursion, tail-call optimization, etc.

Languages like Python make implementing simple loops like:

    def loop():
        <whatever>
        loop()
impossible.

Python will reach a maximum recursion depth and error.

Why is this important? Like I said, it makes looping very easy. For example, actors can almost be trivially implementing in languages with tail-call recursion.

It’s not in Python because like most things in Python, van Rossum doesn’t like it because <reasons>.

https://stackoverflow.com/questions/13591970/does-python-opt...

There’s little point in having full traces of the data doing in and out of the tail-call loop is immutable, so you only really care about the current call of the function.


<goes away to find out what if its in my preferred language & tool>

Its not something I've every heard of before, I guess its peculiar to Python though but dont most languages have some eccentricity?


It's absolutely not peculiar to Python, it's something that every single language implementation has to make a decision on one way or the other.

Here's an SO answer from 2008 about how to enable TCO in various C and C++ compilers: https://stackoverflow.com/questions/34125/which-if-any-c-com...

There are many things both you and I have never heard of before. That's normal.


Yes, the different terminology is a reflection of the entrance of so many new entrants going for what is easy in the short term, instead of learning the theory of their industry and thus learning better approaches that are not 'immediately' obvious.

This lack of learning theory in our industry, instead going for something that is 'easy to get started' explains the popularity of python and javascript, and at the same time why python and javascript are littered with problems that have already been solved, and cluttering up the field of knowledge by reinventing terminology because they never learned the original existing terms.


Yea! And that's a big one.


>unless you need a package that is only in Python (somewhat rare).

That's basically most of scientific/numerical computing.


And yet, people are using Python on much more for things that never touch Numpy, Pandas, or machine learning libraries.

Also, Python isn't the only source for scientific computing. Although, I'm a bit frustrated Microsoft didn't see F# as a Python competitor early on and poured resources into it like they did with Visual Studio Code or TypeScript.


R you forgetting another language which is useful for scientific/numerical computing?

Would Matlab count?


( s/chalk full/chock-full/ )


it's chalk full because academic languages use up all the chalk.


I did wonder for a moment whether it was some pun or joke that was going over my head. :D


Lol. Whoops. Thanks!


> Python's OOP is awkward with object.f() meaning something different from f(object), when really they ought to mean the same thing.

Why should they? Coming from other OO languages, I would expect `object.f()` to involve a vtable lookup, or something similar that effectively has `object` carrying around its own particular implementation of `f` (which might have been closed over some hidden internal state as well), and I'd expect `f(object)` to not do anything of the sort.

> In Pandas, you sometimes have a function with the same name as a method, which behave subtly different from each other.

This sounds more like a poorly designed library than any language-level awkwardness.


> Python's OOP is awkward with object.f() meaning something different from f(object), when really they ought to mean the same thing.

>> Why should they? Coming from other OO languages, I would expect `object.f()` to involve a vtable lookup

Low level details like that should not concern me. In which case, maybe `f(object)` should do a vtable lookup then; why not? The syntax `object.f()` and `f(object)` should be interchangeable in all situations.


> This sounds more like a poorly designed library than any language-level awkwardness.

Languages ought to disallow these things. E.g. I'm constantly baffled that many languages won't let you use symbols as normal method names, but are absolutely fine with you having two methods whose names differ only in case. (There's even academic research showing that case sensitivity is the biggest pitfall for beginners learning Python).


HoTT is too big for my tiny brain, but Harper's book Practical Foundations of Programming Languages (PFPL) was hugely informative and is reasonably accessible to programmers who like some theory and consider themselves to be language geeks. The full text of the first edition was online. Now there's a second edition that is mostly online with a few parts missing, but that still looks good:

https://www.cs.cmu.edu/~rwh/pfpl/


Dr Harper has had a huge impact. Reading these experiences is very cool.


I was an undergrad at CMU and Harper broke the curriculum. The Computer Science undergrad curriculum was that one should take Programming Languages (15-212) before taking the course on Compilers (15-312). But, for a couple years they offered 212 in an alternate Java version instead of the usual ML. I wanted to take it in a _practical_ language instead of a horrible useless language, so I took the 212-Java version. Harper refused to accept that as a prereq when he was teaching 312 Compilers. Yes, I'm still bitter (20+ years later), and I still think ML is an annoying useless academic language no undergrad should be subjected to.


I learnt ML as an undergrad. I felt it was an excellent introduction to the key concepts of functional programming (ADTs, partial application, type inference, lazy data structures, pure functions/lack of mutability etc).

Would I use it in production code, indeed have I even touched it since undergrad? No

Have I found the concepts I learnt via it highly valuable? Yes

Writing compilers maps pretty well to ML, especially for simple/toy compilers you might write as an undergrad, so I can see why he might insist upon it, allowing him to start with the concepts already learnt as a base.


Great read! Your casual reminder that there's a group of SML-ers who hang out on Reddit [1].

[0] https://www.cambridge.org/core/journals/journal-of-functiona...

[1] http://reddit.com/r/sml


I should give the language another chance. Maybe some day I would teach people functional programming through SML. I think the fingers on both hands are enough to count the people in my country who have heard of SML.


Quite unpopular in the Haskell community as I recall.


He doesn’t like Haskell for various reasons. But I think Haskell devs respect his work!


He hasn't blogged in a while, so it's fine. MLers and Haskellers both think Agda is very beautiful, and so I think this diffuses things a bit.

Also his former advisee Jon Sterling is doing domain theory stuff now, so hah!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: