MLWorks: SML compiler from Harlequin returning from the dead as open source

rayiner · on May 2, 2013

A tutorial from CMU showing some of the capabilities of the system: http://www.cs.cmu.edu/~fp/courses/97-212/mlworks-intro.html

jfb · on May 2, 2013

It's been a while since I've seen a '/afs/...' path.

numeromancer · on May 2, 2013

I see AFS every day. When we moved into our current house, my daughter, whose initials are AFS, got hold of a Sharpie and proceeded to initial every surface in the house she could get to until my wife found her and curtailed her exhuberant tagging. I still (12 years later) find “AFS” scrawled in places I didn't know she had so tagged. And we haven't had the heart to remove them all anyway.

pjmlp · on May 2, 2013

Well, back in 2003 it was still quite used at CERN, not sure about nowadays.

jfb · on May 2, 2013

I used it a lot at Chicago in the early '90s, then never again. I always liked the idea, but Jesus Haploid Christ was it ever a PITA to set up (I'm looking at YOU, Kerberos.)

gliese1337 · on May 3, 2013

My high school used AFS and Kerberos at least up to 2007 (when I graduated, and my first-hand knowledge ends). Agreed on the pain.

sophiebits · on May 2, 2013

It's still used at CMU, at least.

danielweber · on May 2, 2013

If anywhere it should be there, since it's named after the those two Andrews. MIT was big into it back in the day, not sure about now.

pldrnt · on May 2, 2013

It still is!

emmelaich · on May 3, 2013

Related: the latest (1.7.x) OpenAFS for Windows is a native Windows file system.

http://www.openafs.org/windows.html

That's pretty cool I think. I don't use it now, but I'm tempted to use it to share between Windows and Linux instead of say Samba.

mtp0101 · on May 2, 2013

All hail Frank Pfenning!

trurl · on May 2, 2013

Indeed.

lemming · on May 2, 2013

This is great, I've always liked SML. It has a lot of what I like about Haskell without breaking my brain, and it seems like it could be a good option for a pragmatic systems development language when used with MLton. I even toyed with the idea of developing an IDE plugin for it but was put off by the difficulty of parsing it.

miga · on May 2, 2013

Good luck guys! MLWorks had quite a reputation at the beginning of the century, and both ML & OCaml have then been already acclaimed as industrial strength functional languages.

pjmlp · on May 2, 2013

Interesting, I only knew about Dylan!

Congratulations to all involved into getting it open sourced.

aerique · on May 2, 2013

LispWorks also originated at Harlequin: http://www.lispworks.com/products/ide.html

nick_barnes · on May 2, 2013

Indeed, LispWorks was Harlequin's first actual product, paid for itself over many years and still does today in the hands of several of the original developers who acquired it after Harlequin went bust. The next product was ScriptWorks (a PostScript RIP), which became the company's cash cow and lives on as the Harlequin RIP from Global Graphics http://harlequinrip.com/ . That paid for the boss's pet projects, including MLWorks (which became a product in the mid 90s) and DylanWorks (which never quite did at Harlequin, but was carried on by ex-Harlequin people who set up Functional Objects, and survives today as OpenDylan). The other main thing to come out of Harlequin is Xanalys, who took on the various Lisp-based analytical projects (and ended up holding the MLWorks sources). This is the second time that we have acquired our old projects from the heirs of Harlequin and open-sourced them: the first was the Memory Pool System http://www.ravenbrook.com/project/mps

marshray · on May 2, 2013

$deity bless these fine folks for arranging this act of resurrection.

rptb1 · on May 2, 2013

Why thank you. Although we've opened the coffin, we've yet to reanimate the corpse. Anyone who'd like to help out, please take a look at https://github.com/Ravenbrook/mlworks/wiki/Roadmap

gaze · on May 2, 2013

I see a benchmarking suite. How does the codegen compare to MLTon, SML/NJ, dare I say o'caml?

nick_barnes · on May 2, 2013

Let's get it working again and see. On the original target platforms (SPARC/Solaris, and then MIPS/Irix) it was quicker than that era's SML/NJ on our favourite benchmark (recompiling the whole of MLWorks itself), by about 25% I think. When we first did the x86 port the performance was not so good (the lack of usable registers was a problem for a design that had originally been focussed on RISC architectures), but I think that had improved by the late 90s (I stopped work on the project by about 1996). O'CaML didn't really exist then.

larsberg · on May 2, 2013

It will be a challenge to beat SML/NJ if the implementation has not progressed much since then, much less MLton. Since the 90s, SML/NJ has gotten a new backend, garbage collector, and has gone through a significant amount of performance tuning for the massive changes that happened to machines (e.g., more than 4MB of RAM, multi-level caches, etc.). Further, MLton is still 2-10x faster than SML/NJ, especially on programs that make heavy use of mutation (ref cells and arrays).

That said, I think it's still awesome to ressurect this project for folks to play around with on modern hardware! It appears to have some of the full development environment experience that none of the rest of us in the ML community have put into our products.

nick_barnes · on May 3, 2013

The implementation has not progressed at all since 1999, when Harlequin folded. It doesn't do defunctorization, so probably can't compete with MLton on performance (but it does have an IDE including a REPL, unlike MLton). It's much closer to the Definition than SML/NJ (of course): once we have the compiler running again I'll be interested to use it on the MLton corner cases. And certainly the SML/NJ garbage collector was occasionally a subject of light-hearted mockery among MLWorkers. But as I say, I don't expect x86 performance to be all that great.

larsberg · on May 3, 2013

Certainly! SML/NJ was a research platform first and high-performance/implementation second.

Defunctorization helps, but from talking with Matthew Fluet, most of the perf comes from the combination of monomorphization and whole-program compilation. You get to avoid all the hackery involved with trying to mix essential inlining (e.g., map, foldl) with separate compilation and the somewhat unpredictable performance that results when a user accidentally writes their own map function but puts it in a separate source file without magical incantations for the inliner.

Also, we're working to get the rights for the definition back from MIT press so we can both push out a free PDF version and update the bugs. There are several corner cases that Harper, Tofte, and MacQueen consider mistakes in the '97 version of the definition, but haven't really taken the time to push out an updated definition, after the ML2000 effort went nowhere. Follow up with me offline (contact info in my profile) if you're morbidly curious.

nick_barnes · on May 3, 2013

I'm sure we'll be in touch. I suggest you join the mailing list, then you'll be able to track our progress (and you'll be the first to know once we have a working compiler).

Do give my regards to John Reppy, who I haven't seen for about 15 years. As I recall, he was the author of the "new" GC in SML/NJ (in the mid 90s), which was a distinct improvement on the original semi-space collector.

larsberg · on May 3, 2013

Will do on both counts! John is my advisor :-)

rptb1 · on May 3, 2013

It would be excellent to have the Definition available online. MLWorks originated with a contract to develop a very strict implementation of the Definition, and we were quite proud of being truly Standard ML.

andrewcooke · on May 2, 2013

if they don't have a bootstrap compiler yet i would guess they're not running benchmarks...

jhawk28 · on May 2, 2013

BSD licensed