Architecture of Lisp Machines (2008) [pdf]

gumby · on July 2, 2021

The lisp implementations described here had a small number expensive runtime costs (otherwise lisp can, if you wish, be complied into very fast code).

One was the cost of the gc memory barrier (cleverly managed for commodity hardware by using the mmu and toggling the write bits, I think thought up by Solvobarro). I think a slightly more sophisticated trick could be done with some extra TLB hardware to generalize this for generational collectors for any gc language, say Java. Another smart trick would be to skip transporting unless fragmentation got too bad. In a modern memory model compaction just isn’t what it used to be.

A second one is runtime type analysis. With the RISC-V spec supporting tagged memory this could be speeded up tremendously for Lisp, Python et al. Is anyone dabbing chips with that option?

The nice thing today is that a lot of languages are revisiting ideas originally shaken out by lisp, so speeding those languages up can speed up Lisp implementations too.

Ps: wish this article has mentioned the KA-10 (first PDP-10) which was really the first machine designed with Lisp in mind and with an assembly language that directly implemented a number of lisp primitives.

pjmlp · on July 3, 2021

Tagged memory seems to be the only option left to mitigate C's memory corruption issues.

While Intel borked their MPX implementation, it has been successfully adopted on Solaris SPARC, and has been making their way across ARM implementations (v8+) and Apple's. Microsoft's Phonon might have something similar, but very few details are public available.

ekez · on July 2, 2021

What are the advantages of tagged memory as opposed to using the unused bits in a regular pointer as a tag?

retrac · on July 2, 2021

Usually, tagged memory at a hardware level, goes along with support for tagged operations in the processor.

In the LISP machines for example, you had an add instruction. Which would happily work correctly on pointers, floats, and integers depending on the data type, at the machine code level. Offers safety and also makes the the compiler simpler.

But where this really shines is in things like, well, lists, since the tags can distinguish atoms from pairs and values like nil, fairly complex list-walking operations can be done in hardware, and pretty quickly too. It also makes hardware implementation of garbage collection possible.

This is just my intuition, but I suspect, these days, it all works out to about the same thing in the end. You use some of the cache for code that implements pointer tagging, or you can sacrifice some die area away from cache for hardwired logic doing the same thing. It probably is in the same ballpark of complexity and speed.

gumby · on July 2, 2021

In addition to what retrac wrote, these tag bits would apply to immediates as well, not just pointers.

fulafel · on July 3, 2021

Speed, like is usual with hw vs software implementations of things. But the difference is these days less than it used to be, because processors spend most of their time stalled on all kinds of things and the ALU processing capacity is underused most of the time.

An interesting question for a modern ISA design would be to figure out how to make tagged words and memory work well with SIMD.

gumby · on July 3, 2021

> An interesting question for a modern ISA design would be to figure out how to make tagged words and memory work well with SIMD.

Not sure what the issue might be. Let’s say you’re doing a multiply-add: you’d call the one for the data type you want and if any operand were of the wrong type you’d get a fault. Am I missing something?

fulafel · on July 3, 2021

That sounds like a lot of mode bits or instruction variants to me. But maybe it wouldn't be a problem.

gumby · on July 3, 2021

Consider that your SIMD instruction might itself take a tag mask and the ALU need only do equality on the tag field. In fact it could do that in parallel with the ALU op; on mismatch you could simply discard the current state and abort the operation. However realistically you’d want the same set of SIMD instructions as an I tagged architecture anyway.

Also I expect any compiler would assume that the contents of an array subject to SIMD computation would be homogeneous anyway, perhaps trying to enforce it elsewhere.

In any case this doesn’t seem like a big deal to me…but I could be wrong!

fulafel · on July 4, 2021

Yep, sounds like a good sketch.

I guess to really get into it a good start would be to work with existing SIMD and take a quantitative approach to what is actually the most hot part of it. I wonder if any existing language implementations (eg Common Lisp) attempt to do these kinds of things in the first place.

analognoise · on July 2, 2021

I don't suppose anyone wants to team up to try to build one for an FPGA?

I know it's been done, but it sounds like fun.

Rochus · on July 2, 2021

https://github.com/lisper/cpus-caddr

rjsw · on July 3, 2021

I think that design could be made to run faster. The ALU currently copies all the original chips rather than just describing the operations needed in Verilog. Newer FPGAs should be big enough to put the microcode in block RAM as well.

Rochus · on July 3, 2021

Depends on the goals; personally I prefer an implementation as close to the original as possible because I'm interested in how they did it and what performance they were able to achieve. If on the other hand the goal is to have a Lisp machine like working environment/experience translated to todays standards/performance then I assume mirating the original code to an SBCL/Linux setup would rather be the way to go.

analognoise · on July 3, 2021

Likely, sure. You're relying on a ton of work by other people - thousands of them just to get a modern processor, let alone Linux.

Wouldn't it be cooler to understand the architecture, upgrade it and put it on an FPGA? Have a faithful Lisp machine with faster everything that fit in a $50 FPGA?

No Intel backdoors. No adtech. No telemetry. No X11 cruft. No SystemD boot mess. No Nvidia driver that doesn't play well with others. You could own the whole architecture - it's small enough for a few people to really grok.

Like restarting all of computing - with a lisp machine, for fun. Not relying on the million years of effort in Linux and the million years of effort on the modern processor below it.

You'd be relying on the fundamental advancements at the silicon layer - modern ASIC cells are practically perfect, compared to what was available in the 1970's. No need for multi-phase clocks, multi-power supply systems (you don't need +/- 10V?). No 10A just to drive 128kB of SRAM. It simplifies everything!

Architecturally a lot of the design in something from this vintage is "because they had to". Modern FPGA design is almost like having 'perfect' or textbook components. You can fan-out hundreds and hundreds of nets and meet timing at 100MHz - something designers would have killed for in 1980!

With "proper" design, on a modern FPGA fabric, you could run at 500MHz. You'd have the world's most roided-out lisp machine.

Boot time? Practically instant. Key lag? What lag?

Need extra horsepower for a scientific calculation? Attach an accelerator directly to the bus.

You could use Yosys and Verilator and the whole chain would be open. Nobody could ever take it from the community.

At outdated silicon nodes, you can build an ASIC. You could put the whole design on Skywater PDK and publish your transistor level design. Would it be competitive with a 5nm processor? Absolutely not.

Would it be the ultimate expression of the Hacker rebellion? I think so.

Rochus · on July 3, 2021

> Wouldn't it be cooler to understand the architecture, upgrade it and put it on an FPGA?

Personally, I would refrain from "upgrading", but instead faithfully recreate the digital circuits (simply on an FPGA instead of discrete logic), as it was apparently done in the referenced project. It's the same intention as when (re-)implementing Babagge's machines. If it's just to do Lisp programming on a modern machine, everything is already there.

analognoise · on July 3, 2021

This is the kind of stuff I'm talking about. Maybe the design space didn't win, but it would be really fun to explore.

rjsw · on July 3, 2021

Another project could be to modify the caddr verilog to match the LMI hardware, they have a slightly different MMU. The LMI software stack is complete and can rebuild itself.

I chose to emulate the OpenCores ethernet controller in the CADR software emulator to make it easier to move images between software and FPGA implementations.

analognoise · on July 3, 2021

Do you have a reference for the LMI hardware? That sounds like a good starting point, especially if the software exists already for the LMI hardware.

rjsw · on July 4, 2021

Maybe easiest to look at the sources for the emulator, I posted a link to it in another comment.

eric4smith · on July 3, 2021

Unpopular opinion here - and maybe it’s because I’m jealous of the Lispers.

But is Lisp used for anything real?

I mean, even teaching - is it worth it to learn lisp? Aren’t other languages more practical to learn?

Isn’t learning lisp like learning a a dead language that once you leave the lisp class you’ll never use again?

Wouldn’t you learn the exact same things you learned in lisp using a more widely used (practical)language?

twobitshifter · on July 3, 2021

Here’s the famous Paul Graham Essay:

http://www.paulgraham.com/avg.html

>Eric Raymond has written an essay called "How to Become a Hacker," and in it, among other things, he tells would-be hackers what languages they should learn. He suggests starting with Python and Java, because they are easy to learn. The serious hacker will also want to learn C, in order to hack Unix, and Perl for system administration and cgi scripts. Finally, the truly serious hacker should consider learning Lisp:

>Lisp is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days, even if you never actually use Lisp itself a lot. This is the same argument you tend to hear for learning Latin. It won't get you a job, except perhaps as a classics professor, but it will improve your mind, and make you a better writer in languages you do want to use, like English.

>But wait a minute. This metaphor doesn't stretch that far. The reason Latin won't get you a job is that no one speaks it. If you write in Latin, no one can understand you. But Lisp is a computer language, and computers speak whatever language you, the programmer, tell them to.

retrac · on July 3, 2021

Common Lisp is a reasonably nice general purpose object-oriented language. It's big, but it's a smaller language than C++ while about as expressive, and quite efficient with a good compiler. There are a few real world applications in Lisp out there still. I saw an article about how Grammarly uses it for their grammar-checking service.

Small Lisps are sometimes used as an embedded or scripting language -- AutoCAD to Emacs. Not just old code, I saw a project using it as a scripting language for a game engine in Rust the other day. When the language is crafted to that kind of purpose, it's often a good fit.

Not to harp on like a CS prophet, but Lisp is basically a syntaxless language. '(It (can (be . maddening))) but the simplicity has its advantages. Ridiculously easy to implement (see niche scripting applications again) and you never get bogged down by syntax when coding. Just functions and data. It encourages a very meta style of programming, often where your program helps write itself at runtime.

Even if you aren't sold on Common Lisp being a good language for career advancement (I'm not either, honestly) I do think something in the Lisp family is probably worth learning for the same reason you should learn at least one assembly language in passing. A handful of basic primitives support every major approach to programming, if you structure them correctly. Makes you think about the problem differently.

rjsw · on July 3, 2021

I use it in a real product.

Common Lisp runs a lot faster than Python or Ruby and it is faster to write than C or Java.

mhuffman · on July 3, 2021

>Isn’t learning lisp like learning a a dead language that once you leave the lisp class you’ll never use again?

Well, similar to learning a dead language, lisp has lots of benefits!

For example, learning some latin is very useful when you come across new (to you) words with latin roots for understanding what they mean!

Similarly, learning lisp is very useful for understanding useful ways to solve new programming problems that you come across.

This is what people mean when they say that lisp will change the way you think as a programmer.

But even beyond that, lisp can be very productive. Especially some of its derivatives (like clojure).

Edit: I did not see that someone already replied with a similar response. But I will leave this here anyway.

pjmlp · on July 4, 2021

Depends how real you consider the customers that keep LispWorks and Allegro Common Lisp still in business since the Lisp Machine days,

http://www.lispworks.com/

https://franz.com/

They are surely not surviving the last 30 years out of charity.

aliasEli · on July 2, 2021

Lisp machines were an interesting idea. Unfortunately they were very expensive and fairly slow compared to other machines at the time.

retrac · on July 2, 2021

Lisp machines weren't slow; the original CADR of the late 70s ran at around 1 MIPS on 32 bit data with up to 8 MB of RAM, making it about as fast as the VAX 780. The VAX was a large minicomputer released in 1977 and one of the fastest machines, short of a high-end mainframe, at the time. A Lisp machine also cost about as much as a VAX (but for a single user).

The problem was maybe, aside from a $50,000 PC being hard to sell, that even on such generous hardware with specialized support, Lisp, particularly with the more naive compilation techniques of the 70s and early 80s, and after adding a fairly sophisticated operating environment, was still a rather hefty language.

rjsw · on July 2, 2021

The CADR used basically the same chips as a VAX 11/780.

peter303 · on July 2, 2021

Special purpose CPUs ran faster than general purpose. However they had upgrade cycles of 3-5 years compared 1/2 to 1 year for commodity chips. The commodity chip almost always caught up in the meantime at a lower cost. My research group bought array processors, fine grained processor like MassPar and Thinking Machines, min-super computers like Convex, and this catch-up happened every time. LISP firmware on general CPUs caught up with custom hardware like Symbolics too.

Very large customer bases like Nvidia can have annual design releases and keep up.

zozbot234 · on July 2, 2021

> The commodity chip almost always caught up in the meantime at a lower cost.

This dynamic is dead now, thanks to the slowing down of Moore's Law. We're even seeing a resurgence of special-purpose hardwired accelerators in CPU's, because "dark silicon" (i.e. the practical death of Dennard scaling) opens up a lot of opportunity for hardware blocks that are only powered up rarely in a typical workload. That's not too different from what the Lisp machines did.

formerly_proven · on July 3, 2021

Seems to me like Lisp was the OG "bloat language" (cf. Python, Ruby, ... today).

mark_l_watson · on July 2, 2021

My Xerox 1108 was reasonably fast, even updating it from InterLisp D to Common Lisp.

Now I now live in a combination of SBCL+Emacs+Slime and also LispWorks Pro. For newbies who want to learn a Lisp, I point them to Racket.

bitwize · on July 2, 2021

They were fast compared to contemporary machines (minicomputers like the PDP-10). What happened was. powerful micros came out and the technology in those and in Lisp compilers for those machines eventually surpassed the LispM architecture in speed. Complacency and mismanagement at companies like Symbolics meant the LispM architecture never caught up, even when it moved to a microprocessor architecture in the 80s.

pfdietz · on July 2, 2021

The single most important trick I remember for Lisp on stock hardware was implementing pointers to cons cells as pointers to the next byte, and doing car/cdr by -1(reg) and 3(reg) (or 7(reg) on a 64 bit machine). This automatically traps on non-conses without any extra cost.

rjsw · on July 3, 2021

The original SPARC CPU is designed to use this tag encoding scheme, section D.4 of the V8 Architecture Manual describes how to use it.

pfdietz · on July 3, 2021

Yes, that gives you flagging/trapping on arithmetic if any args are not fixnums. Aside from that, the idea works on other architectures as well.

bitwize · on July 3, 2021

Oooooh, that is "square root magic constant" levels of dirty.

pfdietz · on July 3, 2021

Also, it lets you implement fixnums with zero low order bits. That is, fixnum x is implemented as x << 2 (on 32 bit machines) or x << 3 (on 64 bit machines). With this encoding, addition and subtraction that is known to produce another fixnum can be done with ordinary add/sub instructions.

lispm · on July 2, 2021

Actually they were not slow compared to other machines. Initially they were developed to replace minicomputers (https://en.wikipedia.org/wiki/Minicomputer) as machines for Lisp programmers.

Instead of sharing one minicomputer having 8 MB RAM (or less) with tens or hundred users, the Lisp programmer had a Lisp Machine as a first personal workstation with GUI (1981 saw the first commercial Lisp Machine systems, before SUN, Lisa, Macs, etc.) - thus the Lisp programmer had not to compete with many other users with scarce memory availability. Often Lisp programmers had to work at night when they had a minicomputer alone - a global garbage collection would make the whole machine busy and response times for other users were impacted, up to making machines unusable for longer periods of time. When I was a student I got 30 minutes (!) CPU time for a half year course on a minicomputer (DEC10, later VAX11/780).

So for a Lisp programmer their personal Lisp Machine was much faster than what he/she had before (a Lisp on a time-shared minicomputer). That was initially an investment of around $100k per programmer seat then.

Later clever garbage collection systems were developed, which enabled Lisp Machines to practically use large amounts of virtual memory. For example: 40 MB physical RAM and 400 MB virtual memory. This enabled the development of large applications. Already in the early 80s, the Lisp Machine operating systems was in the range of one million lines of object-oriented Lisp code.

The memory overhead of a garbage collected system increased prices compared to other machines, since RAM and disks were very expensive in the 80s.

A typical Unix Lisp system was getting cheap fast, though the performance of the Lisp application might have been slower. Note that there is a huge difference between the speed of small code (a drawing routine) and whole Lisp applications (a CAD system). Running a large Lisp-based CAD system (like ICAD) at some point in time was both cheaper and faster on Unix than a Lisp Machine. But that was not initially, since the Unix machines usually had no (or only a primitive) integration of the garbage collector with the virtual memory system. Customers at that time were then already moving to Unix machines. New Lisp projects were also moving to Unix machines. For example the Crash Bandicoot games were developed on SGIs with Allegro Common Lisp. Earlier some game contents was even developed on Symbolics Lisp Machines - the software later was moved to SGIs and even later to PCs. Still a UNIX based system like a SUN could cost $10k for the Lisp license and $40k for a machine with some memory. Often users later bought additional memory to get 32MB or even 64MB. I had a Mac IIfx with 32MB RAM and Macintosh Common Lisp - my Symbolics Lisp Machine board for the Mac had 48MB RAM with 40bits and 8bit ECC.

Currently a Lisp Machine emulator on a M1 Mac is roughly 1000 times faster than the hardware from 1990 which had a few MIPS (million instructions per second). The CPU of a Lisp Machine then was as fast as a 40Mhz 68040. New processor generations had then either been under development, but potential customers moved away - especially as the AI winter caused an implosion of a key market: AI software.

For an article about this topic see: http://pt.withington.org/publications/LispM.html

"The Lisp Machine: Noble Experiment Or Fabulous Failure?"

eschaton · on July 2, 2021

They were (are) slow, though. By 1990, workstations a tenth the price were just as fast, and while Symbolics was trying to scale Ivory past 14 MHz, RISC CPUs were rapidly approaching 100 MHz and CISC CPUs were heading that way too. And Coral, Gold Hill, and Lucid all showed that modern general purpose CPUs could run good Lisp environments well.

My Symbolics systems are elegant, don’t get me wrong. But Genera wouldn’t have been any less elegant if they’d taken their 80386+DOS deployment environment (CLOE) and used it as the basis for a true 80386 port of Genera. They were too stuck on being better than everyone else at designing hardware for Lisp that they missed not needing special hardware for it.

lispm · on July 2, 2021

1990 was already 10 years after the first machines had some wider availability. 'Wider availability' means more then 20 hand-made machines and having commercial vendors (LMI and Symbolics, then TI and Xerox). Yeah, Lucid was pretty nice - too bad then went under when their investment into C++ killed them.

Actually I think Lucid was founded, because Symbolics did not want to further invest into a UNIX based implementation. Symbolics did support SUNs with Lisp Machine boards (the UX400 and UX1200). TI had Lisp Machines with UNIX boards.

Later Symbolics developed a virtual Lisp Machine running Open Genera (a version of their Genera operating system) for the 64bit DEC Alpha chip on top of UNIX.

"The Symbolics Virtual Lisp Machine Or Using The Dec Alpha As A Programmable Micro-engine"

http://pt.withington.org/publications/VLM.html

eschaton · on July 2, 2021

Lucid was founded because Symbolics was unbelievably expensive and unbelievably arrogant. “We will always be the best so you have no choice but to pay our prices, nobody can match our performance or technology.”

They wouldn’t even let a company decommissioning a workstation give it to an employee who wanted to take it home without paying about the cost of a Macintosh in “license transfer fees” and then whoever got it had to pay “maintenance” to stay within the letter of the license.

VLM is decent but they’d have been better off retargeting 80386 and 80486 atop either Unix or Windows, rather than trying to maintain their own special fancy architecture forever.

pjmlp · on July 4, 2021

Ironically we are still catching up with what Lucid was pushing for C++ when they pivoted.

eschaton · on July 2, 2021

Also, once the Symbolics 3600 series was released in the early 1980s, Symbolics looked like they were just resting on their laurels relative to the rest of the industry. There weren’t binary-compatible performance improvements every year or two like the rest of the workstation industry was seeing, it took the rest of the decade to achieve that kind of improvement, and with only source compatibility, in the form of the XL series.

To do a bit of an apples to apples comparison, look at the Apollo and Sun workstation lines versus the Symbolics workstation line from 1983/4-1991/2. That takes you from the Apollo DN300, SUN-1, and Symbolics 3600, through the Apollo DN10000, Sun SPARCstation 2, and Symbolics XL1200. They all started at about 1 MIPS but ended at very different positions.

fulafel · on July 3, 2021

At the time custom hardware was much more common and was attractive, it would have needed rare foresight to bet on 386 a long time before x86 became attractive. Unix workstations, Amiga, etc were doing well and Intel hw was generally seen as pedestrian and stuffy with bad graphics, and 386 seemed like another attempt from intel to dig x86 out from the DOS hole (it broke compatibility with 286, OS/2 had targeted special 286 features...). 386 Unix vendors weren't doing that well either. NeXT Cube with 68040 was launched in 1990 and nobody knew intel was going to overtake Motorola. etc.

bitwize · on July 2, 2021

In fact, one of the "Lisp machine on a budget" options of the mid 80s was the Hummingboard -- a 386 with many megabytes of RAM on an ISA card, specifically commissioned and built to work with Golden Common Lisp.

Zelphyr · on July 2, 2021

Do you have any recommendations for a Lisp Machine emulator for Mac?

lispm · on July 2, 2021

The Interlisp-D system from Xerox/... is available: https://interlisp.org . Expect a real parallel universe. Even for a Lisp programmer this will challenge what one expects from a development system.

The Symbolics system is only available as a pirated and slighty buggy software for Linux (also in VM running Linux). A better version exists, but that one is only available in limited commercial form. It's another parallel universe from 30 years ago. Most development basically stopped mid 90s.

jacquesm · on July 2, 2021

In a way that's great: it will be lightning fast compared to running on the original hardware (many orders of magnitude) and it won't be affected by all the bloat that they didn't tack on during the last 30 years.

eschaton · on July 2, 2021

The owner of the Symbolics IP is an unbelievable idiot for not making sure the modern emulator is distributed far and wide, with source, so people can experiment with and enhance it. That’s the only value it holds today but they seem determined to squander it by not putting it out.

lispm · on July 2, 2021

I'm out

eschaton · on July 3, 2021

Why’s that? You support keeping everything related to Symbolics locked up and hidden from the public? It’s 30 years old, it has no actual value besides research.

p_l · on July 3, 2021

There are people who work on preserving and making sure it can become available one day.

They will also give you that answer, for various reasons.

smackeyacky · on July 2, 2021

That would let too many people into the Lisp priesthood, can't have that kind of shenanigans going on.

rjsw · on July 2, 2021

As well as the ones lispm has described there are emulators for MIT CADR, LMI and TI Lisp Machines. The LMI one [1] is the most complete of these.

[1] https://github.com/dseagrav/ld

jampekka · on July 2, 2021

I.e. worse is better.