The book "The Connection Machine" by Danny Hillis is a fantastic account of the philosophy of the CM design. When I was in high school, I didn't have a computer, I only had computer books from the library, and TCM was a huge influence on me at that age.
This paper by Guy Steele and Danny Hillis is also a good introduction to the Lisp language they were developing (but from what I can tell never got off the ground) for the CM.
In particular the language has two new operators, alpha and beta, that today we might consider flexible analogs to map (compute) and reduce (communicate). It's really fascinating how they made an algebraically closed language around this concept.
Near the end he talks about how this room-sized machine had a syncronous clock, and how there were bits stored in the wires in transit because all the wire lengths were carefully cut multiples of the clock period.
There was also Gary Sabot's work on Paralation Lisp (really interesting approach to declaratively specifying data locality), and later Guy Blelloch's NESL (the first programming language to really deal with nested parallelism) for the CM-2.
The CM1 was a massively parallel computer built by Danny Hillis in the context of his PhD thesis at MIT and commercialized at Thinking Machines, Inc. in the mid-late 1980s. The machine had up to 65536 very primitive CPU cores (more or less just a simple ALUs) interconnected using a 27-dimensional hypercube - and a fascinating array of LEDs, indicating the activity of each of the CPUs (and just a fake display showing random patterns on later machines such as the CM5). You had to use a high-end workstation such as a Sun 3 or Symbolics Lisp machine to operate the CM1. In principle, the CM1 was a (very expensive) single user machine, but larger configurations could be configured into up to four independent partitions, one per user.
One famous person working on the CM architecture was Richard Feynman; the story of his optimization approach to reduce the number of buffers in the router chip of the CM1 is worth reading: https://longnow.org/essays/richard-feynman-connection-machin... (this article also links to a TEDx talk by Danny Hillis).
CM-1/2 is not exactly computer, it is more of a massively parallel SIMD computation accelerator. While AFAIK the CM-1 microsequencer has some limited ability to handle control flow it is there mostly in order to futher compress the instruction stream coming from the front-end workstation (which came over two wonderfully thick 50-pair ECL cables) and most of the actual programmed controlflow is done in software on the front-end side. It's architecture is somewhat reminiscent of FPGA. But while you can freely program the individual macrocells in FPGA, all of the CM "CPUs" have the same configuration, but this configuration can change each clock cycle (see Fig 4.1 in the dissertation and take note that the ALU is essentially an freely configurable 8x2 LUT).
CM-1/2 is not exactly computer, it is more of a massively parallel SIMD computation accelerator.
Right. SIMD was something people tried doing before they could get enough transistors to give every node an instruction decoder. It's useful for a narrow class of problems where special purpose hardware or an FPGA isn't justified, but the problem is so regular that SIMD will work. This niche is not large enough to keep a company in business.
(There were a long series of strange architectures tried in that period. Transputers. Hypercubes. Dataflow machines. The Cell. All too hard to program and not general purpose enough.)
Surely "27-dimensional" must be wrong? 65536 is 2^16; what would an extra 11 dimensions be for?
(The OP says 12-dimensional; the Wikipedia page says 12-dimensional in one place and 20-dimensional in another; the latter seems like it must also be wrong. I wonder whether it's 16+4 where the correct figure is 16-4, the idea being that there are 2^4=16 processors per node or something like that.)
"For a fully configured CM-1, the network is a 12-cube connecting 4,096 processor chips (that is, each l6-processor chip lies at the vertex of a 12-cube)."
We had a Thinking Machine at the University of Minnesota. The person who was in charge of it learned to determine if the code running on it was efficient by watching the patterns of the LEDs.
The architecture is significantly different from what you would find in typical GPU. But the programming model can be considered similar to early days of GPGPU programming on hardware that was not designed for that usage (ie. no control flow instructions, just giant SIMD number crunching thing running somewhat decoupled from host computer).
My group at NASA Ames was specifically tasked with pushing systems such as the Connection Machines, Intel Paragon, IBM SP-2 into (boring) production status. Yeah, that was a fun job. Also contentious, but hey, you want the drama. But it was noticed that a CM-5 was supposed to be visible in Jurassic Park, so by consensus we all trooped over to Shoreline Theater to investigate this. On the clock. That is team building that I can get behind. All of these systems were tres cool to marvel at. As a fresh out, I thought it was super cool that the Cray YMP had a big red switch, just like my home 386 box. And yep, attention was paid to the quality of the blinkenlights when important visitors were around.
As a teenager, I had the opportunity to play and experience the monumental installation Genetic Images by Karl Sims installed at Pompidou Center in 1993.
It was using a Connection Machine to render images in realtime:
The photo at the top of the page is of the installation at Pompidou Center (in the space that is now the staff entrance.)
Genetic Images has only been installed 3 times. Its installation at the Pompidou center as part of the Revue Virtuelle was especially significant at the time, as exhibiting a working super-computer in a museum had never been done, as far as I know.
A sad state I agree, but inevitable when "supercomputers" became something you had to get past bean-counters, rather than just generals and congresscritters.
and its actually 19" racks underneath the faring :) there is a little additional enclosure in the middle that contains the sequencers. the lights are just card edge.
the cm-5 though was all custom machined, including a lovely card insert-and-lock cam for the modules.
the cm5 was stunning, but the cm2 was a fantastic result with a lot less effort and cost.
"Data Parallel Algorithms" by Hillis / Steele is an excellent introduction to SIMD programming at a high level (assumed to run on the CM-2).
Its found in the 1986 issue of "Communications of the ACM". I highly recommend the read. The prefix-sum methodology introduced in that beginner-level article remains the basis of a large number of GPU-algorithms today.
The book specifies that the park is ran by Cray supercomputer (IIRC there it even mentions particular model, probably Y-MP). For the movie it was replaced by CM-5, because it was deemed even more visually stunning, mainly because while Crays often have interesting industrial design (often because of the finite propagation speed of copper wire) they lack the blinkenlights.
IIRC, the default blinkenlighs on the CM-5 was apparently a LED for each of the 4096 "cores" (really a SIMD-lane), and whether or not that particular "core" is active. On means that SIMD-lane is active, off means its inactive. Kind of a glorified Microsoft Windows "CPU Performance" screen, except physical.
Given the regular pattern in Jurassic Park, I assume they reprogrammed the LEDs to look more futuristic.
I read somewhere that the pattern displayed by the CM-5 in the film was referred to as 'common and pleasing #6', as close as I can recall it. I've not seen a definition of the pattern other than it's actual output.
IIRC on CM-1/2 the LEDs reflected state of conditional execution of the individual VLSI packages (there is one LED per package, not one LED per CPU), on CM-5 it is just a bunch of LED matrices controlled more or less globally by I assume some BMC-like thing.
I know I'm not the only one that has wanted to recreate the front of a CM-2 with LEDs. Here is a good reference for the "random and pleasing" pattern of blinking: https://trmm.net/CM-2/
I have read a few pieces about the design and programming on a higher level but not much on the lower-level interprocessor communication and cooridination. The comments here have a heap of links, are any on this aspect?
Has there been much research in non-SIMD n-cube programming models where each processor can go off on it's own path but farming work out to neighbours?
I'd like to have a play around with a n-dimensional multiprocessing system (maybe with some micro-controllers and n in the 4-7 range)
I have this crazy idea of a dual plane optical interconnect n-cube with all of the vertices with even parity on one plane and odd parity on the other. This way all of the neighboring vertices are on the opposite plane with a direct line of sight.
(I've listed the Feynman ones at https://news.ycombinator.com/item?id=27682314 but let's keep that Connection Machine connection off topic here, since it's by far the most-discussed.)
I had an undergraduate research project where I implemented Guy Steeles' Constraint Propagation Language (from his PhD thesis) in *Lisp on the CM-1. I completely forgot about this.
CM-1 lacks floating-point hardware, which anecdotally made it unsuitable for the typical supercomputer number-crunching tasks, that makes 1GFLOP somewhat questionable. Hillis' thesis quotes 1000MIPS for the 4MHz prototype. I believe that production CM-1s had higher clock speed than 4MHz, probably 10MHz, which would change that to 2500MIPS for 64k CPU CM-1.
and then a bunch of software support around around 'zectors' like a vector, with a zillion elements.
so I think you could do float, but you have 3 zectors, the low bits, the high bits and the mantissa, you might need another one for sign.
I think it was tough to give an apples to apples comparison, because it wasn't optimized for floats, but could do floats. it could also do bizarre things with bitmaps, if you could think of the right instruction to get your result. So for some problems, it was much faster than any alternative. But floats are readily accessible and well understood, so it's natural to reach for that tool first.
Anywho, my memory is foggy. I might be mischaracterizing the design. But I think that's about right.
I assume that there were two completely different CMLisp/StarLisp implementations as the language described by Hillis' disertation and StarLisp paper (ie. with xexctors and xaps) is significantly different in the overall design from the StarLisp as described in CM-2 manuals and as implemented by the Common Lisp CM SDK (it can be found somewhere on the internet and works in SBCL). The newer version is significantly lower-level and does not have the xector concept (and its design is similar to C and Fortran APIs for CM). I feel that the reason for this change was to make collective communication operations more explicit, probably because the store-and-forward communication network of CM-1/2 isn't exactly fast at doing arbitrary global permutations.
it was almost certainly less. while the cm2 had floating point chips from weitek lashed to the side, the cm1 ran one single bit instruction every .. ours was running a little over 4mhz, so 250ns. floating point was done in software.
We got some of the first ones at the Naval Research Laboratory in Washington, DC, and I used them to perform (unclassified) molecular dynamics simulations. People used them to scale up all kinds of simulations. The language was a special dialect of C called C*, which was essentially C with array extensions that made doing arithmetic on the CM (which, as other commenters point out, was really just a big array processor) relatively easy to program. It was a single-instruction-multiple-data model, where each little processor got its own piece of the array. Then as now, the way to get speed was to minimize communication among processors.
Whatever your feelings about the NSA, it's an extremely cool museum, and free admission (at least when I went). The staff are incredibly knowledgeable, while obviously they can't discuss classified material, you're bound to get some great unpublished tidbits from them.
They were instrumental in a certain theme park, but the lead programmer suffered an unfortunate… workplace incident as he was attempting to commit industrial espionage. The Costa Rican government has kept this quiet since the 90s…
I often think whether I could (or should) try to build something like this out of a cluster of SBCs, hooked to a bigger board by the GPIO pins and lighting LEDs depending on CPU core activity. 16 boards with 32 cores each would require 8 RPi 4's per board, plus the network cables coming out the back and a total of 64 nodes.
It'd probably be cleaner to use compute modules such as the RPi 4 CM or the Pine64 SOPINE because they pass ethernet and GPIO via the board connectors and wouldn't require more than one ethernet port per board (provided the board has its own switch) but I have no idea of how to design a switch, much less misusing the bits and pieces of one where the network cables are traces on a PCB.
Either that, or have 32 Octavo SoMs, which would allow two separate networks per board, but I'd still need to design the switches (and I have no clue as to how to do that beyond "try to make the traces of equal length").
At least the "neck" of the CM-2a has space for a very large and, presumably, quiet fan.
I made 1/10th scale models of the machines themselves, complete with a computer inside and the blinkenlights, just because I was so struck by the design.
I just made a few for myself. Happy to send files to whomever wants to do the same. Warning--it'll be a bit of a project if you want something fully functional, not just the plastic case. Holler a reply here if you're interested and we can work out a way to privately send contact info.
Tamiko Thiel herself is selling CM T-shirts on spreadshirt. It is my understanding that the original TMC T-shirts also had print on the back, the spreadshirt ones only have the cube of brainy cubes design on the front.
This paper by Guy Steele and Danny Hillis is also a good introduction to the Lisp language they were developing (but from what I can tell never got off the ground) for the CM.
http://diyhpl.us/~bryan/papers2/paperbot/25aa007a093cd69bbf0...
In particular the language has two new operators, alpha and beta, that today we might consider flexible analogs to map (compute) and reduce (communicate). It's really fascinating how they made an algebraically closed language around this concept.
This lecture by Hillis is also fascinating:
https://www.youtube.com/watch?v=Ua-swPZTeX4
Near the end he talks about how this room-sized machine had a syncronous clock, and how there were bits stored in the wires in transit because all the wire lengths were carefully cut multiples of the clock period.