Hacker News new | past | comments | ask | show | jobs | submit login
TROS: How IBM mainframes stored microcode in transformers (2019) (righto.com)
128 points by detaro on March 6, 2021 | hide | past | favorite | 28 comments




The first Bell System electronic switching system, 1ESS, had something similar.[1]

Early computing was very much memory constrained.

[1] https://en.wikipedia.org/wiki/Number_One_Electronic_Switchin...


Magnetic cores can be used to do logic, and there were a few solid-state computers based on that premise (with Mhz clock rates long before transistors caught up)

It is possible to build the equivalent of an FPGA with cores and programmed jumpers like this. It could be useful on the surface of Venus if sufficiently high temperature materials were used.


Just to clarify, these TROS modules don't use core, but "plain old" transformers. The difference is that cores are stateful and nonlinear (they get magnetized), while transformers (more or less) just give you electricity out when you put electricity in.

As you said, magnetic cores were used for logic in some machines. Cores were also used in core memory, which was the main way of implementing RAM until semiconductor RAM came along. Cores were also used in the Apollo Guidance Computer's core rope ROM modules.


Ferromagnetic materials have a hysteresis, a stickyness to their field... materials with large hysteresis are used to make permanent magnets, and those with very low hysteresis are best for transformer cores, because each transition through hysteresis represents loss. All cores have it, it is an engineering optimization to choose a "hard" (high hysteresis) or "soft" magnetic material.

If you want to store a bit, as in core memory, or magnetic logic, you'd use a "hard" material. It is likely the cores in the TROS module were "soft".

If the hysteresis is low enough, even the normal atomic motion due to heat can cause the field to dissipate over time.


IIRC it's not just cores in general that do the computation, is it? Saturable reactors (https://en.wikipedia.org/wiki/Saturable_reactor) allow the use of magnetism for amplification, allowing one to build computers that would last a very very long time.


This was a fantastic read! I had a couple of question about the following two passages:

>" Each word (A or B) has a drive line that passes either through a transformer (for a 1 bit) or around a transformer (for a 0 bit). "

What is actually "driving" these drive lines? What is the source?

>"In this model, different microcode was loaded from a card deck or tape to switch operating modes between System/360 and emulation of the legacy IBM 1400 series."

Was the System/360 the successor to the IBM 1400 series then? I'm guessing this was this the first big commercial "backwards compatible" instruction sets?


Thanks for the questions. The drive lines are activated by a matrix of source and sink drivers. The idea is that instead of 256 drivers, you use 16 sources and 16 sinks so each combination drives one line and you save a lot of components. To make this work, you need a bunch of diodes, which are in the metal cans on the board. (Footnote 2 has a diagram which might clarify all this.)

As for the importance of the System/360, you need to consider IBM's computers in the early 1960s. They had scientific computers, business computers, plus low-end computers like the 1401 for business and 1620 for science. These computers were entirely incompatible, with different instruction sets, word sizes, and architecture. As software projects got larger, this turned into a nightmare since you didn't want to rewrite your operating system for every computer.

The IBM System/360 solved this problem by using the same instruction set and architecture for all their computers, from the low end to the room-filling. The System/360 covered the whole circle of application (i.e. 360°) from business to scientific. This was a revolutionary idea at the time and became very popular. So popular that IBM's current mainframes still support the 360 architecture.

One problem was how to move customers from their old computer to the System/360 since they wanted to keep their old software. This was a big problem for IBM until someone realized that using microcode, you could emulate the old instruction set without much effort. Thus, you could buy emulation support on your System/360 computer and get the microcode to run your old 1401 or scientific programs.


>"The System/360 covered the whole circle of application (i.e. 360°) from business to scientific."

Wow. I did not know that there was a marketing component to the name 360 I just always assumed it was just another prosaic IBM product number.


Most of IBM's product numbers are annoyingly random, but sometimes marketing gets it right, as in the case of the System/360. Amusingly, the aerospace version of System/360 is called System/4 Pi: these computers are operating in three dimensions, and just as a circle has 360 degrees, a sphere has 4π steradians.

IBM also had System/370 which was the 1970s version of the 1960s System/360, and similarly System/390 in the 1990s.


>"However, the arrival of cheap semiconductor ROMs in the 1970s obsoleted complex storage technologies such as TROS. Nowadays, most microprocessors still use microcode, but it's stored in ROM inside the chip instead of in sheets of Mylar. Microcode can now be patched by downloading a file, rather than replacing Mylar sheets inside the computer."

I curious if anyone knows what the last major computers that used hardwired control units in their CPUs would have been. Similarly what would have the first major ROM-based CPU control units have been?


Microcode goes way back to Whirlwind (1497) and EDSAC 2 (1957), depending on how you define it.

Processors still use hardwired control sometimes if they are simple enough (e.g. RISC). I think hardwired control is also used for performance sometimes.


Interesting, I wasn't familiar with the Whirlwind. Regarding the hardwired control in RISC designs that makes sense.

I have a somewhat tangential question - in doing some Googling about RISC and control units I came across the following from a question on Quora[1]:

>"P5 (Pentium) and P6 (Pentium Pro). Note that the control unit was phased out in favour of the reservation station and re-order buffer."

>"The control unit with its rule-based, static approach is far too simplistic and would perform poorly in an out-of-order superscalar core design where the code stream is sequential."

At first this struck me as odd as I wouldn't have thought that reservation/station and ROB would be mutually exclusive with a control unit but it got me to wondering is the microcoded control unit more of logical entity today that's split between the reservation/station and ROB? Or was this something specific to this Pentium chips. Or is this person just wrong?

[1] https://www.quora.com/Why-do-CISC-computer-architecture-tend...


In that quora answer, the box labeled "control unit" really should have been labeled "scheduler", as it is the unit that would be "scheduling" instructions for execution into the execution units (taking dependencies into account to do so).

That "scheduling" is a "control" function, so "control unit" is not wrong, per. se., but there is still going to be a "control unit" even in the reservation station/reorder buffer variant on the right as well. What differs between the two is that the "scheduling" that occurred by the left side "control unit" was instead dispersed into the reservation stations and reorder buffer of the right side diagram.

But there will still be a "control unit" handling overall control of the CPU, it is just not given a box and a label in the right side.


Thanks. I think this is the key idea I was looking to understand:

>"What differs between the two is that the "scheduling" that occurred by the left side "control unit" was instead dispersed into the reservation stations and reorder buffer of the right side diagram."

Is it a correct mental model then to think about control unit being a logical or "distributed" unit(right side of diagram) instead of a monolithic unit(the left side of diagram)?

I'm guessing this "reservation station/reorder buffer variant" of control unit is the predominant one in use in CISC chips these days?


> Is it a correct mental model then to think about control unit being a logical or "distributed" unit(right side of diagram) instead of a monolithic unit(the left side of diagram)?

That is a somewhat reasonable interpretation. Just keep in mind that both diagrams are massive simplifications of the actual reality (i.e., the left side diagram likely had a more dispersed control system than the block diagram would lead one to believe).

> I'm guessing this "reservation station/reorder buffer variant" of control unit is the predominant one in use in CISC chips these days?

It is/has been a popular model (first introduced by IBM in the 360/91 mainframe circa 1967 -- R.M. Tomasulo, "An Efficient Algorithm for Exploiting Multiple Arithmetic Units, https://www.cs.virginia.edu/~evans/greatworks/tomasulo.pdf) for many designs over the years. As for predominant in CISC chips, that really depends upon what current Intel/AMD CPU's are doing internally, and I've not kept up with their most recent offering's internal designs.


Thanks and thanks for the link. I was aware of the Tomasulo algorithm but I don't think I've ever seen the original paper nor did I know it was inherently connected to the IBM OS/360. Cheers.


Do you know of any systems that dynamically updated their microcode or is microcode relatively fixed?

My understand is the Burroughs series of mainframes were designed to support user definable instruction sets so that a computing site would be optimized for specific workloads and languages.


>"My understand is the Burroughs series of mainframes were designed to support user definable instruction sets so that a computing site would be optimized for specific workloads and languages."

This is fascinating although I'm having trouble wrapping my head around it. Would "user" here be an engineer or field tech from Burroughs or mainframe admin at the company the was leasing the Burroughs system. You would have to have very intimate knowledge of CPU architecture to sort of roll you own instruction set no? Or am I misunderstanding this and it would be more about removing some inefficient instructions for a specific type of workload?


Intel CPU's (as well as AMD's) have the ability to load microcode patches to correct/fix bugs that are found post manufacture. In fact, many of the spectre/meltdown mitigations that occurred after those exploits became known were in the form of microcode patches to the various CPU models.

Note, this is not "user definable", but it is updatable microcode.


The Xerox Alto was famous for its user-writeable microcode (among other things). Different languages could use microcode that was optimized for the language's operations. Even some games had custom microcode for high performance.


I strongly suspect the Apollo Guidance Computer mentioned for comparison uses the 'turn on one core at a time' design mostly for 'low power' rather than primarily due to weight. A predicable and small power budget would have been every bit as important as the potential weight savings.


Power consumption was important for the Apollo Guidance Computer, but what made core rope remarkable is its density: it stored 192 bits per core.


I wonder how it was shielded to prevent spurious signals from being injected.


I don't think there was any shielding. The magnetic fields were localized enough (probably restricted to each transformer's core) to avoid problems.


Probably it wasn't; just a very strong flow of electrons and very wide gap between off and on.


Really cool read.





Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: