In the days of OS/360 and OS/370 (IBM's mainframe OS in the 60's and '70s) no instructions at all executed in the wait state. There was no OS level idle process. Instead, the internal CPU microcode itself ran a little loop without dispatching any machine instructions until the tentacles that reached out from that loop were tickled by an external interrupting event. That loop had to be very short for latency reasons and very prickly to test for all the conditions that could take the processor out of it to resume dispatching instructions or do critical internal things.
I know this because I wrote that 6 (initially 8) micro-instruction loop for the IBM System/370 Model 155-157 in the late '60s. I was told back then that since that model became the most widely sold machine ever I had written the code most executed in the history of computing. :-)
The status latch in the machine that indicated that loop was running was fed to a light on the console and, yep, it was nearly always lit other than when booting the OS.
They kept that trick up for a long time. As of AIX 3.2.5 back in the 1990s, an "idle" RS/6000 ran 70% cpu utilization. They were super-fast and efficient in their day, too.
Dunno how true that is today, but I wouldn't be surprised if AIX still uses the same kind of tight event loop internally.
I think John Cocke was working out the architecture of that line then. He's not often credited with it but he invented RISC with that effort. Hardware was built in research (called the 80x where I don't remember x) but the actual Power-PC was quite a ways in the future.
To me the brilliance in Radin's 801 paper was the belief that you could trust the compiler to do a better job. Essentially a compiler can have more than 10 fingers to find its way through optimizing code (optimizing assembly often means marking places with your fingers and glancing back and forth as your trace the logic of execution). That realization struck me at the time like a thunderbolt.
Cocke deserved his Turing, no question about it, however I see the intellectual antecedents of RISC being machines like Gordon Bell's PDP-6/10 (with a small, clean, orthogonal instruction set) and Cray's 6600 (with its overlapping execution units). CISC was a doomed effort to make assembly programmers more efficient.
Through the most popular compiler output today is a CISC instruction set, that output is simply run on microprogrammed emulators.
Yes, John and Fran Allen wrote the first optimizing fortran compiler and developed most of the optimization techniques and analysis used by subsequent compilers.
I knew John and it was that work that led him to design an instruction set optimized for compilers that was implemented in the 801. I'm less aware of George's contribution. Perhaps you could enlighten me.
This can be true on modern CPUs as well. In fact, if the software chooses, it can even remove power from the core dynamically so one doesn't even have to pay the leakage power.
Oh, all those machines used linear ECL (Emitter Coupled Logic) circuits for speed. Terribly hot logic family but complimentary FET based devices just couldn't yet cut the mustard.
FWIW, the transition from the Model 155 to the 157 heralded the introduction of table driven virtual memory. We called it "relocation" hardware then. :-)
The first engineering model of the 155 used core memory but transistor memory became fast, reliable and cheap enough during its development that a very disruptive re-design was performed across the whole family of machines.
That was my first job out of the university and what fun it was! And what an incredible experience working for IBM was then.
What a fun gig. And what interesting times. I remember the headlines in Computerworld when the cost of a megabyte of memory had dropped to $15,000.
I did very little work on the mainframes, lots on the IBM 1800, a process control computer, and my first consulting gig was on a system 32. Don't ask me what programming language it was.
It was all discrete, hard logic circuits centered around a unit called the scoreboard. That term was borrowed from either Burroughs or CDC, I can't remember which. That machine pioneered many of the scalar optimizations that are common in today's architectures.
I got the hardware guys to merge some of the conditions that caused a loop exit into condition groups that were tested by other micro-instructions until a couple went away. Getting out quick was important, sorting out why was less so.
I know this because I wrote that 6 (initially 8) micro-instruction loop for the IBM System/370 Model 155-157 in the late '60s. I was told back then that since that model became the most widely sold machine ever I had written the code most executed in the history of computing. :-)
The status latch in the machine that indicated that loop was running was fed to a light on the console and, yep, it was nearly always lit other than when booting the OS.