A New Golden Age for Computer Architecture

simula67 · on Jan 29, 2019

> Today, 99% of 32-bit and 64-bit processors are RISC

> Concluding this historical review, we can say the marketplace settled the RISC-CISC debate; CISC won the later stages of the PC era, but RISC is winning the post-PC era

It is clear that his assessment is right, but isn't the 99% number too high ? Servers, laptops and desktops still run x86 and they are CISC ( unless you are counting x86 as RISC based on microcode )

> Many researchers assume they must stop short because fabricating chips is unaffordable. When designs are small, they are surprisingly inexpensive.

> High-level, domain-specific languages and architectures, freeing architects from the chains of proprietary instruction sets, along with demand from the public for improved security, will usher in a new golden age for computer architects. Aided by open source ecosystems, agilely developed chips will convincingly demonstrate advances and thereby accelerate commercial adoption.

It will be interesting if manufacturing also gets open sourced. There already seems to be a project attempting this : http://libresilicon.com/

Symmetry · on Jan 29, 2019

There's an x86 processor in your desktop, but there are many more RISC processors doing things like controlling your hard drive. If you buy an AMD processors there's even an ARM core inside the x86 processor in the Platform Security Processor.

Add in all the microwaves, routers, the many processors in your car, and so on and 99% seems a bit high to me but not unreasonable.

ip26 · on Jan 29, 2019

Seems disingenuous though, as the two markets have totally different optimization points and volumes. More like RISC has won the low-cost, low-power market, and CISC-RISC hybrids won the performance market, or something like that.

I'm sure you can think of a million other markets like this- sedans vs race cars, fighter jets vs puddle jumpers, etc.

Symmetry · on Jan 29, 2019

It's really a lot more than two markets. The high end communications systems that use MIPS have almost as high performance demands as a desktop or server. And currently the highest performance core is a POWER 9. But on the other hand you have a few places where x86 Atoms have made inroads as embedded cores, as in the UR robot arms I work with.

But really I think we really do overemphasize the importance of x86 because that's the architectures we have the most experience working with directly.

foobiekr · on Jan 29, 2019

there's a really interesting market at the bottom as well - super low power CPUs. 8/16 bit stuff, not RISC at all (since DRAM itself is a huge power draw at that level), and so on. fun stuff.

DannyB2 · on Jan 29, 2019

Smart TVs, Digital Cameras, Thermostats, Wristwatches, Phones, Tablets, Printers, Google Home devices, Amazon Echo devices, TV sticks (Chromecast, Firestick etc), Ring doorbells, security devices, etc, . . .

Anything (like a Printer) with a web based interface, having a micro web server within the device.

There are VASTLY more ARM and other architecture chips than there are x86/64 chips around you right now. Possibly even in the computer monitor you're reading this on. The desk phone in your office.

Just about anything that has any kind of a screen with menu system.

bzbarsky · on Jan 29, 2019

> Digital Cameras

The SD card has an ARM chip, usually, in addition to whatever is in the camera itself.

pjc50 · on Jan 29, 2019

> agilely developed chips

I feel there's a long way to go there. The economic structure of the industry is against agility because "deployment" remains stubbornly expensive, and the product culture is also much more conservative.

> manufacturing also gets open sourced.

It's one of the most capital-intensive industries in the world, so I don't quite see how this would work? Libresilicon are offering a 1000nm (not a typo) process.

grkvlt · on Jan 29, 2019

Yeah, it's like open sourcing code for 3D immersive visualisation of deep-water seismic survey data, along with the detailed blueprints for the ROV and survey equipment. Awesome enough, sure, but pretty much useless to me. I can build the software like a boss on my laptop, but since I don't have access to any seismic survey data, a CAVE room to view it in, an oceanographic survey vessel, ROV and associated equipment, etc. that's as good as it gets. And, if I had all that awesome oil industry exploration stuff, I'd probably also have a team of engineers building me software anyway, so I don't need it.

My point? Open sourcing stuff with ridiculously huge capital requirements is not that useful, I guess?

nitrogen · on Jan 31, 2019

It's useful for verifying the process is what it claims to be, and for debugging problems as a user of the process.

ghaff · on Jan 29, 2019

I assume they’re counting them as RISC cores plus microcode that makes them compatible with older x86 CISC designs. Modern x86 is RISC under the covers by any meaningful definition-/and also shares pretty much all the problems.

imtringued · on Jan 29, 2019

I don't know why this gets thrown around but it's completely wrong. RISC and CISC are about instruction set architecture. As soon as you have crossed the decoder boundary you are deep into micro architecture territory where the ISA is completely irrelevant because you can use the same techniques regardless of whether you have RISC or CISC.

Now if one would take what you said at face value then it would actually imply that RISC as an ISA is completely irrelevant because it is possible to achieve the same benefits or even beat it despite CISC having the inefficiency of hard to decode instructions and the high cost of a conversion step in the micro architecture. One suddenly realizes that this battle of ISAs is completely futile and the secret sauce is in the micro architecture which is completely divorced from the ISA.

Here are some examples: ARM makes slow ARM chips. Apple makes fast ARM chips. AMD made slow x86 chips in the past but now adopted a faster micro architecture. Intel made Itanium but the chips didn't have any sort of dynamic scheduling so they couldn't deliver the promised performance gains.

deepnotderp · on Jan 29, 2019

Well, both Hennessy and Patterson have perpetuated the micro code and RISC myth, presumably because it suits the "RISC world domination" narrative.

Tor3 · on Jan 29, 2019

ARM doesn't make any ARM chips, they just design them. Licensees make them. https://en.wikipedia.org/wiki/Arm_Holdings

Const-me · on Jan 29, 2019

Modern x86 is not RISC under the covers.

There’re instructions combining a dozen of math operations and also RAM loads, that on modern CPUs decode into just a single micro-op.

Example: https://www.felixcloutier.com/x86/vfmadd132ps:vfmadd213ps:vf... The AVX version computes x=a*b+pointer[i], for 8 independent SIMD lanes.

If you call that amount of stuff in a single micro-op “reduced instruction set” I wonder what exactly is the meaningful definition that you referred to?

gpderetta · on Jan 29, 2019

x86 is not RISC by any meaningful definition. RISC vs CISC has always been about architecture, not microarchitecture. Breaking down instructions into microinstructions is as almost as old as the CPU itself and predates the CISC/RISC nomenclature by decades.

Symmetry · on Jan 29, 2019

Many people describe the uOps inside a modern x86 as RISC-like and that's a good analogy. The internals really are much more similar to a RISC pipeline than older microcoded processors were. But it is still just an analogy because you're right about architecture versus microarchitecture.

gpderetta · on Jan 29, 2019

but microcoding has existed for ever and while uops is probably a very specialized form of microcoding, it still not a new thing.

Also I do not think uops are fixed size as IIRC they can take a variable number of slots in the uop cache, and fix size instructions is pretty much one of the only two remaining differentiating RISC features. The internal x86 microarchitecture is also not load-store, the other one RISC feature, at least in the fused domain, and as far as I understand, in the uop cache.

So, even if we want to abuse the RISC term to describe the microarchitecture, I do not think it cleanly apply to the usual x86 implementations.

edit: this is a pet peeve of mine. It seems I have this discussion every 6 months on HN :)

Symmetry · on Jan 29, 2019

Everything I've read about Intel uOps says that they're fixed size[1]. Now, the size isn't a power of 2 multiple of a byte like you'd see in a RISC design. I seem to recall some Intel architecture with 83 bit uOps? But it is fixed. And the uOp caches for both Intel and AMD are fixed size. Haswell stores 1.5k uOps[2] and Zen 2k[3] for instance.

But the important thing is that uOps are much higher level than microcode instructions. Except for the odd encoding size they would make a lot of sense as an early RISC ISA. Now, they expose a lot of the odd corner cases of the underlying architecture in a way that no modern ISA would but the original Berkeley RISC had branch delay slots and followed the philosophy that you'd just recompile the code when the ISA changes.

I'm at the edge of my knowledge here but I understand that microcoded instructions would tend to be much lower level, being things like read from memory to such and such an internal buffer. By contrast uOps do specify registers or constants, though they do so (post-rename) in terms of physical rather than architectural registers. But the decision on whether to get that arguments from the physical register or the bypass network is still made further down the pipe as with a RISC processor.

Is the analogy perfect? No, of course not. No analogy ever is. But I do think it illuminates more than it misleads for people learning about the evolution of processors - just as long as people can keep architecture and micro-architecture straight.

[1]https://en.wikichip.org/wiki/micro-operation for instance. [2]https://www.realworldtech.com/haswell-cpu/2/ [3]https://en.wikichip.org/wiki/amd/microarchitectures/zen%2B#M...

gpderetta · on Jan 29, 2019

FWIW, Agner says that if a uop has multiple constants (for example an address and an immediate), it will borrow up to 32 bits from the next uop cache slot.

Encoding constants in the instructions themselves is a very non-RISC thing BTW.

blattimwind · on Jan 29, 2019

> Many people describe the uOps inside a modern x86 as RISC-like and that's a good analogy.

Not really. Micro-Ops are typically very large (100+ bits wide) where each bit can be thought of as directly controlling a specific function in an EU. They can do things in parallel; the frontend may emit only one uop for more than one ISA instruction, they can contain constants, they're kinda-of variable-length in some microarchitectures. Overall they're very un-RISC-y.

Overall the whole RISC/CISC debate is pretty much meaningless and has been for decades. Many folks superimpose their own superstitions about unrelated issues (e.g. "PC server" vs "UNIX server" seems a popular one), but at the end of the day pretty much all high-performance cores look fairly similar, regardless of ISA.

Taniwha · on Jan 29, 2019

Compared to other CISC CPUs designed at the same time as x86s (Vax, 68k, 32k, etc) x86s ARE positively RISCy - the instruction set (with 1 or two minor exceptions - push) only has instructions with 1 memory address making exceptions/restart (paging code) simple and making instructions easy to break into simple uOps (one trip to the TLB for protection checking - Vaxes used to potentially do 27 and guaranteeing that instructions could make progress in all situations was problematical)

Kurtz79 · on Jan 29, 2019

I guess if you take into account micros embedded in things like washing machines, kitchen appliances, smart home devices, cars, etc... the figure could get close to 99%.

masklinn · on Jan 29, 2019

Not to mention laptops and desktops are getting RISC-based co-processors these days (e.g. Apple's T2 is present in the current iMac Pro, Macbook Pro, Mac Mini and MBA).

In fact now that I think about it, pretty much every SSD contains a controller SoC which is almost certainly RISC, so even in a standard laptop you get at most a 1:1 ratio of RISC:CISC. And modern GPUs (including IGP) would be RISC in a VLIW configuration, if each computation core is counted as a RISC chips the ratio quickly gets ridiculous.

pjc50 · on Jan 29, 2019

> even in a standard laptop you get at most a 1:1 ratio of RISC:CISC

If you manage to find all the processors I bet it's more like 10:1. To a first approximation everything on the PCI and USB buses will have its own processor, even if it doesn't accept external firmware.

3chelon · on Jan 29, 2019

The article specifically points out that Intel/AMD took on board all the lessons from RISC as they evolved the x86 series and now it's pretty much a RISC architecture too.

pulse7 · on Jan 29, 2019

Most RISC architectures have fixed-length instructions (16-bit, 32-bit, 64-bit, 128-bit), whereby most CISC instructions are variable size. So x86 is not a RISC architecture...

insulanian · on Jan 29, 2019

I remember being excited about The Machine [1] from HP:

"The Machine will be a complete replacement for current computer system architectures. There will be a new operating system, a new type of memory (memristors), and super-fast buses/peripheral interconnects (photonics)."

"HP says it will commercialize The Machine within a few years, “or fall on its face trying.”"

It seems later happened...

[1] https://www.extremetech.com/extreme/184165-hp-bets-it-all-on...

akshayn · on Jan 29, 2019

While HP's "The Machine" specifically appears to never have gone beyond the marketing phase, I think your excitement about the underlying technologies - silicon photonics combined with nonvolative memory combined with resource disaggregation - is valid [0], and they are probably coming.

You can buy NVM today [1], and building systems for resource disaggregation work is an active problem [2].

[0] https://www.usenix.org/conference/fast14/technical-sessions/...

[1] https://www.intel.com/content/www/us/en/products/memory-stor...

[2] https://www.usenix.org/system/files/osdi18-shan.pdf

na412 · on Jan 29, 2019

HP long ago rebranded "The Machine" to be just a powerful regular computer, since they haven't been successful in making viable memristors at scale.

So they will deliver it, it just won't be anything like what they promised.

DJBunnies · on Jan 29, 2019

Later always happens.

insulanian · on Jan 30, 2019

Sorry for the typo :)

ChucklesNorris · on Jan 29, 2019

I think back to the old Cat Stevens song "I Want To Live In A Wigwam", except now I think maybe one day I'd like to live in a Faraday cage.

Computers, like guns, drugs, and any other invention of man are not inherently evil. All these things can be used for good or evil. Unfortunately, the fly in the ointment is human nature. With the convergence of cheaper but increased computing power and the monetization of personal information, I fear what the future holds. I hope I'm wrong, but it looks to me that humanity is doomed to forever live in a state total surveillance and control. We're seeing it happening already. Just the other day I saw an article that said that Sweden is going to tax people on the miles they drive. The very next day I saw another article saying Los Angeles is planning to do the same thing.

Sigh... I'm glad I'm old.

robin_reala · on Jan 29, 2019

Every country already taxes people on the miles they drive, as every country has a fuel tax.

jakamau · on Jan 29, 2019

A fuel tax isn't effectively a tax on mileage, nor is it an effective tax for offsetting infrastructure costs. An increase in axle weight causes an exponential increase in damage to infrastructure (the American Association of State Highway and Transportation Officials uses a "fourth-power" rule). Increased fuel economy also has unintended consequences leading to fuel taxes under-funding the infrastructure repairs.

You essentially have heavy-duty or commercial trucks that don't "pay their weigh" and hyper efficient hybrids and EVs that don't "pay their way" when it comes to infrastructure costs.

So fuel tax isn't equivalent to a tax on "miles driven," it might be better suited to off-setting air pollution (pure conjecture on my part), but if you wanted everyone to be responsible for the damage they cause to public infrastructure, you would need some weird calculus of axle-weight/mile driven tax.

Or we could decide major roads/infrastructure are an economic public good worthy of paying taxes on.

ChucklesNorris · on Jan 29, 2019

The difference is pay at the pump or pay via what your car's embedded GPS is reporting. GPS also allows them to know when and where you were speeding, and since Sweden went cashless, they can simply deduct the fines from your bank account. Compliance is irrelevant.

If having a GPS device on your person becomes law, then avoiding the tax by riding a bicycle, walking, riding a horse etc is defeated.

robin_reala · on Jan 29, 2019

Could you link through to the article you read this in, because I live in Sweden, own a car, and have not heard or seen anything about this.

mav3rick · on Jan 29, 2019

This fear mongering goes on since the days of Henry Ford. Technology has been a boon for many many people. I can't imagine a world with many more people dying or without access to information.

3chelon · on Jan 29, 2019

Indeed, since the days of the Gutenberg press. All new tech heralds the end of times.

4thaccount · on Jan 29, 2019

I'm trying to think of what are some neat computer architectures out there.

Chuck Moore's "Green Arrays" is kinda cool and so is the Parallela board.

mpweiher · on Jan 29, 2019

Transputer. Hardware multitasking and link-communication and a very compact instruction set.

https://en.wikipedia.org/wiki/Transputer

https://en.wikipedia.org/wiki/XCore_Architecture

SOAR (Smalltalk On A RISC), though the conclusion there was mostly that a plain old RISC will do. I wonder if that is still true today.

Rekursiv OO computer https://en.wikipedia.org/wiki/Rekursiv

NEC dataflow processor. https://books.google.de/books?id=qRrlBwAAQBAJ&pg=PA152&lpg=P...

4thaccount · on Jan 29, 2019

Never heard of SOAR. It would be pretty sweet though to have a powerful & multi-core chip running a Smalltalk OS that could do something with all the cores. I'd also like to see kOS from Arthur Whitney if he ever finishes it, although I'd never be able to afford any of their products.

mpweiher · on Jan 29, 2019

https://www2.eecs.berkeley.edu/Pubs/TechRpts/1986/CSD-86-287...

https://www.deepdyve.com/lp/association-for-computing-machin...

http://digitalassets.lib.berkeley.edu/techreports/ucb/text/E...

Looking at the results, they say that hardware tag-checking for integer arithmetic and register windows for fast method calls were the two most important features of the design, nearly doubling performance.

I wonder if that still holds today, with the memory wall so dominant that CPUs tend to be stalled quite a bit (therefore enough time to do tag checking in software).

jabl · on Jan 29, 2019

Nowadays the thinking seems to be that register windows were a mistake (e.g. in SPARC), and newer designs such as RISC-V or Aarch64 don't do it.

foobiekr · on Jan 29, 2019

I don't think I'd call the transputer "hardware multitasking" so much as "integrated communications network."

For a time the T800 was the top of the FP pile. Not for long, though, and then the long, long, long wait for the disappointing T9k doomed the whole architecture.

mpweiher · on Jan 29, 2019

>I don't think I'd call the transputer "hardware multitasking"

"A Transputer had a number of simple operating system functions built into the hardware. These included hardware multitasking with foreground and background priority levels, hardware timers, and hardware time-slicing of background tasks."

The Hardware and Software Architecture of the Transputer -- https://archive.org/details/Xputer

There are special instruction to start and end a process, and the fact that it was a stack machine means context switches were extremely fast, almost no registers to save/restore.

> so much as "integrated communications network."

It had both, and both were integrated. IIRC, the instructions to send/receive on the links were integrated with the multitasking hardware.

"The first 16 'secondary' zero-operand instructions (using the OPR primary instruction) were:

   Mnemonic Description
   REV      Reverse – swap two top items of register stack
   LB       Load byte
   BSUB     Byte subscript
   ENDP     End process
   DIFF     Difference
   ADD      Add
   GCALL    General Call – swap top of stack and instruction pointer
   IN       Input – receive message
   PROD     Product
   GT       Greater Than – the only comparison instruction
   WSUB     Word subscript
   OUT      Output – send message
   SUB      Subtract
   STARTP   Start process
   OUTBYTE  Output byte – send one-byte message
   OUTWORD  Output word – send one-word message"

https://en.wikipedia.org/wiki/Transputer

Also, you could designate memory locations as local communications channels, and the same instructions would work. So the same binary could run locally or distributed.

foobiekr · on Jan 30, 2019

This manual is a great find! Thank you.

agumonkey · on Jan 29, 2019

I need a group for this .. I love forth array cpus

4thaccount · on Jan 30, 2019

I just wish I could figure out how to use the green arrays

vanderZwan · on Jan 29, 2019

> As the architects of the Motorola 68000 and iAPX-432 both learned, the marketplace is rarely patient.

I hope this won't mean the Mill has no chance of success, or at least influencing the industry for the better. Every video seems to introduce interesting new ideas, or new takes on old ideas[0].

https://millcomputing.com/

g0xA52A2A · on Jan 29, 2019

I'd hardly call this a "Golden Age", precisely because of the shortcomings mentioned in the article. Regardless of how you may re-architect a processor you will still end up hitting CMOS limitations in time.

To contrast my fantasy for "Golden Age" would be multiple viable replacements for CMOS that were actively being used in a variety of processors.

naasking · on Jan 29, 2019

Modern arch has various inefficiencies which would be solved if we shifted focus to more parallelism and less complicated processors (maybe something like Amorphous Computing). This requires a different programming model from the sequential, imperative paradigm though.

pjc50 · on Jan 29, 2019

What specifically is "wrong" with CMOS? Not small enough for you yet? :)

ansible · on Jan 29, 2019

Reaching "the end of Moore's Law" is a super big deal, and I still don't understand why people aren't worried about it more in general.

It will cause a crash in the entire tech industry, which will throw the global economy into a depression.

Symmetry · on Jan 29, 2019

I'm somewhat comforted by the fact that there are fundamental physical limits to computational efficiency[1] which we aren't at yet. MOSFETs look to be just about played out but there are tons of other potential computational substrates. Carbon nanotube transistors, photonics, nano-rod logic, magnetic coupling, DNA computing, etc. We're in for a big interregnum before we get a new paradigm working better than our existing one but there's no reason to believe that progress in computation is ending permanently now.

[1]https://en.wikipedia.org/wiki/Landauer%27s_principle

ansible · on Jan 29, 2019

All of the techs you have mentioned are potentially great, but none are close to ready for widescale production.

Symmetry · on Jan 29, 2019

Right, I'm expecting a decade long interregnum at the least.

ghaff · on Jan 29, 2019

Not sure why the downvotes whether or not you take this outcome as a given. CMOS process scaling has been a uniquely powerful driver of the tech industry since circa the 1990s. It’s not obvious how you lose that and don’t have an impact.

We’ve gotten a bit of a reprieve because things like GPUs turn out to be good architectures for many compute intensive workloads.

wibble10 · on Jan 29, 2019

How will reaching the end of Moore’s law cause a crash like this?

Do you mean that future requirements will outstrip capacity? I don’t think this has been true for some time, we run a lot more systems and tend to scale horizontally; doubling compute power every n period isn’t a hard requirement imo.

imtringued · on Jan 29, 2019

If we ever have an industrial revolution or massive productivity gains again it will most likely involve autonomous machines for things like farming, construction, delivery, transport, etc. All of these things depend on increases in power efficiency or performance. It is possible that we are one or two "doublings" away from achieving this in a mobile package and then suddenly moore's law stops.

For example Tesla is betting on conventional cameras and tries to overcome their shortcomings with computationally intensive machine vision. Tesla's custom chips allows them to have more processing power which means they can install more high resolution cameras and other sensors.

ghaff · on Jan 29, 2019

Without higher throughput systems including but not limited to CPUs, you can to some degree just throw more servers at the problem. But now you’ve doubled (maybe more like tripled) the cost to get 2x the performance. And some applications like mobile and sensors depend on miniaturization for a given capability.

pjc50 · on Jan 29, 2019

I was expecting an answer more along the lines of gallium arsenide or SiC process technology, but OK!

It was always going to end at some point, because we're down to just above 100 atoms gate thickness. What we're seeing is not an "end" but a soft landing that really started some years ago with massively multicore processors. The industry has started adapting to horizontal scaling rather than "free" process improvements. There is still an awful lot of low hanging fruit on the software side for performance improvements, but that's harder because it involves removing or adapting abstraction layers.

cryptonector · on Jan 29, 2019

The end of Moore's Law is a certainty regardless of technology. Something other than CMOS might yield smaller and more power efficient features up to a point, and beyond that point we'd be back where we are today with CMOS. We might as well start dealing with the problem more seriously, because each attempt to kick the can down the road gets more and more expensive.

naasking · on Jan 29, 2019

> It will cause a crash in the entire tech industry, which will throw the global economy into a depression.

Why?

ansible · on Jan 29, 2019

The valuation of all the tech companies is based on future growth. Meaning ever increasing sales.

Well, what happens when next year's smartphone really isn't any better or cheaper than last year's smartphone? Sure, there will always be some business because people drop their phones all the time, but sales won't be as high without upgrades driving it.

Lots of people (investors) will be unhappy with the tech sector at that point.

naasking · on Jan 29, 2019

Next year's tech will still be cheaper. If they don't have to retool for the next gen because they literally can't, then we can enjoy the economies of scale in producing the same tech over a longer period of time.

I also dispute the scope you assign to this "upgrade" cycle. It's a huge overstatement to claim that the whole tech sector would "collapse". The tech sector is far larger than phones and personal computers.

ansible · on Jan 29, 2019

> Next year's tech will still be cheaper. If they don't have to retool for the next gen because they literally can't, then we can enjoy the economies of scale in producing the same tech over a longer period of time.

Slightly cheaper, maybe, as the fab equipment depreciates.

> I also dispute the scope you assign to this "upgrade" cycle. It's a huge overstatement to claim that the whole tech sector would "collapse". The tech sector is far larger than phones and personal computers.

It used to be that new hardware would drive new software sales, and vice versa. You'd buy a new PC, because Windows XP ran too slow on your old one. And a couple years later, some awesome game comes out, and you'd need to buy a new PC to play it.

Expansion, not replacement (of broken systems) is what drove the PC segment. It was the foundation of the growth that affected everything that used or could benefit from PCs. And we've seen that with the mobile segment.

And we may yet see that with the VR/AR segment, but that's going to be a tough hill to climb if we can't count on continuing IC improvements.

naasking · on Jan 29, 2019

> Slightly cheaper, maybe, as the fab equipment depreciates.

Sure, but the same equipment would also be cheaper to replace than the capital required for the next generation fab, for the same reasons.

> Expansion, not replacement (of broken systems) is what drove the PC segment. It was the foundation of the growth that affected everything that used or could benefit from PCs. And we've seen that with the mobile segment.

Expansion into new sectors, not expansion into the same sectors. The same will still happen. Solutions will become more customized rather than remain general purpose. There remain considerable gains in parallel computation for instance (GPU), and ASICs will become the new hotness (again).

Furthermore, if clients cease to expand, then more computation will happen on server infrastructure. Your fat clients will become more thin clients again, repeating the same cycle that has happened multiple times so far.

You're also neglecting the investment in infrastructure (data centers, networks) that will continue to grow, if not accelerate. Current hardware is more than sufficient for most of this.

j1vms · on Jan 29, 2019

Reaching "the end of Moore's Law" has so far been ended up being more of a problem for Moore's company than anyone else.

Between GPUs, more power efficient designs (due to heavy mainstream interest in mobile technology), more work put into algorithmic efficiency, and promising early developments in quantum computing, it appears the focus has shifted away from relying on more transistors per square inch for ensuring the industry's future growth.

cbkeller · on Jan 29, 2019

Interesting article with a lot of great historical context. I think a significant part of the tl;dr is that the end of Dennard Scaling and trouble with Moore's law may mean that the best architecture will finally win out, rather than just the one with the best semiconductor design team behind it.

I'm a bit skeptical of the promise of DSAs, though it does seem we're already going that way. Curious what others think on that point.

Symmetry · on Jan 29, 2019

The article didn't well on Dark Silicon but the idea is that as you can fit more structures on a chip but can't light up them all for power dissipation reasons it makes sense to include more and more specialized structures you only use for some purposes. With the rise of GPUs and specialized co-processor for things like media decoding and encryption we're certainly well on our way there. It would be nice to have an intermediate between the two in more places, like the Hexagon VLIW DSP that Qualcomm puts in its Snapdragon DSPs. And of course there are the even more specialized matrix convolution processors that everybody is designing and NVidia is now putting in its GPU.

I think the real question is if it will make sense to have FPGAs in wider use. Certainly not until the programming model improves...

g0xA52A2A · on Jan 29, 2019

I agree, my concern is where do DSAs go in a few generations? How many efficiency/performance gains can be had before hitting the same limits hampering general purpose processors today?

Symmetry · on Jan 29, 2019

A DSA is only going to be an order of magnitude or two more performant than a general computer on any given process node. They'll certainly slow down at the same time general purpose CPUs slow down. The thing is it makes more sense to design and include them in a world where transistors are potentially getting cheaper without getting better because a finite amount of engineering effort will be stretched across a longer timeframe.

vanderZwan · on Jan 29, 2019

Does that also imply a potential power savings of an order of magnitude or two? Because then it becomes a lot more obvious that the demand for them will be there in mobile phones and laptops at least.

Symmetry · on Jan 29, 2019

Yes, it does. And that's a big part of the push for dedicated co-processors that we've seen in SoCs.

ddingus · on Jan 29, 2019

Reading this, I had a flash of a question:

RISC makes a lot of sense given compilers and many other possible optimizations?

However, there are increasing trends to put functions right into silicon too. Those frequently replace many instructions.

Will ML tech somehow make better sense of those things, and CISC in general?

Antonio123123 · on Jan 31, 2019

>Using parallel loops running on many cores yields a factor of approximately 7.

That depends on the number of cores.

rblion · on Jan 29, 2019

It really is a fascinating time to be alive and aware.

The power of computing is redefining civilization, humanity.

deadpool007 · on Jan 29, 2019

I am getting application error while opening the link. Anybody facing the same issue?

ranchpredictor · on Jan 29, 2019

Alot of what they suggest seem to be better solved in the domain of compilers and microcode translation on the fly.

robin_reala · on Jan 29, 2019

– As announced by Intel on the launch of the new Itanium architecture.

shifto · on Jan 29, 2019

oh snap

ranchpredictor · on Jan 29, 2019

Slow clap

Symmetry · on Jan 29, 2019

Quite possibly! Or maybe at install or link time. Or maybe we're looking at a future where almost all code goes through a JIT engine and you've only got a few normal cores that run the OS that manages everything. Or possible everything will be Javascript[1].

[1]https://www.destroyallsoftware.com/talks/the-birth-and-death....