IBM flirted with Itanium. They spent a lot of money on software support for Itan...

walrus01 · on Oct 2, 2019

I think theregister and theinquirer helped steer a lot of people away from itanium in that era, showing the $/performance that could be achieved with a larger number of much less costly dual socket Xeon and opteron boxes. People who really wanted to centralize everything on one godly giant machine went to things like zseries mainframe, not itanium. Everyone else started running Linux on x86...

hapless · on Oct 2, 2019

I don't think the press had anything to do with it.

Customers got engineering samples in their hands and they were very, very slow.

chrisseaton · on Oct 2, 2019

Everyone was promised sufficiently-smart compilers would make great use of them, but that never works out as a plan.

jacobush · on Oct 2, 2019

They very much underestimated the complexity of such compilers. But the concept is fine. Itanium had a lot of raw power, but you had to do demoscene level trickery to get that performance.

Tuna-Fish · on Oct 2, 2019

> But the concept is fine.

The concept is not fine. Itanium was predicated on saving transistors in the OoO machinery, and spending those on making the machine wider and thus performance. However, it turns out that without any OoO, the machine is terrible at hiding memory latency, and the only way to get this back was to ship it with heroically large and low latency caches. And implementing those caches was harder and many times more expensive than just using OoO.

In the end, Itanium saved transistors and power from one side, only to have to spend much more on another side to recoup even part of the performance that was lost.

notacoward · on Oct 2, 2019

The concept was fine based on knowledge available at the time. Processor design occurs on a long cycle, sometimes requiring some guesswork about surrounding technology. The issue of having to hope/guess that compilers would figure out how to optimize well for "difficult" processors had already arisen with RISC and its exposed pipelines. In fact, reliance on smart compilers was a core part of the RISC value proposition. It had worked out pretty well that time. Why wouldn't it again?

VLIW had been tried before Itanium. I had some very minimal exposure to both Multiflow and Cydrome, somewhat more with i860. The general feeling at the time among people who had very relevant knowledge or experience was that compilers were close to being able to deal with something like Itanium sufficiently well. Turns out they were wrong.

Perhaps the concept is not fine, but we should be careful to distinguish knowledge gained from hindsight vs. criticism of those who at least had the bravery to try.

kjs3 · on Oct 3, 2019

Perhaps the concept is not fine, but we should be careful to distinguish knowledge gained from hindsight vs. criticism of those who at least had the bravery to try.

So how many times do you have to fail before being brave is just a bad business decision. The "concept" wasn't fine for Multiflow or the i860 (I used both, and would call it terrible for the i860). It didn't work for Cydrome. Trimedia is gone. Transmeta flamed out. There's, what, a couple of DSP VLIW chips that are actually still sold?

But, hey, let's bet the company on Itanium and compilers that will be here Real Soon Now. I remember the development Merced boxes we got.

The general feeling at the time among people who had very relevant knowledge or experience was that compilers were close to being able to deal with something like Itanium sufficiently well.

That's revisionism. There was a general feeling we were getting good at building optimizing compilers, but I don't recall any consensus that VLIW was the way forward. The reaction to Itanium was much less than universally positive, and not just from the press.

Turns out they were wrong.

Very, very wrong. Again.

notacoward · on Oct 3, 2019

> how many times do you have to fail before being brave is just a bad business decision

That's a very good question. More than once, certainly. How many times did Edison fail before he could produce a working light bulb? How many times did Shockley/Bardeen/Brattain fail before they could produce a working transistor? Even more relevantly,how many ENIAC-era computer projects failed before the idea really took off? Ditto for early consumer computers, mini-supers, etc. Several times at least in each case, sometimes many more. Sure, Multiflow and Cydrome failed. C6X was fairly successful. Transmeta was contemporaneous with Itanium and had other confounding features as well, so it doesn't count. There might have been a couple of others, but I'd say three or four or seven attempts before giving up is par for the course. What kind of scientist bases on a conclusion on so few experiments?

> The reaction to Itanium was much less than universally positive, and not just from the press.

Yes, the reaction after release was almost universal disappointment/contempt, but that's not relevant. That was after the "we can build smart enough compilers" prediction had already been proved false. During development of Itanium, based on the success of such an approach for various RISCs and C6X, people were still optimistic. You're the one being revisionist. It would be crazy to start building a VLIW processor now, but it really didn't seem so in the 90s. There were and always will be some competitors and habitual nay-sayers dumping on anything new, but that's not an honest portrayal of the contemporary zeitgeist.

jacobush · on Oct 2, 2019

Mhm, yeah, it depends on how you look at I guess. I meant if you fine tuned your code to take advantage of the strengths, it could be very good for those workloads. But maybe they built something which from afar, if you squint a lot, has more of the strengths of a programmable GPU today, while they pitched it as a general CPU.

chrisseaton · on Oct 2, 2019

> But the concept is fine.

Is it? What’s the point of a processor that we don’t know how to build compilers for? We still don’t know how to schedule effectively for that kind of architecture today.

notacoward · on Oct 2, 2019

The point is that they didn't know it's impossible. They had good reasons for believing otherwise (see my other comment in this thread) and it's the nature of technological progress that sometimes even experts have to take a chance on being wrong. Lessons learned. We can move on without having to slag others for trying.

bluGill · on Oct 2, 2019

So we would never build a computer because back in 1950 we didn't know how to make compilers, only raw bytes of machine code (we couldn't even "compile" assembly language). Life somethings require that you create a prototype of something that you think will work to see if it really does in the real world.

jcranmer · on Oct 2, 2019

But with 1950s-era machines, it was expected that programmers were capable of manually scheduling instructions optimally, because compilers simply didn't exist back then.

VLIW architectures are often proposed to simplify the superscalar logic, but the problem with VLIW is that it forces a static schedule, which is incompatible with any code/architecture where the optimal schedule might be dynamic based on the actual data. In other words, any code that involves unpredictable branches, or memory accesses that may hit or miss the cache--in general CPU terms, that describes virtually all code. VLIW architectures have only persisted in DSPs, where the set of algorithms that are trying to be optimized is effectively a small, closed set.

chrisseaton · on Oct 2, 2019

> So we would never build a computer because back in 1950 we didn't know how to make compilers

No that's different - the big idea with the Itanium was specifically to shift the major scheduler work to the compiler. We didn't build the first computers with the idea we'd build compilers later.

notacoward · on Oct 3, 2019

But we did build an awful lot of RISC machines with exactly that idea. And it worked.

hybrids · on Oct 2, 2019

It begs the question whether or not any current compiler optimizations for a new theoretical VLIW-ish machine (Mill?) would prove to be an effective leg-up on the Itanium.

chrisseaton · on Oct 2, 2019

Being a bit more charitable, I think the problem is that people look at VLIW generated code and think 'wow that's so wasteful look at all the empty slots' without realising those 'slots' (in the form of idle execution units pipeline stages) are empty in OOO processors right now anyway. The additional cost is in the ICACHE, as already described.

Also, these days you would pretty much just need to fix LLVM, C2, ICC, and the MS compiler, and almost everyone would be happy.

acje · on Oct 2, 2019

The focus on vertical scale servers and FP performance was a mistake. Had they focused on single socket servers with INT performance the history might have been different. Also today’s compilers are much more capable so maybe Itanium was simply too early.