One difference is that, according to the article, Intel actually learned quite a bit technically from the 432 even though it was a commercial flop. It's hard to see much of a silver lining in IA64/Itanium for either Intel or HP--or, indeed, for all the other companies that wasted resources on Itanium if only because they felt they had to cover their bases.
A lot of RISC CPU arches which were popular in the 1990's declined because their promulgators stopped investments and bet on switching to IA64 instead. Around the year 2000, VLIW was seen as the future and all the CISC and RISC architectures were considered obsolete.
That strategic failure by competitors allowed x86 to grow market share at the high end, which benefited Intel more than the money lost on Itanium.
Sun didn't slow down on UltraSPARC or make an Itanium side bet. IBM did (and continues to) place their big hardware bet on Power--Itanium was mostly a cover your bases thing. I don't know what HP would have done--presumably either gone their own way with VLIW or kept PA-RISC going.
Pretty much all the other RISC/Unix players had to go to a standard processor; some were already on x86. Intel mostly recovered from Itanium specifically but it didn't do them any favors.
Actually, they did. Intel promised aggressive delivery schedule, performance ramp, and performance. The industry took it hook, line, and sinker. While AMD decided not to limit 64 bit to the high end and brought out x86-64.
Sun did a port IA64 port of solaris, which is definitely an itanium side bet.
HP was involved in the IA64 effort and definitely was planning on the replacement of pa-risc from day 1.
> HP was involved in the IA64 effort and definitely was planning on the replacement of pa-risc from day 1.
As my memory remembers and https://en.wikipedia.org/wiki/Itanium agrees, Itanium originated at HP. So yes, a replacement for pa-risc from day 1, but even more so...
This isn't really true. IBM/Motorola need to own the failure of POWER and PowerPC and MIPS straight up died on the performance side. Sun continued with Ultrasparc.
It wasn't that IA64 killed them, it's that they were getting shaky and IA64 appealed _because_ of that. Plus the lack of a 64bit x86.
Its simply economics Intel had the volume. Sun and SGI simply didn't have the economics to invest the same amount, and they were also not chip company, the both didn't invest enough in chip design or invested it wrongly.
Sun spend an unbelievable amount of money on dumb ass processor projects.
Towards the end of the 90s all of them realized their business model would not do well against Intel, so pretty much all of them were looking for an exit and IA64 hype basically killed most of them. Sun stuck it out with Sparc with mixed results. IBM POWER continues but in a thin slice of the market.
Ironically there was a section of Digital and Intel who thought that Alpha should be the bases of 64 bit x86. That would have made Intel pretty dominate. Alpha (maybe a TSO version) with 32 bit x86 comparability mode.
Look closely at AMD designs (and staff) of the very late 90s and early 2000s and/or all modern x86 parts and see that ...more or less, that's what happened, just not with an Alpha mode.
Dirk Meyer (Co-Architect of the DEC Alpha 21064 and 21264) lead the K7 (Athlon) project, and they run on a licensed EV6 bus borrowed from the Alpha.
Jim Keller (Co-Architect of the DEC Alpha 21164 21264) lead the K8 (first gen x86-64) project, and there are a number of design decisions in the K8 evocative of the later Alpha designs.
The vast majority of x86 parts since the (NexGen Nx686 which became) AMD K6 and Pentium Pro (P6) have been internal RISC-ish cores with decoders that ingest x86 instructions and chunk them up to be scheduled on an internal RISC architecture.
It has turned out to sort of be a better-than-both-worlds thing almost by accident. A major part of what did in the VLIW-ish designs was that "You can't statically schedule dynamic behavior" and a major problem for the RISC designs was that exposing architectural innovations on a RISC requires you change the ISA and/or memory behavior in visible ways from generation to generation, interfering with compatability so... the RISC-behind-x86-decoder designs get to follow the state of the art changing whatever they need to behind the decoder without breaking compatibility AND get to have the decoder do the micro-scheduling dynamically.
I'm certainly not going to claim that x86 and its irregularities and extensions of extensions is in _any way_ a good choice for the lingua franca instruction set (or IR in this way of thinking). Its aggressively strictly ordered memory model likely even makes it particularly unsuitable, it just had good inertia and early entrance.
The "RISC of the 80s and 90s" RISC principles were that you exposed your actual hardware features and didn't microcode to keep circuit paths short and simple and let the compiler be clever, so at the time it sort of did imply you couldn't make dramatic changes to your execution model without exposing it to the instruction set. It was about '96 before the RISC designs (PA-RISC2.0 parts, MIPS R10000) started extensively hiding behaviors from the interface so they could go out-of-order.
That changed later, and yeah, modern "RISC" designs are rich instruction sets being picked apart into whatever micro ops are locally convenient by deep out of order dynamic decoders in front of very wide arrays of microop execution units (eg. ARM A77 https://en.wikichip.org/wiki/arm_holdings/microarchitectures... ), but it took a later change of mindset to get there.
Really, the A64 instruction set is one of the few in wide use that is clearly _designed_ for the paradigm, and that has probably helped with its success (and should continue to, as long as ARM, Inc. doesn't squeeze too hard on the licensing front).
Seems to me that you just have to be careful when bringing out a new version. You can't change the memory model from chip to chip but that goes for x86 to. Not sure what other behaviors are not really changeable.
Can you give me an example of this? SPARC of the late 90s ran 32bit SPARC.
If you look at the definitions of various structures and opcodes in x86 you'll notice gaps that would've been ideal for a 64-bit expansion, so I think they had a plan besides IA64, but AMD beat them to it (and IMHO with a far more inelegant extension.)
Itanium was a success right until they actually made a chip.
What they should have done is hype Itanium and then they day it came out they should have said yeah this was a joke, what we did is buy Alpha from Compaq and its literally just Alpha with x86 comparability mode.
Itanic was a flop due to AMD releasing 64bit CPU. And I still think Intel learned a lot from its failure if not from the technology but business-wise. Just stick to improving the existing architecture while keeping backward-compatibility.
IMO, Itanic was a doomed design from the start, the lesson to be learned is that "You can't statically schedule dynamic behavior." The VLIW/EPIC type designs like Itanium require you have a _very clever_ compiler to schedule well enough to extract even a tiny fraction of theoretical performance for both instruction packing and memory scheduling reasons. That turns out to be extremely difficult in the best case, and in a dynamic environment (with things like interrupts, a multitasking OS, bus contention, DRAM refresh timing, etc.) it's basically impossible. Doing much of the micro-scheduling dynamically in the instruction decoder (see: all modern x86 parts that decompose x86 instructions into whatever it is they run internally that vendor generation) nearly always wins in practice.
Intel spent decades trying to clean-room a user-visible high end architecture (iAPX432, then i860, then Itanium), while the x86 world found a cheat code for microprocessors with the dynamic translation of a standard ISA into whatever fancy modern core you run internally (microcode-on-top-of-a-RISC? Dynamic microcode? JIT instruction decoder? I don't think we really have a comprehensive name for it) thing. Arguably, NexGen were really the first to the trick in 1994, with their Nx586 design that later evolved into the AMD K6, but Intel's P6 - from which most i686 designs descend - is an even better implementation of the same trick less than a year later, and almost all subsequent designs work that way.
Or Intel would have been cut out if they didn't put forth an offering that was less expensive and more performant? When NT4 came out, it ran on Alpha, MIPS, and PowerPC. You could even run (...at about half speed) x86 binaries on the Alpha port with FX!32. Apple has swung a transition like that twice, all the old Workstation vendors went from 68k to their bespoke RISCs, Microsoft could have just slowly transitioned out of Intel parts with no more difficulty than transitioning to IA64. Windows' PE format still doesn't have an elegant Fat binary setup (they have that Fatpack hack in windows-on-ARM, but it's worse than the 90s implementations), but that doesn't mean they couldn't have added one if compelled because the winning x86 successor(s) didn't end up being backward compatible.
The biggest squeeze on 32 bit architectures is the memory ceiling, and Intel was doing PAE to get 36 bit addressing on the Pentium Pro in '95 and kept squeaking by with PAE well into the mid-2000s before most consumers cared. You only got 4GB per-process, and it took a couple years for chipset support to happen. The chipset issue is itself an interesting historical rabbit-hole, only one of the first-party chipsets for the Pentium Pro - the 450GX - which was a many-chip monstrosity, even _claimed_ to support more than 4GB of RAM. I've never found an example of a 450GX configuration with more than one 82453GX DRAM controller as indicated by the documentation to handle multiple 4GB banks to the extent that I suspect it may not have actually worked. By 96/97 there were 3rd party chipsets that could do >4 processors and >4GB, most prominently the Axil NX801 ( https://www.eetimes.com/axil-computer-to-incorporate-pentium... ) sold by DataGeneral as the AV8600 and HP as the HP NetServer LXr Pro8.
The slow performance would eventually have lead to Intel realizing its bad idea. They would use market share to competitors on different ISA.
Windows at some point would want to be on faster processors and would again run on some of the others. Windows doesn't have undying loyalty to Intel.
Other people then AMD could do a x86 implementation with some 64 bit overlay as well. Transmeta style. That kind of system would beat Itanium as well if it was put on top of a fast RISC processor.
And at some point AMD even on 32bit would massively gain market share as they invested more in faster RISC style processors. So Intel would have massive pressure from the bottom end and the top end. And in any possible future AMD at some point it gone do something with 64 bit.
The idea that the whole industry goes massively backwards and stagnates for years because Intel monopoly doesn't really work in practice.
The point is that NT is portable, and once Merced hit the market in 2001 5+ years overdue and not delivering on any of its performance promises, the only question was "What architecture will succeed x86, because we can cross IA64 off the list." In the same way that when the 432 showed up years late and 5-10x slower than a contemporary Motorola 68000 or 286, the 432 was dead in the water and all the early 80s workstations were built with 68ks and the PC market went with 286s.
I don't know if the absence of AMD64 in 2003 would have made an opening for SPARC or PowerPC or ARM or something else entirely, or maybe the "Let's slap an expansion on the 8080 again, just like the 386 bailed use out after the 432 debacle" scenario was inevitable, but NONE of the compiler-scheduled-parallel architectures panned out in the market, so someone else was going to win.
I have read one insider account that Intel had its own, different x86-64 instruction set, designed in response to AMD's. It approached Microsoft and asked it to port Windows to it.
Microsoft refused, saying "we already support one failing 64-bit architecture of yours, at great expense and no profit. We're not doing two just for you. There now is a standard x86-64 ISA and it's AMD64, so suck it up and adopt the AMD ISA -- it's good and we already have it working."
Or words to that effect. :-)
I've not been able to find the link again since, but allegedly, yes, the success of AMD's x86-64 has been due to Microsoft backing it. It sounds plausible to me.
Based on https://en.wikipedia.org/wiki/File:Itanium_Sales_Forecasts_e... it's clear that Itanium was delayed and sales projections were drastically reduced multiple times before AMD even announced their 64-bit alternative, let alone actually shipping Opteron. (For reference, AMD announced AMD64 in October 1999, published the spec August 2000, shipped hardware in April 2003. Intel didn't publicly confirm their plans to adopt x86-64 until February 2004, and shipped hardware in June 2004.)
VLIW was really marooned in time: driven by overconfidence in the compiler (which had shown that you could actually expose pipeline hazards), and underestimates of the coming abundance of transistors (which make superscalar OoO really take off, along with giant onchip caches). well, and multicore to sop up even more available transistors.
otoh, for the previous 20 years, things like the 432 and lispms and burroughs large systems had been losing, in favor of architectures that pushed all the hard work onto compilers
so it makes sense that in 01995 you'd look at ooo and vliw and extrapolate that vliw/epic was going to beat the crap out of ooo
Granted it makes some amount of sense. But the issue is that with EPIC you still can address every part of the processor unless you want to continue to grow the instruction. So you end up having to do OoO anyway but you just made it much more complex and hard to reason about.
I'm not a chip designer but is what I understood to be one of the issues.
Also if this compiler stuff wasn't jet written, unlike with RISC where people at Standford showed successful compilation for RISC already before people even developed any high performance RISC chips.
I don't want to claim I'm smarter then those people, clear all the people working on these VLIW processor were a lot smarter then me. But then again many smart people worked on Alpha and they didn't go the VLIW route.