Hacker News new | past | comments | ask | show | jobs | submit login
IBM’s 24-core Power9 chip (nextplatform.com)
176 points by ajdlinux on Aug 24, 2016 | hide | past | favorite | 154 comments



The article say Power9 chip will have "hardware assisted garbage collection" and that's all, that's a bit short for such an important topic.. Does anyone have more information about it?


I came here to ask the same thing, did some searching, and still haven't found anything.

Edit: Found this in PowerISA_V3.0.pdf

Load Doubleword Monitored Instruction: Specifies a new instruction and facility that improves the performance of the garbage collection process and eliminates the need to pause all applications during collection.

The ldmx instruction loads the same doubleword in storage as the ldx instruction.

ldmx is intended for use by applications when loading data objects during times when objects are being moved as part of a garbage collection process. In this type of usage, all loads of object pointers by applications are performed using ldmx. Whenever objects within a given region of memory are to be moved, the garbage collection program sets the Load Monitored Region registers to encompass the region being moved, and sets BESCRGE LME to 0b11 to enable Load Monitored event-based branches.

...


I think it refers to idmix instruction and Load Monitored Region, and Load Monitored Region Registers

>The Load Monitored Region registers are used to specify a Load Monitored region, and to specify the sections of the region that are enabled. The Load Monitored region is a contiguous region of storage specified by the Load Monitored Region register. Load Monitored regions range in size from 32 MB to 1 TB. All regions are powers of 2 in length and are aligned (see Section 1.11.1). The Load Monitored region is divided into 64 sections of equal length, each of which can be separately enabled. The Load Monitored Section Enable Register specifies which sections are enabled.

>Load Doubleword Monitored Instruction: Specifies a new instruction and facility that improves the performance of the garbage collection process and eliminates the need to pause all applications during collection.

Programming Note:

>The ldmx instruction loads the same doubleword in storage as the ldx instruction.

>Idmx is intended for use by applications when loading data objects during times when objects are being moved as part of a garbage collection process. In this type of usage, all loads of object pointers by applications are performed using ldmx. Whenever objects within a given region of memory are to be moved, the garbage collection program sets the Load Monitored Region registers to encompass the region being moved, and sets BESCR GE LME to 0b11 to enable Load Monitored event-based branches.

>Subsequently, if an application program loads (ldmx) a pointer into an enabled section of the Load Monitored region, an event-based branch (EBB) will occur. The EBB handler will load the instruction from storage, and decode the instruction to determine the effective address from which the pointer was loaded. The handler will then load the pointer (obtaining the same value the application obtained), and determine where the corresponding object has been moved to, and update the pointer in storage so that it points to the object's new location. The EBB handler may also take other actions such as updating additional pointers, depending on the situation. After this processing is complete, the EBB handler sets BESCR LME LMEO to (1 0) since the taking of the EBB set these bits to (0,1), and then executes rfebb 1 to re-enable EBBs and return to the application program at the ldmx. The application program re-executes the ldmx, which now returns the updated pointer (which no longer points into an enabled section of the Load Monitored region), and continues.

>Other variations and extensions of the above pro- cedure are also possible.


The new SPARCs have something like that too. http://www.mythics.com/about/blog/an-introduction-to-your-se...


> Other silicon accelerators include a SQL coprocessor in each core. Entire SELECT statements now run at silicon speeds.

seriously? love to see how/why that was put in the die


Oracle did this in the past 2 generations of their UltraSPARC architecture.

At it's core a `SELECT FROM` statement is a filter lambda (without any joins, or before/after a join has been done depending on query optimization).

Fundamentally this is because an SQL table is a flat array. With row's being flat array's within the table's memory space.

Calculating the pointer to fields within rows within a table is ridiculously parallel. You're limited only by SIMD size. (this is 1 FMA + 1 add against a constant ~2 ticks).

Also custom wide SIMD registers can allow for `VARCHAR(255)` equality statements to be done in 1 processor cycle.


I guess I shouldn't be surprised that when a database company swallows a CPU hardware division that the CPU might sprout some more database specific operations, but I was.

I would love to see some benchmarks of how much this speeds up some sample workload queries, if you happen to know of some (preferably not from Oracle's marketing dept.)?


Oracle UltraSPARC purchasing agreement forbids 3rd party direct comparison benchmarks.

The only ones you'll find by 3rd party are non-direct comparisons like transactions per second in Oracle-SQL. Or Core Count.


Why am I not surprised by this. I sometimes find myself thinking I should check my bias against Oracle, just to make sure I'm not being too hard on them for past transgressions (for example, Microsoft is far from perfect but they've come a long ways from many of their more negative past practices). Then I learn something like this, which while it doesn't directly mean they are still continuing in predatory marketing and sales practices, doesn't exactly assuage any fears I might have have had.


I currently work for a company that heavily uses Oracle technology. It's horrible. Our culmative contracts are worth several hundred thousand dollars per years.

But they'll still cheat us on small things. We request information on a training seminar (we had a free attendance voucher), and they lose our account information until 1 week after the seminar ended.

It's to the point where I just laugh. Everyone knows their cheating us, but upper management won't change.


Oracle's database is pretty good (if you can afford it). All their other software, from what I've seen, is shockingly bad considering their resources.


Yes if you shell out for all the money to run it on every core, and use a lot memory it is solid. But the bang for your buck, Microsoft SQL has a better support plan, and saner licensing model. My experience is working with smaller datasets <200GB so YMMV.

I mean I'd rather use Postgres. But large corps like having a vendor to call.


Check out EnterpriseDB. I don't know how good they are in practice. But their speciality is solving the exact problem you mentioned with Oracle vs Postgres.

http://www.enterprisedb.com/products-services-training/produ...


Forbidding benchmarks is the kind of thing that should draw the wrath of any state worth the name.

Alas.


Or if someone steals one ?


Err...you know who bought Sun right?


That's interesting. For some reason, I was under the impression Oracle had basically killed off the hardware development portion of Sun. I don't recall where I got that idea though.

Good to see new developments in this line. The more (open) hardware platforms the better, IMO.


> I don't recall where I got that idea though.

They killed the desktops. That, essentially, made SPARC mostly invisible to most people. They also do a crappy job of reminding the world that a software company (as they position themselves) can make decent hardware.


Intel's hugely unsuccessful 432 also supported hardware garbage collection in 1981.

https://en.wikipedia.org/wiki/Intel_iAPX_432#Garbage_collect...


Ok, that makes two of us that read a pile of SW features in HW and thought about the 432. I think this is more like the i960, though, where a fast RISC gets some extra features. Always thought it might have succeeded if not for the BiiN and Ada crap. IBM's been playing it smarter with POWER by making sure its compatible with and aids the software people actually use. So, POWER9 has a much better chance than Intel's prior attempts.



The power ISA specs tend to omit quite a few details about the hardware itself. I wouldn't be surprised if documentation was internal only and the first time we get any details about it is when IBM patches some open source software to take advantage of it.


They do respond to requests, albeit quite slowly. I wanted documentation to go with some kernel patches they posted to the linux kernel (IOMMU related), that hinted at a mode I was interested in. It took them two years but the docs showed up a few months ago. I suspect I wasn't the only one asking for more detail about system level (or firmware/hypervisor) features that aren't documented in the instruction arch. I think they understand both at the top and the bottom, that the time for hiding stuff like this is over. Pushing it through the immense bureaucracy still takes time but support from upper mgmt eventually gets it done. The one thing IBM has always been good at is documenting their hardware/software. Somewhere in the late 90's they started to hide much of the lowlevel documentation (like the rest of the industry) but I think the slow realization that doing that doesn't help them (or anyone else) is slowly becoming apparent as the most successful companies are frequently the most open ones (and therefor have better linux/open source support).

LoPAPR http://openpowerfoundation.org/?resource_lib=linux-on-power-... IODA2 http://openpowerfoundation.org/?resource_lib=openpower-io-de...

(although i'm 99% sure the IODA doc is missing a heap of stuff).


The Reduceron (an FPGA core for running pure functional programs) contains a very simple hardware garbage collector[0]. Almost definitely not the same sort of thing, though.


So, this is the chip that Google, Facebook and Rackspace used in their design for the open compute platform[1][2]. Maybe the entry level price will drop when like those are buying them (or their components) at scale.

1: http://www.nextplatform.com/2016/04/06/inside-future-google-...

2: http://www.pcworld.com/article/3053092/ibms-power-chips-hit-...


Don't forget the Raptor Workstation. Allegedly, they plan to sell the thing for less than what SGI and Sun sold their boxes in the past. Using POWER8-9 chips in a market where ebay's best deal is $4,000 for a POWER8 CPU. ;)


ISA 3.0 adds a new instruction 'darn' -- Deliver a Random Number. That's a pretty good asm mnemonic :) I wonder if anyone has dug into the details of how that works yet?


Wow that's pretty good. My other favorite mnemonic is also from PowerPC 'eieio' - enforce in-order execution of I/O.


"IBM has a well-known disdain for vowels, and basically refuses to use them for mnemonics (they were called on this, and did "eieio" as an instruction just to try to make up for it)."

- Linus Torvalds, 2009 (http://yarchive.net/comp/linux/coding_style.html)


Haha, didn't know that. I figured an engineer had a kid with an affinity for Old MacDonald.


Old MacDonald had a (server) farm, eieio...


The thing I love about the eieio instruction is that it serves the same purpose on-chip as in the song. Sort of a sync point.


And if you use it once, you tend to use it multiple times, each with a little code.

* Peek a value that's MMIO

* EIEIO

* Poke right back into MMIO

* EIEIO



Totally off topic - but - check out "eieio" for another good instruction name.[1] ;-)

[1] http://www.ibm.com/support/knowledgecenter/ssw_aix_61/com.ib...



I'm quite naive regarding Power processors; does anyone know where I can see any motherboards with prices for power processors, or even if any of the BSDs run on them? It seems like it would be quite a nice platform/OS combination that side-steps the traditional Apple/OSX vs Intel/Linux workstation options


In he case of the IBM Power processor, it'd be difficult to find the individual pieces. The computers are typically sold as prefab units by IBM or a third party. Not much of a build-it-yourself market.

That said, they're expensive like any high end datacenter-grade hardware.


So then these are not like the PowerPC chips used in the PowerMacs and Amiga One computers?

More like a minicomputer or mainframe level power?


I've seen this mentioned a few times when Power chips come up: https://www.raptorengineering.com/TALOS/prerelease.php


OpenBSD, FreeBSD, and NetBSD all have support for the older Apple G4 and G5 hardware. Obviously being older they are kind of slow (especially compared to amd64) but if you just use them as file servers they are fine. The biggest problem with BSD on PowerPC, in my opinion, is the lack of modern programming languages. I believe on BSD you are stuck with C, C++, ruby, python, and perl mostly. No node, no Java, no golang, no rust. If you use Debian you have access to Java.


Keep in mind Power != PowerPC. Related families but there are differences (exercise left to the reader).


What? Why wouldn't you have access to node? Source is available for those, and node even specifically addresses the BSDs[1]. I heard go does something kind of odd with system calls, so I could see if possibly having a problem (but I really don't know enough to state definitively).

I'm not sure if rust targets that platform/arch combo, but I could see it being deferred while they focus on more common ones.

Edit: It just occurred to me that Node's (V8's) optimized JIT probably uses ASM or at least routines optimized for x86, so that would complicate deployment.

1: https://nodejs.org/en/download/package-manager/#freebsd-and-...


Yeah, v8 is pretty famous for its lack of interpreter, though it looks like IBM is working on a port. You could also use the Chakra or SpiderMonkey ports.


  > I'm not sure if rust targets that platform/arch combo
It's the PPC bit that's tough at the moment, we do provide tier 2 support for {free,net}BSD on x86_64.


I figured. Although this isn't PowerPC, but Power, and what little I was able to find quickly seems to indicate the instruction set has evolved since it was branched for PowerPC. I imagine even if PowerPC targeted binaries work on Power, they wouldn't use the platform to the fullest.


Oh yeah, right. Point is, yeah, not yet. I wonder what LLVM's support of these arches is.


IBM has been investing heavily in the toolchain, so LLVM has decent support for PowerPC.


That hasn't translated into working FreeBSD support yet though. It's still stuck using GCC 4.2 on ppc, which is a real pain.


I know that there is some work on porting Rust to ppc64le which is gradually coming together.


While Go doesn't have a port to BSD on POWER, there is a Go port for POWER on Linux (it's the s390x port).

I imagine it shouldn't be too difficult (relatively speaking) to do the port as a result.


The Go port is ppc64le, the new ABI (elf v2) which is Linux only at present, the BSDs support the elf v1 ppc ABI, which is a bit different and would need a different port, although it is mainly endianness and how function pointers work.

As others have said s390x is nothing to do with Power.


s390!=POWER

The GOARCH for power is ppc64/ppc64le AFAIK.

(s390x is the mainframe processor, aka z/Architecture, the latest being the z13 which is most definitely not a power8/9.)


I had thought s390x could be based on POWER8/9, but you're right, that doesn't appear to be the case. I apologise. Regardless, Go does consider S390x to be a separate port.


As I understand it, the s390x ISA is fully backwards compatible to stone tablets. But it's essentially a POWER chip with a bit of extra decoder logic.


There's actually very little in common between z/Arch and Power, but I don't blame you since "everybody knows" they are the same thing.


Yah, most of the descriptions (just a little extra decode, aka PowerPC AS), and sharing the hardware actually apply to the iSeries, which _IS_ POWER based. But of course that is a minicomputer not a maninframe...

It seems the vast majority of people who consider themselves knowledgeable about computers are just massively ignorant/confused by IBM's product lines and capabilities (and don't even get me started about the nonstop and openvms). Which is a bit of a shame, particularly in the case of IBM i, which is probably one of the most interesting OS's still in support/production.


Today I learned. Thanks.


S390x (IBM System z) is an arch distinct from POWER. They both have Go ports, though.

edit: failed to refresh


Java is now available for BSD on powerpc64. A set of patches were developed to add ppc64 support and have been officially committed to the bsd-port of openjdk8.

Node also supports powerpc. Support was added courtesy the team at IBM.

I have both working on my PowerMac G5, running FreeBSD 10.2.


What about gccgo? "Gccgo, however, supports all the processors that GCC supports" Source: https://blog.golang.org/gccgo-in-gcc-471


Many programs dont work with gccgo though.


Well, if you have C, you can get to pretty much anything else (maybe with some porting effort).

I definitely built and ran Erlang on a G4 powerbook several years back.


I believe you can buy the Tyan boards and get prices. They are not cheap but not unreasonable http://www.tyan.com/solutions/tyan_openpower_system.html

FreeBSD runs on modern Power I believe, but the other BSDs only run on older hardware AFAIK, although IBM is pretty good at providing hardware so that may have changed.


When POWER8 came out in 2014 I ran a couple of numerical benchmarks against Intel Xeon CPUs. In my benchmarks the POWER8 CPUs where slightly faster than Intel Xeon CPUs when the data fitted into the CPU's cache, so far so good (the speed advantage of the POWER8 CPU can easily be explained by the much higher clocking). But once I started running heavy benchmarks involving gigabytes of data the POWER8 CPUs where at least twice as slow. Now 3 years later POWER9 CPUs will come out that are about 1.5 times faster, in my opinion this is not enough to compete against Intel. Why would anybody want to get locked into a rare and more expensive CPU architecture if there is no speed advantage?


The idea of the OpenPower thing is that you are no more locked in than you are on x86, perhaps not quite as open as arm64 but it's pretty nice in theory.

Also, I have a major sad for our industry when people struggle so much with alternative architectures. Tons of software absolutely should not care including JVM languages, scripting languages, Golang. Most userland compiled languages should also minimally care. For lower level C stuff, regular compilation on non-x86 archs is often directly reflected on overall code quality (new compiler warnings, things like alignment correctness, cache/memory coherency model assumptions, data type assumptions, etc). It is somewhat hard to port operating systems and "efficiency libraries" that invoke platform features, but it also takes a comparatively small number of people to do.


Most of the applications are heavily optimized for x86 processors for years. There are so many applications littered with intrinsics and x86 assembler coded blocks, algorithms are tuned for x86 and intel specific memory characteristics. Arm got some attention because of phones, but only in last few years.

Power8/9 are very decent processors with a lot of raw power, but it will take a more concentrated and long term effort to have a level playing field in the software landscape.


Some of us believe that the open platform is worth the expense.

ME, and the absolute refusal of Intel to allow its own customers to disable it, is very scary to the tinfoil hat crowd.

Unfortunately, Raptor Engineering is still early in development of their POWER8 board, and even that may ultimately turn out to be vaporware.


If they haven't released by now, they probably won't. Sad but true, technology has a shelf life.

If they wait another 6 months it simply won't be economical to release it ever.


You might have issues with the quality of your compiler or you might be using POWER7 code and/or have VSX disabled. IME POWER8 is somewhat faster than high end Xeons (although not in terms of price/performance).


I compiled my code using the latest GCC version available, I even made sure I used the right compiler flags to enable e.g. hardware POPCNT support. I know that IBM claimed that POWER8 CPUs were faster than Intel Xeon CPUs but I found the opposite to be true (at least for the benchmarks I cared about). You can also find many other benchmarks on the web which are in favor of Intel Xeon CPUs compared to POWER8.

I am not against IBM POWER CPUs, I was just disappointed by POWER8...


Fair enough. I'm sure it's workload dependent, and also I have not tested huge data sets. I'm most interested in the overhead of virtualization and (separately) compile times.


Why would you be "locked in"?

I've developed for and used Linux and BSD on several platforms. Moving to a different one is trivial unless you're dependent upon proprietary binaries which can't be rebuilt. If you're using open code, or code which you have the ability to rebuild for the new platform, you're not locked in.


Uptime. Reliability. Support. Feature set. Backwards compatibility.


I don't know why, but I always had a thing for Power CPU's. Really curious how Power9 compares to Power8 and Intel Broadwell-EP/EX when it comes to power efficiency (and of course performance wise).


Power CPUs are also the only currently on the market that work with open source firmware from the ground up – everything else needs proprietary microcode and/or firmware blobs.


If you're curious where that lives, check out the OpenPower Github project.[1]

(full disclosure - I worked on tiny pieces of the code and stuff adjacent to it in the design)

[1] https://github.com/open-power


I don't think that is 100% true, you can bootstrap ARMs with ARM trusted firmware+tianocore. That doesn't mean that a large number of arm devices can be bootstrapped that way due to decisions made by the ARM licensee.

https://github.com/ARM-software/arm-trusted-firmware http://www.tianocore.org/edk2/ (for more platform support add git.linaro.org/uefi/OpenPlatformPkg.git)


Yeah, or possibly j-core j2 [0] if you fab the chips yourself and don't need much power.

[0] http://j-core.org/


No offence, but the POWER is the only free CPU on the market that isn't a hippie piece of crap.


"No offense intended. Now please, allow that to completely excuse me as I offer a direct insult."


No, really. I get it, building your own CPU is fun. But they're all useless for the debate at hand, because nobody wants obsolete, slow, power hungry CPUs just because they're free.


at least one company want's this cpu [1], (but since one of the reimplementors of the superH architecture, or whatever you want to call them, works there - as CEO in fact - , take it with a huge grain of salt).

since you seem to know about the power consumption, can you please share some numbers about the power consumption of the jcore cpu (on a comparable process to other cpu's build for IoT devices, usb controllers etc)?

jcore is not intended to be a replacement for the main cpu, so this would be really interesting to me, and would save me from asking on the mailing list. ;P

[1] http://se-instruments.com/press_releases/sei-adopts-open-sou...


The insult is to the chip, not to redtuesday.


I mean this as a genuine question: isn't OpenSPARC a free-as-in-speech CPU that I would presume - given its lineage - isn't a hippie piece of crap?


On the paper, but is any actually available on the market?


I loved the PowerPC chips, the 604 and G5 in particular. Too bad it never became a viable platform outside of Apple.

(In the '90s, the PowerPC had very competitive price/performance, but there wasn't a viable desktop OS available after Apple terminated the original MacOS clone program. 5-10 years later Linux would have been much more ready, but the chips were not competitive with Intel anymore.)


I can't help wonder if Intel "won" by bruteforcing the lithography side. But this then made them "blind" to the reduced cost side of Moore's Law (or they tried for the longest time to get people to ignore it).

Their first reminder was perhaps netbooks, using semi-retired Celerons to make small and cheap clamshell web browsers and media players.

Their second, and ongoing, reminder is smartphones/tablets.


PowerPC essentially died as a desktop and server processor once Apple decided to go Intel. IBM decided to go back towards Power's roots with the POWER5 (the follow-on to the PPC970 in the Power Mac G5 and the PowerPC-based XServes).


PowerPC really died in 1997-98 when it became clear that Jobs wasn't going to license Mac OS (classic or X) for 3rd party PowerPC boxes anymore, and Windows NT on PPC wasn't going anywhere.

Without an operating system, IBM and Motorola didn't have an incentive to build PC chips anymore. The G4 happened because it was already in development and the vector extensions were useful for Motorola's embedded ambitions. The G5 happened because Apple basically paid IBM to make a desktop chip out of their POWER designs, AFAIK.

Around 2004, there was a startup PowerPC maker called P.A. Semi [1] that apparently competed for Apple's Mac CPU business. After Apple went Intel, they acquired P.A. Semi to design iPhone chips instead.

[1] https://en.wikipedia.org/wiki/P.A._Semi


For large swaths of life the PowerPCs in a mac significantly faster than any available x86 CPUs. Back in the single core days, clock speed was something that lay user vaguely understood was correlated to performance. This had a subtle but real impact as Intel and OEMs were effective in marketing clock speed number as something users should care about. It was very effective until it backfired when AMD torpedoed intel with Opteron, and then clock frequency scaling became undesirable.


The idea that PPC was faster than x86 back in the PPC mac days is Apple hyperbole. There were a few occasions where the fastest PPC was faster than the fastest x86 available but they were far and few between and very slim. If all you looked at were the very carefully selected/tuned apple benchmarks you can be forgiven for thinking that (the photoshop filter benchmark using altivec comes to mind) there was a huge PPC advantage.

Here is a random google search for spec performance over time of assorted CPU arches.

http://preshing.com/20120208/a-look-back-at-single-threaded-...

The only obvious case in that graph where PPC>x86 is in mid 1995.

I can't find the one I saw a couple years ago detailing the core/clockrate comparisons of PPC's shipped in Macs with common PCs, that was much easier to read and far more obvious.


It wasn't just Apple hyperbole, PowerPC really did beat Intel at quite a few instances, and generally kept up until about 2004. AltiVec with photoshop was not just a tuned benchmark but a significant advantage on a common scenario. The Power chips also consumed significantly less power, enabling Apple to put far more powerful CPU's in laptops.


They still live on in the embedded scene, esp aerospace. Freescale makes them. They steadily improve in all kinds of ways. Example:

http://www.nxp.com/products/microcontrollers-and-processors/...

They're also deployed on FPGA's as embedded CPU's.


(It always confused me that my PowerPC 603 @ 75 MHz (a Performa 6200) was so, so much slower than my PowerPC 601 @ 80 MHz (a Power Macintosh 8100) – it wasn't until years later that I learned that the '601 was executing three instructions per Hz whilst the '603 was only executing two instructions per Hz...)


Indeed, performance per watt is the real question. They almost caught up with Intel with the Power8. If they manage to beat Intel with Power9, I think that would make a lot of folks take notice.


Now if only they'd make development boards available for less ~$4000.


The Wii and the Wii U is your friend.


As much as I think the Wii, the Wii U and the Xbox 360 are all the equal-best gaming consoles ever (yes, I judge my gaming consoles purely by their CPU architecture), each of those consoles uses a custom chip developed under a specific IBM-{Nintendo, Microsoft} agreement, so they're somewhat different chips from the POWER[n] series :)


The Wii and Wii U are using PowerPC CPUs.


Yes. The distinction being made here is that PowerPC is not necessarily the same as Power, even if they have a shared lineage.


More importantly, the system architecture is different.


Am I the only one who thinks that Power9 die shot looks pretty? Reminds me of a stained glass window.

http://www.nextplatform.com/wp-content/uploads/2016/08/ibm-h...


Intel used to publish poster-sized photos of their dies. I had one for the Pentium-II. It'd be cool to get one of this.


I had the 386 one in my dorm room at university.

Edit: This one, I think: https://upload.wikimedia.org/wikipedia/commons/4/44/Intel_80... It's nice that you can almost see individual wires.


Wow, that really is quite beautiful. People often don't realize the marvels of technology involved in uploading their selfies.


As somebody who's a hardware enthusiast I wish I could buy poster sized dye shots still :\


I still use this one: http://www.cpucollection.se/data/media/107/keychain_intel_pe... as my keychain. It's an early 90s Pentium (60 or 66).

My dad got it while working at the university and at some point in time in high school gave it to me. The Lucite is pretty scratched up by now, but it has a weird sentimental value to me so I'm planning on getting it buffed out.


No, although I prefer the die shot from Power8 (probably because of better symmetry) [0], I like it as well. But nothing comes close to Nvidia GF100 for me. [1]

[0] http://www.extremetech.com/wp-content/uploads/2014/04/ibm-po...

[1] http://images.anandtech.com/galleries/693/DieShot_GF100_Arch...

[2] (forum thread with many die shots) http://www.cpu-world.com/forum/viewtopic.php?t=19888


Ha, I was just thinking the same thing. They should sell poster prints. I'd hang it up.


If you like this, you may enjoy Mark Richards and John Alderman's book "Core Memory: A Visual Survey of Vintage Computers."

There are some photos from it at http://www.wired.com/2007/07/core-memory/


Now set as my desktop background.


I'm not too familiar with the PowerPC world, so maybe someone could help me out. I know I should just search, but I'm feeling lazy.

Is this a new ISA or just the latest revision in their existing ISA?

Also, what's the relationship with Freescale (NXP) here, who sells pretty decent "PowerPC" chips for server-ish platforms, such as the T2080.

Do they license the ISA from IBM sort of like how ARM works?


POWER and PowerPC are two somewhat related, but independent ISAs. Both are available for licensing: https://en.wikipedia.org/wiki/OpenPOWER_Foundation

This is going to be the latest refinement of the POWER ISA, replacing the current POWER8 used by IBM, and licensed by some Chinese company for not yet released products.

FreeScale/NXP are licensing PowerPC.


It's a new revision of an existing ISA. POWER8 implements Power ISA 2.07, POWER8 is 3.0.

The Freescale part of NXP was formed out of Motorola's semiconductor division - Motorola, in turn, was part of the original AIM Alliance with Apple and IBM back in the early 90s that developed PowerPC. So they've been making PowerPC stuff since the very beginning. I would assume they have a license of some sort from IBM.

[Disclaimer: IBMer, opinions my own]


typo on the second POWER8 ?


Argh, yes, I meant POWER9. Thanks.


Note the FreeScale chips implement the embedded variant of POWER7 ("book E" IIRC). They won't run server software. In particular vector instructions are missing -- servers OSes like RHEL will crash very early in boot. And last I looked little endian support was missing or broken. Big endian is basically dead on modern server POWER although at Red Hat we still build both endians for now.


So realistically, does processor architecture matter today from the custkmer's point of view? I mean performance and all that matters, but presumably that will be offset by market forces. But for example I rent a couple of ARM servers from Scaleway as well as a whole bunch of x86-64 servers from various providers, and they all can do all the same stuff. So why is it that x86 is the chip data centers prefer when they could have a variety of chips and let the customers and the market choose?


Online.net, the folks behind scaleway you mentioned, do actually have a dedicated POWER8 server available. While their cheapest dedicated servers (C2350 Avoton / 4GB) go for €9/mo, and a bit more comparable 2xE5-2660v3 / 256GB at €320, you have to shell €1600/mo for the POWER8 / 512GB.[1] Which is to say, their other prices seem ok, but the POWER is still expensive.

Also, IIRC they've had POWER servers available for quite some time (possibly already POWER7), so maybe there is demand. Assuming so, it is either actually more efficient to a great degree than x86-64, or sufficiently more efficient so that it can solve some problems the x86-64 ones can't at all.

[1] https://www.online.net/en/dedicated-server/dedibox-power8


Small and medium size customers just want the cheapest architecture they can get, which is x86 or ARM fo servers and mobile, respectively. Large customers want leverage against Intel and ARM so alternative architectures are more important.


Virtualization and now increasingly, Docker containers.

There are a ton of Docker images that are not going to work on anything other than x64.

Lack of binary compatibility is a huge hurdle to overcome for any new CPU architecture.

x64 has the same issue in the mobile space against ARM.


The only thing that matters in data centers is power consumption, so just as soon as these compete on compute per joule, data centers will roll them out.


Google/facebook datacenters maybe, but there are others where backwards compatibility is actually the most important metric because the companies in question have been investing in nonportable software for decades. Those places tend to be massively heterogeneous as different business merged/etc and absorbed the technology stacks or layered on newer technologies over older ones.


Yeah, absolutely. Your basic corporate in-house development shop using C# and SQL Server is married to Intel for all eternity. I think the Google/Facebook crowd is really the best market for this IBM stuff, because those companies control their entire stack and have the technical ability to port it.


Well, they released SQL Server for linux (even if pared down feature-wise), and that's probably a bigger hurdle than recompiling for a new architecture (depending on whether their compilation toolchain supports it, and how well), although there could still be some x86(_64) specific optimizations in place.

Sometimes weird stuff like supporting esoteric platforms happens because the companies in question are so large, that it makes sense to spend a few $100k on that project to make Intel squirm and get better pricing from Intel on your bulk deal. Keep in mind, Microsoft have their own cloud service...


On the data center side, it is quite difficult finding decent server hardware with eg ecc for arm.


All the scale is on x86, so it gets the best fabs and the cheapest unitary costs. ARM got some scale with mobile, so they are starting to compete.

But until Moore's Law is really buried and forgotten, new architectures won't be able to compete on already filled niches.


Erm, ARM partners ship 30x as many cores every year as Intel[1]. ARM is the one with scale.

The actual difference is Intel has proprietary fab technology one or two generations ahead of TSMC etc.

[1] The actual numbers for 2014 are: Intel shipped 400 million chips, ARM partners shipped 12 billion chips.


I think the "scale" being discussed is semiconductor manufacturing, where the area matters and the unit count does not. Each of Intel's CPUs is between 100x and 10x larger than each of those ARM chips. For example a Xeon E5 has a die area of > 450mm^2, the Apple A9 is about 100mm^2, a Tegra 2 is ~50mm^2, and a Cortex is < 5mm^2. Those unit counts you mention are dominated by those smaller parts.


Yes, ARM now has enough scale to compete with x86 (and is starting to). And got there by filling a niche that wasn't taken over.

That's what I said.


Do these IBM systems have anything like Intel ME that might be used by Big Bro for remote, covert rooting or subversion of the server? If yes on the standard model, can we order a version of these systems that lack that feature?

I'd love to see some market forces nudging Intel to offer CPU's and Motherboards without ME.


Check out: http://openpowerfoundation.org

and: https://www.raptorengineering.com/TALOS/prerelease.php

Supposedly this could be a way out of Intel ME. Gotta see it first though, but I have one pre-ordered (whatever that means).


Interesting, IBM wants to become ARM of server computing - "Open to IP changes"


So, is the 24 cores 4 threads variant just the same as the 12 cores 8 thread variants but with each (ridiculously) wide core statically partitioned in two?

edit: should have read till the end.


I don't understand why IBM is not creating their own cloud services like Amazon, MS or Google. They have talent, they make their own hardware and software they are struggling a little bit with this new software landscape so why not just do it?


They acquired Softlayer, and they're pushing BlueMix and Watson pretty hard - so they definitely have a footing in the cloud hosting/ services!


Well, there's BlueMix, but it's a shameful shadow of what Amazon/Google/MS are offering.


Why wouldn't IBM just have a simple product page for such an innovation?

Like an Apple.com/iphone page for Power systems :-)

Search for Power9 on IBM.com yields practically nothing.

http://www.ibm.com/search/esas/search?q=power9&v=17&sn=23&us...

Perhaps the website is all maintained by the folks busy in "the cloud".


Because this is just a technical preannouncement, you can't actually buy anything yet? A search for POWER8 returns lots of hits.


Agreed. I wasn't looking for a buy button. However, an authoritative info page by IBM itself providing technical specs and details of the platform would earn them good credits at least amongst the tech audience.


Also likely because it's not targeted at consumers in the same way. Intel doesn't release products like hat. Dell and Apple do for their products that integrate Intel chips though.


24-core seems odd here, 32-core will be a more symmetric design ? What is main reason for this ? Cost ?


I have no direct knowledge on IBM's processor, but from working on other processors, usually it's due to product yield.

Today's cutting edge fabrication of multi-core chips makes it quite unlikely to get many dies to manufacture where all the CPU's are working at high speeds. Chips must run at a single frequency, so the slowest frequency is the bottleneck. If you fabricated 32 cores, and 1 fails testing, the chip is garbage.

So the strategy is that the chips are fabricated with 32 cores, and then they speed-test the cores and take fastest 24 cores. The remaining 8 are fuse-disabled, and the end user sees a 24 core chip.

Note this strategy also allows them to produce a 16-core chip with the same die, if for example only 20 cores work at high speeds.

So you get both high yield and high speed chips.


Yes, that's done some times but in this case, from the die picture posted here in another thread [1], it seems to have indeed 24 cores in 12 pairs (each of the "houses").

[1] https://news.ycombinator.com/item?id=12351729

http://www.nextplatform.com/wp-content/uploads/2016/08/ibm-h...


Symmetric doesn't mean 'power of two'. Quite a few ICs have had even numbers of cores on them (I'm not aware of any with an odd number of cores, but that doesn't seem like the end of the world either). Lots of other things have to fit on the die, too (e.g. unified cache, which takes up a lot of real estate in chips that have them)


slightly off topic, but for some reason your comment made me remember the time I had to do some regression testing of a fix backported to a really old version of our software.

We had to test against the OSs we'd since dropped support for, which included HP-UX, AIX and s390.

The first time I heard someone mention a "31bit" build, I thought they were taking the piss :P It just seemed so odd at the time...


Between Power9, AMD's Zen-based Naples, and Qualcomm's ARM-based Centriq, things are looking pretty interesting for the future of servers.


IMO Xeon will be tough to unseat for performance. But a lot of server consumers only want to be able to scale ever-more guest OS/containers on a single platform, so these challengers do pose a real threat to Intel's server market.

Open source software has been a powerful economic force: the fact that so much software exists that's designed to target many similar platforms and is frequently recompiled for those targets means that the impact to the end-consumer of switching among x86/ppc/arm is much lower than it was in the past.


I don't know if they have to beat Intel on performance per core. There are markets that require the best performance per core, but there is also a big market that wants more reasonably powerful cores for a given price point.

For instance, in the VPS hosting market, you see most providers offer "vCPUs" (virtual CPUs), because then they can sell their customers "four cores" (or 20 cores) on the cheap, and so on. It's a good marketing strategy.

What if they can offer real cores for the same low price? That would be pretty appealing to their customers. They can also offer "24-core dedicated servers" on the cheap, and so on.

Also, in the long run, it would be best for hosting and cloud providers if they would at least adopt a strategy of 50% Intel chips, and 50% AMD + Power + ARM. That should get them much better prices in the future (from all the chip providers), if they did that. Monopolies don't serve anyone but the monopolist.


They could look at the Cavium (ARM 64 bit) chips. 48 cores / socket, and typically two sockets per server. Each core is in-order and rather slow. eg: http://b2b.gigabyte.com/products/product-page.aspx?pid=5460#...


And maybe Mill Computing can follow through with the claim that their belt based, general purpose DSP CPU design can beat all other offerings at least with regard to power use. I wonder how long it will be until physical chips are out. Or at least new talks, they were really good.


I watched all the talks and it taught me a lot about how CPUs work, and what an interesting new architecture the "belt machine" is.

Highly recommended. http://millcomputing.com/docs/


I was enthralled watching those talks, and the concept of storing high-level bytecode for executables that gets targeted to the specific machine at runtime is a concept I've always found cool (I can install an ancient AS/400 app on our IBM i that was originally built on IBM's ancient 48-bit CPU the old systems had on our shiny POWER7 box as long as they didn't use any private API's).

However, we'll see what happens if and when they produce real silicon whether the concept will be another Itanic, even AMD moved away from VLIW designs for their GPU's for performance reasons ~4 years(though, to be fair, they don't exactly have the $$$ to pay for tremendous amounts of driver developers to optimize their code generation anymore).




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: