Hacker News new | past | comments | ask | show | jobs | submit login

While the performance seen here is nice, i'm curious to see the price/performance ratio. Running against a 8-core XEON would not make sense if the closer Intel system price-wise is a quad 12-core xeon... Obviously we are talking cloud here so it might not even apply.

By my experience with Power7, the price/performance ratio is much lower on Power then Intel systems. Maybe it changed but i'm not holding my breath, even if IBM seems much more aggressive on pricing with P8 then they were with P5-P7. The

Quick calculation, absolutely unscientific:

Seeing that the price is 0.14$/hour for the 6-core xeon and 1.08$/hour on the 176 core P8, it would have to be roughly 8-10x faster to justify the cost difference, not sure it will be the case.




the thing you're getting here is primarily throughput on a single image. Even if it's more expensive per-core per-hour, you can't discount that you'd have to work a lot harder to get the equivalent 30-box distributed solution to work properly, and even then it would have certain disadvantages owing to network latency.


This is interesting and I'd like to hear more opinions on it. My impression is that distributed computing has been eating Power/Sparc/Z processors' lunch for a long time now because software has made up for the deficiencies of coordinating 30 boxes. Do you and do any others believe that we are at an inflection point where the pendulum swings back in the direction of 'high-performance' processors like Power8, or will improvements in 'scale-out' ease-of-use and economies of scale continue to win the day?


The dominant use case for the last decade or so has been web servers hitting caches to do low-CPU low-causality CRUD operations. That looks unlikely to change in the next decade, so keep your Intel stock.

That said, for a lot of interesting use cases, like that king-hell postgres database sitting in the middle of the swarm, or video processing, or streams processing, or indeed any situation in which thousands-to-millions of simultaneous actors need to work on the same shared state, this sort of system starts looking real interesting.

As a thought experiment, think of this system like a GPU, except every single processor is a fully capable 2 GHz i5 running Unix, and instead of having to deal with the CUDA or OpenCL API, you can just write erlang (or haskell; .. or whatever) code and it will run. And instead of having 2-8G of RAM, you have 48G. And instead of having arcane debug tools, you have recon and gdb and ddd.

I don't think there is a pendulum, I think there's a spectrum and has always been one; pragmatism should always rule, and your use case is not my use case. There isn't going to be an objective winner ever, no matter how close Intel may get to covering much of the sweet spot.


If you have a problem that behaves poorly in the face of the Network Fallacies, then you want to scale vertically.

RDBMSes are a classic example. Some kinds of compute-heavy problems too -- simulations with lots of coupled components, video compression etc.


I would be extremely surprised that you would need 30 x86 boxes to reach the performance of a P8 box, on any type workload. By my experience with P5-P7 they can be faster for certain workloads then x86, but not that much.


You can't really compare the different chip revs apples-to-apples. P6 was a completely different chip architecture with much higher clock speeds that IBM abandoned because it didn't perform well. They make a lot of changes in each chip rev.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: