the price per core in terms of watts strongly favors GPU. Even if a card is draw...

vardump · on Dec 24, 2014

That GPU has 12 cores, in the same way a desktop CPU has 4 cores. Number of units with concurrent independent execution flow. Very wide and a lot of execution resources yes. Maybe even 5x computing power of a desktop CPU considering clock frequency as well.

"2300+ cores" is a VERY misleading way to represent GPU resources. You could also say GTX780 has 12 cores with 1/3rd clock frequency is an equally unfair and equally "true" way to express it, if you were trying to suggest CPUs are "better".

fitshipit · on Dec 23, 2014

Yes, but each of those cores does very little compared to one core on a CPU.

spydum · on Dec 23, 2014

doing very little times 2300 in parallel is sort of the point though. if you have a parallel optimized job (like sum/agg of many rows/table), it is stupid difficult to make fast on a complex CPU where most of the execution paths remain unused. you must wait for the instruction pipeline to clear, and you can only process so many ops per cycle (whats a popular core count now on a big cpu system? 48-96 threads i think my UCS blades can run). when you are talking THOUSANDS of cores on a weee baby 250w GPU card, of which I can put TWO OF in each system? That is enormously powerful for those parallel tasks.

vardump · on Dec 24, 2014

In that case my 4 core is really -- waves hands -- a 576 core system. 4 cores, maybe 2 AVX 8-wide instructions execute 2 * 8 * 4 and maybe 3 stages are in flight. And 3x the clock. Or something. So I'm getting roughly comparable completely meaningless 2 * 8 * 4 * 3 * 3 cores.

I'm not suggesting CPU resources should be counted like that, but that's closer to have GPU resources are counted. Sure, it sounds impressive, but does that 2304 cores really represent fair truth to, say, 4 CPU cores?

msandford · on Dec 24, 2014

While we're making hand-wavey comparisons, I'll make one.

NVidia GPUs have a theoretical peak of about 3-5 TFlops for 250 watts. http://en.wikipedia.org/wiki/List_of_Nvidia_graphics_process...

Xeons have a theoretical peak of about 0.5-1 TFlops for 150 watts. http://www.microway.com/hpc-tech-tips/intel-xeon-e5-2600-v3-...

Is that completely apples-to-apples? Probably not since the Xeon is probably talking about double precision floating point versus single precision on the GPU. But for a lot of database applications which don't involve money, single precision floats have a sufficient level of accuracy for the performance improvement to be attractive.

Yeah the performance isn't 100x like it used to be but it's still enough that if you have racks and racks full of machines a 3-10x improvement could be really substantial. Going from $10k/mo in rent to $1k/mo in rent at a datacenter could make or break an early stage startup.

Further as things get cheaper they get used a lot more. Scientists have only two models: the ones they can run but don't really like and the ones they want to run. Adding fidelity to modeling codes isn't an absolute good but it's hard to argue that it makes the world worse.