AMD Radeon HD 6990 already has over 1 TFLOPS performance double-precision, and t...

phamilton · on Nov 16, 2011

Yes there is no problem buying it, but have you ever tried programming in OpenCL? Complexity aside, GPGPU hits a big bottleneck when dealing with large datasets. There just isn't enough memory available on the GPU, and transfers to and from the device are costly.

buff-a · on Nov 16, 2011

I promise you that if you don't pay an equivalent amount of attention to data availability on these intel chips you wont see teraflop speed. If there's one thing that GPGPU did it was force idiots to have to thing about presenting data to the processor instead of leaving it all over the fucking place and letting the cache "sort it out" (read: "run really slow").

That said. I'll take one =)

marcf · on Nov 16, 2011

There are AMD video cards with 4GB of memory these days and special cards have up to 16GB I understand.

phamilton · on Nov 16, 2011

16GB isn't adequate for the work I've done on them. Genome assembly ( an O(n^2k^2) process ) generally has 100GB of data, each segment of which needs to be compared against each other segment ( O(n^2) ), and when comparing two segments each datum needs to be compared to each other datum ( O(k^2) ).

So while you can just transfer 4GB of data to the card at a time, it really doesn't cut it.

r3demon · on Nov 16, 2011

I'm sure Intel won't cut it either, PCI Express speed is limited, and RAM woudn't catch up with simple computation speed. Find a better algorithm.

rayiner · on Nov 16, 2011

Far more limited architecture.

r3demon · on Nov 16, 2011

By this logic all RISK architectures are very limited, but they can compute. Parallel architectures have to be seriously limited anyway, you just can't access 100Gb data with 1000 parallel processes at the same time.