Hacker News new | past | comments | ask | show | jobs | submit login

> The only thing sad here is that all this work to enable full-width AVX512 is going to be mostly wasted as approximately 0% of all client software will get recompiled to an AVX512 baseline for decades if ever.

Well, the other thing is that for workloads where you're cramming 512b vectors through 2xFMAs every cycle -- there's a good chance you can (and have been) just buying GPUs to handle that problem. So, I think that space has been eaten up a bit in recent times.

I don't think it will be decades of waiting though. AVX2 is a practical baseline today IMO and Haswell is what, barely 10 years old? Intel dragged their feet like crazy of course, but "decades" from now is a bit much. And AVX-512's best feature -- it's much more coherent and regular design -- means a lot of vectorization opportunities might be easier to do, even automatically (e.g. universal masking and gather/scatters make loop optimizations more straightforward.) We'll have to see how it shakes out.




The GPUs that can do FP64 operations are priced out of the range acceptable for small businesses or individuals.

The consumer GPUs are suitable only for games, graphics and ML/AI.

There are also other applications, like in engineering, where the only cost-effective way is to use CPUs with good AVX-512 support, like Zen 5.

A 9950X has a similar FP64 throughput like the last GPUs that still had acceptable prices, from 5 years ago (Radeon VII).

Even for FP32, the throughput of a 9950X is similar to that of a very good integrated GPU (512 FP32 FMA per cycle, but at a double clock frequency, so equivalent with a GPU doing 1024 FP32 FMA per cycle), even if it is no match for a discrete GPU.

There are also applications where the latency of transferring the data to the GPU, then doing only short computations, can reduce the performance below what can be achieved on the CPU.

Obviously, there are things better done on a GPU, but there are enough cases where a high throughput CPU like a desktop Zen 5 is better.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: