Correct, in GPUs can indeed do a better job at hiding latency through massive pa...

mrb · on March 6, 2020

Divergent warps are still a huge problem (but SLIDE doesn't have this problem AFAIK).

Uncoalesced reads are not a problem severe enough to make GPUs underperform CPUs. Or, said another way, uncoalesced reads come with a roughly equally bad performance impact on both GPUs and CPUs.

plusplusc · on March 6, 2020

The only reason GPUs can hide the latency is the massive parallelism in the problem space (computing the hash for nonce n doesn't block nonce n + 1). This algorithm involves a lot of data-dependency, so a computer for training these networks actually may be memory-latency bound (unless you are training a ton of neural networks and can hide the latency), which is extremely bad for GPUs.