Hacker News new | past | comments | ask | show | jobs | submit login

Definitely. Once the AVX512-IFMA instruction set becomes more common (only available on Intel 10nm CPUs so far), I may try a PCG with 104 bits of state. Wanting SIMD rngs (separate generator per vector lane), PCG is hurt by 64-bit integer multiplication being on the slower side. AVX2 has to perform three 32 bit multiplications, while the AVX512 instruction has much lower throughput than the likes of a xor, shift, or floating point arithmetic. The new IFMA may change that.

My use cases are Monte Carlo, where "passes statistical tests" is probably all that's needed for good results. Maybe I should just try an mcg.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: