pretty sure a memory access is faster than the methods presented in the article.

PhilipRoman · 2024-12-21T20:32:43 1734813163

Depends also heavily on the context. You pay for each cache miss twice - once for the miss itself, and next time when you access whatever was evicted during the first miss. This is why LUTs often shine in microbenchmarks, but drag down performance in real world scenarios when mixed with other cache bound code.

dist-epoch · 2024-12-21T20:02:17 1734811337

Hitting L2 is more than 3-4 cycles

retrac · 2024-12-21T20:13:17 1734811997

Access to main memory can be many many cycles; a short routine already in cache may be able to recompute a value more quickly than pulling it from main memory.

ryao · 2024-12-21T20:53:59 1734814439

An uncached random memory access is around 100 cycles.

Sesse__ · 2024-12-21T22:42:54 1734820974

100 cycles would be very low. Many systems have more than 100 ns!

ryao · 2024-12-25T03:22:12 1735096932

You are correct. I used the wrong unit:

https://jsmemtest.chipsandcheese.com/latencydata

We can say around 100ns, although likely somewhat more.

Retr0id · 2024-12-21T20:32:07 1734813127

64K is enough to fill L1 on many systems