I haven't looked at the problem closely enough to answer, but could we start fro...

haxen · 2024-03-10T07:10:52 1710054652

> maybe 4 cycles worth in this optimized version?

It's quite a bit more than that, just the code discussed in the post is around 20 instructions, and there's a bunch more concerns like finding the delimiter between the name and the temperature, and hashtable operations. All put together, it comes to around 80 cycles per row.

When explaining the timing of 1.5 seconds, one must take into account that it's parallelized across 8 CPU cores.

nkurz · 2024-03-10T22:18:07 1710109087

You are right. In my defense, I meant to say "about 4 cycles per byte of input" but in my editing I messed this up. I'd just deleted a sentence talking about the number of bytes per cycle we could bring in from RAM, but was worried my estimate was outdated. I started trying to research the current answer, then gave up and deleted it, leaving the other sentence wrong.

stavros · 2024-03-10T10:59:07 1710068347

Sorry, yes, I meant disk I/O, I should have clarified.