What do you think the threshold number is where scanning a list will outperform ...

Shorel · on May 25, 2023

> But I think it’s very hard to be calibrated correctly on what small means here.

It depends.

If you measure or determine somehow the sizes of the L1-L2-L3 caches in the CPU, the relative speed and size of the main memory, and the relative speed and size of the local storage (in contrast with for example network-remote data), then it is a matter of performing some arithmetic, and just to be on the safe side, some benchmarks.

Any data set that fits into the CPU cache is definitely small, for a particular CPU.

If you want to decide a number that applies to every and all hardware, then it goes from hard to basically impossible.