A consistent 5 ms difference in micro-benchmarks is definitely not "measurement ...

whatshisface · on Nov 30, 2022

Well, there is both random and systemic error in any experiment, and if 5ms is small relative to anything you'd expect (or there is some other reason to discount it) then it might be related to a problem in the benchmarking setup that's too small to be worth resolving. Any test is good to within some level of accuracy and they don't always average out to infinitely good if you rerun them enough times.

joosters · on Nov 30, 2022

The 5ms isn't the key number. It's 5ms extra over a 28ms baseline, that's about 18% difference. If your noise threshold is 18%, then I think you have to accept that the benchmark probably isn't any good for this stated task.

viraptor · on Nov 30, 2022

https://github.com/bheisler/criterion.rs is good for tests like that. It will give you much more than a single number and handle things like outliers. This makes identifying noisy tests simpler.

glittershark · on Nov 30, 2022

The benchmarking harness that the post uses is based on criterion

spullara · on Nov 30, 2022

A benchmarking harness without error bars?

dataangel · on Nov 30, 2022

It could be noise in a benchmark that does IO, or has determinism problems.

rfoo · on Nov 30, 2022

Here is the benchmark output from the post:

  news_app/ranges_and_joins/cached
     time:   [28.583 ms 29.001 ms 29.526 ms]
     thrpt:  [277.45 Kelem/s 282.48 Kelem/s 286.61 Kelem/s]

  news_app/ranges_and_joins/cached
     time:   [33.271 ms 33.836 ms 34.418 ms]
     thrpt:  [238.01 Kelem/s 242.11 Kelem/s 246.22 Kelem/s]

Given that 33.836/(1000/(242.11*1000)) ~= 8192, my understanding is the time reported here is how long it takes to do 8192 queries. Also it reports three metrics (should be min, median and max). All these means the benchmark harness did run the test for a lot of times and the 5 ms different is not random at all.