In HFT you typically only measure latency for events that you did react on, not ...

In HFT you typically only measure latency for events that you did react on, not all events (at least, in the systems that I've built).

That's arguably a bit problematic as events you don't react to can cause you to fall behind, so it's useful to also track how far behind you are from the point where you start processing a packet (which requires hardware timestamping, unfortunately not present on many general-purpose NICs like what's in the cloud).

For a given market, you may have less than 1M samples a day (I've even seen some so-called HFT strategies or sub-strategies that only had 300 samples per day).

So overall, you'd have fewer events than 44.1kHz, and there are typically expected outliers at the beginning of the data before the system gets warmed up or recovers from bad initial parameters (which I suppose you could just ignore from your distribution, or you could try not to require a warmup).

But you're right, you probably want to look at quantiles closer to 1 as well. You're also looking at the max regardless.