I hadn't seen CLHASH yet, thank you for mentioning it. Looks even faster but only XOR universal (meaning XORed hashes are uniformly distributed, not the hashes themselves). In particular, it fails smhasher's avalanche test (because bit flips have predictable effects), and their proposed fix is more expensive than a HighwayTreeHash round.
> their proposed fix is more expensive than a HighwayTreeHash round.
For longish-strings, most of the cycles are in the preliminary rounds, so appending whatever you want onto the end should add negligible cost. This includes bitmixing (their proposed smhasher fix) or a HighwayTreeHash round.
Agreed :) We (and SipHash developers) do care about short strings, though. Scripting language hash table inputs are typically around 10 bytes, so we can't ignore finalization overhead.