Whatever is relevant to the domain you're tackling. How you structurally approach latency in a web server is different than a 3D renderer, which is different than an audio stack, etc. All of those domains have great open source or source available code to learn from, but it wouldn't be useful for me to say "go read HAProxy" if you're never going to touch an HTTP packet. When I'm tackling a problem the first thing I do is research everything everyone else has done on the problem and read their code, benchmark it if possible, and steal all the good ideas.
The basic principles never change. Avoid copies, never allocate, keep the hot path local, and really, good codebases should be doing these things anyway. I don't code any differently when I'm writing a command line argument parser vs low-latency RPC servers. It's just a matter of how long I spend tweaking and trying to improve a specific section of code, how willing I am to throw out an entire abstraction I've been working on for perf.
In the domain of web stuff, effectively all the major load balancers are good to study. HAProxy, nginx, Envoy. Also anything antirez has ever touched.
Application servers are also interesting to study because there's many tricks to learn from a piece of software that needs to interface with something very slow but otherwise wants to be completely transparent in the call graph. FastWSGI is a good example.
The basic principles never change. Avoid copies, never allocate, keep the hot path local, and really, good codebases should be doing these things anyway. I don't code any differently when I'm writing a command line argument parser vs low-latency RPC servers. It's just a matter of how long I spend tweaking and trying to improve a specific section of code, how willing I am to throw out an entire abstraction I've been working on for perf.
In the domain of web stuff, effectively all the major load balancers are good to study. HAProxy, nginx, Envoy. Also anything antirez has ever touched.
Application servers are also interesting to study because there's many tricks to learn from a piece of software that needs to interface with something very slow but otherwise wants to be completely transparent in the call graph. FastWSGI is a good example.