As somebody who works along with Applied Scientist helping them with tasks related to model training and deployemnt; how does one get exposure to more lower level engineering work like optimization, performance etc.
We have an ML infra team; but their goal is building tools around the platform, not necessarily getting workloads run optimially
Brendan Gregg's work on system performance and profiling is a good place to start. A lot of ML perf boils down to Linux perf or what the heck is happening in an HPC scheduling system like SLURM.
https://www.brendangregg.com/linuxperf.html