Yes, the idea that wall-to-wall assembly is somehow superior is definitely flawed. High-level language compilers have many optimization passes and can transform code in ways that the user might not have imagined. Writing high-level code that causes the toolchain to produce the machine code the author wants is its own skill, though, and a habitual assembly programmer might not have that particular skill.
Yes, but Randy also had to get uber-expert level understanding of the hardware to reach such performance... It would be hard to do that without writing copious assembly specific to the hardware somewhere along the way, regardless of compiler quality, so his "100% assembly" strategy served both goals.
I agree with this. A smart coder is going to write it all in a high-level language and then profile it to see which tiny bits might be better off in assembler and then profile it again to test their theory. Modern compilers know a lot of tricks and can often find optimizations that would be hard to do manually in machine code.