I've spent my childhood reverse-engineering packed and obfuscated code in assembly, writing matrix multiplication libraries in C++ for fun, also did a lot with the first available frameworks for GPU programming in assembly (e.g. implementing kernel applications to images).
I can sit down and estimate how much my code is going to take (including the cache misses and other nonlinearities).
The way I earn money is by programming python. Why is it so? Because it's a cheaper solution and because you know that there's a multitude of libraries in Python that already do that for you. Optimisation is the overhead where you hire a developer solely to sit down and optimise; and quite frankly it is quite commoditised in that sense.
A certain irony about knowing all the tricks and using Python is that you invariably encounter people online who say "well Python was too slow so I dropped it" when the problem they encountered is covered either by one of Python's prolific extensions or by a slight reformulation of the problem.
And, having said that, I also didn't know that some optimizations were possible until later in my programming career. There's just a point where it escapes one's current level of skill.
I am coding since the mid-80's, so I have also done a lot of low level stuff in the past, but eventually ended up earning money by focusing on managed languages.
Knowing how everything works helps to understand how to write code with performance in mind when required, write that occasional function/method or VM plugin in C++, or even read the Assembly code generated by the AOT/JIT compiler.
But for the typical application use cases, the performance for the types of applications I write is more than enough.
I don't know--it just seems like we have lost something as a profession. Lost some of the craft and efficiency. I get that this is a practical attitude and often the right answer. But always working with "managed" languages and moving higher and higher up in the abstractions is one of the reasons we now need clusters of computers to solve problems. We've gone from computers the size of rooms to personal computers that can rest on your lap, all the way back to racks of computers that need an entire room.
Actual high-performance-computing clusters are typically used efficiently, running well-tuned code. BLAS libraries (matrix multiplication) are usually very heavily optimized.
Of course, a lot of code just uses these optimized building blocks and ends up doing multiple passes over the data instead of doing more with each pass while it's still hot in cache. It's disappointing that we still don't have optimizing compilers that know how to produce code for an efficient matrix multiply or something, and be able to mix in some other work right into that.
Your point definitely applies to server farms, though, running a huge stack of interpreted languages and big clunky "building blocks" instead of something written efficiently.
Developers using Lisp during the 60's all the way to Lisp Machines in the 80's would jump of joy if they could get their hands on a RaspberryPI level hardware.
So the capabilities are there, but it is up to the developers to actually make use of them.
I can sit down and estimate how much my code is going to take (including the cache misses and other nonlinearities).
The way I earn money is by programming python. Why is it so? Because it's a cheaper solution and because you know that there's a multitude of libraries in Python that already do that for you. Optimisation is the overhead where you hire a developer solely to sit down and optimise; and quite frankly it is quite commoditised in that sense.