Ah, thank you for a really interesting reply. I appreciate it.
> One of the primary optimizations is using vm_splice to avoid memory copies. vm_splice only works if the file descriptor is a pipe.
Thanks - I hadn't actually read the code in question. I now realise I should have. That's a totally fair point.
That being said, in the general case, there are clearly far more optimisations to be made. Do you think GCC-from-C is really theoretically limited to producing assembly code which is two thirds as performant as handwritten assembly?
Ahhh, I getchu. Yes, that technically means that a compiler can't produce 100% optimal code in 100% of cases. However, it doesn't mean it can't produce 100% optimal code in 99.99999% of cases, or 99.99999% optimal code in 100% of cases, as you say yourself.
This is more of an academic issue about formal properties of a compiler, rather than a practical issue with the optimisations it could make in the overwhelming majority of real-world cases.
That being said, I'm very sympathetic to your point. The (mostly toy) compilers I've written have been extremely weak in the optimisation department. I have a lot of respect for people who are able to do that, and a lot of respect for the challenge it poses.
(I do hope that the growth of big data, and developers' increasing indifference to privacy, might have the salutary effect that compiler authors gain access to more information about the programs their compilers are compiling, out in the real world ... which, at least going by my own experience, would be invaluable in making decisions about particular optimisations. Sadly 99% of optimisations are not peephole optimisations.)
> One of the primary optimizations is using vm_splice to avoid memory copies. vm_splice only works if the file descriptor is a pipe.
Thanks - I hadn't actually read the code in question. I now realise I should have. That's a totally fair point.
That being said, in the general case, there are clearly far more optimisations to be made. Do you think GCC-from-C is really theoretically limited to producing assembly code which is two thirds as performant as handwritten assembly?
> https://en.m.wikipedia.org/wiki/Rice%27s_theorem
Ahhh, I getchu. Yes, that technically means that a compiler can't produce 100% optimal code in 100% of cases. However, it doesn't mean it can't produce 100% optimal code in 99.99999% of cases, or 99.99999% optimal code in 100% of cases, as you say yourself.
This is more of an academic issue about formal properties of a compiler, rather than a practical issue with the optimisations it could make in the overwhelming majority of real-world cases.
That being said, I'm very sympathetic to your point. The (mostly toy) compilers I've written have been extremely weak in the optimisation department. I have a lot of respect for people who are able to do that, and a lot of respect for the challenge it poses.
(I do hope that the growth of big data, and developers' increasing indifference to privacy, might have the salutary effect that compiler authors gain access to more information about the programs their compilers are compiling, out in the real world ... which, at least going by my own experience, would be invaluable in making decisions about particular optimisations. Sadly 99% of optimisations are not peephole optimisations.)