The reason I don't agree with this is because the preprocessor is the main killer of compiler throughput in a large project, preventing a number of optimizations that would otherwise be possible.
This doesn't pass the sniff test regarding throughput. I've observed both cc and various linkers take hundreds to thousands of seconds on template-heavy and sometimes not-well-organized c++ on machines running around 4ghz with ddr4, nvme storage, and plenty of both to not be constrained (1TB RAM, 6TB disk). The preprocessor steps barely register in the bazel profile of my repo, compared to places where we hit slot paths in the compiler and linker due to massive mains which are fundamentally separate programs glued together with a switch/case and a read from a config file.
The cost isn't the cost of parsing. It's the cost of compiling something you've compiled before. If you change a header that is included in N translation units, you compile N translation units, even if you didn't fundamentally change the header contents in a way that would effect the final object files.
But that isn't a cost of the preprocessor. It's the cost of definitions (and declarations in some cases) living somewhere other than the compilation unit in which they are used.
Even with a "module" system as handwaved in TFA, there's still the possibility that you (or someone else) changed a "module". C++ makes it almost impossible to decide if the change requires recompilation (without effectively doing the compilation to decide).
That's only because of #include, however. preprocessing is very fast, it's just because C++ lacks a sane way to import definitions that it dumps a huge amount of text into the frontend of the compiler.
How do modules work with generics and cross-module inline functions? Probably I can find the answer in D or Rust but I am not familiar with their mechanisms. Thanks.
In Rust, the library format also includes a pre-compiled version of the generics needed, and so when the compiler includes the library, it can monomorphize from there.
I'm not so sure about that. Surely it would take longer to parse both branches of an if-constexpr then it would for the preprocessor to see the #if and discard half of it.
I should clarify. My comment is a statement about the ability for the compiler to reliably cache compiled object files in an incrementally compiling situation.