I regularly work with multi-thousand line source files, but compilation speed has never really been an issue there, at least for me. The issue has always been the link time afterwards (especially if you turn LTO on). I guess its probably more of an issue for C++.
Is anyone working on adding parallel linking to GCC like clang has?
I agree. In all of the large projects I have worked on the build systems were constantly optimized under the observation is that compilation is infinitely parallelizable and the latency doesn't matter nearly as much as throughput but the link is the expensive step on the critical path.
I'm not familiar with clang's support for parallel linking, so maybe this isn't what you mean, but GCC supports parallel ltrans for LTO, and IIRC GCC 10 made multithreaded ltrans the default behavior.
For C++ it really does become an issue, since each individual file can take over a second to compile (usually depending on how heavy the template usage is), even for moderately-sized files (~1000 lines).
Projects using templates and heavy metaprogramming often see most of the compile time spent in template instantiation, not in link time. For this kind of projects I think this would help.
LLVM is not amenable to using multiple threads to compile a single TU. The use-lists of global values (such as functions, global variables, or constant expressions) include all uses from all functions, so parallelizing on a per-function basis requires acquiring locks (or some sort of lock-free data structure) to add, remove, or iterate these lists, which would considerable overhead on a relatively common operation.
Lets certainly hope clang will eventually manage to improve things in that context, because the restriction against parallelization during the optimization step when there is a single compilation unit is quite punishing.
Great that C++ compiler teams are working on reducing C++ compile time. The next great feature I think would be a game changer is caching the result of compilation (templates and binary code). A fine grained cache acting at the level of a line (or a function if it's too hard) would save an incredible amount of time to C++ devs, while also simplifying build systems. This would also stop engineers from losing days trying to speedup their build.
Such a compiler exists, it's called zapcc (was a fork of clang 3.something). Sadly it has been abandoned and never merged even though it was opensourced.
We already had function level compilers back in the early days, Energize C++ and VA for C++ v4, but they were too resource hungry for what companies were willing to pay for and they died.
There is an Energize C++ demo floating around on YouTube.
This was being touted for the Fedora build system recently, and Jeff Law made typically sensible remarks to temper the enthusiasm. I don't have a link to the devel archive to hand.
Imagine you have one source file with 100 of functions which you want to compile. Traditionally it could be compiled only using one core, processing function after function. Assuming we are processing already preprocessed information, there's no theoretical reason why more cores couldn't be used to compile the functions: e.g. with 4 cores, each could get 25 functions. The practical reason why not is not only that it increases code complexity in the trivial case (all the functions take the same time) but that in non-trivial case you can end up trying to do a lot of what the OS and make system together do with separate processes doing the separate compilation of many files, dynamically deciding what to do at which point.
This experiment tries to make a case that something in that direction is doable and worth doing. I missed the good proof for the later, based on the analysis of state of the existing projects, however. I would personally rather consider the completely opposite direction:
In the big C++ projects, most of the functions are already present in small compilation units, and the total of the build process mostly spends time in processing the same huge set of headers for every compilation unit. Often, the whole process would be faster if more compilation units would be compiled as one(!) when the headers aren't used in a way that the semantic of them changes depending on the way in which they are used (I've heard that, of all, LLVM actually does that unfortunate thing, but I haven't spend time analyzing that myself).
Compiling entire units as one was standard practice dealing with the (defunct) IBM VisualAge C++ compiler. On one of our larger projects, updating the makefile to concatenate all .cpp's in each component, then compiling those .cpp's, reduced build times from 15-20 hours down to 1-2 hours.
Google Summer of Code participants are not interns, they are paid a stipend to work on open source projects. And–isn't it good if an intern project ends up being used?
Is anyone working on adding parallel linking to GCC like clang has?