This is based on clang --- it's not a new compiler.
<tangent>
Are the any working open source C++ compilers which aren't based on gcc or clang? I know of TenDRA, which appears to have ceased to exist and was always kind of incomprehensible and non-working and apparently it never got as far as STL support; there Path64, whose github page no longer contains the compiler repo; there's Open64, whose website no longer exists (although there does seem to be a daughter project, OpenUH, which did a release last year)...
For C++? No.
Too hard of a language to waste phd students maintaining :)
Over the years there have been a few more (openC++ is a good example, ROOT used to be one too), but they are all dead now AFAIK.
One reason is that research in compilers is pretty much not in the frontends anymore. For any research still being done that uses C++ as a base, gcc/clang/llvm pretty much work fine. LLVM in particular has an IR that works for most researchers, is not hard to understand, etc.
For people who want to try to build larger solutions, they usually start with C (see, for example, libfirm and friends).
Commercial folks use EDG or, actually, a lot are also starting to use clang now as a frontend.
(IBM was the other major company that had their own C++ parser for a longer time)
Bit of a side-note, but I would disagree about ROOT/CINT ever being a C++ compiler. It was an interpreter for a language that looked somewhat related to C++, if you squint quite a bit.
* All variables were hoisted up to function scope. This meant that you couldn't use "int i" in one loop, and "unsigned int i" in the next. This also caused destructors to be incorrectly delayed until the end of the function call.
* Use of templates required pre-compiled dictionaries for each type the template might be instantiated for. Any templates occurring in interpreted code would be silently ignored.
* Incomplete standard library implementation. std::abs(long) is missing, for example.
* const is silently ignored.
Not entirely relevant to the current discussion, but good heavens, I am glad that monstrosity is gone.
I think the reason why there are so many C/C-like compilers, but very few C++ (I personally don't know of any open-source ones besides gcc and clang) is due to the complexity of the language; C is simple enough that a single person can easily write a compiler for one in a short time and have it compile both itself and large quantities of existing standard and not-so-standard code:
In contrast, C++ is at least an order of magnitude more complex, with far more features that need to be working to get to something which could usefully be considered 'C++'. For example, templates are pretty integral to the language and implementing them correctly is not trivial.
The number of people who can write a whole C compiler, while relatively many, is still a tiny fraction of all programmers. (I do wish there were more, since I personally think the concepts aren't so hard once you see the essence of the tiny compilers like tcc or C4.) And a much tinier fraction of them will want to try writing one for even a subset of C++.
There was OpenWatcom, but it doesn't support anything past C++98, as far as I know.
There's a really high barrier to entry for C++ compilers, and between Clang, GCC, and all the proprietary compilers there's not a huge need for another one.
The tl;dr from reading between the lines is that having the open source version was crippling their commercial sales, so they tried to make it go away.
There's a nightly download from http://www.pathscale.com/ekopath-compiler-suite makes you click through a GPLv2 license (the GPL is not an EULA, dammit!) but there's no source included or any link to source.
But I did eventually track down a clone of the original github version here:
I see. So basically they wanted to open it, but changed their mind. How does current PathScale handle the GPL though? Or they never updated the compiler since then?
While we were shipping the compiler at the original PathScale, the source was distributed to everyone we distributed a copy to (GPLv2 clause 3a). I'm not sure why you'd mention github, given that they didn't exist yet. And I have no idea what you mean by "official site" when we're talking about a company that ceased business in 2006.
p.s. if you wonder why I'm so vehement about this topic, it's because you're basically accusing real people of being unethical based on your inability to find things now that we were required to give away in the past.
While GPlv2 3a is open source, it usually refers to including the source with the product in the form of something like a CD-ROM.[1]
>3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following:
>a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange
Do you know if the source is downloadable, legally from anywhere? Or was it only distributed to original customers?
I have no idea if the source is downloadable, legally, from anywhere. That's not required by the license. I'm glad that you actually looked at the license before you asked!
Sorry to disappoint you, but CD-ROMs aren't customary used for software interchange anymore. Unless you are collecting antiques. So again, where can I find the code for the current day PathScale?
> This is based on clang --- it's not a new compiler.
This is good. Apparently it improves compile speed by a factor of 2..10 for a wide use range without compromising on the output quality. For many C++ developers it will be like a nitro injection.
Not open source, and actually it's not free, either. People sure think it is, but that's only for students at degree-granting institutions, and people working on open-source code who are not being paid.
Academic researchers can get a free library license, but not a compiler license.
Ditto Digital Mars C++ (nee Symantec C++, nee Zortech C++). But as I recall, like most other "alternative" C++ compilers, it's also stuck in the C++03 land.
DMC++ is still awesome, because it's the only ISO C++ compiler out there that can target any kind of DOS platform and binary format, from 16-bit .COM files (!!!) to 32-bit DOS extender.
It was my understanding that the advantage of Intel's compiler is that it would optimize for the strengths available on newer processors while still allowing it to work on older processors (for example, using AVX instructions if available and a slower branch if not) [aka, the "CPU dispatcher"]. Agner wrote about it "crippling" AMD processors because they didn't say they were "GenuineIntel" back in 2009.[0]
> In conclusion, we can see that zapcc is always faster than both gcc and clang.
For testing his template-heavy library. A very small data set, and while templates are one of the problems when it comes to C++ compilation speed, it's certainly not the only one.
Let's see the differences when compiling Firefox or the whole KDE suite.
So, while neat, pretty much all of this would be solved by C++ modules and precompiled modules.
(and in fact, is, based on what i've seen. But i still hope these guys get to market and make some money before that takes over the world, because i know how hard it is to do what they are doing :P)
I tested my codebase with currently favors 'ccache gcc-6 -flto' over 'ccache clang-3.9 -flto' with zapcc.
It's a pure C project.
* Build-time went from 58s to 37s with zapcc. I didn't use zapcc with ccache and shouldn't.
* Run-time went from 371s down to 204s. Almost double speed!
So it's clearly an effect of clang-4 over gcc-6, and not so much zapcc.
Then I crosschecked with clang-4. The run-time is entirely based on clang-4, confirmed.
But the build-time with clang-4 was 44s, still 20% slower than with zapcc.
What I learned:
* ccache is horrible. zapcc is much better.
* clang-4 is fantastic, even if their lto is still too broken to be usable, regarding visibility and inlining.
I found great bugs with new clang-4 warnings.
I had no time yet to use it on C++, where the real advantages come up. On pure C there are just side-effects.
One thing that always confused me about ccache was that it doesn't cache lib generation and executable generation. I know that the authors have insisted (until they were blue in the face) that supporting lib/exe caching would require rewriting ccache... I just don't understand why. Once you know about about -frandom=0, and you've removed all aspects of non-determinisim (__TIME__ & co.), then all that's left is the moral equivalent of `dwarfdump -u <exe>` for each compiler, and you're good-to-go for deterministic caching.
The architecture of ccache maps preprocessed sources to object files; it uses the compiler to preprocess the source, and it hashes the result, knowing that nothing other than the compiler, command line, and preprocessed source determines the output.
Linking involves far more complexity, with the input files harder to determine. There's no equivalent of -E or -fdirectives-only for linking. A ccache for linking would have to identify and hash all library, object file, and linker script inputs, including those pulled in indirectly by linker scripts, in addition to the toolchain, the command line, and any linker plugins.
It's absolutely possible, and I'd love to see someone do so, but it seems significantly harder than caching compilation.
You'd also want to time the result, and figure out how long the reading and hashing takes compared to linking. ccache misses take only slightly longer than a normal compilation; link-cache misses may take much longer than normal.
On top of that, unlike a compilation cache that seems very likely to hit on the 99% of files not changed in a build, a linker cache would only hit when absolutely nothing has changed in the entire build. It might help for a project that links numerous tiny libraries or binaries (which seems relatively uncommon), but for a project that primarily builds a single library or binary, it'd only help if you rebuild entirely identical sources twice.
(It might, however, speed up Linux kernel builds if you've only changed the code for a couple of modules and not anything in the core kernel.)
I would love to see a comparison of the performance of compiled programs.
If zappcc creates slower executables but spitting them out faster, it could be used during development to speed up iterations. And if the executables are faster, I'm very curious as to how they achieved both faster compilation and faster runtimes.
I think a good example of this is tcc, one of the fastest C compilers I've seen --- because it doesn't do much optimisation at all and is single-pass, it can generate code as it parses, but the output is dismally inefficient.
Another example of ultrafast compilation is Delphi, but once again the generated code looks more like a dumb line-by-line translation with plenty of redundant and unnecessary instructions (making decompiling interesting in that it easily produces something quite close to the original source.)
You might want to compare tcc to just about any compiler with -O0 -- they're a lot faster if they are allowed to generate slow code. It's also super-straightforward to find compiler bugs, if you're lucky enough that it's an O0 bug!
In theory the compiled output should be the same as clang (if there are no bugs) being that this is just an optimized clang where compile structures are cached.
Normally, it should make almost difference compared to the code generated by the same version of clang on which zapcc is based. It may make a difference in the long term if they don't keep up to date with clang trunk.
The value of a caching compiler should really become apparent in incremental builds, as in rebuilding after changing a single file. Yet the author talks about "not seeing any improvements".
Like he said, he might be doing something wrong.
The speedup observed anyway might come from the compilers having to rebuild/instantiate templated code everytime it's included.
"The value of a caching compiler should really become apparent in incremental builds, as in rebuilding after changing a single file. Yet the author talks about "not seeing any improvements"."
There are millions of reasons this may not be true in C++.
For starters, the use of time and date macros, etc.
Without precise dependency tracking of what source lines are dependent on what macros (which is super hard, and i don't think they do), which is not usually what is done (dependency tracking is often much more coarse grained), you may not see an improvement.
VisualAge C++ was one of the best incremental C++ implementations i ever saw, and even it did not get to this level.
> VisualAge C++ was one of the best incremental C++ implementations i ever saw, and even it did not get to this level.
How did it compare with Energize C++?
I only know both from magazines during those days, even though someone uploaded an Energize video to YouTube.
In regards to VisualAge C++ I think only those of us that were active back then can remeber anything about it. Besides the magazines I had with the product review, I never seen much information being posted on the Internet.
Hell, even if this only gives a speedup with template heavy code I'm on board. We use a number of header only, template meta programmed to death libraries (RapidJSON, Eigen, ViennaCL), and compilation speed improvements would be a huge productivity boost.
I would love to see a C++ compiler implement multi-core optimization and code generation for template instantiations. I feel this can be a big win for cases where you instantiate a template n times and they are basically all independent from each other.
It's not clear that it will help with template instantiation, because it's fairly hard to parallelize.
Certainly, possible, but very hard to do "optimally" (IE by sharing work instead of duplicating it).
Given any initial work is likely to have to do duplicate work per thread, this usually cuts into your speedup quite a lot.
Codegen, on the other hand, is pretty much fully parallelizable.
This is the whole reason thinlto exists.
Unfortunately, it's quite inconvenient to use precompiled, there are a lot of limitations in each compiler. For instance, only the first header include a source file can be precompiled and other things like that, this is a more general approach. But PCH can bring a really good speedup too and are free ;)
This was my question too. Zapcc's faq says: Precompiled headers requires building your project to the exact precompiled headers rules. Most projects do not bother with using precompiled headers. Zapcc works within your existing build.
Precompiled headers are currently ignored by Zapcc.
Curious as to how much faster this makes standard workflows where you are re-compiling only a few files and the linker is typically the bottleneck.
I use incredibuild in my day-job it's a great help when doing full re-compiles or changing a pervasive header but offers no help on smaller builds where it doesn't parallelize the link. Zappcc doesn't look to do anything with the linker either.
You can try gold with parallel and/or incremental linking (although incremental linking didn't work on a large library where I would've needed it most)
in the perfect world I would like that everyone who uses clang would freely benefit from this improvement. should some big and good corporations just buy out those guys?
The entire point of using a BSD/MIT-style license is that it gives "freedom" to a different set of people than the GPL does. In this case, why is it not perfect for someone to invest money building a commercial product on top of clang? Isn't it just fine that they want to make a lot of money?
If Clang had used a GPL-like license, Zapcc would have been forced to share all their modifications to Clang with the whole world, and we would've all benefited from it -- and maybe the optimizations would even have been merged back into the mainline of Clang.
Why do you think the Zapcc developers would have worked for free on this? It seems pretty clear that they developed the software because they thought they could make some money off it.
And thanks to the fact that they did this, we now know it is possible. Competition will hopefully motivate the Clang developers to develop similar performance improvements in the mainline of Clang. Everyone will benefit.
A Clang user is certainly no worse off than they were yesterday.
I'm not sure I agree. Maybe for some software, but I wouldn't even think about using a programming language without at least one quality open-source implementation, and I think many developers would agree.
Fair 'nuff. But let me ask you this: will those managers ask them to use a language without a high-quality open-source implementation? And if you look at the most popular languages out there (and even many of the fringe ones) the answer is probably not.
All the stored procedure programming languages of commercial SQL servers, .NET before Microsoft opened it up, commercial compilers of Common Lisp, C++ Builder, Delphi, Ada, C and C++ compilers for embedded development (no clang and gcc aren't the only ones), Coldfusion, Flash, Objective-C (gcc and clang are just a tiny part of the whole stack), Cobol, RPG, NEWP, a few in-house proprietary languages, Java compilers for embedded platforms with extended AOT features
It doesn't matter if there are open source implementations of language X, if you cannot use them in processor X, operating system Y, rather the closed source commercial compiler of the processor X, operating system Y vendor.
<tangent>
Are the any working open source C++ compilers which aren't based on gcc or clang? I know of TenDRA, which appears to have ceased to exist and was always kind of incomprehensible and non-working and apparently it never got as far as STL support; there Path64, whose github page no longer contains the compiler repo; there's Open64, whose website no longer exists (although there does seem to be a daughter project, OpenUH, which did a release last year)...
Is there anything else?