Reading these stories one keep seeing the same pattern. GCC vs. EGCS, just like ...

froydnj · on Aug 28, 2010

I think the dislike of LLVM stems mostly from the company behind it. As you may know, Apple contributed a fair amount of code to GCC, but they also kept a lot of code for themselves. (They did, of course, make it available per GPL, but did not try to submit it upstream.) There have also been problems in the way Apple contributed code; see the mailing list archives for details.

I think there's also some dislike of the project itself, or perhaps the people behind it. But that's another discussion.

> The problem are always an excessive centralisation of stewardship... which is kind of ironic coming from the guy who most strongly supported free software.

I don't think it's so much centralization of stewardship for technical decisions, but stewardship of the rights to the code; i.e. you need to sign copyright papers before committing non-trivial patches. And I can assure you that the FSF sees doing so as strongly supporting the cause of free software. There are obvious counterexamples (the Linux kernel's more lax Signed-off-by policy comes to mind), but there are reasons why this centralization is done.

I know this centralization causes problems. Getting assignment papers submitted if you are at a university is a pain; your employer may not be keen on signing them either. It'd be nice to see the process streamlined, or even banished entirely, but I doubt that's going to happen.

> I think it's got only itself to blame, and a certain type of elitist arrogance.

I'm curious where you think this "elitist arrogance" comes from or even what sort of behaviors you're referring to. Can you elaborate?

nanairo · on Aug 28, 2010

Wasn't one of the problem with Apple and GCC that Apple had to hack the GCC to integrate it with the rest of its tools (which obviously the GCC people couldn't care less about)?

When it comes to the topic of stewardship the situation is less clear cut from my point of view. I think there's a (very human) feeling of "I've created this toy so I get to decide". In part it's a good attitude: if you managed to develop it this far there are good chances you'll be able to develop it further. But sometimes it makes people a bit blind when something better comes along.

In that sense using technological reasons, or the cause of free software, seem to be more of a way to (maybe not on purpose) push away whoever wants to change the way things are run. But that's purely a gut feeling.

Incidentally that's what I tried to convey with "elitist arrogance". This kind of "we know best" and "who are you? what are your credentials?" attitudes. Though I am not surprised it only made thinks more confused to a reader. Sorry about that. :)

froydnj · on Aug 29, 2010

I don't know the full story about Apple and GCC; I haven't read all of the mailing list traffic during the relevant period, nor was I around to hear about offline conversations. Certainly I can see Apple modifying bits of GCC to make it more suitable for IDE integration and upstream not caring about that. Apple did have ways of getting such changes in, though; it's possible they didn't care enough to push them upstream.

As for "elitist arrogance", I think what you are getting at is that you see the FSF's policies as erecting a moat and a wall around GCC and telling the rabble to stay out while the nobility quietly hack away at their desks. If I have understood you correctly, then I see the argument, but I do not agree with it. As long as GCC is an FSF project, that's just the way things are going to be. I don't think the FSF has set out with the intent to be "elitist"; it's unfortunate if that's the way the policy is perceived. Feel free to correct me; it's possible that I've misunderstood you.

The FSF does not, on the whole, dictate the technical direction of the project. So the goodness or badness of technical changes and any "elitism" associated with acceptance or rejection of such changes is another issue.

Perhaps things will change if GCC ever disassociated itself from the FSF: another egcs-style fork or something more dramatic. There have been small rumblings of such a change, but I think such an event is quite unlikely.

nanairo · on Aug 29, 2010

Please understand that I don't think the FSF ever set out with the intent of being "elitist", and I appreciate what they are trying to do. Although I often do not agree with the methods, I am happy they exist.

I like your example of the moat. I think though that nobility is not very much defined in your metaphor. After some thought I think there is a strong sense of community maybe, where those who've worked there the longest are those who become the "nobility". And people do not want a "foreigner" to come and change their ways, even though he may back his ideas with facts and good arguments.

In a way, a method that made sense once, is used for such a long time that at some point it becomes more of an "ideology", like a nationalistic feeling.

philwelch · on Aug 29, 2010

I think it's a lot less drama-inducing to consider any Apple contribution to open-source as a fork of the original project. The KHTML people were just as annoyed that Apple were bad open-source citizens, but there's nothing in any FLOSS license that requires you to play nice with the existing maintainers of any project if you don't want to.

prodigal_erik · on Aug 28, 2010

> They did, of course, make [GCC code] available per GPL

If you're talking about the NeXT ObjC frontend, that wasn't a matter of course. It actually took legal threats to get them to meet that obligation.

froydnj · on Aug 28, 2010

I wasn't thinking about the ObjC frontend; the ObjC frontend, AIUI, was contributed by NeXT before NeXT became a part of Apple.

If you go grab the GCC tarball Apple provides, it has many, many changes from the GCC 4.2 sourcebase. Lots of those changes have not been submitted back upstream. The code is available, granted, but it's not available in a way that's helpful to anybody but Apple.

X-Istence · on Aug 29, 2010

So? They complied with the GPL. Why is it that people also expect companies to push code changes back upstream, maintain said parts of code upstream and properly document said patches?

If upstream wants patches so badly they can download the tar-ball provided by Apple, diff the two source trees and document, and submit those patches to the upstream. It is not the companies responsibility to do so. The GPL only states that it MUST make the source code and its changes available.

KaeseEs · on Aug 28, 2010

While LLVM isn't a fork and doesn't quite fit the model of the gnumacs/xemacs, gcc/egcs, glibc/linuxlibc &c. from years gone by, I do hope the new competition in the free compiler space will spur improvement and innovation in both projects. I would almost expect not only the state of free compilers, but the health of GCC itself, to be better with a viable competitor in LLVM and clang than without.

cpr · on Aug 28, 2010

Seems to me (an old compiler guy in previous lives, but not necessarily up to date technically) that LLVM has really taken the momentum from the GCC project, since they're starting afresh, and generally being more approachable.

Perhaps someone with technical experience with both internals could comment?

froydnj · on Aug 28, 2010

Disclaimer up-front: my day job is hacking on GCC and related GNU pieces of infrastructure, so that colors my opinions somewhat. I speak only for myself, not for my employer.

My level of experience in both: I hacked a specialized LLVM inlining pass in graduate school. I've written two middle-end passes for GCC and tweaked GCC's inliner in similar, though less invasive, ways. I found the level of difficulty to be similar between the two projects.

Other people may have different opinions. I remember reading a claim of a graduate school project that tried for four months to get started on a middle-end optimization pass in GCC and got nowhere; after switching to LLVM, they made progress in a month or so. Personally, I think that meant they weren't trying very hard with GCC.

KaeseEs · on Aug 28, 2010

Given your experience with both, what would you consider the pain points of getting a grasp of each when looking to contribute? I know when I looked at GCC, the mix of custom manual memory management (gcc_free and pals), semi-automatic management via obstacks, and automatic management via garbage collection seemed baroque and frightening.

froydnj · on Aug 29, 2010

I don't have a good sense of pain points in LLVM; the inliner hacking was a while ago, I don't remember many of the details, and LLVM has surely changed quite a bit since then.

As for GCC, I think the pain points are twofold: the documentation for the middle-end is somewhat scattered. I honestly think enough information for figuring things out is present, it's just not always obvious where to look. There are lots of other passes to look at too, which can be extremely helpful. Assumptions of the interface, or side-effects, are not always stated, which can be surprising at times. The other pain point is contributing upstream: you're going to get dinged on formatting, documentation (usually just "did you do it"; the review is generally not as thorough as say, GDB's documentation review), compilation time, etc. etc.

Also, GCC's hash tables (htab_t) are a pain to use correctly.

Of course, my experience is somewhat slanted towards the middle-end; the set of pain points is somewhat different if you are working in the front-end or the back-end. And my set of pain points from the middle-end might be different if I had worked on different optimization passes. (My passes cared very little about things like aliasing, for instance.)

It's surprising to me that you mention memory management as a pain point, though I can see how the variety can be bewildering. The only distinction that really matters is between GC'd and non-GC'd memory. obstacks and alloc pools are only ways of providing specialized malloc interfaces. A useful rule of thumb is that if your data is only needed for one pass of the compiler, then you can allocate it any way you like; if the data is longer-lived than that, it needs to go in GC'd memory. I can elaborate if you'd like, but that's the basic idea.

FWIW, I agree that the whole GC system is somewhat baroque. The GC was a decent solution to an engineering problem and the whole mechanism nicely solved memory management problems and provided the basis for precompiled headers, but it causes problems in other ways nowadays and trying to get rid of it would be a huge effort.