Hacker News new | past | comments | ask | show | jobs | submit login
Distcc – distribute builds across multiple machines simultaneously (2006) (distcc.github.io)
39 points by Scoobs on Jan 1, 2021 | hide | past | favorite | 25 comments



If you like distcc, perhaps you'd also like icecream https://github.com/icecc/icecream which I think is a bit easier to use.


I worked on adding distributed compilation to sccache [0]. Docs at [1] and [2]. Compared to existing tools, sccache supports:

- local caching (like ccache)

- remote caching e.g. to S3, or a LAN redis instance (unique afaik)

- distributed compilation of C/C++ code (like distcc, icecream)

- distributed compilation of Rust (unique afaik)

- distributed compilation on Windows by cross compiling on Linux machines (unique afaik)

Note that I think bazel also does a bunch of these, but you need to use bazel as your build system.

[0] https://github.com/mozilla/sccache

[1] quickstart - https://github.com/mozilla/sccache/blob/master/docs/Distribu...

[2] reference docs - https://github.com/mozilla/sccache/blob/master/docs/Distribu...


I've been contemplating how to add ccache to my work's current icecream cluster and I also considered the possibility of using it for non-release branch dev builds in CI. I'll have to experiment with this instead. Thank you.


I'm using it. It's reaction to network degradation, even one just moderately worse off than LAN with a single switch really hampers usage on cloud servers.


Just use Bazel with remote execution. It can distribute your unit tests, too.



Not the same thing. distcc is a wrapper to a compiler that can work with any build system. Just use CC=distcc.


I don’t think anyone is claiming that Bazel is “the same thing” as distcc. But they both solve the problem of distributed builds.


But unless you're already using Bazel, it's impractical to suggest "just use Bazel to distribute your builds" as doing so requires that you first migrate your build to Bazel. distcc, as noted, can be applied to an existing build regardless of the build tool used.


So, you’re saying that it’s impractical to switch build systems, and that it’s inappropriate to suggest alternative build systems?

I think that if you’re reading a suggestion to “just use X” on Hacker News, the sensible thing to do is to evaluate the suggestion for your own. You don’t need to put a long list of caveats in front of every software recommendation. If your problem is, “My builds are slow, I want distributed builds,” then you should probably at least consider the two most popular solutions to that problem, which are Bazel and Distcc. Both with their drawbacks.


Is distcc actually any faster than running the builds locally on a half decent machine? If I recall the latency associated with getting the file to the build machines often exceeded the local compilation duration. I remember being really excited about the idea, and really disappointed with the results.


One use that I can think of (never tried it myself, though in theory it should work) is if you are trying to cross compile for a processor architecture that doesn't have decent speed (many embedded chips), and the package you are compiling is difficult to cross compile (where the build script makes decisions based on the local environment or is otherwise to get to play nice with a cross compiler).

In that case you can run the build on the target CPU, but instead of calling gcc locally it calls distcc pointing to a cross compiler installed on a fast machine. This can be useful if you are compiling a bunch of packages from a distribution.


I've used distcc in that way long time ago. I had an powerpc iBook (G4) and a x86 desktop (Pentium4). The desktop helped compile software on the iBook using distcc and a cross-compiler. It did work pretty well. I also used ccache in addition to that to cache build output. That was probably around 2005 when most (all?) cpus were still single-core.


Indeed, in my use (with the Parabola GNU/Linux-libre distro), this is the killer use-case of distcc. This is how most of Parabola's packages for MIPS were built, and how many for ARM are built.


We used to use distcc (2007 time period) for the daily builds of one of our large, in-house C++ products - order of 10M LOC.

In principle, it was a good use-case with a highly modular structure and a clearly defined but chunky build graph.

In practice, it did work, but throwing more hardware at the problem on a single host turned out to be faster than the existing distcc setup and had much reduced operational complexity.

We could probably have tuned it further with distcc plus the new hardware, but we achieved the performance target we were looking for.


Running it locally will always be faster as long as your machine is not a bottleneck (#cores, ram, ...). I think the use-case for distcc et al is to enable less-powerful machines to run builds faster by levering other machines. That’s exactly what we use it for at work. Our developers have not-so-powerful laptops and with distcc/icecc they can utilize the power of our build agents in the server room.

Also interesting to read: https://github.com/StanfordSNR/gg


Or maybe if money is no object spin up a bunch of VMs in the cloud to compile.


At the moment, icecream is not suitable for this as it is super sensitive to network degradation, and I ran into few compilation stalls that way.


I was actually just playing around with distcc for the first time last night. Compiling ungoogled chromium normally takes my desktop about 12 hours, and using distcc to share the load with my laptop (with a gigabit connection to my desktop) took a little under 7 hours. It definitely improved the speed considerably for me.


We used it for several massive C++ apps ~2012 and yes it was much faster than our local machines, although our machines at the time weren't state of the art but also weren't bad either. We had a pretty amazing on-prem lab full of Blades that we used for distcc, so I'm sure that helped deliver the lightning.

One thing people always overlook though with distcc is that while you compile remote, you always link local. That was the bulk of our build time and wasn't the fault of distcc. Object files would come at lightning speed (compared to pure local build) but linking was still non-trivial.

These days I'd bet a modern AMD would beat distcc in our situation due to latency times.


I think I last used distcc back in the mid 2000s, when I was mucking around with Gentoo a bit more frequently, on a single core machine.

I had two machines, one running Gentoo, one running Windows, and I would use distcc to enable me to use the Windows machine to help boost the compilation speed of Gentoo. I can't remember where the documentation was, but http://wikigentoo.ksiezyc.pl/HOWTO_Distcc_server_on_Windows.... seems to do a good job of capturing the general concept.


Does this cache build artifacts in any way? So if several developers are working on the same code-base, they can share build artifacts based on, like, the hash of the inputs? That would require reproducible builds, but that's probably not too tricky to achieve.


It doesn’t IIRC. But, you can combine it with ccache which can use a NFS share. If you use clang, that might be of interest: https://github.com/yrnkrn/zapcc


We've built Crave.io to share build cache between developers without resorting to customer supplied NFS or other storage bottlenecks.


This looks interesting. Up for a chat sometime?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: