Canonical creates a custom 40-processor ARM build machine

huntergdavis · on June 13, 2011

I actually predicted this about a year ago, and wrote a blog series about how to set one up yourself with generic scripts (later collected into an ebook here http://hunterdavis.com/build-your-own-distributed-compilatio...). It was featured a few times here on HN. Cross compilation can actually be quite speedy, but speed isn't the only reason to use such a machine, especially in a business situation.

sausagefeet · on June 13, 2011

Why would cross compiling be seriously slow?

jws · on June 13, 2011

I suspect because it is an unusual case and doesn't get as much attention as the native gcc code. That is compounded by the need to "compute like an ARM" for the constants, leading to some emulation.

But the worst part is those damned autoconf scripts. They very cleverly probe the attributes of your x86 by compiling and running code during the build process and then make decisions about how the code should run on your ARM. They are a never ending sink of human effort. Best to just build on a machine where they will get the right answer without you fiddling with them.

icefox · on June 13, 2011

The way I solved the autoconf problem was that you have one arm machine do the actual building and a cluster of powerful x86 machines running distcc with cross compilers. So this way it builds really fast and the package build system thinks it is native without any problems.

Honestly a little surprised this would be news. Doesn't everyone have an arm cluster or a distcc type setup like the above?

Edit: Back when mac's were PPC I did the same trick and had a handful of x86 Linux box's with apple's gcc setup for cross compiling running with distcc. Made the OS X builds run much faster.

wolf550e · on June 13, 2011

How does distcc insteract with link time optimization?

SeveredCross · on June 13, 2011

Linking is done on the target box, and not on the distcc builders, so link-time optimization should be unaffected.

wolf550e · on June 13, 2011

If by "unaffected" you mean "correct", then yes, as long as it is set up correctly for cross-compilation (I mean the compiler and assembler, which are unused on the target box in non-LTO mode).

But with GCC LTO, distcc will only distribute the parsing of the source code while the optimization and code generation will be done on the target box, so the speedup gain with distcc will be much smaller (LTO makes the ratio of work that parallelizable to work that is non-parallelizable much lower).

GCC LTO partitions the work, if it can interact with distcc it can distribute optimization too, at the cost of some missed optimizations. I don't know if that does the right thing in GCC 4.6.0

rwmj · on June 13, 2011

autoconf scripts are generally OK. We cross-compile over a hundred packages for the Fedora Windows cross-compiler project[1], and it's not the autoconf projects that cause any problems. It's the people who roll their own half-assed build system that are the problem.

[1] http://fedoraproject.org/wiki/SIGs/MinGW

radarsat1 · on June 13, 2011

> by compiling and running code during the build process

They don't do that when used in cross-compiling mode.

jws · on June 13, 2011

If things are configured correctly, and you have an oracle to produce the answers that would be generated if you were running locally, then it can work. A lot of the common stuff works pretty well, a lot of people's code fails miserably too.

nupark2 · on June 13, 2011

It would be slow in terms of fixing the oceans of OSS code that doesn't cleanly configure/compile in a cross-compilation environment, rather than the compilation process itself.

zdw · on June 13, 2011

Not sure, but compiling packages can be i/o bound rather that CPU bound, and is fairly easy to split up.

Rather than one fast multicore server, this solution gets them a lot of separate systems each with dedicated disk, memory, etc. Also, the reboot and wipe each time has security benefits.

The alternative equivalent solution would be a bunch of VM's on a host, which would probably result in memory or i/o bandwidth contention quickly.

mentat · on June 13, 2011

Yeah, this made no sense to me. There's no way a native build solution will be comparable to a high end many core, large cache x86 solution.

zdw · on June 13, 2011

Also, a link to the hardware build blog: http://dmtechtalk.wordpress.com/

mathgladiator · on June 13, 2011

This is way cool, but mostly because it clued me in on pandaboard.

joshu · on June 13, 2011

calling this one 40 processor machine is much like calling a 42u rack filled with dells an 80 processor machine.

kragen · on June 13, 2011

It's like calling a bunch of minicomputers sending packets to each other over lines leased from AT&T a "network". Any fool can see that that's just a use of AT&T's network, not a network in itself. Networks are made of long lines interconnected with crossbar switches.

Right?

(Disclaimer: I wrote the Beowulf FAQ.)

joshu · on June 13, 2011

try arguing the point instead.

sciurus · on June 13, 2011

So before building each package the board PXE boots and installs the OS onto the USB-attached hard drive? That seems inefficient. Why not use an overlay filesystem and just throw away the changes after each build? Is there even a need for local storage, or could the nodes run off an NFS export?

JoshTriplett · on June 13, 2011

For one thing each package needs a different build environment, so the overlay filesystem wouldn't necessarily help much. For another, can you say with confidence that the build process (which typically runs some steps as root) will not affect any system state outside of the filesystem?

jrockway · on June 13, 2011

Why run the build steps as root? Fakeroot seems to work for Debian.