Hacker News new | past | comments | ask | show | jobs | submit login
What's in a Build Tool? (lihaoyi.com)
113 points by lihaoyi on March 4, 2016 | hide | past | favorite | 69 comments



Regarding make, the author states:

  Parallelize different commands: Okay: There's no automatic way to parallelize different targets, though you can still parallelize a single target using Bash.
This is wrong, make has supported the -j flag since 1988:

       -j [jobs], --jobs[=jobs]
             Specifies the number of jobs (commands) to run simultaneously.  If there is more than one -j option, the last one is effective.  If the -j option is given without an argument, make will not limit the number of jobs that can  run
             simultaneously.


I can't believe anyone could use make in the last ~12 years without the -j flag. We've had dual cores at least for ages.


I have a co-worker who was told about -j a couple of weeks ago. Before that, he would bill half an hour for every time he did a clean build.


Actually, I think make should default to parallel builds.


The problem there is that it would break makefiles that do not handle their dependencies correctly. With non-parallel builds, the order of execution is deterministic, and so you can get away with sloppiness in your dependencies. If it works the first time that you test it, it will continue to work.

I can understand why they would be hesitant to change the default, as they would rather that old, tested scripts continue to work without modification.


> The problem there is that it would break makefiles that do not handle their dependencies correctly.

Serious question: In what way are Makefiles that are not specifying their dependencies correctly ever to be considered non-broken?


As said: their behavior is still deterministic in non-parallel execution.

And it looks like gnu make developers aren't like compiler developers that exploit any ambiguity as an excuse to really mess up somebody's day.


It's the exact same argument that comes up whenever gcc improves their optimization algorithms by exploiting undefined behavior, making some code no longer work. In both cases, the original code was fundamentally broken from the start, and the change in tooling only revealed the brokenness, not causing it.

I would completely see such makefiles as being broken.


The impact of those changes is rather different. In the case of make, your build would probably break. In the case of gcc, your program's behaviour would silently change.


In the case of make, with missing dependencies, it can result in a file not being re-compiled when it should be. If you are compiling C, this can result in the definition of a function definition being different in two different compilation units. When one of those compilation units calls a function defined in the other, your program's behavior breaks. All due to a change in the build tool.


That's true. I qualified my statement with probably because there are exceptions. Protecting against those sorts of errors is why my release candidates are done with a clean build and newly fixed bugs are reverified on that package before release.


Of course, they are broken. If dependencies are specified incorrectly, there will be files that, if modified, do not lead to correct rebuilds.

But that doesn't mean there aren't stable workflows using those broken files that are not sensitive to the brokenness.


This is exactly why it should be the default. Defaulting to parallel builds would stop the epidemy of broken makefiles from growing.

Users of legacy makefiles would have to explicitely use "make -j1", or specify ".NOTPARALLEL:" in the makefile - no big deal.

(As a beneficial side effect, defaulting to parallel builds would create an incentive for fixing sloppy makefiles).


callously harming existing and long-term users in the pursuit of some narcissistic ideological goal seems risky.


It will also not work with recursive make invocations, e.g. when top level makefile simply invokes makefiles of several independent projects.


This actually works fine in GNU make too. The parent make acts as a job server. The implementation is ingenious, see http://make.mad-scientist.net/papers/jobserver-implementatio...


It only works if all the dependencies are specified correctly, otherwise, the make will fail.


It's a pity that rake is not included in the table at the bottom.

One huge advantage of rake is the ability to easily debug what is happening. A quick `p [filename, task]` lets you know what is happening. Even, if needed, you can quickly do

    $ irb
    >>> require 'rake'
    >>> p FileList["foo/*"]
and actually play around with the libraries. If you know anything about ruby, and especially if you are looking to build anything other than a c/c++ project, it is definitely worth looking at.


I thought this was going to be a rant against autoconf. When that rant comes I'll post our 186 line shell script that does what autoconf does for pretty much any platform. Handles AIX, Solaris, FreeBSD, HP-UX, IRIX, Linux, NetBSD, OpenBSD, SCO, SunOS, Windows, MacOS.

I hate autoconf but that's not part of this thread. I'll wait.


Unix, Unix, Unix, Unix, Unix, Unix clone, Unix, Unix, Unix, Unix, Windows, Unix.

So, your shell script handles Unix, its most popular clone and Windows. ;-)

The Windows part is interesting. What are the requirements to run your shell script on Windows?


I'm interested that you count BSD-derived OSs as "Unix" but Linux as a "Unix clone", even though they include the same amount of code from AT&T Unix (i.e. zero), and Linux and GNU have traditionally followed AT&T Unix conventions over BSD conventions.


BSD derived directly from AT&T UNIX. In fact, the last releases of BSD from Berkeley still contained substantial amounts of AT&T code. They made the Net/2 release without AT&T code, but this was not a complete system. 386BSD reimplemented most of the missing parts and modern BSDs built on that work.

tl;dr: while modern BSD does not contain AT&T UNIX code anymore, it evolved from AT&T UNIX. In contrast to Linux, it was not a clean-room implementation.


It's a ship of theseus.

If you replace the whole things step by step, each step a small one, is it still the same thing or isn't it?


The point is that it is not a clone of UNIX, it's one of the evolutionary strains of UNIX. You probably won't find substantial amounts of AT&T Research UNIX in Solaris, but it's still UNIX. The lineages of BSD and Solaris can be traced back directly to AT&T Research UNIX.


it seems to me that, today, this seems perhaps more a discussion of "historical context" than material reality.

after all the evolution and adaptation that's gone on over the last couple decades i wonder if saying "bsd is unix" is any more than a notional thing.

is the linux kernel materially and substantially different from any current bsd kernels?


I'm curious about this shell script ; this looks to good to be true!

Having automated the build of dozens of FOSS packages, I can definitely say that autoconf-based packages are _by far_ the easiest ones to build ; especially if you want to do funny stuff like out-of-tree builds or cross-compilation (can your 186-line script do this?).

Some projects (x264, ffmpeg) provide 'configure'-like custom scripts, trying to imitate what autoconf does ; but in practise you can't automate their build the same way you did with autoconf-based projects.


I see where the hate for autoconf comes from.

Properly done autoconf is very developer-friendly for building but if you have to maintain it, it can be a nightmare.

OTOH, because you can always escape to shell scripts, I usually can fix any issue that crops up.

With cmake, I regularly fail, e.g. configuring succeeds but building fails because of a missing library and because I actually have no clue what the standard approach should be (online documentation is abysmal with cmake).


Have you tried CMake?


Yes.

Cross-compilation of existing packets has proven to be a nightmare with cmake. I have to override manually CMAKE_C_COMPILER, CMAKE_CXX_COMPILER ...

Have a look at zenbuild, which is a toy-project of mine trying to homogeneize the build interface of many FOSS projects. Each packet get its script (ala PKGBUILD), it can be seen as a list of hacks around the quirks of each project's build system.

Projects using autoconf are a breeze to build: https://github.com/Ace17/zenbuild/blob/master/zen-libxau.sh

But projects using cmake are another story: https://github.com/Ace17/zenbuild/blob/master/zen-x265.sh


i like the idea of cmake, but i thoroughly despise the pseudo-language.

i haven't used it in some time, so perhaps it's improved; but the ergonomics, documentation and developer-friendliness of it was absolutely dreadful.


Why not write a preemptive blog post?


Can you please GNU me yr 186 lines of bytes. I am curious.


ninja? tup?

Article doesn't note the property of being able to depend upon the entire command that generates an output (ie: re-generate when compiler flags change). This is something that makes doing reliable builds much easier (when present). It's notably very hard to do in make (and even then is very inefficient).

Also, on "download" the author seems to presume that one takes the naive approach in each tool. In most cases, if one spends a bunch of time on it the downloads can be done fairly efficiently (especially in make, without even much work there). Most of these build systems are fully programmable, so the rating should probably focus more on the level of difficulty to do well (with some requirements specifying what "well" is)


To depend on compiler flags, I do this:

## compiler_flags file keeps a copy of the compile flags $(builddir)/compiler_flags: force mkdir -p $(builddir) echo '$(CPPFLAGS) $(CFLAGS)' | cmp -s - $@ || echo '$(CPPFLAGS) $(CFLAGS)' > $@

##rebuild all objects if compile flags change $(LIBOBJECTS) $(RTLLIBOBJECTS) $(OPTLIBOBJECTS) $(TESTOBJECTS) $(builddir)/init_qt_workdir: $(builddir)/compiler_flags

I"m pretty happy with the results.


Depending on compiler flags was implicitly mentioned for sbt, though (point 8). Depending on the `.value` of the `scalaVersion` configuration property is the same thing and can be done for all other settings input as well.


I think the author does a great disservice lumping together deploy, test, and dev sandbox tools under 'build tools'. They are different scenarios which all happen to require a built copy of the code.

There's no fundamental reason for them to use the same tool, except the inertial tug of whatever build tool your project happens to use.


Gradle is a glaring omission if maven, ant and sbt are included.


I'm curious what the author thinks of CMake. It reminded me a lot of Maven, but for C/C++ projects.


The author also doesn't cover autoconf/automake, which operates in the same vein as cmake (generating files for use by another build tool).

I agree, it would be useful to evaluate it here as it provides a bunch of the features the other "build tools" provide.


For c and c++, I've recently starting using meson.

It's a breath of fresh air compared to some other build tools, and the config files are both expressive and concise.


For those curious (like me): meson's home page -- http://mesonbuild.com/


Am i the only one thinking that doing a simple compile is something that should be handled fully by the compiler and not the build tool. The compiler should automatically detect which files are being #included and allow for incremental compile based on this. The Typescript and SASS compilers are good examples of this, you just invoke it like 'tsc main.ts --watch' and it will automatically know that main.ts included other files and these will also be watched.

If your compiler does not support this, the only way to build a proper dependency tree is 1. if the compiler can generate one in a standard format (gcc -MMd), 2. your buildtool knows as much about the language as the compiler does, 3. you write the dependency tree manually.

1. Is bad because it requires the compiler to support generating a wide range of dependency-graph formats, or every build tool must support the make syntax.

2. Is bad because the build-tool becomes one massive monolith and there is no way they can reliably support every language in the world.

3. Is bad because humans are lazy and make mistakes.

For the build tool, this leaves it's responsibility to doing things before and after compilation (generating/preprocessing code, copy assests, run tests, deploy to staging), which usually means invoking arbiatry and custom shell-commands, the order of this is always so application-specific that #3 is inevitable. If doing these things are complicated, your build tool sucks, and sadly i found most build tools to fall into this category because they try #2 and want to be the center of the universe. Even doing a simple thing such as copying a file requires you to learn a new awkard xml-syntax, install the "copy-file-plugin" or write your own copy-file-plugin in the awkward syntax. What i would want is somthing with the simplicity of Gulp or Make, but with a bit more widespread syntax and more batteries included for file-manipulation, like Python would be perfect. Rake looks promising, maybe it makes it worth learning Ruby?


Disagree. Compilers, like anything else, should be small composable libraries. Compilation and dependency parsing should perhaps be separate, and the same dependency-tracking library should be usable by the compiler, wider build tools, and IDEs or similar.


If the syntax is the only issue, maybe we should fork GNU make.


I was just writing a Rakefile and have a good time of it. I like using make but rake gives me more functionality.


Rake is a pleasure for small projects, like generating HTML from a handful of org-mode files or zipping up a Chrome extension. I've not tried to take it much farther than that, though there is a series of blog posts I can't find attached the moment in which the author used rake for real C projects with GCC.


The OP mistakes what a build tool is. A build tool generates the executable from sources. Use other tools for other tasks. I once wrote a tool for running commands on file system events, but I lost it. But for example I write files rather often, a build on each write would be annoying for me. It's a personal thing, so give me the build script and let me use it the way i like.


Ant and Maven are fairly obsolete at this point in the Java world having been superseded by Gradle, and so aren't really worthy of much discussion. But Gradle is quite good and I think it would stack up very well against the others including sbt. Like sbt, I think it fuses a task based and configuration based approach and also does dependency management.


Definitely not Maven. Maven is ubiquitous in Java projects. Gradle has been pushed hard by Android, but I know that a lot of developers aren't convinced that putting code into configuration is a good idea. Maven even after all the XML crud remains insanely flexible and powerful.


I have no opinion on Gradle at this point, though the developers you mention probably need a casual reminder that code == data, and complex enough XML spec is indistinguishable from a poorly formatted Abstract Syntax Tree (or Lisp code).


Maven is really not insanely flexible. I have a build in which i wanted three distinct test phases (unit, functional, and API contract). A simple requirement. But i couldn't have that without writing a plugin, or breaking out a dummy module for one of them. We ended up making the contract tests a special case of the unit tests, selected by passing in some flags; a consequence of that is that it's impossible to run both sets of tests in a single build.

tl;dr: Maven is the exact opposite of flexible.


creating profiles specific to each might help


I think i still couldn't run all three sets of tests in one build though, right?


You can activate multiple profiles per build if needed.


But i can't run a goal twice under different profiles, can i?


No, but you can tie multiple plugin executions to a single phase.


Yikes, I still use Maven. I don't like being left behind in the dust. Is there any compelling reason to prefer Gradle? What mistakes did Maven make that gradle avoided? One thing I am hesitant about is the I don't need my build tool to be a turing complete language.


Technology radar [1] said 'Adopt' in 2013: "Language-based build tools like Gradle and Rake continue to offer finer-grained abstractions and more flexibility long term than XML and plug-in based tools like Ant and Maven. This allows them to grow gracefully as projects become more complex."

My personal opinion after managing the build system of a fairly complex java project for years, I'm glad we skipped Maven in the evolution. For me the most important part was error messages, and the most helpful were Ant > Gradle > Maven. We had very different conventions than what Maven offered, and shaping it to our process was cumbersome. Coupled with Maven 2.x and 3.x never played nicely together (at least for me), now when I see a Maven build I just sigh (looking at you Atlassian)

Finally, strictly personal observation, but Ant/Maven tend to correlate with SVN (or older VCS) usage, which you can argue still does its job, but is pretty much considered obsolete.

[1] https://www.thoughtworks.com/radar/a-z#gradle


Watch for complexity. Circa 2000 when Ant was popular I leveraged jython and inlined small python scripts into the rewrite of a grossly hairy onion of make, shell, and perl scripts that comprised a build system for a sprawling collection of JDK1.2/1.3 apps. In hindsight, this was probably a mistake because it solved the problems of "We can't manage this!" and "Anything outside the JVM is not portable!" it made very little inroad against complexity.


Thanks, this is helpful! I didn't know about Technology Radar before now. Interestingly, they say "Subversion moves back into the Adopt section of the radar because it is a solid version control tool suitable for most teams."


For small projects, like a library or something, there isn't a huge difference. Your build is likely to follow the stereotypical compile - test - package cycle which both tools support out of the box.

There are a few advantages even for simple projects, though:

* I really like Gradle's wrapper feature, which means you can check in a small bootstrapper which downloads Gradle when you need it. That means you don't have to worry about installing the build tool, nor making sure you have the right version. It makes projects self-contained.

* Gradle caches downloaded dependencies in a directory which is distinct from your local repository. Maven just drops things into the local repository. To me, what Maven does is a fundamental confusion of purposes, and it makes it a headache to clean the cache, because you have no idea what is cached, and what is an irreplaceable locally-built artifact. Conversely, it makes it tough to look at your repository and work out what is an official artifact that came from the internet, and what is some crummy local thing you shouldn't rely on.

* Gradle knows about dependencies between tasks, and between tasks and files. When you run a build, Gradle can be fairly smart about what tasks it runs. If you're tweaking the packaging step, re-running the build won't run compile and test if the code hasn't changed. If you just want to run functional tests, tell Gradle, and it will re-compile if necessary, but won't unnecessarily run the unit tests. Doing this with Maven involves manually working out what phases are necessary for what you want to do, and then listing them all on the command line.

Gradle starts to pull ahead significantly for large projects, though. In those, you inevitably need to do more complicated things - code generation, running special tools, multiple kinds of tests, including extra bits and bobs in the packaging, etc. For those, Maven rapidly becomes painful, whereas Gradle's difficulty scales no more than linearly with the challenge.

On the flip side, there's no significant downside to Gradle itself. It's perhaps fractionally slower. It opens the door to writing loads of Groovy code and making your build a nightmare, but there's a simple way to avoid that - just don't do it. You never need to write full-blown code to do anything you could do in XML in Maven; you only need to write code where you would also need to do so in Maven, as a plugin.

And when that does happen, you'll find that writing plugins is also much, much nicer with Gradle than with Maven, and there are really lightweight ways to start - you can write new task definitions right in your build script (which is not nightmare-inducing, because they work inside the normal Gradle flow), then move them out to external modules as you wish.

I'd suggest giving Gradle a go on a small project, one where you don't anticipate a lot of build complexity, just to get a feel for it.


> It opens the door to writing loads of Groovy code and making your build a nightmare, but there's a simple way to avoid that - just don't do it

The parent commenter said "I don't need my build tool to be a turing complete language". Perhaps s/he knows that when the language is there, people will use it, no matter how much someone in Change Control says "writing loads of Groovy code: just don't do it."


Ant maybe, but Maven is hardly obsolete. Most every Java project I run into uses Maven or Gradle, with Maven being more common.


Gradle has superceded Maven in the same way that Node has replaced Java and C++ in production (hint: it hasn't).


Isn't Gradle based on and using Maven (under the hood) ?


Gradle allows you to 1) use maven plugins 2) download dependencies from maven repositories. So my understanding is that Gradle is backwards compatible but separate.


After reading this article on NPM as a build tool I immediately gave up on Gulp and Grunt. Highly recommended! http://blog.keithcirkel.co.uk/how-to-use-npm-as-a-build-tool...



This is helpful and uses a good matrix to compare the tools in question. I've been wanting to see a version of this for cross-language tools: Bazel et al, msbuild, scons, and do not forget the quiet but powerful waf.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: