Hacker News new | past | comments | ask | show | jobs | submit login
Buck: A high-performance build tool (buckbuild.com)
94 points by geezerjay on April 21, 2019 | hide | past | favorite | 69 comments



I get the sense that Bazel will end up with a larger ecosystem than Buck, because Google’s Cloud business depends on open-sourcing more stuff, so they’ll put more effort into open-source Bazel.

Uber is migrating from Buck to Bazel, for instance.


Buck and monorepo is a disaster (there). A bunch of people from Google forced that stuff down everyone's throat. Concerns and real life problems were simply discarded. Another solution in search of a problem. Not invented here at the extreme.


What made Buck fail at Uber vs Facebook? I thought they successfully used it for larger repos there.


My understanding is that Facebook doesn't do Go, so it's always been a second class citizen for Buck. Go folks at Uber were spending a significant amount of time (in the scale of several months) trying to upstream fixes to Buck core, but found that Bazel extensibility allowed them to get similar functionality in a matter of weeks or even days. Buck also didn't seem to have a good story around Thrift/Proto and JS/node.

At Uber, the Java stack is still sort of ok w/ Buck (because Buck handles some things better there), so they're taking a wait-and-see approach until the Bazel ecosystem catches up w/ Buck on the concerns they care about. But long term, we are envisioning implementing a company-wide monorepo, and that sort of entails having a unified build system.


Also (bazel already mentioned, so no need to), but:

Pants - https://www.pantsbuild.org/index.html

Please.Build - https://please.build/

Closely related, but functioning bit different:

Gn - https://gn.googlesource.com/gn/ (targets ninja). Used by Chromium, Fuchsia, and others

Soong - https://android.googlesource.com/platform/build/+/master/REA... (targeting Kati?)


Would it be interesting to list Infrastructure-as-Code tools and cloud pipeline tools in parallel ? At some point, build tools and infrastructure tools may converge :

Kubeflow Pipeline (for ML) - https://www.kubeflow.org/docs/pipelines/pipelines-overview/

Tekton Pipeline - https://github.com/tektoncd/pipeline

TF Extended - https://www.tensorflow.org/tfx


I'm also pretty sure that this will happen.


Google keep inventing new build systems for their OS projects even when Bazel is available.


We think Buck is great. It's Deterministic hermetic builds and it's composable and declarative high-level build description language made packaging very easy. We built even built a package manager for Buck: https://github.com/LoopPerfect/buckaroo

Currently we marketing it for C++ but it can be used for any language that is supported by Buck.


What’s your take on Bazel? Have you considered offering support in Bazel for your package manager?


> What's your take on Bazel?

Here a couple key points:

- Buck and Bazel are very similar.

- Buck currently models C++ projects better [1].

- Buck leverages remote caches currently better than Bazel [3] - Bazel is very easy to install

- Bazels "toolchains" make it very easy to onboard newcomers (to any project and language) but also ensure the build will run as expected.

- Bazel is less opinionated and more extensible than Buck.

In fact Bazel is so powerful that you can have Buildfiles that download a package manager and use it to resolve more dependencies. This is great to get things off the ground, but makes things less composable because the package manager won't see the whole dependency graph. As a result you might get version conflicts somewhere down the line.

To summarize: I think having a very opinionated build-system is easier to reason and scales usually better.

Communities with very opinionated packaging and build-systems are proving this by having orders of magnitude more packages that eg. the highly fragmented C++ community where configuration is prefered over convention.

> Have you considered offering support in Bazel for your package manager?

Yes we did. As soon as this feature [1] is implemented we will have a 1:1 mapping for C++ Buck Projects and Bazel. Then after a small (automated) refactoring of our Buckaroo packages, you should be able to build any package from the Buckaroo ecosystem with either Buck or Bazel.

Btw. The cppslack community is attempting to create a feature matrix of various build-systems here [2]

[1] https://github.com/bazelbuild/bazel/issues/7568

[2] https://docs.google.com/document/d/1y5ZD8ETyGtxCmtT9dIMDTnWw...

[3] https://github.com/bazelbuild/bazel/issues/7664


I'd say the main the challenge that both Bazel and Buck have to face in the C++ land is that it's still too much work to migrate an existing cmake/autotools project to them. One nice project that tries to tackle that is https://github.com/bazelbuild/rules_foreign_cc. But yes, there are minor things that make migration from Bazel to Buck or vice versa not as smooth as it could be, and I'm confident those will be fixed.

To me the biggest added value of Bazel is the remote (build and test) execution (which will get a nice performance boost from https://github.com/bazelbuild/bazel/issues/6862 in Bazel 0.25; also mentioned in [3]).

(Your [3] doesn't compare Bazel and Buck, only Bazel with remote caching and without it, so it's not clear from it that Buck leverages caches better than Bazel).

And one nit, Bazel doesn't allow you to download anything in the loading, analysis, or execution phases ("the BUILD files"), those are completely hermetic, sandboxed, and reproducible (when compilers are). The package manager integrations happen when resolving external repositories ("the WORKSPACE file"), where non-hermetic behavior is allowed and is used e.g. to autoconfigure C++ toolchain, or download npm packages.


This is great that we have many options. However, as I am reviewing the various build tools I can't find that suits my needs. Pretty much a lot of the build tools, such as bazel, nix, take over the native language dependency management and build operations. What I would like is a simple build dependency tool, similar to what Amazon has for their build tool called (brazil) that can manage pulling packages locally to build.

Here is what I would like to achieve in own projects:

1. Work on a NodeJS project A that can locally install NPM packages based on packages.json.

2. Have another Python or Go project that build depends on project B.

3. The build tool allows the dependencies pulled either from artifact repository or use checked out version locally.

This build system focuses on the WHAT to build vs HOW to build, which will be driven by project's own build tool, e.g. ant, npm, maven, etc.

I am considering creating my own simple build tool which is inspired by the Amazon's own tooling.

BTW, does not anyone know if I am allowed to develop my own tooling inspired on company's internal tooling? Obviously it won't be exactly the same but taken with lots of inspirational values.

Here is more details on the Amazon's build tool: https://gist.github.com/terabyte/15a2d3d407285b8b5a0a7964dd6...


Software development is all about not re-inventing the wheel. Yet there are countless build systems, tons of pointless frameworks. I wonder developers have nothing else to do. Every company releases their own version of everything.


My understand of Buck (and Pants) is that they are the result of ex-Google employees going to other companies (resp. Facebook, Twitter), realizing that Blaze was more or less the Right Way to do a build system (at least for their set of circumstances), and then being forced to re-implement the ideas from scratch (edit: or memory, or exfiltrated docs/code) because Blaze was not open source. Reusing extant software is great, but it requires that software to be available to you in the first place.


Bazel is basically current Google employees doing the same thing due to how tightly Blaze is tied to Google infrastructure. Any open source projects or things that may one day be a separate business unit under alphabet rather than part of Google can't use Blaze, so they set out to make a version of Blaze that they can use.


“Reimplementation from scratch” seems like a generous description of Buck. It was initially so like Blaze that I always assumed a xoogler exfiltrated at least the documentation of Blaze.


I really can't speak to that in any way, I was just trying to answer the OP's question as to why there are a number of--on face very similar--Bazel-style build systems. It certainly seems like a fair assumption, though.


Buck definitely looks familiar, having worked with e.g. bazel.


I remember that being a former Google intern.


What's the prior art here? Specifically,

- Support for thousands of related and unrelated codebases in the same repo, with a nuanced understanding of dependencies so that all "dirty" objects, and no others, are rebuilt/retested for each change.

- Hermetic builds to weed out undeclared dependencies.

- Support for remote build cache / remote build workers.

- Understanding of many unrelated languages.

I can't imagine GNU Make being reasonable in this kind of use case. What would you choose?



At a previous job we had the choice of buying a library for a couple grand from a field expert in said library or spinning one of our Senior engineers who clearly costs the company more than a few grand a year on rebuilding the same thing. Hundreds of thousands of dollars wasted (think about the management time, not just the employees own time) to build something in let's say a year, where said developer could of used the few thousand dollar library and done much more productive work to generate something of value.

Now they have to maintain a fork instead. What's worse is they likely wont know how it works in a year's time, imagine in a few years when they need him to update the library.

Sometimes it's better to buy a library than to waste resources reinventing someone else's tried and tested wheel.


Because nobody's done it right yet. Or if they have, they haven't marketed it right yet.

I, for one, applaud the effort. Maybe someday we can finally relegate CMake to the dustbin of history.


Can anyone who's used both compare Buck to Blaze/Bazel?


Overall Buck and Bazel are quite similar as they are both converging to Starlark DSL.

However there are still some differences: Buck is much more opinionated than Bazel. Buck models slightly better C++ projects[1] and currently it's remote cache is more efficient[2].

Bazel is more extensible and offers also remote execution. Bazel has a bigger community and it's roadmap is public.

There has been more than 350 C++ [4] libraries been ported to Buck for the Buckaroo Package Manager[3].

There are also technical details that manifest in some odd ways but are not significant. There is a nice paper by Simon Peyton Jones (creator of haskell) and others that goes into the design details [5]

[1] https://github.com/bazelbuild/bazel/issues/7568

[2] https://github.com/bazelbuild/bazel/issues/7664

[3] https://github.com/LoopPerfect/buckaroo

[4] https://github.com/buckaroo-pm

[5] https://www.microsoft.com/en-us/research/uploads/prod/2018/0...

Happy to go more into detail if desired


Uber is migrating away from Buck to Bazel. One major reason is poor support for a variety of languages (Go being one big one, where fixing issues was historically slow due to a need to upstream fixes to Buck core).

I haven't worked with Buck myself, but colleagues who evaluated it for JS have expressed concerns with lack of support/ecosystem there as well. In comparison, there are various Bazel rulesets for JS/Typescript, and I've had some pretty good experience w/ implementing rules myself. The Starlark docs are good.

Another thing going for Bazel is its ability to embed external codebases into a build system. This mechanism allows rules to be shared among repositories in a reusable fashion.


Does this include mobile projects? I know Uber was/is pretty big on the Buck migration


Yes


There is also Redo

https://redo.readthedocs.io/en/latest/

Very different, never used it, but very interesting nonetheless.


Redo is an awesome replacement for make - way better in many ways, while being much simpler.

However, it doesn’t do things buck and Bazel do such as making sure only declared files are indeed used, or tracking compiler and toolset versions on its own.


> Buck looks at the contents of your inputs, not their timestamps to figure out what needs to be built. As a result, incremental builds should always be correct, so there's no need to perform a clean build.

At least one of these new-fangled tools gets at least one part right. The salient point is now of course whether Buck considers the correct set of dependencies as well as negative dependencies.


Buck makes you declare every dependency explicitly. A particular target must specify exactly what files it uses, and when it makes a build, it copies those files from the target and all its dependencies into an empty sandbox and builds there. It's wonderful.


Wonderful until you have a few thousand files you need to declare explicitly.

Or do I misunderstand something here?



BTW take a look at tup: it tries to infer dependencies by watching which files each build step actually uses.


Redo enhances the timestamp approach instead: https://apenwarr.ca/log/20181113


why are there so many build systems?

it's understandable why programmers have opinions about aesthetics regarding their IDE, language, or framework of choice, but what is there to be opinionated about with a build system?


>, but what is there to be opinionated about with a build system?

I think you're saying this because your idea of a "build system" is a very simple set of sequential steps from known source code input files to an output binary.

The differing opinions come in when the "build system" includes various contradictory philosophies of how to configure and specify the building of complex software.

Different opinions on:

- syntax : should build config be XML, or JSON, or YAML, or custom syntax? E.g. Ant and Maven used XML but Gradle does not

- dependencies search: should build system implicitly find and add relevant dependencies? Or should programmer explicitly specify each one?

- reproducible builds (version pinning) vs auto-updated dependencies.

- should the build system be "smart" about cross-platform differences? If yes, you end up with complicated build tools like GNU Autoconf "configure" bash script, or CMake's complex syntax.

- should build system do optimization and bundling tasks that's not strictly limited to "compile" steps? Some Javascript build tools try to eliminate redundant or unused js code to make downloads smaller.

- etc, etc.

I also recommend reading the blog post "So you want to write a package manager"[1] to get an idea of how complicated a build system can get. The title says "package manager" but much of the material is also about build systems.

The bottom line is that reasonable people can disagree on the priorities and therefore, you can't create The One & Only Build System to End All Other Build Systems.

[1] https://medium.com/@sdboyer/so-you-want-to-write-a-package-m...


I feel like there is a ‘best’ answer to most if not all of these questions, and the build system should standardize on whatever that best practice is.

It’s great that Timmy thinks an obscure extension of YAML is the best way to define build configs, but all the rest of the engineers use JSON, so that is what we use.


Because Google didn't open source Blaze soon enough, and Xooglers who moved to other companies wanted something similar and ended up building clones (Buck and Pants come to mind). Eventually, Google did release an open source version of Blaze called Bazel, but by then the eco-system was already rife with many other similar but slightly different build systems.


> why are there so many build systems?

Perhaps because the incumbent/popular build systems are far from being the best solution for the job, particularly when compared with build systems available for other technologies.

For instance, in C and C++ land the incumbents are still hand-written Makefiles, autotools, or cmake, which leave much to be desired.


- configuration format/language

- system requirements

- convention vs configuration

- flexibility vs predefined paths

- tradeoff features vs their complexity (parallelism, caching, ...)

- level of integration with specific other tooling (VCS, languages, package managers, ...)


It's a hard thing to do right despite looking deceptively simple.


Well many build systems don't even try to work properly, many new ones included. E.g. anything using timestamps is unequivocally an incorrect build system.


Can you explain why? I’d be interested to learn more.


Apart from issues pertaining timestamp granularity and clock skew, there is a problem that is technically not about timestamps but is related since it's often a consequence of the dependency data model employed by that class of build systems.

Consider this makefile rule:

foo.a: $(patsubst %.c,%.o,$(wildcard *.c))

Now, the foo.a target will be considered as stale if any of timestamps of it's dependencies is newer than foo.a

But what if you remove a .c file?

As build systems like make don't capture a fingerprint of the names of the inputs of a built (let alone the hash of their contents) it's very hard for them to deal with that case correctly.

Correctness is important otherwise your trust in the incremental builds quicky erodes and you start doing clean builds all the times you get an error, just in case.


According to Apenwarr (redo implementer) claims [0] that checksums are not necessary, if you extend the mtime with size, inode number, file mode, owner uid, owner gid, and (targets only) the sequence number of the last time it was built.

The blog post [0] also contains arguments against checksums: Sometimes building a target has side effects. Checksumming every output after building it is somewhat slow. Checksumming every input file before building is very slow.

[0] https://apenwarr.ca/log/20181113


Building a Target shouldn't have side effects. But things to may want to do with your build system might have side effects. Bazel for example supports with with the bazel run command, as opposed to bazel build.


Yeah, but while the mtime++ vs content hash dichotomy is interesting and all, in a way it's all just about a performance optimization.

What matters is keeping track of the actual dependencies and Redo does indeed save that knowledge in a database that gets consulted between runs.


In principle, you want the output of a build to always be the same if the inputs are the same. In terms of correctness, you want "sameness" to be defined in terms of the contents of the input, not their timestamp, since timestamps can be easily be changed inadvertently by things like `touch`, 3rd party tools, etc. Also, relying on timestamps could pose problems for caching/checksums/etc if they are printed into any transitive dependency of a build pipeline.


Furthermore, timestamp precision is still problematic. Some filesystems still have one second granularity, but even if you use a filesystems capable of storing sub-millisecond timrstamps, that doesn't mean commands that get invoked as part of the build process (e.g. cp -p) invoke the right APIs that make use of such precision. e.g. see the doc for the .LOW_RESOLUTION_TIME directive of GNU make


Because it’s easy to implement something that does 80% of the features, and also it’s easier than learning all the gnarly features of make

Last year I learned about .PRECIOUS and it was awful


I think it boils down to taste, just as there are many languages which have basically the same expressive power. Similarly, there are many shells like bash, tcsh, ...


Building and deploying software is an unsolved problem and this bottleneck becomes increasingly costly and painful as the digital revolution accelerates.


Because it’s fun to build them and “do it right”... :)



The C++ community started a comparison of various buildsystem a while ago: https://docs.google.com/document/d/1y5ZD8ETyGtxCmtT9dIMDTnWw...


Is it possible to do distributed builds with GCC and LLVM? I would like to see more options for distributing builds across a cluster of servers in order to significantly speed up the build process.



(Disclaimer: Bazel engineer)

Bazel does ecosystem and language agnostic distributed builds with both GCP and self-hosted solutions.

Here's a demonstration of building the Angular project remotely: https://youtu.be/lDyIc2Abkwg?t=593


Can Buck handle Maven dependencies yet ? We don't want to vendor hundreds of jars into our repositories like before maven.


As noted below, no, but Bazel now officially does, if you'd like something similar (this is new as of a couple weeks ago): https://github.com/bazelbuild/rules_jvm_external


No but this will be covered by buckaroo [1]

[1] https://github.com/LoopPerfect/buckaroo/issues/314


Bazel is significantly superior.


Are you able to elaborate on why?


Edit: nevermind, saw facebook and build tool and assumed it was site assets. Note to self, read whole articles before commenting.


This is a build tool for Java and C++ code. Good luck deploying those without a build step.


Quick shoutout to not reading.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: