Buck2: Our open source build system

ihnorton · on April 6, 2023

The fact that Buck2 is written in a statically-compilable language is compelling, compared to Bazel and others. It's also great that Windows appears to be supported out of the box [1,1a] -- and even tested in CI. I'm curious how much "real world" usage it's gotten on Windows, if any.

I don't see many details about the sandboxing/hermetic build story in the docs, and in particular whether it is supported at all on Linux or Windows (the only mention in the docs is Darwin).

It's a good sign that the Conan integration PR [2] was warmly received (if not merged, yet). I would hope that the system is extensible enough to allow hooking in other dependency managers like vcpkg. Using an external PM loses some of the benefits, but it also dramatically reduces the level of effort for initial adoption. I think bazel suffered from the early difficulties integrating with other systems, although IIUC rules_foreign_cc is much better now. If I'm following the code/examples correctly, Buck2 supports C++ out of the box, but I can't quite tell if/how it would integrate with CMake or others in the way that rules_foreign_cc does.

(one of the major drawbacks of vcpkg is that it can't do parallel dependency builds [3]. If Buck2 was able to consume a vcpkg dependency tree and build it in parallel, that would be a very attractive prospect -- wishcasting here)

[1] https://buck2.build/docs/developers/windows_cheat_sheet/ [1a] https://github.com/facebook/buck2/blob/738cc398ccb9768567288... [2] https://github.com/facebook/buck2/pull/58 [3] https://github.com/microsoft/vcpkg/discussions/19129

fanzeyi · on April 6, 2023

One side effect of all the Metaverse investment is that Meta now has a lot more engineers working on Windows. You bet there will be real world usage. ;)

e4m2 · on April 6, 2023

> There are also some things that aren't quite yet finished:

> There are not yet mechanisms to build in release mode (that should be achieved by modifying the toolchain).

> Windows/Mac builds are still in progress; open-source code is mostly tested on Linux.

Source: https://buck2.build/docs/why.

yencabulator · on April 9, 2023

> I don't see many details about the sandboxing/hermetic build story in the docs, [...]

Looks like local mode just inherits whatever environment the buck daemon was spawned in.

The remote execution thing is configured with a docker image to run things in, and only specified files are coped into the container instance, so it's somewhat hermetic. Docker containers aren't really reproducible, and there's only one image per remote execution backend, so that's kinda the weakest link (especially compared to something like Nix's hermetic builds, where the build-visible filesystem only contains the things you declared as dependencies).

ahslian · on April 10, 2023

Internally, we don't use docker in our Remote Execution service implementation and the linux workers use cgroups to isolate whereas the macOS and Windows story is still being worked on.

IIUC, the publicly available Remote Execution services out there are specified by docker, so we chose to have OSS buck2 align to that.

As noted, local mode doesn't do anything else at this point, but we've discussed exactly this to help developers identify dependency declarations earlier.

oggy · on April 6, 2023

Great to see this. I hope it takes off - Bazel is useful but I really like the principled approach behind it (see the Build Systems a la Carte paper), and Neil is scarily good from my experience of working with him so I'd expect that they've come up with something awesome.

One thing I find annoying with all of these general, language-agnostic build systems though is that they break the "citizenship" in the corresponding language. So while you can usually relatively easily build a Rust project that uses crates.io dependencies, or a Python project with PyPi dependencies, it seems hard to make a library built using Bazel/Buck available to non-Bazel/Buck users (i.e., build something available on crates.io or PyPi). Does anyone know of any tools or approaches that can help with that?

marcyb5st · on April 6, 2023

Regarding bazel, the rules_python has a py_wheel rule that helps you creating wheels that you can upload to pypi (https://github.com/bazelbuild/rules_python/blob/52e14b78307a...).

If you want to see an approach of bazel to pypi taken a bit to the extreme you can have a look at tensorflow on GitHub to see how they do it. They don't use the above-mentioned building rule because I think their build step is quite complicated (C/C++ stuff, Vida/ROCm support, python bindings, and multiOS support all in one before you can publish to pypi).

dfinninger · on April 6, 2023

I use py_wheel to build packages to be consumed by data scientists in my company. It works well and is reasonably straightforward. Although the packages are pure Python so I haven’t had to deal with native builds.

kccqzy · on April 6, 2023

I have a lot of respect for Neil, but I've been burned by the incompleteness and lack of surrounding ecosystem for his original build system Shake (https://shakebuild.com/). This was in a team where everyone knows Haskell.

I'm cautiously optimistic with this latest work. I'm glad at least this isn't some unsupported personal project but something official from Meta.

ndmitchell · on April 7, 2023

I think of Shake as a library for implementing build systems, and was hoping that libraries would emerge that described how to implement rules like C++, and how they should compose together so you can compile C++/Haskell/Python all together happily. A few libraries emerged, but the overall design never emerged. Sorry you got burned :(

Buck2 is at a higher level than Shake - the rules/providers concepts pretty much force you into a pattern of composable rules. The fact that Meta has lots of languages, and that we were able to release those rules, hopefully means it's starting from the point of view of a working ecosystem. Writing those rules took a phenomenal amount of effort from a huge range of experts, so perhaps it was naive that Shake could ever get there on only open source volunteer effort.

klodolph · on April 7, 2023

The “citizenship” point is really interesting. I’ve found these build systems to be really useful for solving problems in multi-language repos. They make it super easy to create all the build artifacts I want. However, in many ways, they make the source more difficult to consume for people downstream.

jpdb · on April 6, 2023

Bazel now has a module system that you can use.

https://bazel.build/external/module

This means your packages are just Git repos + BUILD files.

rvcdbn · on April 6, 2023

These kinds of tools are designed to work in monorepos so you don’t really rely on package management like you do with separate repos. This works really well for sharing close inside companies/entities. Doesn’t work as well for sharing code between entities.

dnsco · on April 6, 2023

If I'm understanding, for the rust specific case, this generates your BUCK files from your Cargo.toml:

https://github.com/facebookincubator/reindeer

lopkeny12ko · on April 6, 2023

> One thing I find annoying with all of these general, language-agnostic build systems though is that they break the "citizenship" in the corresponding language

I mean, this is kind of the whole point. A language agnostic build system needs a way to express dependencies and relationships in a way that is agnostic to, and abstracts over, the underlying programming language and its associated ecosystem conventions.

jen20 · on April 7, 2023

That is only true if the output is an application.

If the output is libraries for some ecosystem (perhaps with bindings to something written in Rust or C), one needs to be able to build packages that others not invested in that build system can consume.

chubot · on April 7, 2023

The linked paper is pretty interesting, and short at 4 pages.

In plainer language, I'd say the observation/motivation is that not only do compiling and linking benefit from incrementality/caching/parallelism, but so does the build system itself. That is, the parsing of the build config, and the transformation of the high level target graph to the low level action graph.

So you can implement the build system itself on top of an incremental computation engine.

Also the way I think about the additional dependencies for monadic build systems is basically #include scanning. It's common to complain that Bazel forces you to duplicate dependency info in BUILD files. This info is already present (in some possibly sloppy form) in header files.

So maybe they can allow execution of the preprocessor to feed back into the shape of the target graph or action graph. Although I wonder effect that has on performance.

---

The point about Java vs. Rust is interesting too -- Java doesn't have async/await, or coroutines.

I would have thought you give up some control over when things run with with async/await, but maybe not... I'd like to see how they schedule the tasks.

Implementing Applicative Build Systems Monadically

https://ndmitchell.com/downloads/paper-implementing_applicat...

dbaupp · on April 7, 2023

Pants v2 supports dependency inference in a manner similar to what you’re hypothesising. It indeed can benefit from the general caching mechanism. https://blog.pantsbuild.org/why-dependency-inference/

It has been remarkably convenient for our Python monorepo.

chubot · on April 7, 2023

Ah very interesting, I had heard of Pants, but I didn't know about Pants 2. Having less BUILD metadata definitely seems like a big win, and I imagine with some effort you can keep the parallelism/caching/distribution too (I'd be interested in details on that).

Since Pants 2 has a core in Rust, I wonder if you considered Starlark vs. Python? I can see advantages to Python, but it seems like there are good open source implementations of Starlark now.

With Pants 2 and Buck 2, now we're on the 3rd or 4th generation of the Bazel-like systems :)

ndmitchell · on April 7, 2023

We schedule tasks mostly using tokio - without any particular care. It's mostly good, but occasionally causes some performance headaches. I think in the fullness of time we might need to invest more in careful scheduling.

RcouF1uZ4gsC · on April 6, 2023

> Buck2 is an extensible and performant build system written in Rust

I really appreciate tooling that is written in Rust or Go that produce single binaries with minimal runtime dependencies.

Getting tooling written in for example Python to run reliably can be an exercise in frustration due to runtime environmental dependencies.

crabbone · on April 6, 2023

Your problem is that Python sucks, especially it's dependency management. It sucks not because it ought to suck, but because of the incompetence of PyPA (the people responsible for packaging).

There are multiple problems with Python packaging which ought not exist, but are there and make lives of Python users worse:

* Python doesn't have a package manager. pip can install packages, but installing packages iteratively will break dependencies of packages installed in previous iterations. So, if you call pip install twice or more, you are likely to end up with a broken system.

* Python cannot deal with different programs wanting different versions of the same dependency.

* Python version iterates very fast. It's even worse for most of the Python packages. To stand still you need to update all the time, because everything goes stale very fast. In addition, this creates too many versions of packages for dependency solvers to process leading to insanely long installation times, which, in turn, prompts the package maintainers to specify very precise version requirements (to reduce the time one has to wait for the solver to figure out what to install), but this, in turn, creates a situation where there are lots of allegedly incompatible packages.

* Python package maintainers have too many elements in support matrix. This leads to quick abandonment of old versions, fragmented support across platforms and versions.

* Python packages are low quality. Many Python programmers don't understand what needs to go into a package, they either put too little or too much or just the wrong stuff altogether.

All of the above could've been solved by better moderation of community-generated packages, stricter rules on package submission process, longer version release cycles, formalizing package requirements across different platforms, creating tools s.a. package manager to aid in this process... PyPA simply doesn't care. That's why it sucks.

androidbishop · on April 6, 2023

Most of this is false. You are ignoring the best practices of using python virtual environments for managing a project's binary and package versions.

crabbone · on April 6, 2023

You are seriously going to preach about virtual environments to someone who maintains couple dozens of Python packages, works and worked in the infra departments of the largest software companies on Earth? :)

Come back ten years. We'll talk.

dikei · on April 7, 2023

"Appeal to authority" doesn't prove your point buddy, especially if that "authority" is yourself.

crabbone · on April 9, 2023

This is not an "appeal to authority". It means to say that I was using virtual environments before you started programming, and am acutely aware of their existence: the solution you offer is so laughable it doesn't deserve a serious discussion, just too many things are "naive" at best your "solution", but mostly your "solution" is just irrelevant / a misunderstanding of the problem.

dikei · on April 10, 2023

I'm not the original poster you replied to, just a passer by.

androidbishop · on April 14, 2023

I'm still waiting on an actual argument here other than condescending name calling, appeals to authority (yes that is what you are doing), and casual hand-waving away any serious discussion because it doesn't "deserve" it.

Let's go through the points which I was referring to:

"Python doesn't have a package manager. pip can install packages, but installing packages iteratively will break dependencies of packages installed in previous iterations. So, if you call pip install twice or more, you are likely to end up with a broken system."

"Likely" seems like a stretch here since it's pretty damned rare that I've come across this when using virtual environments. With a virtual environment, you have an isolated system. Why are you installing packages iteratively in the first place? Use a requirements.txt with the packages you need, then freeze it. If you end up with a conflict, delete the virtual environment and recreate a fresh one, problem solved.

"Python cannot deal with different programs wanting different versions of the same dependency"

It does when you're running your applications using virtual environments. Again, you say that it's irrelevant, but this is literally what this shit solves. I come from a world where multiple applications are run on separate docker containers so this doesn't really apply anyway, but if you had to run multiple applications on the same server you can set the PYTHONPATH env variable and venv binary to the virtual environment when running each application.

"Python version iterates very fast. It's even worse for most of the Python packages. To stand still you need to update all the time, because everything goes stale very fast. In addition, this creates too many versions of packages for dependency solvers to process leading to insanely long installation times, which, in turn, prompts the package maintainers to specify very precise version requirements (to reduce the time one has to wait for the solver to figure out what to install), but this, in turn, creates a situation where there are lots of allegedly incompatible packages."

Maybe I'm misunderstanding what you are saying here, but this seems like a retread of your first point with some casual opinions thrown in. If you delete the venv and re-install all the packages at once, shouldn't it resolve dependency issues? "Insanely long installation times"? Seems to be a lot quicker than maven or gradle in my experience, and much easier to use. I get a lot of dependency issues with those managers as well, so this doesn't seem to be a unique problem for python, if it really is a problem when using virtual environments.

"Python package maintainers have too many elements in support matrix. This leads to quick abandonment of old versions, fragmented support across platforms and versions."

I admit I don't know anything about this. Maybe it's true, but I imagine this is true for community packages of just about any language.

"Python packages are low quality. Many Python programmers don't understand what needs to go into a package, they either put too little or too much or just the wrong stuff altogether."

This is not only purely subjective opinion, it's not even one that seems to be common. Maybe that's true for less popular packages (and again, I'm not convinced it wouldn't be the same for less popular packages in other languages), but the ones most people use for common tasks I often see heralded as fantastic examples of programming that I should be reviewing to level up my own code.

"All of the above could've been solved by better moderation of community-generated packages, stricter rules on package submission process, longer version release cycles, formalizing package requirements across different platforms, creating tools s.a. package manager to aid in this process..."

I'm not familiar enough with the politics, culture, and process of maintaining Python's packages or package management system to speak to any of this. It seems like this would generally be good advice regardless of the state it's currently in. But these are broad, systemic solutions that require a revamp of the culture and bureaucracy of the entire package management system, a completely different set of tools than the ones that already exist (that would likely create backwards incompatibility issues), and no meaningful way to measure the success of these initiatives because most of your complaints are subjective opinions. Furthermore, at least half of your complaints seem to already be mitigated using virtual environments and industry best-practices, so I'm struggling to see where any of this is helpful.

zdw · on April 6, 2023

s/Python/NodeJS/ and everything in this statement is multiplied by 10x

IshKebab · on April 6, 2023

Some of it is also true for Node (e.g. poor package quality), but I think it would be hard to argue that the actual package management of Node is anywhere near as bad as Python.

Node basically works fine. You get a huge node_modules folder, sure. But it works.

Python is a complete mess.

nextaccountic · on April 6, 2023

> You get a huge node_modules folder, sure. But it works.

pnpm and other tools deduplicates that

zdw · on April 6, 2023

I've tended to have the exact opposite experience - Node projects have 10x (or more) the dependencies of Python ones, and the tooling is far worse and harder to isolate across projects.

A well engineered virtualenv solves most Python problems.

crabbone · on April 6, 2023

I don't have enough experience with npm, but one thing I know for sure is that it can support multiple different versions of the same package. Not in the way I'd like it to do that (i.e. it also allows this in the same application), but at least in this sense it's not like Python.

TikolaNesla · on April 6, 2023

Yes, just what I thought when I installed the Shopify CLI (https://github.com/Shopify/cli) a few days ago because they force you to install Ruby and Node

rektide · on April 6, 2023

Personally it seems like a huge waste of memory to me. It's the electron of the backend. It's absolutely done for convenience & simplicity, with good cause after the pain we have endured. But every single binary bringing the whole universe of libraries with it offends.

Why have an OS at all if every program is just going to package everything it needs?

It feels like we cheapened out. Rather than get good & figure out how to manage things well, rather than driver harder, we're bunting the problem. It sucks & it's lo-fi & a huge waste of resources.

bogwog · on April 6, 2023

I don't think that matters so much. For building a system, you definitely need dynamic linking, but end user apps being as self contained as possible is good for developers, users, and system maintainers (who don't have to worry about breaking apps). As long as it doesn't get out of hand, a few dozen MBs even is a small price to pay IMO for the compatibility benefits.

As a long time Linux desktop user, I appreciate any efforts to improve compatibility between distros. Since Linux isn't actually an operating system, successfully running software built for Ubuntu on a Fedora box, for example, is entirely based on luck.

rektide · on April 6, 2023

There's also the issue that if a library has a vulnerability, you are now reliant on every static binary updating with the fix & releasing a new version.

Where-as with the conventional dynamic library world one would just update openssl or whomever & keep going. Or if someone wanted to shim in an alternate but compatible library, one could. I personally never saw the binary compatibility issue as very big, and generally felt like there was a while where folks were getting good at packaging apps for each OS, making extra repos, that we've lost. So it seems predominantly to me like downsides, that we sell ourselves on, based off of outsized/overrepresented fear & negativity.

preseinger · on April 6, 2023

the optimization you describe here is not valuable enough to offset the value provided by statically linked applications

the computational model of a fleet of long-lived servers, which receive host/OS updates at one cadence, and serve applications that are deployed at a different cadence, is at this point a niche use case, basically anachronistic, and going away

applications are the things that matter, they provide the value, the OS and even shared libraries are really optimizations, details, that don't really make sense any more

the unit of maintenance is not a host, or a specific library, it's an application

vulnerabilities affect applications, if there is a vulnerability in some library that's used by a bunch of my applications then it's expected that i will need to re-deploy updated versions of those applications, this is not difficult, i am re-deploying updated versions of my applications all the time, because that is my deployment model

lokar · on April 6, 2023

Indeed. I view Linux servers/vms as ELF execution appliances with a network stack. And more and more the network stack lives in the NIC and the app, not Linux.

preseinger · on April 7, 2023

100% yes

rektide · on April 6, 2023

Free software has a use beyond industrial software containers. I don't think most folks developing on Linux laptops agree with your narrow conception of software.

Beyond app delivery there's dozens of different utils folks rely on in their day to day. The new statically compiled world requiring each of these to be well maintained & promptly updated feels like an obvious regression.

preseinger · on April 6, 2023

> Free software has a use beyond industrial software containers. I don't think most folks developing on Linux laptops agree with your narrow conception of software.

the overwhelming majority of software that would ever be built by a system like buck2 is written and deployed in an industrial context

the share of software consumers that would use this class of software on personal linux laptops is statistically zero

really, the overwhelming majority of installations of distros like fedora or debian or whatever are also in industrial contexts, the model of software lifecycles that their maintainers seem to assume is wildly outdated

howinteresting · on April 6, 2023

Again, there is no alternative. Dynamic linking is an artifact of an antiquated 70s-era programming language. It simply does not and cannot work with modern language features like monomorphization.

Linux distros are thankfully moving towards embracing static linking, rather than putting their heads in the sand and pretending that dynamic linking isn't on its last legs.

PaulDavisThe1st · on April 6, 2023

Whoa, strong opinions.

Dynamic linking on *nix has nothing to do with 70s era programming languages.

Did you consider the possibility that the incompatibility between monomorphization (possibly the dumbest term in all of programming) and dynamic linking should perhaps saying something about monomorphization, instead?

howinteresting · on April 6, 2023

> Dynamic linking on *nix has nothing to do with 70s era programming languages.

Given that dynamic linking as a concept came out of the C world, it has everything to do with them.

> Did you consider the possibility that the incompatibility between monomorphization (possibly the dumbest term in all of programming) and dynamic linking should perhaps saying something about monomorphization, instead?

Yes, I considered that possibility.

PaulDavisThe1st · on April 6, 2023

The design of dynamic linking on most *nix-ish systems today comes from SunOS in 1988, and doesn't have much to do with C at all other than requiring both the compiler and assembler to know about position-independent code.

What elements of dynamic linking do you see as being connected to "70s era programming languages"?

> Yes, I considered that possibility.

Then I would urge you to reconsider.

preseinger · on April 6, 2023

dynamic linking is an optimization that is no longer necessary

there is no practical downside to a program including all of its dependencies, when evaluated against the alternative of those dependencies being determined at runtime and based on arbitrary state of the host system

monomorphization is good, not bad

the contents of /usr/lib/whatever should not impact the success or failure of executing a given program

PaulDavisThe1st · on April 6, 2023

Dynamic linking wasn't an optimization (or at least, it certainly wasn't just an optimization). It allows for things like smaller executable sizes, more shared code in memory, and synchronized security updates. You can, if you want, try the approach of "if you have 384GB of RAM, you don't need to care about these things", and in that sense you're on quicksand with the "just an optimization". Yes, the benefits of sharing library code in memory are reduced by increasing system RAM, but we're seeing from a growing chorus of both developers and users, the "oh, forget all that stupid stuff, we've got bigger faster computers now" isn't going so well.

There's also the problem that dynamic loading relies on almost all the same mechanisms as dynamic linking, so you can't get rid of those mechanisms just because your main build process used static linking.

preseinger · on April 6, 2023

it allows for all of the things you list, yes, but those things just aren't really valuable compared to the reliable execution of a specific binary, regardless of any specific shared library that may be installed on a host

smaller executable sizes, shared code in memory, synchronized security updates, are all basically value-zero, in any modern infrastructure

there is no "growing chorus" of developers or users saying otherwise, it is in fact precisely the opposite, statically linked binaries are going extremely well, they are very clearly the future

bscphil · on April 7, 2023

> it allows for all of the things you list, yes, but those things just aren't really valuable compared to the reliable execution of a specific binary

> smaller executable sizes, shared code in memory, synchronized security updates, are all basically value-zero, in any modern infrastructure

This highlights the fact that you're extremely focused on one particular model of development, one where a single person or group deploys software that they are responsible for running and maintaining - often software that they've written themselves.

This is, obviously, an extremely appropriate paradigm for the enterprise. Static linking makes a lot of sense here. Python's virtual environments are basically the approved workaround for the fact that Python was built for systems that are not statically linked, and I cherish it for exactly that reason. Use Go on your servers - I do myself! But that doesn't mean it's appropriate everywhere.

Sometimes developers in this mindset forget there's a whole other world out there, a world of personal computers, that each have hundreds or thousands of applications installed. Applications on these systems are not deployed, they are installed. The mechanism by which this happens (on Linux) is via distributions and maintainers, and dynamic linking needs to be understood as designed for that ecosystem. Linux operating systems are built around making things simple, reliable, and secure for collections of software that are built and distributed by maintainers.

I'm firmly on the side of the fence that says that dynamic linking is the correct way to do that. All the benefits you mention are just a free bonus, of course, but I care about them as well. Smaller executable sizes? Huge win on my 256 GB SSD. Synchronized security updates? Of course I care about those as an end user!

rektide · on April 7, 2023

I hugely agree that the parent is definitely definitely favoring one and only one kind of software model.

You raise the world of personal computers. And I think dynamic linking is absolutely a choice that has huge advantages for these folks.

There's other realms too. Embedded software needs smaller systems, so the dynamic library savings can be huge there. Hyper-scaler systems, where thousands of workloads can be running concurrently, can potentially scale to much much much higher usages with dynamic linking.

It's a little far afield, but with systems like webasssembly we're really looking less at a couple orgs within a company each shipping a monolith or two, and we're potentially looking way more at having lots of very small functions with a couple helper libraries interacting. This isn't exactly a classic dynamic library, but especially with the very safe sandboxing built in, the ideal model is far closer to something like dynamic linking where each library can be shared than it is shared.

preseinger · on April 7, 2023

> Sometimes developers in this mindset forget there's a whole other world out there, a world of personal computers, that each have hundreds or thousands of applications installed.

it's not that i forget about these use cases, it's that i don't really consider them relevant

tooling that supports industrial use cases like mine is not really able to support end-user use cases like yours at the same time

linux operating systems may have at one point been built around making things as you describe by distribution maintainers, but that model is anachronistic and no longer useful to the overwhelming majority of its user base, the huge majority of software is neither built nor distributed by maintainers, it is built and distributed by private enterprises

bscphil · on April 7, 2023

> > Sometimes developers in this mindset forget there's a whole other world out there, a world of personal computers, that each have hundreds or thousands of applications installed.

> it's not that i forget about these use cases, it's that i don't really consider them relevant

Yes, exactly! It's an extremely myopic vision. You've spent this long thread arguing against dynamic linking on the basis of what is only a small fraction of total human / computer interactions! By "not relevant" you mean not relevant to the enterprise. I grant that of course - but these uses cases are (by definition) relevant to hundreds of millions of PC users.

> the huge majority of software is neither built nor distributed by maintainers, it is built and distributed by private enterprises

The overwhelming majority of the software I run is built and distributed by maintainers. Literally, there are only a few exceptions, like static-built games that rarely or never change and are (unfortunately) closed source. I daresay that's true for the majority of Linux users - the vast majority of the software we install and use is not "built and distributed by private enterprises".

This reality is what Linux-on-the-desktop is built for. There are millions of people who are going to want to continue using computers this way, and people like me will continue contributing to and developing distributions for this use case, even if shipping static or closed-source binaries to Linux users becomes common.

preseinger · on April 8, 2023

linux-on-the-desktop is also like statistically zero of linux installations (modulo mobile) but if that's counter to a belief of yours then we're definitely not going to make progress here so (shrug)

like i'm not sure you understand the scale of enterprise linux. a single organization of not-that-very-many people can easily create and destroy hundreds of millions of deployed systems every day, each with a novel configuration of installed software. i've seen it countless times.

bscphil · on April 8, 2023

I think we're arguing on multiple fronts here and that is confusing things.

1. My point about Linux on the desktop is that there are in practice users like me who are already getting the (many) advantages of dynamic linking, and don't want to give up those advantages. To the point that some of us are going to support and work on distributions that continue the traditional Linux way in this area. In your view, the ecosystem has moved to software being built and distributed by private corporations. I don't think this has happened - on Windows software was always built and distributed this way; on (desktop) Linux it never was and largely still isn't!

2. My point about the desktop in general is that this use case matters to the vast majority of computer-using human beings much more than enterprise. The number of deployed containers that get created and destroyed every day doesn't change that fact, nor does the fact that Linux users are merely a tiny fraction of this desktop use case. This is what creates the myopia I was talking about - you're thinking about metrics like "number of systems deployed" whereas I'm thinking of number of human-computer interactions that are impacted. I don't think you can just discard what matters on the desktop or paint it as irrelevant. Desktop computing shouldn't be subordinate to the technical requirements of servers!

So to summarize the argument: (a) desktop use cases still matter because they comprise the majority of human-computer interactions, (b) dynamic linking and the maintainer model are the superior approach for desktop computing, and in fact complement each other in important ways, and (c) even if most desktop users can't take advantage of this model because of the dominance of closed source software and the corporate development model, desktop Linux can and does, and will hopefully continue to do so into the future.

preseinger · on April 9, 2023

> Desktop computing shouldn't be subordinate to the technical requirements of servers!

i guess this is the crux of the discussion

linux desktop computing for sure _is_ subordinate to linux server computing, by any reasonable usage metric

i'm not trying to deny your experience in any way, nor suggest that dynamic linking goes away, or anything like that -- your use case is real, linux on the desktop is real, that use case isn't going away

but it is pretty clear at this point that linux on the server is wildly successful, linux on mobile is successful (for android), and that linux on the desktop is at best a niche use case

the majority of human interactions with linux occur via applications, services, tools, etc. that are served by linux servers, and not by software running on local machines like desktops or laptops

linux is a server operating system first and foremost

rektide · on April 13, 2023

whether we want unobservable ungovernable far off machines running the future forever, or whether we want a future where actual people can compute & see what happens seems to matter. the numbers may perhaps stack up to suborn PC needs to industrial computing needs now, but is that the future anyone should actually want? should the invisible hand of capital be the primary thing humanity should try to align to?

and where is the growth potential? is the industrial need going to become greatly newly empowered & helpful to this planet, to us? will it deliver & share the value potential out there? PC may be a smaller factor today, but i for one am incredibly fantastically excited to imagine a potential future 10 years from now where people start to PC again, albeit in a different way.

individual PCs have no chance. it's why the cloud has won. on-demand access wherever you are, consistent experience across devices is incredibly incredibly convenient. but networks of PCs that work well together is exciting, and we've only so very recently started emerging the capability to have nice easy to manage ops/automated multi-machine personal-computing. we've only recently emerged to maturity where a better, competitive personal computing is really conceivable.

it's been the alpha linux geeks learning how to compute and industrial players learning how to compute, and the invisible hand has been fat happy & plump from it, but imo there's such a huge potential here to re-open computing to persons, to create compelling interesting differently-capable sovereign/owned computing systems, that are free from so many of the small tatters & deprevations & enshittifications that cloud- that doing everyting on other people's computers as L-Users- unnerringly drops on us. we should & could be a more powerful, more technically-cultural culture, and i think we've severely underrated how much subtle progress there's been to make that a much less awful, specialized, painful, time-consuming, low-availability, disconnected effort than it used to be.

charcircuit · on April 7, 2023

>Applications on these systems are not deployed

In a way they are. You deploy it to the store and then as people's computers download the update automatically.

A counter example to your claims about Linux is Android. Libraries are not shared between apps (beyond the android framework and libc). This is despite the fact that phones have limited storage.

PaulDavisThe1st · on April 7, 2023

the chorus is about the assumptions commonly found among younger devs that these old "efficiency" and "optimization" techniques don't matter any more. c.f. apps (desktop, mobile) that take forever to do things that should not take forever.

"modern infrastructure" seems like a bit of a giveaway of your mind set. yes, i know that there's a lot of stuff that now happens by having your web browser reach out to "infrastructure" and then the result is displayed in front of you.

But lots of people still use their computers to run applications outside the browser, where "modern infrastructure" means either nothing at all, or it means "their computer (or mobile platform)". the techniques mentioned in this subthread are all still very relevant in this context.

preseinger · on April 7, 2023

there is basically no situation in which it is important to optimize for binary size, embedded sure, but nowhere else

the infrastructural model i'm describing doesn't require applications to run in browsers, or imply that applications are slower, actually quite to the contrary, statically linked binaries tend to be faster

the model where an OS is one to many with applications works fine for personal machines, it's no longer relevant for most servers (shrug)

cwalv · on April 7, 2023

> there is basically no situation in which it is important to optimize for binary size, embedded sure, but nowhere else

Not disagreeing that there many upsides to statically linking, but there are (other) situations where binary size matters. Rolling updates (or scaling horizontally) where the time is dominated by the time it takes to copy the new binaries, e.g.

> the model where an OS is one to many with applications works fine for personal machines, it's no longer relevant for most servers

Stacking services with different usage characteristics to increase utilization of underlying hardware is still relevant. I wouldn't be surprised if enabling very commonly included libraries to be loaded dynamically could save significant memory across a fleet .. and while the standard way this is done is fragile, it's not hard to imagine something that could be as reliable as static linking, esp in cases where you're using something like buck to build the world on every release anyway

PaulDavisThe1st · on April 7, 2023

it was never relevant for servers. and there are probably still fewer servers than end-user systems out there, certainly true if you include mobile (there are arguments for and against that).

preseinger · on April 7, 2023

servers vastly, almost totally, outnumber end-user systems, in terms of deployed software

end-user systems account for at best single-digit percentages of all systems relevant to this discussion

(mobile is not relevant to this discussion)

rektide · on April 13, 2023

binary size is also memory size. memory size matters. applications sharing the same libraries can be a huge win for how much stuff you can fit on a server, and that can be a colossal time/money/energy saver.

yes: if you're a company that tends to only run 1-20 applications, no, the memory-savings probably won't matter to you. that matches quite a large number of use cases. but a lot of companies run way more workloads than anyone would guess. quite a few just have no cost-control and/or just don't know, but there's probably some pretty sizable potential wins. it's even more important for hyper-scalers, where they're running many many customer processes at a time. even companies like facebook though, i forget the statistic, but sometime in the last quarter there was quote saying like >30% of their energy usage was just powering ram. willing to bet, they definitely optimize for binary size. they definitely look at it.

there's significant work being put towards drastically reducing scale of disk/memory usage across multiple containers, for example. composefs is one brilliant very exciting example that could help us radically scale up how much compute we can host. https://news.ycombinator.com/item?id=34524651

i also haven't seen the very important very critical other type of memory mentioned, cache. maybe we can just keep paying to add DRAM forever and ever (especially with CXL coming across the horizon), but the SRAM in your core-complex will almost always tend to be limited (although word is Zen4 might get within striking distance of 1GB which is EPIC). static builds are never going to share cache effectively. the instruction cache will always be unique per process. the most valuable expensive fancy memory on the computer is totally trashed & wasted by static binaries.

there's really nothing to recommend about static binaries, other than them being extremely stupid. them requiring not a single iota of thought to use is the primary win. (things like monomorphic optimization can be done in dynamic libraries with various metaprogramming & optimizing runtimes, hopefully one's that don't need to keep respawning duplicate copies ad-nauseum.)

i do think you're correct about the dominant market segment of computing, & you're speaking truthfully to a huge % of small & mid-sized businesses, where the computing needs are just incredibly simple & the ratio of processes to computers is quote low. their potential savings are not that high, since there's just not that much duplicate code to keep dynamically linking. but i also think that almost all interesting upcoming models of computing emphasize creating a lot more smaller lighter processes, that there are huge security & managability benefits, and that there's not a snowman's chance in hell that static-binary style computing has any role to play in the better possible futures we're opening up.

preseinger · on April 13, 2023

you're very sensitive to the costs of static linking but i don't think you see the benefit

the benefit is that a statically linked binary will behave the same on all systems and doesn't need any specific runtime support above or beyond the bare minimum

this is important if you want a coherent deployment model at scale -- it cannot be the case that the same artifact X works fine on one subset of hosts, but not on another subset of hosts, because their openssl libraries are different or whatever

static linking is not stupid, it doesn't mean that hosts can only have like 10 processes on them, it doesn't imply that the computing needs it serves are simple, quite the opposite

future models of computing are shrinking stuff like the OS to zero, the thing that matters is the application, security (in the DLL sense you mean here) is not a property of a host, it's a property of an application, it seems pretty clear to me that static linking is where we're headed, see e.g. containers

howinteresting · on April 7, 2023

Thanks for the information about SunOS. My point still stands: the C ecosystem makes it possible in a way that other language models simply don't.

> Then I would urge you to reconsider.

Done. No change to my beliefs.

crabbone · on April 6, 2023

Absolutely. As soon as it started to seem like even couple hundreds of JARs won't put a significant strain on the filesystem having to house them, the typical deployment switched to Docker images, and, on top of the hundred of JARs started to bundle in the whole OS userspace. Which also, conveniently, makes memory explode because shared libraries are no longer shared.

This would definitely sound like a conspiracy theory, but I'm quite sure that hardware vendors see this technological development as, at least, a fortunate turn of events...

preseinger · on April 6, 2023

when someone writes a program and offers it for other people to execute, it should generally be expected to work

the size of a program binary is a distant secondary concern to this main goal

static compilation more or less solves this primary requirement, at the cost of an increase to binary size that is statistically zero in the context of any modern computer, outside of maybe embedded (read: niche) use cases

there is no meaningful difference between a 1MB binary or a 10MB binary or a 100MB binary, disks are big and memory is cheap

the optimization of dynamic linking was based on costs of computation, and a security model of system administration, which are no longer valid

there's no reason to be offended by this, just update your models of reality and move on

cozzyd · on April 7, 2023

Wait until you use a single board computer with a 4GB emmc OS disk. And don't forget about bandwidth...

preseinger · on April 7, 2023

i have a few devices like that around, but the thing is that the software i put on them is basically unrelated to the software that's being discussed here

definitely i am not using buck or bazel or whatever to build binaries that go on those little sticks

cozzyd · on April 7, 2023

sure, but people are suggesting staticly linking everything and many modern languages don't really support dynamic linking.

rektide · on April 6, 2023

I never had a problem before. The people saying we need this for convenience felt detached & wrong from the start.

It's popular to be cynical & conservative, to disbelieve. That has won the day. It doesn't do anything to convince me it was a good choice or actually helpful, that we were right to just give up.

preseinger · on April 6, 2023

"wrong" or "a good choice" or "actually helpful" are not objective measures, they are judged by a specific observer, what's wrong for you can be right for someone else

i won't try to refute your personal experience, but i'll observe it's relevant in this discussion only to the extent that your individual context is representative of consumers of this kind of software in general

that static linking provides a more reliable end-user experience vs. dynamic linking is hopefully not controversial, the point about security updates is true and important but very infrequent compared to new installations

stu2b50 · on April 6, 2023

Sometimes things just don’t have good solutions in one space. We solved in another space, as SSD and ram manufacturers made memory exponentially cheaper and more available over the last few decades.

So we make the trade off of software complexity for hardware complexity. Such is how life goes sometimes.

thangngoc89 · on April 6, 2023

> with minimal runtime dependencies

You’re probably thinking of static binary. I believe that OP is comparing a single binary vs installing the whole toolchain of Python/Ruby/Node and fetching the dependencies over the wire.

crabbone · on April 6, 2023

If it's not a statically linked binary, then the problem is just as bad as it is with Python dependencies: instead, now you need to find the shared libraries that it linked with.

maccard · on April 6, 2023

We've had decades to figure this out, and none of the "solutions" work. Meanwhile, the CRT for visual studio id 15MB. If every app I installed grew by 15MB I don't think I would notice.

cozzyd · on April 7, 2023

Imagine if every QT program included all of the QT shared libraries.

maccard · on April 7, 2023

On windows they do.

howinteresting · on April 6, 2023

Dynamic linking is an artifact of C, not some sort of universal programming truth.

wahern · on April 7, 2023

Dynamic linking originated with Multics (https://en.wikipedia.org/wiki/Multics) and MTS (https://en.wikipedia.org/wiki/MTS_system_architecture), years before C even existed. Unix didn't get dynamic linking until the 1980s (https://www.cs.cornell.edu/courses/cs414/2001FA/sharedlib.pd...).

The impetus for dynamic linking on Multics and MTS was the ability to upgrade libraries without having to recompile software, and to reuse code not originally designed or intended for (e.g. different compilers or languages), let alone compiled with, the primary program. Both of these reasons still pertain, notwithstanding that some alternatives are more viable (e.g. open source code means less reliance on binary distribution).

preseinger · on April 7, 2023

neither of those reasons still pertain, really

"the primary program" is the atomic unit of change, it is expected that each program behaves in a way that is independent of whatever other files may exist on a host system

bogwog · on April 6, 2023

I feel so lucky that I found waf[1] a few years ago. It just... solves everything. Build systems are notoriously difficult to get right, but waf is about as close to perfect as you can get. Even when it doesn't do something you need, or it does things in a way that doesn't work for you, the amount of work needed to extend/modify/optimize it to your project's needs is tiny (minus the learning curve ofc, but the core is <10k lines of Python with zero dependencies), and doesn't require you to maintain a fork or anything like that.

The fact that the Buck team felt they had to do a from scratch rewrite to build the features they needed just goes to show how hard it is to design something robust in this area.

If there are any people in the Buck team here, I would be curious to hear if you all happened to evaluate waf before choosing to build Buck? I know FB's scale makes their needs unique, but at least at a surface level, it doesn't seem like Buck offers anything that couldn't have been implemented easily in waf. Adding Starlark, optimizing performance, implementing remote task execution, adding fancy console output, implementing hermetic builds, supporting any language, etc...

[1]: https://waf.io/

klodolph · on April 6, 2023

> If there are any people in the Buck team here, I would be curious to hear if you all happened to evaluate waf before choosing to build Buck?

There’s no way Waf can handle code bases as large as the ones inside Facebook (Buck) or Google (Bazel). Waf also has some problems with cross-compilation, IIRC. Waf would simply choke.

If you think about the problems you run into with extremely large code bases, then the design decisions behind Buck/Bazel/etc. start to make a lot of sense. Things like how targets are labeled as //package:target, rather than paths like package/target. Package build files are only loaded as needed, so your build files can be extremely broken in one part of the tree, and you can still build anything that doesn’t depend on the broken parts. In large code bases, it is simply not feasible to expect all of your build scripts to work all of the time.

The Python -> Starlark change was made because the build scripts need to be completely hermetic and deterministic. Starlark is reusable outside Bazel/Buck precisely because other projects want that same hermeticity and determinism.

Waf is nice but I really want to emphasize just how damn large the codebases are that Bazel and Buck handle. They are large enough that you cannot load the entire build graph into memory on a single machine—neither Facebook nor Google have the will to load that much RAM into a single server just to run builds or build queries. Some of these design decisions are basically there so that you can load subsets of the build graph and cache parts of the build graph. You want to hit cache as much as possible.

I’ve used Waf and its predecessor SCons, and I’ve also used Buck and Bazel.

jsgf · on April 7, 2023

With Buck2, memory taken for the graph is a concern, but it fits into a single host's RAM.

klodolph · on April 7, 2023

Interesting. I know that for Buck 1, some workloads didn’t fit entirely in RAM.

bogwog · on April 6, 2023

I get that, but again, there's no reason Waf can't be used as a base for building that. I actually use Waf for cross compilation extensively, and have built some tools around it with Conan for my own projects. Waf can handle cross compilation just fine, but it's up to you to build what that looks like for your project (a common pattern I see is custom Context subclasses for each target)

Memory management, broken build scripts, etc. can all be handled with Waf as well. In the simplest case, you can just wrap a `recurse` call in a try catch block, or you can build something much more sophisticated around how your projects are structured.

Note, I'm not trying to argue that Google/Facebook "should have used X". There are a million reasons to pick X over Y, even if Y is the objectively better choice. Sometimes, molding X to be good enough is more efficient than spending months just researching options hoping you'll find Y.

I'm just curious to know if they did evaluate Waf, why did they decide against it.

klodolph · on April 7, 2023

I don’t see how using Waf as a base would help in any way. It seems like a massive mismatch for the problems that Facebook and Google are solving. You seem to be fond of Waf, maybe if you elaborated why you think that Waf would be a good base for compiling massive, multi-language code-bases, I could understand where you are coming from. Where I am coming from—it feels like Waf is kind of a better version of autotools, or something like that, and it’s just not in the same league. It’s like comparing a bicycle to a cargo ship. Like, “Why didn’t the people designing the cargo ship use the bicycle as a starting point?” I don’t want to abuse analogies here, but that’s what the question sounds like to me. This is based on my relatively limited experience using Waf (and SCons, which I know is different), and my experience using Bazel and Buck.

Having spent a lot of time with Buck and Bazel, there are just so many little things you run into where you go, “Oh, that explains why Buck or Bazel is designed that way.” These design decisions permeate Buck and Bazel (Pants, Please, etc.)

I just don’t see how Waf can be used as a base. I really do see this as a new “generation” of build systems, with Buck, Bazel, Please, and Pants, and everything else seems so much more primitive by comparison.

bogwog · on April 7, 2023

I’m coming from the perspective of someone who has been working with it for a while, and coincidentally very intensely hacking away at it recently.

The thing about waf is that it’s more designed like a framework than a typical build tool. If you look at the code, it’s split into a core library (thats the <10k loc I estimated), and additional tools that do things like add C++ or Java build support.

That’s one of the reasons I like Waf, since it becomes a powerful toolkit for creating a custom build system once you strip away the thin outer layer. There is no one-size-fits-all build system, so a tool that can be molded like waf is very powerful imo.

I guess it’s hard to get that point across without experiencing it. There are just so many good design decisions everywhere. For example, extensibility comes easily because task generator methods are “flat”, and ordering is implemented via constraints. This means you can easily slip your own functions between any built in generator method to manipulate their inputs or outputs. It’s like a sub-build system just for creating Task objects.

Also, I don’t want to give the impression that I think waf would have been a better choice for these companies. I’ve kind of been defending it a lot in this thread, but my original point/question was just to know if they evaluated waf/what they thought about it. After so many comments I feel like I might be coming off as hostile… which isn’t my intention.

klodolph · on April 7, 2023

I’m not trying to react to your comments as if they’re hostile, just hope to clear the air. I like defending Buck and Bazel a little bit, and at the same time, I really recognize that they are painful to adopt, don't solve everyone’s problems, etc.

Waf does seem like a “do things as you like” framework, and I think that notion is antithetical to the Buck and Bazel design ethos. Buck and Bazel’s design are, “This is the correct way to do things, other ways are prohibited.” You fit your project into the Buck/Bazel system (which could be a massive headache for some) and in return you get a massive decrease in build times, as well as some other benefits like good cross-compilation support.

One fundamental part of the Buck/Bazel design is that you can load any arbitrary subset of the build graph. Your repository has BUILD files in various directories. Those directories are packages, and you only load the subset of packages that you need for the targets you are actually evaluating during evaluation time. You can even load a child package without loading the parent—like, load //my/cool/package without loading //my/cool or //my.

The build graph construction also looks somewhat different. There is an additional layer. In build systems like Waf, you have some set of configuration options, and the build scripts generate a graph of actions to perform which create the build using that configuration. In Buck/Bazel, there is an additional layer—you create a platform-agnostic build graph first (targets, which are specified using rules like cc_library), and then there’s a second analysis phase, which converts rules like “this is a cc_library” into actual actions like “run GCC on this file”.

These extra layers are there, as far as I can tell, to support the goals of isolating different parts of your build system from each other. If they’re isolated, then you have better confidence that they will produce the same outputs every time, and you can make more of the build process parallelizable—not just the actual build actions, but the act of loading and analyzing build scripts.

I do think that there is room to appreciate both philosophies—the “let’s make a flexible platform” philosophy, and the “let’s make a strict, my-way-or-the-highway build system” philosophy.

davnn · on April 7, 2023

> the core is <10k lines of Python with zero dependencies

Isn‘t that already a no-go, to write a performance critical system in a slow programming language?

taeric · on April 7, 2023

I am no python fan, but find it laughably hard to believe it could be what makes a build coordination system slow.

Too · on April 7, 2023

On clean builds the python tax will be dwarfed be the thousands of calls to clang yes. That’s not the scenario you need to optimize for. What’s more important is that incremental builds are snappy, since that is what developers do 100 times per day.

I’ve seen some projects with 100MB+ ninja-files that even ninja itself, proud for being written in optimized c++, takes a second or two to parse on each build invocation. Convert that to python and you likely land in 5-20 sec range instead. Enough to alt-tab and get distracted by something else. Google code base is likely even larger than this.

A background daemon that holds the graph in memory would probably handle it. In the big scheme such a design is likely better anyway. But needs a big upfront design and is a lot more complex than just reparsing a file each time.

Side note: For some, even the interpreter startup is annoying. Personally I find it negligible, especially after 3.11 you can almost claim it’s snappy.

taeric · on April 7, 2023

Code bases that big are strawmen for most companies. Yes, they happen; but as often they should be segmented into smaller things. That don't require monolithic build setups.

joshuamorton · on April 7, 2023

The context for this thread was weather Facebook considered waf in participation, so it is very relevant.

taeric · on April 7, 2023

Certainly fair. I had meandered on to "in general" way too quickly.

nextaccountic · on April 6, 2023

> They are large enough that you cannot load the entire build graph into memory on a single machine

You mean, multiple gigabytes for build metadata, that just says things like that X depends on Y and to build Y you run command Z?

esprehn · on April 7, 2023

Yes, the codebases at Google and FB contain billions of files. Article from 2016 about the scale, and of course it's only grown dramatically since then: https://m-cacm.acm.org/magazines/2016/7/204032-why-google-st...

klodolph · on April 7, 2023

Yes. By “multiple gigabytes” I am talking about >100 GB. Maybe >1 TB.

nextaccountic · on April 7, 2023

How is this even possible? I take that this data is highly compressible, right?

phyrex · on April 7, 2023

It wouldn’t be compressed in ram though, would it?

nextaccountic · on April 7, 2023

There are in-memory https://en.wikipedia.org/wiki/Succinct_data_structure but I actually don't mean that specifically: I mean that, for example, there must be tons of strings with common prefixes, like file paths (which can be stored in a trie to have faster access and compress data in ram) or very similar strings (like compiler invocations that mostly have the same flags), and other highly redundant data that can usually be used to cut down on memory requirements.

I highly doubt that, after doing all those tricks, you still end up with 100GB - 1TB of build data.

Shish2k · on April 7, 2023

You could do those tricks and cut down memory, perhaps even 10x, but they come at the cost of increased CPU time. Designing the system in such a way that you only ever need to load a tiny subset of the graph at one time gives you a 1000x saving for memory and CPU.

nextaccountic · on April 7, 2023

Some of those tricks may actually decrease CPU time (by fetching less data from RAM and using the CPU cache more effectively). And you can also apply any optimizations for partial loading on top of that.

I guess the downside is that the system would be more complex overall, but you can probably get 80% of the result with not so large changes

jsgf · on April 6, 2023

I don't know if they considered waf specifically, but the team is definitely very familiar with the state of the art: https://www.microsoft.com/en-us/research/uploads/prod/2018/0...

One of the key requirements is that Buck2 had to be an (almost) drop-in replacement for Buck1 since there's no way we could reasonably rewrite all the millions of existing build rules to accommodate anything else.

Also Buck needs to support aggressive caching, and doing that reliably puts lots of other constraints on the build system (eg deterministic build actions via strong hermeticity) which lots of build systems don't really support. It's not clear to me whether waf does, for example (though if you squint it does look a bit like Buck's rule definitions in Starlark).

xxpor · on April 6, 2023

I truly believe any build system that uses a general-purpose language by default is too powerful. It lets people do silly stuff too easily. Build systems (for projects with a lot of different contributors) should be easy to understand, with few, if any, project specific concepts to learn. There can always be an escape hatch to python (see GN, for example), but 99% of the code should just be boring lists of files to build.

sangnoir · on April 6, 2023

You cannot magick away complexity. Large systems (think thousands of teams with hundreds of commits per minute) require a way to express complexity. When all is said and done, you'll have a turing-complete build system anyway - so why not go with something readable

Too · on April 7, 2023

No no no no. The more code you have, the more you have to constrain the builds.

I understand where the sentiment comes from, having seen one too many example of people struggling to implement basic logic in cmake or groovy, that would be a oneliner in python. But completely opening up the floodgates is not the right solution.

Escape hatches into GP languages can still exist but the interfaces to them need to be strict, and it’s better people see this boundary clearly, rather than limping around trying to do GP inside cmake and failing on correctness anyway. Everything else should like parent say just be a list of files.

Dependencies need to be declarative and operations hermetic.

Otherwise the spaghetti of complexity will just keep growing. Builds and tests will take forever due to no way of detecting what change affects which subsystem, what can be parallelized and even worse when incremental builds stop working.

By constraining what can be done, you also empower developers to do whatever they want, within said boundaries, without having to go through an expert build-team. Think about containers, it allowed every team to ship whatever they want without consulting the ops team.

sangnoir · on April 7, 2023

> The more code you have, the more you have to constrain the builds.

That works if you have one team - of if all teams work the same way. If you have multiple teams with conflicting requirements[1], you absolutely should not constrain the build because you'd be getting in the way.

1. E.g. Team A uses an internal C++ lib an online service and prefers an evergreen version of it to be automatically applied with minimal human involvement. Team B team uses the same lib on physical devices shipped to consumers/customers. Updates are infrequent (annual), but have to be tested thoroughly for qualification. Now your build system has to support evergreen dependencies and versioned ones. If you drop support for either, you'll be blocking one team or the other from doing work.

lmm · on April 7, 2023

On the contrary, large systems have to restrict what their build system does because otherwise the complexity becomes unmanageable. I used to work on a large codebase (~500 committers, ~10MLOC) that had made the decision to use Gradle because they thought they needed it, but then had to add increasingly strict linters/code review/etc. to the gradle build definitions to keep the build maintainable. In the end they had a build that was de facto just as restricted as something like Maven, and the Turing completeness of Gradle did nothing but complicate the build and slow it down.

And sure, maybe having a restricted build definition (whether by using a restricted tool or by doing code review etc.) moves the complexity somewhere else, like into the actual code implementation. But it's easier to manage there. The build system is the wrong place for business logic, because it's not somewhere most programmers ever think to look for it.

xxpor · on April 6, 2023

I seriously doubt there's a single repo on the planet that averages hundreds of commits per minute. That's completely unmanageable for any number of reasons.

kps · on April 7, 2023

According to [1], in 2015 Google averaged 25 commits per minute (250000/7/24/60). I can imagine hundreds per minute during Pacific working hours today.

[1] https://cacm.acm.org/magazines/2016/7/204032-why-google-stor...

xxpor · on April 7, 2023

In the case of a monorepo, that's exactly the case why the build system shouldn't be overly complex. If you're expecting random people to make changes to your stuff, you shouldn't be burdening them with more complexity than necessary.

The monorepo case is also a little bit outside what I was originally talking about. I was mostly refering to individual services/libraries/apps

chucknthem · on April 6, 2023

It wouldn't surprise me at all if some large repos at Google or Facebook now get to that many, it's easy to do once you have robots committing code (usually configuration changes).

sangnoir · on April 7, 2023

I didn't mean on average, but the build tool has to handle the worst case and I probably am understating the worst case.

I'd bet there are a more than a few repos that do get (at least) hundreds of commits as a highwater mark. My guess is lots of engineers + mono-repo + looming code-freeze deadline can do that like clockwork.

Edit: Robots too as sibling pointed out. A single human action may result in dozens of bot-generated commits

xxpor · on April 7, 2023

IMO there's almost never a good reason to have automated commits in repos outside of two cases:

1) Automated refactoring

2) Automated merges when CI passes

Configs that can be generated should just be generated by the build.

But that's a different topic

sangnoir · on April 7, 2023

> IMO there's almost never a good reason to have automated commit

This depends entirely on the quality of dev tools available.

Also, commit =/= shipped code: you may have a automated commits and keep a human in the loop before shipping, by way of rejectable Pull-Request (or the proprietary equivalent).

A simple library upgrade will result in a wave of commits/bot-authored PRs

1. Human makes a change to a core library, changing it from v1 to v2

2. Bot identifies all call-sites and refactors to v2-equivalent, creating 50 PRs for 50 different teams.

One change, 51 commits.

rrdharan · on April 7, 2023

There are at least two other hugely important use cases you missed:

- automatic security / vendoring updates (e.g. https://github.com/renovatebot/renovate)

- automated cross-repo syncs, e.g. Google has processes and tools that bidirectionally sync pieces of Google3 with GitHub repos

DrBazza · on April 6, 2023

The problem with build systems are its users. For exactly the reason you say. For a man with a hammer every problem is a nail. Developers don’t think of build systems in the right way. If you’re doing something complex in your build it should surely be a build task in its own right.

bogwog · on April 6, 2023

I agree, that’s also pretty much why Starlark exists. However, there are many cases where you do need complex build logic.

Personally, I always go for declarative CMake first, then waf as soon as I find my CMakeLists looking like something other than just a list of files.

I’ve considered before creating a simple declarative language to build simple projects like that with waf, but I don’t like the idea of maintaining my own DSL for such little benefit, when CMake works just fine, and everyone knows how to use it. I feel like I’d end up with my own little TempleOS if I decided to go down that rabbit hole.

bluGill · on April 7, 2023

I'd like to agree, but every significant project has something wierd that their build system doesn't have built in yet. So you need some way to extend it. Useful build systems end up supporting lots of hacks '

That said, the more your build systems makes easy without having to write code the better.

joshuamorton · on April 7, 2023

While this is true, this isn't a problem for your buck/bazel/pants-likes. Between genrules and custom rules you can do this with all the power you (usually) need.

pjmlp · on April 6, 2023

They are the bane of any DevOps/Build Engineer when trying to fix build issues.

baby · on April 6, 2023

I think I would agree as well. So I’m not sure how that makes me feel about nix.

xxpor · on April 6, 2023

Nix is different because no one's smart enough to figure how how to do silly things ;)

pxc · on April 7, 2023

Nix is Turing complete, but it's not a general purpose language. It is designed as a DSL for building software, and I think it's pretty nice for that.

The Nickel rationale doc has some thoughts on why this might be the right call: https://github.com/tweag/nickel/blob/master/RATIONALE.md#tur...

From my (limited) experience with another deliberately limited configuration DSL (CUE), I think more power in such DSLs will pan out better in the long run. Of course, it's not all one or the other: a powerful build DSL can still enforce useful discipline, and a Turing-complete language can still be thoughtfully designed around a special purpose. I think Nix demonstrates both pretty well, actually.

tadfisher · on April 8, 2023

Nix forces you to serialize every build step (what it calls a "derivation"), and moreover it isolates the build environment to only include things built with Nix or verified by hash. So while there is a lot of power, the only thing you can do with that power is produce derivations which themselves actually run the build.

Contrast this with Gradle, which is currently digging itself out of a hole by forcing authors to declare all inputs and outputs of their tasks so it can serialize them, but you can literally do anything Java can throughout the entire process. This is the kind of Herculean task which is neatly sidestepped by tightly controlling the DSL environment (inputs/outputs) as does Nix.

PaulDavisThe1st · on April 6, 2023

And the best part about waf? The explicit design intent that you include the build system with the source code. This gets rid of all the problems with build systems becoming backwards/forwards incompatible, and trying to deal with the issues when a developer works on one project using build system v3.9 and another that users build system v4.6

With waf, the build system is trivially included in the source, and so your project always uses the right version of waf for itself.

softfalcon · on April 6, 2023

I could be wrong as I haven't dug into the waf docs too too much, but I think the major difference between waf and Buck is the ability to handle dependency management between various projects in a large org.

The documentation and examples for waf seem to be around building one project, in one language, with an output of statistics and test results. I am sure this is a simplification for education and documentation purposes, but it does leave a vague area around "what if I have more than 1 or 2 build targets + 5 libs + 2 apps + 3 interdependent helper libraries?"

Buck seems to be different in that it does everything waf does but also has clear `dep` files to map dependencies between various libraries within a large repository with many, many different languages and build environments.

The key thing here being, I suspect that within Meta's giant repositories of various projects, they have a tight inter-linking between all these libraries and wanted build tooling that could not only build everything, but be able to map the dependency trees between everything as well.

Pair that with a bunch of consolidated release mapping between the disparate projects and their various links and you have a reason why someone would likely choose Buck over waf purely from a requirements side.

As for another reason they likely chose Buck over waf. It would appear that waf is a capable, but lesser known project in the wider dev community. I say this because when I look into waf, I mostly see it compared against CMake. Its mental state resides mostly in the minds of C++ devs. Either because of NIHS (not invented here syndrome) or fear that the project wouldn't be maintained over time, Meta may have decided to just roll their own tooling. They seem to be really big on the whole "being the SDK of the internet" as of late. I could see them not wanting to support an independent BSD licensed library they don't have complete control over.

These are just my thoughts, I could be completely wrong about everything I've said, but they're my best insights into why they likely didn't consider waf for this.

bogwog · on April 6, 2023

It’s true that Waf doesn’t come with dependency management out of the box (EDIT: unless you count pkg-config), so maybe that’s why (besides NIHS). The way I handle it is with another excellent project called Conan (https://conan.io/)

However, if you’re going to build a custom package management system anyways, there’s no reason you couldn’t build it on top of waf. Again, the core is tiny enough that one engineer could realistically hold the entire thing in their head.

But I don’t think we’re going to get it right speculating here lol. I’m sure there was more to it than NIHS, or being unaware of waf.

joshuamorton · on April 6, 2023

A number of things like being written in python start to matter at big scale. I love python, but cli startup time in python is actually a concern for apps used many times daily by many engineers.

Fixing that or moving to a daemon or whatever starts to take more time than just redoing it from scratch, and if the whole thing is 10k lines of python, it's something a domain expert can mostly reimplement in a week to better serve the fb specific needs.

bogwog · on April 6, 2023

I've been using Waf for a couple of years, including on retro thinkpads from ~08. I've never run into issues with the startup time for waf and/or Python. Even if the interpreter were 100x slower to start and execute than it currently is, that time would be negligible next to the time spent waiting for a compiler or other build task to complete.

And if it is too slow, there's profiling support for tracking down bottlenecks, and many different ways to optimize them. This includes simply optimizing your own code, or changing waf internal behavior to optimize specific scenarios. There's even a tool called "fast_partial" which implements a lot more caching than usual project-wide to reduce time spent executing Python during partial rebuilds in projects with an obscene number of tasks.

> Fixing that or moving to a daemon or whatever starts to take more time than just redoing it from scratch, and if the whole thing is 10k lines of python, it's something a domain expert can mostly reimplement in a week to better serve the fb specific needs.

Well, considering Buck just went through a from-scratch rewrite, I would argue otherwise. Although, to be fair, that 10k count is just for the core waflib. There are extra modules to support compiling C/C++/Java/etc for real projects.

(also, waf does have a daemon tool, but it has external dependencies so it's not included by default)

joshuamorton · on April 6, 2023

> Well, considering Buck just went through a from-scratch rewrite, I would argue otherwise

Based on what, the idea that waf fits their needs better than the tool they wrote and somehow wouldn't need to be rewritten or abandoned?

> Even if the interpreter were 100x slower to start and execute than it currently is, that time would be negligible next to the time spent waiting for a compiler or other build task to complete.

This wrongly assumes that clean builds are the only use case. Keep in mind that in many cases when using buck or bazel, a successful build can complete without actually compiling anything, because all of the artifacts are cached externally.

> There's even a tool called "fast_partial" which implements a lot more caching than usual project-wide to reduce time spent executing Python during partial rebuilds in projects with an obscene number of tasks

Right, the point that this is a concern to some people, and that there's clearly some tradeoff here such that it isn't the default immediately rings alarm bells.

bogwog · on April 6, 2023

No offense, but I think you're reading too much into my casual comments here to guide your understanding of waf, rather than the actual waf docs. Waf isn't optimized for clean builds (quite the contrary), and neither you nor I know whether the waf defaults are insufficient for whatever Buck is being used for. I just pointed out the existence of that "fast_partial" thing to show how deep into waf internals a project-specific optimization effort could go.

But discussions about optimization are pointless without real world measurements and data.

lmm · on April 7, 2023

The fact that it's implemented and not on by default is a red flag any way you slice it. Either it's implemented but unreliable, or it's reliable but the maintainers don't think it's worth turning on for some reason (why?).

dikei · on April 7, 2023

Exactly, one of the key selling point Bazel/Buck is their caching systems: very high cache hit rate, with no inconsistency, which allows very fast incremental build: 0-change build takes close to 0 seconds.

cozzyd · on April 6, 2023

Just imagine how much memory a large dependency graph would take in Python...

Especially considering how poor Python's support for shared memory concurrency is...

scrollaway · on April 7, 2023

Waf bills itself as "the meta build system". But Buck2 is "the Meta build system". :)

rtpg · on April 7, 2023

waf looks pretty nice but does it have a remote cache? For me the biggest argument for Bazel is the remote caching, and not having it is a bit of a deal breaker IMO

thomasahle · on April 6, 2023

It's probably more about better caching, but using buck2 internally at Meta reduced me buildtimes from minutes to seconds. A very welcome upgrade.

Dowwie · on April 7, 2023

For what language?

thomasahle · on April 7, 2023

Python mainly.

lopkeny12ko · on April 6, 2023

I'm missing some historical context here. This article goes out of its way to compare and contrast with Bazel. Even the usage conventions, build syntax (Starlark), and RBE API are the same as in Bazel.

Did FB fork Bazel in the early days but retain basically everything about it except the name? Why didn't they just...adopt Bazel, and contribute to it like any other open source project?

0xcafefood · on April 6, 2023

One thing you might be missing is that this is Buck2.

Buck (https://github.com/facebook/buck) has been open sourced for nearly 10 years now.

The lore I've heard is that former Googlers went to Facebook, built Buck based on Blaze, and Facebook open sourced that before Google open sourced Blaze (as Bazel).

The first pull to the Buck github repo was on May 8, 2013 (https://github.com/facebook/buck/pulls?q=is%3Apr+sort%3Acrea...). The first to Bazel was Sep 30, 2014 (https://github.com/bazelbuild/bazel/pulls?q=is%3Apr+sort%3Ac...).

bradfitz · on April 7, 2023

I was visiting the Google Munich office on the day Google open sourced Blaze/Bazel. The Facebook Buck team sent a congratulations cake: https://photos.app.goo.gl/6KwE6qeD3i72kSo38

Amusingly, the cake was in German but most of the Bazel team didn't really speak German. But it was yummy.

dheera · on April 6, 2023

Smells like what FB did with Caffe vs. Caffe2, the two of which have nothing to do with each other.

esprehn · on April 7, 2023

Blaze is very old (from 2006), the history is described here: https://mike-bland.com/2012/10/01/tools.html#blaze-forge-src...

In the years that followed folks left Google and joined other companies and created similar build systems because blaze had a lot of advantages at scale. Facebook made Buck, Twitter made Pants. Blaze was still closed source inside Google. They all used the same python looking language.

In 2012 Twitter open sourced Pants: https://blog.twitter.com/engineering/en_us/a/2016/the-releas...

In 2013 Facebook open sourced Buck: https://en.m.wikipedia.org/wiki/Buck_(software)

In 2015 Google finally open sourced most of blaze, but renamed it bazel for copyright reasons. Some might argue they waited too long because clearly there was a lot of demand for such a system. :)

After that Twitter (mostly?) migrated to bazel and Facebook sort of stalled out on Buck. But then recently they decided to rewrite it from scratch to fix a lot of the architecture problems resulting in Buck2.

Buck2 looks pretty impressive and hopefully it gets the bazel folks moving faster. For example the analysis phase in bazel is very slow even inside Google, and Buck2 shows an alternative design that's much faster.

flurie · on April 7, 2023

The direction the Bazel team seems to be going in is shortening the wall clock time by allowing for concurrent analysis and execution: https://github.com/bazelbuild/bazel/issues/14057.

krschultz · on April 6, 2023

At the time that FB started writing Buck, Bazel was not open source. I believe it did exist as Blaze internally at Google before FB started writing Buck. Facebook open sourced Buck before Google open sourced Blaze as Bazel.

Over time Facebook has been working to align Buck with Bazel, e.g. the conversion to Starlark syntax so tools such as Buildozer work on both systems. I believe Buck2 also now uses the same remote execution APIs as Bazel, but don't quote me on that.

mikepurvis · on April 7, 2023

Blaze already existed when I was an intern in 2007.

ynx · on April 6, 2023

Buck far predates Bazel, and was built by ex-googlers replicating Blaze.

Skylark was a later evolution, after the python scripts grew out of control, and a cue that fb took from Google long after Buck had been near-universally deployed for several years.

bdittmer · on April 7, 2023

Remote Execution is just a gRPC protocol -- bazel, buck1 and others implement it.

jeffbee · on April 6, 2023

Hrmm, it makes performance claims with regard to Buck1 but not to Bazel, the obvious alternative. Hardly anyone uses Buck1 so you'd think it would be relevant.

dtolnay · on April 6, 2023

I have a non-toy multi-language project in https://github.com/dtolnay/cxx for which I have both Buck2 and Bazel build rules.

On my machine `buck2 clean && time buck2 build :cxx` takes 6.2 seconds.

`bazel clean && time bazel build :cxx` takes 19.9 seconds.

jeffbee · on April 6, 2023

That's cool. I was not able to repro due to the buck2 instructions not working for me, two different ways

   Compiling gazebo v0.8.1 (/home/jwb/buck2/gazebo/gazebo)
  error[E0554]: `#![feature]` may not be used on the stable release channel
  --> gazebo/gazebo/src/lib.rs:10:49

Then with the alternate instructions:

  error: no such command: `+nightly-2023-01-24`
 Cargo does not handle `+toolchain` directives.
 Did you mean to invoke `cargo` through `rustup` instead?

fanzeyi · on April 6, 2023

It looks like you don't have rustup.rs. You will need to install that since Buck2 is depending on nightly Rust features.

yencabulator · on April 9, 2023

FWIW buck2 doesn't seem to build with nightly-2023-01-24 at this time, nightly-2023-03-15 worked for me. (Nightlies from April cause an internal compiler error.)

jeffbee · on April 6, 2023

Anyway regardless of the fact that my local Rust environment isn't cut out to repro your result, how much of that speedup is due to parallelism that Buck2 offers and Bazel does not? When I build your cxx repo with bazel and load the invocation trace, the build was fully serialized.

krschultz · on April 6, 2023

It's honestly hard to measure at the scale of Meta. Just making everything compatible with Bazel would be a non-trivial undertaking.

Also that seems an interesting thing an independent person could write about, but whatever claims Meta made on a topic like that would be heavily scrutinized. Benchmarking is notoriously hard to get right and always involves compromises. It's probably not worth making a claim vis a vis a "competitor" and triggering backlash. If it's significantly faster than Bazel that will get figured out eventually. If not the tool really is aimed at Buck1 users upgrading to Buck2 so that is the relevant comparison.

kajecounterhack · on April 6, 2023

I wonder if it's just because they don't have the same scale of data, since FB as a company uses Buck1/Buck2 but not Bazel?

They've clearly learned from Bazel though! I like the idea of not needing Java to build my software, and Starlark is battle tested / might make transitioning off Bazel easier.

rajman187 · on April 6, 2023

The author of Bazel came over to FB and wrote Buck from memory. In Google it’s called Blaze. Buck2 is a rewrite in rust and gets rid of the JVM dependence, so it builds projects faster but it’s slow to build buck2 itself (Rust compilation)

bhawks · on April 6, 2023

I believe this is an over simplification. Engineers who had used Blaze at Google reimplemented it at Facebook based on what they knew of how it worked.

Even Facebook's Buck launch blog does not offer this story of Bucks lineage and although the author worked on the Closure compiler at Google that is not all of Blaze.

https://engineering.fb.com/2013/05/14/android/buck-how-we-bu...

rajman187 · on April 6, 2023

The author of Bazel came over to FB and wrote Buck from memory. In Google it’s called Blaze. Buck2 is a rewrite in rust and gets rid of the JVM dependence, so it builds projects faster but it’s slow to build buck2 itself (Rust compilation)

rockwotj · on April 6, 2023

Does anyone know how IDE support for Buck2 is? I couldn't find anything except some xcode config rules. Half the battle with Bazel/Buck/etc is that getting and IDE or LSP to work for C++/Java/Kotlin/Swift/etc is always a pain because those tools don't really work out of the box.

sebastos · on April 7, 2023

The vscode bazel plugin is basically completely abandoned. There's an issue that has been open for 3 years asking to add intellisense support for C++. Seems completely ludicrous to put in the massive effort it took to build Bazel and then fumble at the goal line by not supporting vscode.

rockwotj · on April 8, 2023

I think the recommendation for c/c++ in Bazel is to use this: https://github.com/hedronvision/bazel-compile-commands-extra...

And use the compile command json file to power clangd. I'm not a vscode person but I would hope the vscode c++ plugin would support that

habitue · on April 7, 2023

probably use a starlark plugin?

piperswe · on April 7, 2023

How does that help with developing C in a Buck project?

evmar · on April 6, 2023

How do the "transitive-sets (tsets)" mentioned here compare to Bazel depsets[1]? Is it the same thing with a different name, or different in some important way?

[1] https://bazel.build/rules/lib/depset

cjhopman · on April 6, 2023

tsets are described in more detail here: https://buck2.build/docs/rule_authors/transitive_sets/. Bazel's depsets were one of the influences on their design. To users, they will seem fairly similar and would be used for solving similar problems, there's some differences in the details of the APIs.

I'm not well-versed on the internal implementation details of bazel's depsets, but one interesting thing about tsets that may further differentiate them is how they are integrated into the core, specifically that we try hard to never flatten them there. The main two places this comes up are: (1) when an action consumes a tset projection, the edges on the DICE graph (our incremental computation edges) only represent the direct tset roots that the action depends on, not the flattened full list of artifacts it represents and (2) when we compute the input digest merkle tree for uploading an actions inputs to RE, that generally doesn't require flattening the tset as we cache the merkle trees for each tset projection node and can efficiently merge them.

yurodivuie · on April 6, 2023

Do smaller companies (smaller than Meta and Google) use these kinds of build tools much? It seems like a system that rebuilds everything whenever a dependency changes is more suited an environment that has very few, if any, external dependencies.

Is anyone using Buck/Bazel and also using frameworks like Spring, or React, for example?

i-use-nixos-btw · on April 6, 2023

I used Bazel for a long time, in a small team (5). No spring/react for me, but we have ~100 external dependencies across 5 languages. It was generally positive, certainly better than any system we used beforehand, and multi-language stuff was great, but the learning curve was steep and it had annoying defaults (which we soon wrapped with our own).

Regarding 3rd party packages, if you update them regularly and they are depended on by intensive builds, that’s going to be the reality in any build system worth using. A build is a DAG and if one node changes then it’s children must too, if you want any guarantee that the update isn’t going to break fresh builds. As with any build, if you want fewer rebuilds you need to be diligent with assigning dependencies to things that need them (so unrelated things don’t rebuild), and when things can be divided and conquered then they should be.

Bazel it isn’t perfect at reproducibility and thats what made me abandon ship. It isn’t perfectly sandboxed, for instance - it still uses your system compilers, still has access to your system libraries. This means that you can accidentally depend on something without specifying it, which violates my requirement for true reproducibility.

These days we use Nix, which is ultimately a better environment for developing reproducible builds if you’re happy to shun Windows. It is also a superior language to Starlark IMO - declarative makes much more sense than imperative in something that has to later resolve to a DAG. In terms of using it as a build system, it’s lacking unless you put in the legwork. The default is to use other build systems within Nix, but that leads to a lot of unnecessary rebuilding during development, so we just span up our own Bazel-like library for Nix to do the heavy lifting.

jvolkman · on April 7, 2023

Bazel supports pluggable toolchains these days. We use `zig cc` via https://github.com/uber/bazel-zig-cc.

surrealize · on April 6, 2023

I worked at a company that was about 150 people when I joined. It's not primarily a software company but the early team had a bunch of ex-google folks, and they chose Bazel. I encountered it for the first time there. We did use React, yes.

I really liked the cross-language aspect of Bazel. Having one command that could compile everything and produce a deployable container in a highly reproducible way is great. It really cut down on "what's your compiler/tool version etc."-type back-and-forth during debugging with other engineers.

The bazel JS/TS rules were tough to work with when we first started using it for JS (2018 I think), especially since we were using create-react-app at the time, and that didn't mesh well with the way bazel wants to work. It's gotten a lot better though.

If I was making the choice from scratch in a new company/codebase, I think it'd really depend on the team. You kind of need broad-based buy-in to get the full benefits IMO.

krschultz · on April 6, 2023

I would heavily consider this type of system once build times become a major pain point. That often happens somewhere around 20-50 people working in one codebase. So I think this is a problem space for medium sized companies. Truly small companies probably don't need this and should use the standard ecosystem tools, BUT if your team knows how to use it there's little downside in started from a Buck / Bazel. Especially since you get most of the benefit if you have a nice clean DAG of your modules, and that's easy to build at the beginning and hard to refactor into later.

hobofan · on April 7, 2023

We are a ~10 people startup and have been using Bazel since day 1 (where I introduced Bazel and learned about it on the job).

Overall, I would say that it has been very much worth it, as it eliminates some classes of developer problems almost entirely that come up in companies of any size (e.g. "works on my machine"). I also feel when it comes to time spent setting it up, it's also a net positive over alternative systems where we would've had to spend time tuning e.g. GH Actions CI caches or make Docker build more reproducible.

chucknthem · on April 6, 2023

Uber adopted Bazel a few years ago for their Go and Java monorepos, which is the majority of their code at the time. I doin't know the state of their UI repos.

umanwizard · on April 7, 2023

> In our internal tests at Meta, we observed that Buck2 completed builds 2x as fast as Buck1.

In my experience Buck was spending a huge amount of time in GC, so this doesn’t surprise me. It must have been (ab)using Java in such a way that massive amounts of stuff were sprayed across the heap.

dboreham · on April 7, 2023

Ah. The "you're holding it wrong" defense of GC :)

rtpg · on April 7, 2023

The dynamic dependency stuff looks very nice! It feels like a good entrypoint for systems that are "merely" wanting good build caching, and not being "so huge git falls apart" big.

My biggest gripe with Bazel is how when you're off the beaten path suddenly it feels like the ecosystem really doesn't want you to just solve problems yourself. Meanwhile in this Buck2 documentation, directly talking about adding good support for tools outsides of community-provided things.

I still am not a superfan of the awkward way that custom implementations get declared (which I think comes from needing to support super-giant projects? But it's jsut awkward) and all the naming suffers from Google-like "we cannot call them functions but must call them factories" NIH things... but at least there are clear docs.

orthoxerox · on April 6, 2023

I really hope the team responsible for it is called Timbuktu.

LegNeato · on April 6, 2023

Congrats to the team! Very excited to finally get to use this.

kylecordes · on April 7, 2023

The essential characteristics of Buck2 look very appealing - but it's hard to see this catching up with the substantial ecosystem of language support rules for Bazel.

hn_go_brrrrr · on April 7, 2023

If the language rules are all Starlark, shouldn't they be compatible?

EntrePrescott · on April 7, 2023

Haven't looked at the Bazel codebase, but it would strongly surprise me if the language support rules were implemented in Starlark. More likely, I'd suppose them to be almost exclusively written in Java (and highly based on quite Bazel-specific classes), with Starlark only coming into play in the form of bindings for being used by the users for their BUILD/bzl definitions.

Edit/Add: cf the part about language rules in this comment from a person who says they're a former bazel developer: https://news.ycombinator.com/item?id=35477309

hn_go_brrrrr · on April 7, 2023

Only the C++ and Java rules are native, the rest are Starlark. I don't know Bazel internals so I can't really comment on your point about APIs.

dikei · on April 7, 2023

Same language does not necessarily mean Buck2 will provide the same API as Bazel to write language rules.

jmmv · on April 7, 2023

As a former Bazel developer and current Bazel user, I very much like the design principles that they outline for Buck2. In particular:

* The fact that it is written in a compiled safe language is a breath of fresh air. I personally like Java the language and understand why Bazel was originally written in Java and how it has done a great job at "hiding" it, but it's still there. In particular, Java's memory and threading models has been problematic for certain scenarios. (I haven't kept up with the language advances and I believe there are new ways to fix this, but adopting them would require a major overhaul of Bazel's internals.) Plus Bazel being written in Java prevents it from being adopted in smaller projects that are /not/ written in Java--a bummer for the whole open source ecosystem.

* The complete separation of language rules from the core is great. This is something that Bazel has wanted to achieve for a long time, but they are still stuck with native C++ and Java rules (it's really hard to rewrite them apparently). Not a huge deal, but in Buck2's case, their design highlights that it's clean enough to support this from day one.

* The "single" phase execution is also nice to see. Bazel used to have three phases (loading, analysis, and execution) and later managed to interleave the first two. However, the separation is still annoying from a performance perspective, and also introduces some artifacts in the memory model.

* It's good that various remote execution facilities as well as virtual file systems have been considered from day one. These do not matter much... until they do, at which point you want the flexibility to incorporate them. Bazel used to have this in the Google-internal version (there is that ACM paper that explains this), but the external version doesn't. For example, there is a patch to support an output tree on a lazy file system courtesy of the bb-clientd project, but after years it hasn't been upstreamed yet.

* And lastly, it's also great to see that what they open sourced is what they use internally. Bazel isn't like that: Google tried to open source a "cleaner version" by removing certain warts that were considered dead ends... and that has been both good and bad. On the one hand, this has been key to developing Starlark to where it is today, but on the other, this has made it hard for certain communities to adopt Bazel (e.g. the Python rules were mostly unusable for a really long time).

Now, a question: Buck2 uses the Starlark language, but that does not imply that they implement the same Build APIs to support the rules that Bazel has. Does anyone know to what extent the rules are compatible between the two? If Buck2 supported the Bazel rules ecosystem or with minor changes, that'd be huge!

ndmitchell · on April 7, 2023

Thanks for the comments! There are two levels at which you could make Buck2/Bazel compatible:

* At the BUILD/BUCK file level. I imagine you can get close, but there are details between each other that will be hard to overcome. Buck2 doesn't have the WORKSPACE concept, so that will be hard to emulate. Bazel doesn't have dynamic dependencies which means that things like OCaml are written as multiple rules for a single library, while Buck2 just has a single rule. I think it should be possible to define a macro layer that was common enough so that it could be implemented by both Bazel and Buck2, and then open source users could just support the unified Bazck build system. * At the rules level. These might be harder, as the subtle details tend to be quite important. Things like tsets vs depsets is likely to be an issue, as they are approximately the same mechanism, but one wired into the dependency graph and one not, which is going to show up everywhere.

1MachineElf · on April 6, 2023

There are a few references to NixOS on the code/issues.[0] I wonder what Meta's use case is for NixOS.

[0] https://github.com/facebook/buck2/search?q=nixos&type=issues

aseipp · on April 7, 2023

That person creating the issues was me. I simply know of Neil's work, was a user of his previous build system Shake, and we ran in some similar circles in the past -- so when I saw buck2's source initially get released a few months ago on GitHub, I just started using it immediately and giving feedback long before the initial release. Nobody at Meta uses NixOS for production work, from my understanding.

Actually, the current tip of trunk for buck2 can't build on NixOS right now with buildRustPackage due to a problem with prost. I should probably file a root cause issue about that soon...

yencabulator · on April 9, 2023

EDIT: nevermind, I couldn't get it to build due to a prost issue, gave up and did `cargo build` instead. I misremembered.

crane.dev with nightly-2023-03-15 from oxalica/rust-overlay built here okay (after I figured out all the BUCK2_BUILD_PROTOC, missing Cargo.lock, etc tricks).

I did have some weird trouble with trying to import buck2 into my flake as a non-flake input, with complaints about "failed to resolve patches" for prost, but putting the flake.nix into the buck2 source tree worked.

aseipp · on April 11, 2023

Just using `cargo build` will work as you've seen, but building a Nix package won't work right now. If you're still keeping up with this thread, though, I have code in this repository to build a copy of buck2 with Nix -- it's several weeks out of date at this point though:

- https://github.com/thoughtpolice/buck2-nix/tree/main/buck/ni...

You should mostly be able to just 'callPackage' that and have it work if you have rust-overlay applied. Check the corresponding 'flake.nix' if you need it.

The prost issue with buildRustPackage is preventing an upgrade, so ideally the minor patches to prost can get upstreamed, but I need to (again) file a ticket about moving this along.

ndmitchell · on April 6, 2023

These were from an open source contributor - out the box Buck2 doesn't really have support for Nix. But because the rules are flexible, you can write your own version of Nix-aware Buck2.

noisy_boy · on April 7, 2023

Wonder if they have examples for Java where maven and groovy are the main two tools.

Also, in case of our builds, we can benefit only so much from being faster during build phase because it is all the other bits like SonarQube scans, pushing artifacts to Artifactory, misc housekeeping bits, annoyingly slow Octopus deployments etc, that add most of the time to the long deployment cycle. Sometimes I think if a dedicated Go utility that takes care everything build related (parallelizing when possible) would make things faster; it will have the full picture after all. But then we will be reimplementing all the features of these various tools which is maybe ok at FB scale but would be too much for a smaller shop.

prpl · on April 7, 2023

You can push those down too so they only operate on applicable code (and if caching is working, only the changed code) per-build, with a tool like bazel and probably buck2 - in addition to getting parallel execution per-target.

djha-skin · on April 7, 2023

Everyone says buck and bazel are so amazing but honestly, mono repos are unicorns. No one does this. It's useful to no one I know. I keep hearing it's useful to somebody, so it must be really useful when it is, but I've never ever seen buck, bazel or monorepos in real life. And it's been my career to build stuff.

5Qn8mNbc2FNCiVV · on April 8, 2023

I'm sorry to break it to you, but monorepos are extremely common. Doesn't mean they have to be as large but every company I've been at had a monorepo.

And as soon as you have to manage PRs for multiple repos with a new cross-cutting feature or scheduling changes in the correct order you understand why they are so appealing.

__float · on April 7, 2023

There's quite a few well-known places listed on https://bazel.build/community/users across many industries. I think Buck and Pants and Please (and ...) are not as widely used, but if they had a list to add we'd have even more examples.

candiddevmike · on April 6, 2023

For folks that are using these kinds of tools, any regrets? How much more complexity do they add vs make or shell scripts?