Hacker News new | past | comments | ask | show | jobs | submit login
Buck2: Our open source build system (fb.com)
392 points by mfiguiere on April 6, 2023 | hide | past | favorite | 271 comments



The fact that Buck2 is written in a statically-compilable language is compelling, compared to Bazel and others. It's also great that Windows appears to be supported out of the box [1,1a] -- and even tested in CI. I'm curious how much "real world" usage it's gotten on Windows, if any.

I don't see many details about the sandboxing/hermetic build story in the docs, and in particular whether it is supported at all on Linux or Windows (the only mention in the docs is Darwin).

It's a good sign that the Conan integration PR [2] was warmly received (if not merged, yet). I would hope that the system is extensible enough to allow hooking in other dependency managers like vcpkg. Using an external PM loses some of the benefits, but it also dramatically reduces the level of effort for initial adoption. I think bazel suffered from the early difficulties integrating with other systems, although IIUC rules_foreign_cc is much better now. If I'm following the code/examples correctly, Buck2 supports C++ out of the box, but I can't quite tell if/how it would integrate with CMake or others in the way that rules_foreign_cc does.

(one of the major drawbacks of vcpkg is that it can't do parallel dependency builds [3]. If Buck2 was able to consume a vcpkg dependency tree and build it in parallel, that would be a very attractive prospect -- wishcasting here)

[1] https://buck2.build/docs/developers/windows_cheat_sheet/ [1a] https://github.com/facebook/buck2/blob/738cc398ccb9768567288... [2] https://github.com/facebook/buck2/pull/58 [3] https://github.com/microsoft/vcpkg/discussions/19129


One side effect of all the Metaverse investment is that Meta now has a lot more engineers working on Windows. You bet there will be real world usage. ;)


> There are also some things that aren't quite yet finished:

> There are not yet mechanisms to build in release mode (that should be achieved by modifying the toolchain).

> Windows/Mac builds are still in progress; open-source code is mostly tested on Linux.

Source: https://buck2.build/docs/why.


> I don't see many details about the sandboxing/hermetic build story in the docs, [...]

Looks like local mode just inherits whatever environment the buck daemon was spawned in.

The remote execution thing is configured with a docker image to run things in, and only specified files are coped into the container instance, so it's somewhat hermetic. Docker containers aren't really reproducible, and there's only one image per remote execution backend, so that's kinda the weakest link (especially compared to something like Nix's hermetic builds, where the build-visible filesystem only contains the things you declared as dependencies).


Internally, we don't use docker in our Remote Execution service implementation and the linux workers use cgroups to isolate whereas the macOS and Windows story is still being worked on.

IIUC, the publicly available Remote Execution services out there are specified by docker, so we chose to have OSS buck2 align to that.

As noted, local mode doesn't do anything else at this point, but we've discussed exactly this to help developers identify dependency declarations earlier.


Great to see this. I hope it takes off - Bazel is useful but I really like the principled approach behind it (see the Build Systems a la Carte paper), and Neil is scarily good from my experience of working with him so I'd expect that they've come up with something awesome.

One thing I find annoying with all of these general, language-agnostic build systems though is that they break the "citizenship" in the corresponding language. So while you can usually relatively easily build a Rust project that uses crates.io dependencies, or a Python project with PyPi dependencies, it seems hard to make a library built using Bazel/Buck available to non-Bazel/Buck users (i.e., build something available on crates.io or PyPi). Does anyone know of any tools or approaches that can help with that?


Regarding bazel, the rules_python has a py_wheel rule that helps you creating wheels that you can upload to pypi (https://github.com/bazelbuild/rules_python/blob/52e14b78307a...).

If you want to see an approach of bazel to pypi taken a bit to the extreme you can have a look at tensorflow on GitHub to see how they do it. They don't use the above-mentioned building rule because I think their build step is quite complicated (C/C++ stuff, Vida/ROCm support, python bindings, and multiOS support all in one before you can publish to pypi).


I use py_wheel to build packages to be consumed by data scientists in my company. It works well and is reasonably straightforward. Although the packages are pure Python so I haven’t had to deal with native builds.


I have a lot of respect for Neil, but I've been burned by the incompleteness and lack of surrounding ecosystem for his original build system Shake (https://shakebuild.com/). This was in a team where everyone knows Haskell.

I'm cautiously optimistic with this latest work. I'm glad at least this isn't some unsupported personal project but something official from Meta.


I think of Shake as a library for implementing build systems, and was hoping that libraries would emerge that described how to implement rules like C++, and how they should compose together so you can compile C++/Haskell/Python all together happily. A few libraries emerged, but the overall design never emerged. Sorry you got burned :(

Buck2 is at a higher level than Shake - the rules/providers concepts pretty much force you into a pattern of composable rules. The fact that Meta has lots of languages, and that we were able to release those rules, hopefully means it's starting from the point of view of a working ecosystem. Writing those rules took a phenomenal amount of effort from a huge range of experts, so perhaps it was naive that Shake could ever get there on only open source volunteer effort.


The “citizenship” point is really interesting. I’ve found these build systems to be really useful for solving problems in multi-language repos. They make it super easy to create all the build artifacts I want. However, in many ways, they make the source more difficult to consume for people downstream.


Bazel now has a module system that you can use.

https://bazel.build/external/module

This means your packages are just Git repos + BUILD files.


These kinds of tools are designed to work in monorepos so you don’t really rely on package management like you do with separate repos. This works really well for sharing close inside companies/entities. Doesn’t work as well for sharing code between entities.


If I'm understanding, for the rust specific case, this generates your BUCK files from your Cargo.toml:

https://github.com/facebookincubator/reindeer


> One thing I find annoying with all of these general, language-agnostic build systems though is that they break the "citizenship" in the corresponding language

I mean, this is kind of the whole point. A language agnostic build system needs a way to express dependencies and relationships in a way that is agnostic to, and abstracts over, the underlying programming language and its associated ecosystem conventions.


That is only true if the output is an application.

If the output is libraries for some ecosystem (perhaps with bindings to something written in Rust or C), one needs to be able to build packages that others not invested in that build system can consume.


The linked paper is pretty interesting, and short at 4 pages.

In plainer language, I'd say the observation/motivation is that not only do compiling and linking benefit from incrementality/caching/parallelism, but so does the build system itself. That is, the parsing of the build config, and the transformation of the high level target graph to the low level action graph.

So you can implement the build system itself on top of an incremental computation engine.

Also the way I think about the additional dependencies for monadic build systems is basically #include scanning. It's common to complain that Bazel forces you to duplicate dependency info in BUILD files. This info is already present (in some possibly sloppy form) in header files.

So maybe they can allow execution of the preprocessor to feed back into the shape of the target graph or action graph. Although I wonder effect that has on performance.

---

The point about Java vs. Rust is interesting too -- Java doesn't have async/await, or coroutines.

I would have thought you give up some control over when things run with with async/await, but maybe not... I'd like to see how they schedule the tasks.

Implementing Applicative Build Systems Monadically

https://ndmitchell.com/downloads/paper-implementing_applicat...


Pants v2 supports dependency inference in a manner similar to what you’re hypothesising. It indeed can benefit from the general caching mechanism. https://blog.pantsbuild.org/why-dependency-inference/

It has been remarkably convenient for our Python monorepo.


Ah very interesting, I had heard of Pants, but I didn't know about Pants 2. Having less BUILD metadata definitely seems like a big win, and I imagine with some effort you can keep the parallelism/caching/distribution too (I'd be interested in details on that).

Since Pants 2 has a core in Rust, I wonder if you considered Starlark vs. Python? I can see advantages to Python, but it seems like there are good open source implementations of Starlark now.

With Pants 2 and Buck 2, now we're on the 3rd or 4th generation of the Bazel-like systems :)


We schedule tasks mostly using tokio - without any particular care. It's mostly good, but occasionally causes some performance headaches. I think in the fullness of time we might need to invest more in careful scheduling.


> Buck2 is an extensible and performant build system written in Rust

I really appreciate tooling that is written in Rust or Go that produce single binaries with minimal runtime dependencies.

Getting tooling written in for example Python to run reliably can be an exercise in frustration due to runtime environmental dependencies.


Your problem is that Python sucks, especially it's dependency management. It sucks not because it ought to suck, but because of the incompetence of PyPA (the people responsible for packaging).

There are multiple problems with Python packaging which ought not exist, but are there and make lives of Python users worse:

* Python doesn't have a package manager. pip can install packages, but installing packages iteratively will break dependencies of packages installed in previous iterations. So, if you call pip install twice or more, you are likely to end up with a broken system.

* Python cannot deal with different programs wanting different versions of the same dependency.

* Python version iterates very fast. It's even worse for most of the Python packages. To stand still you need to update all the time, because everything goes stale very fast. In addition, this creates too many versions of packages for dependency solvers to process leading to insanely long installation times, which, in turn, prompts the package maintainers to specify very precise version requirements (to reduce the time one has to wait for the solver to figure out what to install), but this, in turn, creates a situation where there are lots of allegedly incompatible packages.

* Python package maintainers have too many elements in support matrix. This leads to quick abandonment of old versions, fragmented support across platforms and versions.

* Python packages are low quality. Many Python programmers don't understand what needs to go into a package, they either put too little or too much or just the wrong stuff altogether.

All of the above could've been solved by better moderation of community-generated packages, stricter rules on package submission process, longer version release cycles, formalizing package requirements across different platforms, creating tools s.a. package manager to aid in this process... PyPA simply doesn't care. That's why it sucks.


Most of this is false. You are ignoring the best practices of using python virtual environments for managing a project's binary and package versions.


You are seriously going to preach about virtual environments to someone who maintains couple dozens of Python packages, works and worked in the infra departments of the largest software companies on Earth? :)

Come back ten years. We'll talk.


"Appeal to authority" doesn't prove your point buddy, especially if that "authority" is yourself.


This is not an "appeal to authority". It means to say that I was using virtual environments before you started programming, and am acutely aware of their existence: the solution you offer is so laughable it doesn't deserve a serious discussion, just too many things are "naive" at best your "solution", but mostly your "solution" is just irrelevant / a misunderstanding of the problem.


I'm not the original poster you replied to, just a passer by.


I'm still waiting on an actual argument here other than condescending name calling, appeals to authority (yes that is what you are doing), and casual hand-waving away any serious discussion because it doesn't "deserve" it.

Let's go through the points which I was referring to:

"Python doesn't have a package manager. pip can install packages, but installing packages iteratively will break dependencies of packages installed in previous iterations. So, if you call pip install twice or more, you are likely to end up with a broken system."

"Likely" seems like a stretch here since it's pretty damned rare that I've come across this when using virtual environments. With a virtual environment, you have an isolated system. Why are you installing packages iteratively in the first place? Use a requirements.txt with the packages you need, then freeze it. If you end up with a conflict, delete the virtual environment and recreate a fresh one, problem solved.

"Python cannot deal with different programs wanting different versions of the same dependency"

It does when you're running your applications using virtual environments. Again, you say that it's irrelevant, but this is literally what this shit solves. I come from a world where multiple applications are run on separate docker containers so this doesn't really apply anyway, but if you had to run multiple applications on the same server you can set the PYTHONPATH env variable and venv binary to the virtual environment when running each application.

"Python version iterates very fast. It's even worse for most of the Python packages. To stand still you need to update all the time, because everything goes stale very fast. In addition, this creates too many versions of packages for dependency solvers to process leading to insanely long installation times, which, in turn, prompts the package maintainers to specify very precise version requirements (to reduce the time one has to wait for the solver to figure out what to install), but this, in turn, creates a situation where there are lots of allegedly incompatible packages."

Maybe I'm misunderstanding what you are saying here, but this seems like a retread of your first point with some casual opinions thrown in. If you delete the venv and re-install all the packages at once, shouldn't it resolve dependency issues? "Insanely long installation times"? Seems to be a lot quicker than maven or gradle in my experience, and much easier to use. I get a lot of dependency issues with those managers as well, so this doesn't seem to be a unique problem for python, if it really is a problem when using virtual environments.

"Python package maintainers have too many elements in support matrix. This leads to quick abandonment of old versions, fragmented support across platforms and versions."

I admit I don't know anything about this. Maybe it's true, but I imagine this is true for community packages of just about any language.

"Python packages are low quality. Many Python programmers don't understand what needs to go into a package, they either put too little or too much or just the wrong stuff altogether."

This is not only purely subjective opinion, it's not even one that seems to be common. Maybe that's true for less popular packages (and again, I'm not convinced it wouldn't be the same for less popular packages in other languages), but the ones most people use for common tasks I often see heralded as fantastic examples of programming that I should be reviewing to level up my own code.

"All of the above could've been solved by better moderation of community-generated packages, stricter rules on package submission process, longer version release cycles, formalizing package requirements across different platforms, creating tools s.a. package manager to aid in this process..."

I'm not familiar enough with the politics, culture, and process of maintaining Python's packages or package management system to speak to any of this. It seems like this would generally be good advice regardless of the state it's currently in. But these are broad, systemic solutions that require a revamp of the culture and bureaucracy of the entire package management system, a completely different set of tools than the ones that already exist (that would likely create backwards incompatibility issues), and no meaningful way to measure the success of these initiatives because most of your complaints are subjective opinions. Furthermore, at least half of your complaints seem to already be mitigated using virtual environments and industry best-practices, so I'm struggling to see where any of this is helpful.


s/Python/NodeJS/ and everything in this statement is multiplied by 10x


Some of it is also true for Node (e.g. poor package quality), but I think it would be hard to argue that the actual package management of Node is anywhere near as bad as Python.

Node basically works fine. You get a huge node_modules folder, sure. But it works.

Python is a complete mess.


> You get a huge node_modules folder, sure. But it works.

pnpm and other tools deduplicates that


I've tended to have the exact opposite experience - Node projects have 10x (or more) the dependencies of Python ones, and the tooling is far worse and harder to isolate across projects.

A well engineered virtualenv solves most Python problems.


I don't have enough experience with npm, but one thing I know for sure is that it can support multiple different versions of the same package. Not in the way I'd like it to do that (i.e. it also allows this in the same application), but at least in this sense it's not like Python.


Yes, just what I thought when I installed the Shopify CLI (https://github.com/Shopify/cli) a few days ago because they force you to install Ruby and Node


Personally it seems like a huge waste of memory to me. It's the electron of the backend. It's absolutely done for convenience & simplicity, with good cause after the pain we have endured. But every single binary bringing the whole universe of libraries with it offends.

Why have an OS at all if every program is just going to package everything it needs?

It feels like we cheapened out. Rather than get good & figure out how to manage things well, rather than driver harder, we're bunting the problem. It sucks & it's lo-fi & a huge waste of resources.


I don't think that matters so much. For building a system, you definitely need dynamic linking, but end user apps being as self contained as possible is good for developers, users, and system maintainers (who don't have to worry about breaking apps). As long as it doesn't get out of hand, a few dozen MBs even is a small price to pay IMO for the compatibility benefits.

As a long time Linux desktop user, I appreciate any efforts to improve compatibility between distros. Since Linux isn't actually an operating system, successfully running software built for Ubuntu on a Fedora box, for example, is entirely based on luck.


There's also the issue that if a library has a vulnerability, you are now reliant on every static binary updating with the fix & releasing a new version.

Where-as with the conventional dynamic library world one would just update openssl or whomever & keep going. Or if someone wanted to shim in an alternate but compatible library, one could. I personally never saw the binary compatibility issue as very big, and generally felt like there was a while where folks were getting good at packaging apps for each OS, making extra repos, that we've lost. So it seems predominantly to me like downsides, that we sell ourselves on, based off of outsized/overrepresented fear & negativity.


the optimization you describe here is not valuable enough to offset the value provided by statically linked applications

the computational model of a fleet of long-lived servers, which receive host/OS updates at one cadence, and serve applications that are deployed at a different cadence, is at this point a niche use case, basically anachronistic, and going away

applications are the things that matter, they provide the value, the OS and even shared libraries are really optimizations, details, that don't really make sense any more

the unit of maintenance is not a host, or a specific library, it's an application

vulnerabilities affect applications, if there is a vulnerability in some library that's used by a bunch of my applications then it's expected that i will need to re-deploy updated versions of those applications, this is not difficult, i am re-deploying updated versions of my applications all the time, because that is my deployment model


Indeed. I view Linux servers/vms as ELF execution appliances with a network stack. And more and more the network stack lives in the NIC and the app, not Linux.


100% yes


Free software has a use beyond industrial software containers. I don't think most folks developing on Linux laptops agree with your narrow conception of software.

Beyond app delivery there's dozens of different utils folks rely on in their day to day. The new statically compiled world requiring each of these to be well maintained & promptly updated feels like an obvious regression.


> Free software has a use beyond industrial software containers. I don't think most folks developing on Linux laptops agree with your narrow conception of software.

the overwhelming majority of software that would ever be built by a system like buck2 is written and deployed in an industrial context

the share of software consumers that would use this class of software on personal linux laptops is statistically zero

really, the overwhelming majority of installations of distros like fedora or debian or whatever are also in industrial contexts, the model of software lifecycles that their maintainers seem to assume is wildly outdated


Again, there is no alternative. Dynamic linking is an artifact of an antiquated 70s-era programming language. It simply does not and cannot work with modern language features like monomorphization.

Linux distros are thankfully moving towards embracing static linking, rather than putting their heads in the sand and pretending that dynamic linking isn't on its last legs.


Whoa, strong opinions.

Dynamic linking on *nix has nothing to do with 70s era programming languages.

Did you consider the possibility that the incompatibility between monomorphization (possibly the dumbest term in all of programming) and dynamic linking should perhaps saying something about monomorphization, instead?


> Dynamic linking on *nix has nothing to do with 70s era programming languages.

Given that dynamic linking as a concept came out of the C world, it has everything to do with them.

> Did you consider the possibility that the incompatibility between monomorphization (possibly the dumbest term in all of programming) and dynamic linking should perhaps saying something about monomorphization, instead?

Yes, I considered that possibility.


The design of dynamic linking on most *nix-ish systems today comes from SunOS in 1988, and doesn't have much to do with C at all other than requiring both the compiler and assembler to know about position-independent code.

What elements of dynamic linking do you see as being connected to "70s era programming languages"?

> Yes, I considered that possibility.

Then I would urge you to reconsider.


dynamic linking is an optimization that is no longer necessary

there is no practical downside to a program including all of its dependencies, when evaluated against the alternative of those dependencies being determined at runtime and based on arbitrary state of the host system

monomorphization is good, not bad

the contents of /usr/lib/whatever should not impact the success or failure of executing a given program


Dynamic linking wasn't an optimization (or at least, it certainly wasn't just an optimization). It allows for things like smaller executable sizes, more shared code in memory, and synchronized security updates. You can, if you want, try the approach of "if you have 384GB of RAM, you don't need to care about these things", and in that sense you're on quicksand with the "just an optimization". Yes, the benefits of sharing library code in memory are reduced by increasing system RAM, but we're seeing from a growing chorus of both developers and users, the "oh, forget all that stupid stuff, we've got bigger faster computers now" isn't going so well.

There's also the problem that dynamic loading relies on almost all the same mechanisms as dynamic linking, so you can't get rid of those mechanisms just because your main build process used static linking.


it allows for all of the things you list, yes, but those things just aren't really valuable compared to the reliable execution of a specific binary, regardless of any specific shared library that may be installed on a host

smaller executable sizes, shared code in memory, synchronized security updates, are all basically value-zero, in any modern infrastructure

there is no "growing chorus" of developers or users saying otherwise, it is in fact precisely the opposite, statically linked binaries are going extremely well, they are very clearly the future


> it allows for all of the things you list, yes, but those things just aren't really valuable compared to the reliable execution of a specific binary

> smaller executable sizes, shared code in memory, synchronized security updates, are all basically value-zero, in any modern infrastructure

This highlights the fact that you're extremely focused on one particular model of development, one where a single person or group deploys software that they are responsible for running and maintaining - often software that they've written themselves.

This is, obviously, an extremely appropriate paradigm for the enterprise. Static linking makes a lot of sense here. Python's virtual environments are basically the approved workaround for the fact that Python was built for systems that are not statically linked, and I cherish it for exactly that reason. Use Go on your servers - I do myself! But that doesn't mean it's appropriate everywhere.

Sometimes developers in this mindset forget there's a whole other world out there, a world of personal computers, that each have hundreds or thousands of applications installed. Applications on these systems are not deployed, they are installed. The mechanism by which this happens (on Linux) is via distributions and maintainers, and dynamic linking needs to be understood as designed for that ecosystem. Linux operating systems are built around making things simple, reliable, and secure for collections of software that are built and distributed by maintainers.

I'm firmly on the side of the fence that says that dynamic linking is the correct way to do that. All the benefits you mention are just a free bonus, of course, but I care about them as well. Smaller executable sizes? Huge win on my 256 GB SSD. Synchronized security updates? Of course I care about those as an end user!


I hugely agree that the parent is definitely definitely favoring one and only one kind of software model.

You raise the world of personal computers. And I think dynamic linking is absolutely a choice that has huge advantages for these folks.

There's other realms too. Embedded software needs smaller systems, so the dynamic library savings can be huge there. Hyper-scaler systems, where thousands of workloads can be running concurrently, can potentially scale to much much much higher usages with dynamic linking.

It's a little far afield, but with systems like webasssembly we're really looking less at a couple orgs within a company each shipping a monolith or two, and we're potentially looking way more at having lots of very small functions with a couple helper libraries interacting. This isn't exactly a classic dynamic library, but especially with the very safe sandboxing built in, the ideal model is far closer to something like dynamic linking where each library can be shared than it is shared.


> Sometimes developers in this mindset forget there's a whole other world out there, a world of personal computers, that each have hundreds or thousands of applications installed.

it's not that i forget about these use cases, it's that i don't really consider them relevant

tooling that supports industrial use cases like mine is not really able to support end-user use cases like yours at the same time

linux operating systems may have at one point been built around making things as you describe by distribution maintainers, but that model is anachronistic and no longer useful to the overwhelming majority of its user base, the huge majority of software is neither built nor distributed by maintainers, it is built and distributed by private enterprises


> > Sometimes developers in this mindset forget there's a whole other world out there, a world of personal computers, that each have hundreds or thousands of applications installed.

> it's not that i forget about these use cases, it's that i don't really consider them relevant

Yes, exactly! It's an extremely myopic vision. You've spent this long thread arguing against dynamic linking on the basis of what is only a small fraction of total human / computer interactions! By "not relevant" you mean not relevant to the enterprise. I grant that of course - but these uses cases are (by definition) relevant to hundreds of millions of PC users.

> the huge majority of software is neither built nor distributed by maintainers, it is built and distributed by private enterprises

The overwhelming majority of the software I run is built and distributed by maintainers. Literally, there are only a few exceptions, like static-built games that rarely or never change and are (unfortunately) closed source. I daresay that's true for the majority of Linux users - the vast majority of the software we install and use is not "built and distributed by private enterprises".

This reality is what Linux-on-the-desktop is built for. There are millions of people who are going to want to continue using computers this way, and people like me will continue contributing to and developing distributions for this use case, even if shipping static or closed-source binaries to Linux users becomes common.


linux-on-the-desktop is also like statistically zero of linux installations (modulo mobile) but if that's counter to a belief of yours then we're definitely not going to make progress here so (shrug)

like i'm not sure you understand the scale of enterprise linux. a single organization of not-that-very-many people can easily create and destroy hundreds of millions of deployed systems every day, each with a novel configuration of installed software. i've seen it countless times.


I think we're arguing on multiple fronts here and that is confusing things.

1. My point about Linux on the desktop is that there are in practice users like me who are already getting the (many) advantages of dynamic linking, and don't want to give up those advantages. To the point that some of us are going to support and work on distributions that continue the traditional Linux way in this area. In your view, the ecosystem has moved to software being built and distributed by private corporations. I don't think this has happened - on Windows software was always built and distributed this way; on (desktop) Linux it never was and largely still isn't!

2. My point about the desktop in general is that this use case matters to the vast majority of computer-using human beings much more than enterprise. The number of deployed containers that get created and destroyed every day doesn't change that fact, nor does the fact that Linux users are merely a tiny fraction of this desktop use case. This is what creates the myopia I was talking about - you're thinking about metrics like "number of systems deployed" whereas I'm thinking of number of human-computer interactions that are impacted. I don't think you can just discard what matters on the desktop or paint it as irrelevant. Desktop computing shouldn't be subordinate to the technical requirements of servers!

So to summarize the argument: (a) desktop use cases still matter because they comprise the majority of human-computer interactions, (b) dynamic linking and the maintainer model are the superior approach for desktop computing, and in fact complement each other in important ways, and (c) even if most desktop users can't take advantage of this model because of the dominance of closed source software and the corporate development model, desktop Linux can and does, and will hopefully continue to do so into the future.


> Desktop computing shouldn't be subordinate to the technical requirements of servers!

i guess this is the crux of the discussion

linux desktop computing for sure _is_ subordinate to linux server computing, by any reasonable usage metric

i'm not trying to deny your experience in any way, nor suggest that dynamic linking goes away, or anything like that -- your use case is real, linux on the desktop is real, that use case isn't going away

but it is pretty clear at this point that linux on the server is wildly successful, linux on mobile is successful (for android), and that linux on the desktop is at best a niche use case

the majority of human interactions with linux occur via applications, services, tools, etc. that are served by linux servers, and not by software running on local machines like desktops or laptops

linux is a server operating system first and foremost


whether we want unobservable ungovernable far off machines running the future forever, or whether we want a future where actual people can compute & see what happens seems to matter. the numbers may perhaps stack up to suborn PC needs to industrial computing needs now, but is that the future anyone should actually want? should the invisible hand of capital be the primary thing humanity should try to align to?

and where is the growth potential? is the industrial need going to become greatly newly empowered & helpful to this planet, to us? will it deliver & share the value potential out there? PC may be a smaller factor today, but i for one am incredibly fantastically excited to imagine a potential future 10 years from now where people start to PC again, albeit in a different way.

individual PCs have no chance. it's why the cloud has won. on-demand access wherever you are, consistent experience across devices is incredibly incredibly convenient. but networks of PCs that work well together is exciting, and we've only so very recently started emerging the capability to have nice easy to manage ops/automated multi-machine personal-computing. we've only recently emerged to maturity where a better, competitive personal computing is really conceivable.

it's been the alpha linux geeks learning how to compute and industrial players learning how to compute, and the invisible hand has been fat happy & plump from it, but imo there's such a huge potential here to re-open computing to persons, to create compelling interesting differently-capable sovereign/owned computing systems, that are free from so many of the small tatters & deprevations & enshittifications that cloud- that doing everyting on other people's computers as L-Users- unnerringly drops on us. we should & could be a more powerful, more technically-cultural culture, and i think we've severely underrated how much subtle progress there's been to make that a much less awful, specialized, painful, time-consuming, low-availability, disconnected effort than it used to be.


>Applications on these systems are not deployed

In a way they are. You deploy it to the store and then as people's computers download the update automatically.

A counter example to your claims about Linux is Android. Libraries are not shared between apps (beyond the android framework and libc). This is despite the fact that phones have limited storage.


the chorus is about the assumptions commonly found among younger devs that these old "efficiency" and "optimization" techniques don't matter any more. c.f. apps (desktop, mobile) that take forever to do things that should not take forever.

"modern infrastructure" seems like a bit of a giveaway of your mind set. yes, i know that there's a lot of stuff that now happens by having your web browser reach out to "infrastructure" and then the result is displayed in front of you.

But lots of people still use their computers to run applications outside the browser, where "modern infrastructure" means either nothing at all, or it means "their computer (or mobile platform)". the techniques mentioned in this subthread are all still very relevant in this context.


there is basically no situation in which it is important to optimize for binary size, embedded sure, but nowhere else

the infrastructural model i'm describing doesn't require applications to run in browsers, or imply that applications are slower, actually quite to the contrary, statically linked binaries tend to be faster

the model where an OS is one to many with applications works fine for personal machines, it's no longer relevant for most servers (shrug)


> there is basically no situation in which it is important to optimize for binary size, embedded sure, but nowhere else

Not disagreeing that there many upsides to statically linking, but there are (other) situations where binary size matters. Rolling updates (or scaling horizontally) where the time is dominated by the time it takes to copy the new binaries, e.g.

> the model where an OS is one to many with applications works fine for personal machines, it's no longer relevant for most servers

Stacking services with different usage characteristics to increase utilization of underlying hardware is still relevant. I wouldn't be surprised if enabling very commonly included libraries to be loaded dynamically could save significant memory across a fleet .. and while the standard way this is done is fragile, it's not hard to imagine something that could be as reliable as static linking, esp in cases where you're using something like buck to build the world on every release anyway


it was never relevant for servers. and there are probably still fewer servers than end-user systems out there, certainly true if you include mobile (there are arguments for and against that).


servers vastly, almost totally, outnumber end-user systems, in terms of deployed software

end-user systems account for at best single-digit percentages of all systems relevant to this discussion

(mobile is not relevant to this discussion)


binary size is also memory size. memory size matters. applications sharing the same libraries can be a huge win for how much stuff you can fit on a server, and that can be a colossal time/money/energy saver.

yes: if you're a company that tends to only run 1-20 applications, no, the memory-savings probably won't matter to you. that matches quite a large number of use cases. but a lot of companies run way more workloads than anyone would guess. quite a few just have no cost-control and/or just don't know, but there's probably some pretty sizable potential wins. it's even more important for hyper-scalers, where they're running many many customer processes at a time. even companies like facebook though, i forget the statistic, but sometime in the last quarter there was quote saying like >30% of their energy usage was just powering ram. willing to bet, they definitely optimize for binary size. they definitely look at it.

there's significant work being put towards drastically reducing scale of disk/memory usage across multiple containers, for example. composefs is one brilliant very exciting example that could help us radically scale up how much compute we can host. https://news.ycombinator.com/item?id=34524651

i also haven't seen the very important very critical other type of memory mentioned, cache. maybe we can just keep paying to add DRAM forever and ever (especially with CXL coming across the horizon), but the SRAM in your core-complex will almost always tend to be limited (although word is Zen4 might get within striking distance of 1GB which is EPIC). static builds are never going to share cache effectively. the instruction cache will always be unique per process. the most valuable expensive fancy memory on the computer is totally trashed & wasted by static binaries.

there's really nothing to recommend about static binaries, other than them being extremely stupid. them requiring not a single iota of thought to use is the primary win. (things like monomorphic optimization can be done in dynamic libraries with various metaprogramming & optimizing runtimes, hopefully one's that don't need to keep respawning duplicate copies ad-nauseum.)

i do think you're correct about the dominant market segment of computing, & you're speaking truthfully to a huge % of small & mid-sized businesses, where the computing needs are just incredibly simple & the ratio of processes to computers is quote low. their potential savings are not that high, since there's just not that much duplicate code to keep dynamically linking. but i also think that almost all interesting upcoming models of computing emphasize creating a lot more smaller lighter processes, that there are huge security & managability benefits, and that there's not a snowman's chance in hell that static-binary style computing has any role to play in the better possible futures we're opening up.


you're very sensitive to the costs of static linking but i don't think you see the benefit

the benefit is that a statically linked binary will behave the same on all systems and doesn't need any specific runtime support above or beyond the bare minimum

this is important if you want a coherent deployment model at scale -- it cannot be the case that the same artifact X works fine on one subset of hosts, but not on another subset of hosts, because their openssl libraries are different or whatever

static linking is not stupid, it doesn't mean that hosts can only have like 10 processes on them, it doesn't imply that the computing needs it serves are simple, quite the opposite

future models of computing are shrinking stuff like the OS to zero, the thing that matters is the application, security (in the DLL sense you mean here) is not a property of a host, it's a property of an application, it seems pretty clear to me that static linking is where we're headed, see e.g. containers


Thanks for the information about SunOS. My point still stands: the C ecosystem makes it possible in a way that other language models simply don't.

> Then I would urge you to reconsider.

Done. No change to my beliefs.


Absolutely. As soon as it started to seem like even couple hundreds of JARs won't put a significant strain on the filesystem having to house them, the typical deployment switched to Docker images, and, on top of the hundred of JARs started to bundle in the whole OS userspace. Which also, conveniently, makes memory explode because shared libraries are no longer shared.

This would definitely sound like a conspiracy theory, but I'm quite sure that hardware vendors see this technological development as, at least, a fortunate turn of events...


when someone writes a program and offers it for other people to execute, it should generally be expected to work

the size of a program binary is a distant secondary concern to this main goal

static compilation more or less solves this primary requirement, at the cost of an increase to binary size that is statistically zero in the context of any modern computer, outside of maybe embedded (read: niche) use cases

there is no meaningful difference between a 1MB binary or a 10MB binary or a 100MB binary, disks are big and memory is cheap

the optimization of dynamic linking was based on costs of computation, and a security model of system administration, which are no longer valid

there's no reason to be offended by this, just update your models of reality and move on


Wait until you use a single board computer with a 4GB emmc OS disk. And don't forget about bandwidth...


i have a few devices like that around, but the thing is that the software i put on them is basically unrelated to the software that's being discussed here

definitely i am not using buck or bazel or whatever to build binaries that go on those little sticks


sure, but people are suggesting staticly linking everything and many modern languages don't really support dynamic linking.


I never had a problem before. The people saying we need this for convenience felt detached & wrong from the start.

It's popular to be cynical & conservative, to disbelieve. That has won the day. It doesn't do anything to convince me it was a good choice or actually helpful, that we were right to just give up.


"wrong" or "a good choice" or "actually helpful" are not objective measures, they are judged by a specific observer, what's wrong for you can be right for someone else

i won't try to refute your personal experience, but i'll observe it's relevant in this discussion only to the extent that your individual context is representative of consumers of this kind of software in general

that static linking provides a more reliable end-user experience vs. dynamic linking is hopefully not controversial, the point about security updates is true and important but very infrequent compared to new installations


Sometimes things just don’t have good solutions in one space. We solved in another space, as SSD and ram manufacturers made memory exponentially cheaper and more available over the last few decades.

So we make the trade off of software complexity for hardware complexity. Such is how life goes sometimes.


> with minimal runtime dependencies

You’re probably thinking of static binary. I believe that OP is comparing a single binary vs installing the whole toolchain of Python/Ruby/Node and fetching the dependencies over the wire.


If it's not a statically linked binary, then the problem is just as bad as it is with Python dependencies: instead, now you need to find the shared libraries that it linked with.


We've had decades to figure this out, and none of the "solutions" work. Meanwhile, the CRT for visual studio id 15MB. If every app I installed grew by 15MB I don't think I would notice.


Imagine if every QT program included all of the QT shared libraries.


On windows they do.


Dynamic linking is an artifact of C, not some sort of universal programming truth.


Dynamic linking originated with Multics (https://en.wikipedia.org/wiki/Multics) and MTS (https://en.wikipedia.org/wiki/MTS_system_architecture), years before C even existed. Unix didn't get dynamic linking until the 1980s (https://www.cs.cornell.edu/courses/cs414/2001FA/sharedlib.pd...).

The impetus for dynamic linking on Multics and MTS was the ability to upgrade libraries without having to recompile software, and to reuse code not originally designed or intended for (e.g. different compilers or languages), let alone compiled with, the primary program. Both of these reasons still pertain, notwithstanding that some alternatives are more viable (e.g. open source code means less reliance on binary distribution).


neither of those reasons still pertain, really

"the primary program" is the atomic unit of change, it is expected that each program behaves in a way that is independent of whatever other files may exist on a host system


I feel so lucky that I found waf[1] a few years ago. It just... solves everything. Build systems are notoriously difficult to get right, but waf is about as close to perfect as you can get. Even when it doesn't do something you need, or it does things in a way that doesn't work for you, the amount of work needed to extend/modify/optimize it to your project's needs is tiny (minus the learning curve ofc, but the core is <10k lines of Python with zero dependencies), and doesn't require you to maintain a fork or anything like that.

The fact that the Buck team felt they had to do a from scratch rewrite to build the features they needed just goes to show how hard it is to design something robust in this area.

If there are any people in the Buck team here, I would be curious to hear if you all happened to evaluate waf before choosing to build Buck? I know FB's scale makes their needs unique, but at least at a surface level, it doesn't seem like Buck offers anything that couldn't have been implemented easily in waf. Adding Starlark, optimizing performance, implementing remote task execution, adding fancy console output, implementing hermetic builds, supporting any language, etc...

[1]: https://waf.io/


> If there are any people in the Buck team here, I would be curious to hear if you all happened to evaluate waf before choosing to build Buck?

There’s no way Waf can handle code bases as large as the ones inside Facebook (Buck) or Google (Bazel). Waf also has some problems with cross-compilation, IIRC. Waf would simply choke.

If you think about the problems you run into with extremely large code bases, then the design decisions behind Buck/Bazel/etc. start to make a lot of sense. Things like how targets are labeled as //package:target, rather than paths like package/target. Package build files are only loaded as needed, so your build files can be extremely broken in one part of the tree, and you can still build anything that doesn’t depend on the broken parts. In large code bases, it is simply not feasible to expect all of your build scripts to work all of the time.

The Python -> Starlark change was made because the build scripts need to be completely hermetic and deterministic. Starlark is reusable outside Bazel/Buck precisely because other projects want that same hermeticity and determinism.

Waf is nice but I really want to emphasize just how damn large the codebases are that Bazel and Buck handle. They are large enough that you cannot load the entire build graph into memory on a single machine—neither Facebook nor Google have the will to load that much RAM into a single server just to run builds or build queries. Some of these design decisions are basically there so that you can load subsets of the build graph and cache parts of the build graph. You want to hit cache as much as possible.

I’ve used Waf and its predecessor SCons, and I’ve also used Buck and Bazel.


With Buck2, memory taken for the graph is a concern, but it fits into a single host's RAM.


Interesting. I know that for Buck 1, some workloads didn’t fit entirely in RAM.


I get that, but again, there's no reason Waf can't be used as a base for building that. I actually use Waf for cross compilation extensively, and have built some tools around it with Conan for my own projects. Waf can handle cross compilation just fine, but it's up to you to build what that looks like for your project (a common pattern I see is custom Context subclasses for each target)

Memory management, broken build scripts, etc. can all be handled with Waf as well. In the simplest case, you can just wrap a `recurse` call in a try catch block, or you can build something much more sophisticated around how your projects are structured.

Note, I'm not trying to argue that Google/Facebook "should have used X". There are a million reasons to pick X over Y, even if Y is the objectively better choice. Sometimes, molding X to be good enough is more efficient than spending months just researching options hoping you'll find Y.

I'm just curious to know if they did evaluate Waf, why did they decide against it.


I don’t see how using Waf as a base would help in any way. It seems like a massive mismatch for the problems that Facebook and Google are solving. You seem to be fond of Waf, maybe if you elaborated why you think that Waf would be a good base for compiling massive, multi-language code-bases, I could understand where you are coming from. Where I am coming from—it feels like Waf is kind of a better version of autotools, or something like that, and it’s just not in the same league. It’s like comparing a bicycle to a cargo ship. Like, “Why didn’t the people designing the cargo ship use the bicycle as a starting point?” I don’t want to abuse analogies here, but that’s what the question sounds like to me. This is based on my relatively limited experience using Waf (and SCons, which I know is different), and my experience using Bazel and Buck.

Having spent a lot of time with Buck and Bazel, there are just so many little things you run into where you go, “Oh, that explains why Buck or Bazel is designed that way.” These design decisions permeate Buck and Bazel (Pants, Please, etc.)

I just don’t see how Waf can be used as a base. I really do see this as a new “generation” of build systems, with Buck, Bazel, Please, and Pants, and everything else seems so much more primitive by comparison.


I’m coming from the perspective of someone who has been working with it for a while, and coincidentally very intensely hacking away at it recently.

The thing about waf is that it’s more designed like a framework than a typical build tool. If you look at the code, it’s split into a core library (thats the <10k loc I estimated), and additional tools that do things like add C++ or Java build support.

That’s one of the reasons I like Waf, since it becomes a powerful toolkit for creating a custom build system once you strip away the thin outer layer. There is no one-size-fits-all build system, so a tool that can be molded like waf is very powerful imo.

I guess it’s hard to get that point across without experiencing it. There are just so many good design decisions everywhere. For example, extensibility comes easily because task generator methods are “flat”, and ordering is implemented via constraints. This means you can easily slip your own functions between any built in generator method to manipulate their inputs or outputs. It’s like a sub-build system just for creating Task objects.

Also, I don’t want to give the impression that I think waf would have been a better choice for these companies. I’ve kind of been defending it a lot in this thread, but my original point/question was just to know if they evaluated waf/what they thought about it. After so many comments I feel like I might be coming off as hostile… which isn’t my intention.


I’m not trying to react to your comments as if they’re hostile, just hope to clear the air. I like defending Buck and Bazel a little bit, and at the same time, I really recognize that they are painful to adopt, don't solve everyone’s problems, etc.

Waf does seem like a “do things as you like” framework, and I think that notion is antithetical to the Buck and Bazel design ethos. Buck and Bazel’s design are, “This is the correct way to do things, other ways are prohibited.” You fit your project into the Buck/Bazel system (which could be a massive headache for some) and in return you get a massive decrease in build times, as well as some other benefits like good cross-compilation support.

One fundamental part of the Buck/Bazel design is that you can load any arbitrary subset of the build graph. Your repository has BUILD files in various directories. Those directories are packages, and you only load the subset of packages that you need for the targets you are actually evaluating during evaluation time. You can even load a child package without loading the parent—like, load //my/cool/package without loading //my/cool or //my.

The build graph construction also looks somewhat different. There is an additional layer. In build systems like Waf, you have some set of configuration options, and the build scripts generate a graph of actions to perform which create the build using that configuration. In Buck/Bazel, there is an additional layer—you create a platform-agnostic build graph first (targets, which are specified using rules like cc_library), and then there’s a second analysis phase, which converts rules like “this is a cc_library” into actual actions like “run GCC on this file”.

These extra layers are there, as far as I can tell, to support the goals of isolating different parts of your build system from each other. If they’re isolated, then you have better confidence that they will produce the same outputs every time, and you can make more of the build process parallelizable—not just the actual build actions, but the act of loading and analyzing build scripts.

I do think that there is room to appreciate both philosophies—the “let’s make a flexible platform” philosophy, and the “let’s make a strict, my-way-or-the-highway build system” philosophy.


> the core is <10k lines of Python with zero dependencies

Isn‘t that already a no-go, to write a performance critical system in a slow programming language?


I am no python fan, but find it laughably hard to believe it could be what makes a build coordination system slow.


On clean builds the python tax will be dwarfed be the thousands of calls to clang yes. That’s not the scenario you need to optimize for. What’s more important is that incremental builds are snappy, since that is what developers do 100 times per day.

I’ve seen some projects with 100MB+ ninja-files that even ninja itself, proud for being written in optimized c++, takes a second or two to parse on each build invocation. Convert that to python and you likely land in 5-20 sec range instead. Enough to alt-tab and get distracted by something else. Google code base is likely even larger than this.

A background daemon that holds the graph in memory would probably handle it. In the big scheme such a design is likely better anyway. But needs a big upfront design and is a lot more complex than just reparsing a file each time.

Side note: For some, even the interpreter startup is annoying. Personally I find it negligible, especially after 3.11 you can almost claim it’s snappy.


Code bases that big are strawmen for most companies. Yes, they happen; but as often they should be segmented into smaller things. That don't require monolithic build setups.


The context for this thread was weather Facebook considered waf in participation, so it is very relevant.


Certainly fair. I had meandered on to "in general" way too quickly.


> They are large enough that you cannot load the entire build graph into memory on a single machine

You mean, multiple gigabytes for build metadata, that just says things like that X depends on Y and to build Y you run command Z?


Yes, the codebases at Google and FB contain billions of files. Article from 2016 about the scale, and of course it's only grown dramatically since then: https://m-cacm.acm.org/magazines/2016/7/204032-why-google-st...


Yes. By “multiple gigabytes” I am talking about >100 GB. Maybe >1 TB.


How is this even possible? I take that this data is highly compressible, right?


It wouldn’t be compressed in ram though, would it?


There are in-memory https://en.wikipedia.org/wiki/Succinct_data_structure but I actually don't mean that specifically: I mean that, for example, there must be tons of strings with common prefixes, like file paths (which can be stored in a trie to have faster access and compress data in ram) or very similar strings (like compiler invocations that mostly have the same flags), and other highly redundant data that can usually be used to cut down on memory requirements.

I highly doubt that, after doing all those tricks, you still end up with 100GB - 1TB of build data.


You could do those tricks and cut down memory, perhaps even 10x, but they come at the cost of increased CPU time. Designing the system in such a way that you only ever need to load a tiny subset of the graph at one time gives you a 1000x saving for memory and CPU.


Some of those tricks may actually decrease CPU time (by fetching less data from RAM and using the CPU cache more effectively). And you can also apply any optimizations for partial loading on top of that.

I guess the downside is that the system would be more complex overall, but you can probably get 80% of the result with not so large changes


I don't know if they considered waf specifically, but the team is definitely very familiar with the state of the art: https://www.microsoft.com/en-us/research/uploads/prod/2018/0...

One of the key requirements is that Buck2 had to be an (almost) drop-in replacement for Buck1 since there's no way we could reasonably rewrite all the millions of existing build rules to accommodate anything else.

Also Buck needs to support aggressive caching, and doing that reliably puts lots of other constraints on the build system (eg deterministic build actions via strong hermeticity) which lots of build systems don't really support. It's not clear to me whether waf does, for example (though if you squint it does look a bit like Buck's rule definitions in Starlark).


I truly believe any build system that uses a general-purpose language by default is too powerful. It lets people do silly stuff too easily. Build systems (for projects with a lot of different contributors) should be easy to understand, with few, if any, project specific concepts to learn. There can always be an escape hatch to python (see GN, for example), but 99% of the code should just be boring lists of files to build.


You cannot magick away complexity. Large systems (think thousands of teams with hundreds of commits per minute) require a way to express complexity. When all is said and done, you'll have a turing-complete build system anyway - so why not go with something readable


No no no no. The more code you have, the more you have to constrain the builds.

I understand where the sentiment comes from, having seen one too many example of people struggling to implement basic logic in cmake or groovy, that would be a oneliner in python. But completely opening up the floodgates is not the right solution.

Escape hatches into GP languages can still exist but the interfaces to them need to be strict, and it’s better people see this boundary clearly, rather than limping around trying to do GP inside cmake and failing on correctness anyway. Everything else should like parent say just be a list of files.

Dependencies need to be declarative and operations hermetic.

Otherwise the spaghetti of complexity will just keep growing. Builds and tests will take forever due to no way of detecting what change affects which subsystem, what can be parallelized and even worse when incremental builds stop working.

By constraining what can be done, you also empower developers to do whatever they want, within said boundaries, without having to go through an expert build-team. Think about containers, it allowed every team to ship whatever they want without consulting the ops team.


> The more code you have, the more you have to constrain the builds.

That works if you have one team - of if all teams work the same way. If you have multiple teams with conflicting requirements[1], you absolutely should not constrain the build because you'd be getting in the way.

1. E.g. Team A uses an internal C++ lib an online service and prefers an evergreen version of it to be automatically applied with minimal human involvement. Team B team uses the same lib on physical devices shipped to consumers/customers. Updates are infrequent (annual), but have to be tested thoroughly for qualification. Now your build system has to support evergreen dependencies and versioned ones. If you drop support for either, you'll be blocking one team or the other from doing work.


On the contrary, large systems have to restrict what their build system does because otherwise the complexity becomes unmanageable. I used to work on a large codebase (~500 committers, ~10MLOC) that had made the decision to use Gradle because they thought they needed it, but then had to add increasingly strict linters/code review/etc. to the gradle build definitions to keep the build maintainable. In the end they had a build that was de facto just as restricted as something like Maven, and the Turing completeness of Gradle did nothing but complicate the build and slow it down.

And sure, maybe having a restricted build definition (whether by using a restricted tool or by doing code review etc.) moves the complexity somewhere else, like into the actual code implementation. But it's easier to manage there. The build system is the wrong place for business logic, because it's not somewhere most programmers ever think to look for it.


I seriously doubt there's a single repo on the planet that averages hundreds of commits per minute. That's completely unmanageable for any number of reasons.


According to [1], in 2015 Google averaged 25 commits per minute (250000/7/24/60). I can imagine hundreds per minute during Pacific working hours today.

[1] https://cacm.acm.org/magazines/2016/7/204032-why-google-stor...


In the case of a monorepo, that's exactly the case why the build system shouldn't be overly complex. If you're expecting random people to make changes to your stuff, you shouldn't be burdening them with more complexity than necessary.

The monorepo case is also a little bit outside what I was originally talking about. I was mostly refering to individual services/libraries/apps


It wouldn't surprise me at all if some large repos at Google or Facebook now get to that many, it's easy to do once you have robots committing code (usually configuration changes).


I didn't mean on average, but the build tool has to handle the worst case and I probably am understating the worst case.

I'd bet there are a more than a few repos that do get (at least) hundreds of commits as a highwater mark. My guess is lots of engineers + mono-repo + looming code-freeze deadline can do that like clockwork.

Edit: Robots too as sibling pointed out. A single human action may result in dozens of bot-generated commits


IMO there's almost never a good reason to have automated commits in repos outside of two cases:

1) Automated refactoring

2) Automated merges when CI passes

Configs that can be generated should just be generated by the build.

But that's a different topic


> IMO there's almost never a good reason to have automated commit

This depends entirely on the quality of dev tools available.

Also, commit =/= shipped code: you may have a automated commits and keep a human in the loop before shipping, by way of rejectable Pull-Request (or the proprietary equivalent).

A simple library upgrade will result in a wave of commits/bot-authored PRs

1. Human makes a change to a core library, changing it from v1 to v2

2. Bot identifies all call-sites and refactors to v2-equivalent, creating 50 PRs for 50 different teams.

One change, 51 commits.


There are at least two other hugely important use cases you missed:

- automatic security / vendoring updates (e.g. https://github.com/renovatebot/renovate)

- automated cross-repo syncs, e.g. Google has processes and tools that bidirectionally sync pieces of Google3 with GitHub repos


The problem with build systems are its users. For exactly the reason you say. For a man with a hammer every problem is a nail. Developers don’t think of build systems in the right way. If you’re doing something complex in your build it should surely be a build task in its own right.


I agree, that’s also pretty much why Starlark exists. However, there are many cases where you do need complex build logic.

Personally, I always go for declarative CMake first, then waf as soon as I find my CMakeLists looking like something other than just a list of files.

I’ve considered before creating a simple declarative language to build simple projects like that with waf, but I don’t like the idea of maintaining my own DSL for such little benefit, when CMake works just fine, and everyone knows how to use it. I feel like I’d end up with my own little TempleOS if I decided to go down that rabbit hole.


I'd like to agree, but every significant project has something wierd that their build system doesn't have built in yet. So you need some way to extend it. Useful build systems end up supporting lots of hacks '

That said, the more your build systems makes easy without having to write code the better.


While this is true, this isn't a problem for your buck/bazel/pants-likes. Between genrules and custom rules you can do this with all the power you (usually) need.


They are the bane of any DevOps/Build Engineer when trying to fix build issues.


I think I would agree as well. So I’m not sure how that makes me feel about nix.


Nix is different because no one's smart enough to figure how how to do silly things ;)


Nix is Turing complete, but it's not a general purpose language. It is designed as a DSL for building software, and I think it's pretty nice for that.

The Nickel rationale doc has some thoughts on why this might be the right call: https://github.com/tweag/nickel/blob/master/RATIONALE.md#tur...

From my (limited) experience with another deliberately limited configuration DSL (CUE), I think more power in such DSLs will pan out better in the long run. Of course, it's not all one or the other: a powerful build DSL can still enforce useful discipline, and a Turing-complete language can still be thoughtfully designed around a special purpose. I think Nix demonstrates both pretty well, actually.


Nix forces you to serialize every build step (what it calls a "derivation"), and moreover it isolates the build environment to only include things built with Nix or verified by hash. So while there is a lot of power, the only thing you can do with that power is produce derivations which themselves actually run the build.

Contrast this with Gradle, which is currently digging itself out of a hole by forcing authors to declare all inputs and outputs of their tasks so it can serialize them, but you can literally do anything Java can throughout the entire process. This is the kind of Herculean task which is neatly sidestepped by tightly controlling the DSL environment (inputs/outputs) as does Nix.


And the best part about waf? The explicit design intent that you include the build system with the source code. This gets rid of all the problems with build systems becoming backwards/forwards incompatible, and trying to deal with the issues when a developer works on one project using build system v3.9 and another that users build system v4.6

With waf, the build system is trivially included in the source, and so your project always uses the right version of waf for itself.


I could be wrong as I haven't dug into the waf docs too too much, but I think the major difference between waf and Buck is the ability to handle dependency management between various projects in a large org.

The documentation and examples for waf seem to be around building one project, in one language, with an output of statistics and test results. I am sure this is a simplification for education and documentation purposes, but it does leave a vague area around "what if I have more than 1 or 2 build targets + 5 libs + 2 apps + 3 interdependent helper libraries?"

Buck seems to be different in that it does everything waf does but also has clear `dep` files to map dependencies between various libraries within a large repository with many, many different languages and build environments.

The key thing here being, I suspect that within Meta's giant repositories of various projects, they have a tight inter-linking between all these libraries and wanted build tooling that could not only build everything, but be able to map the dependency trees between everything as well.

Pair that with a bunch of consolidated release mapping between the disparate projects and their various links and you have a reason why someone would likely choose Buck over waf purely from a requirements side.

As for another reason they likely chose Buck over waf. It would appear that waf is a capable, but lesser known project in the wider dev community. I say this because when I look into waf, I mostly see it compared against CMake. Its mental state resides mostly in the minds of C++ devs. Either because of NIHS (not invented here syndrome) or fear that the project wouldn't be maintained over time, Meta may have decided to just roll their own tooling. They seem to be really big on the whole "being the SDK of the internet" as of late. I could see them not wanting to support an independent BSD licensed library they don't have complete control over.

These are just my thoughts, I could be completely wrong about everything I've said, but they're my best insights into why they likely didn't consider waf for this.


It’s true that Waf doesn’t come with dependency management out of the box (EDIT: unless you count pkg-config), so maybe that’s why (besides NIHS). The way I handle it is with another excellent project called Conan (https://conan.io/)

However, if you’re going to build a custom package management system anyways, there’s no reason you couldn’t build it on top of waf. Again, the core is tiny enough that one engineer could realistically hold the entire thing in their head.

But I don’t think we’re going to get it right speculating here lol. I’m sure there was more to it than NIHS, or being unaware of waf.


A number of things like being written in python start to matter at big scale. I love python, but cli startup time in python is actually a concern for apps used many times daily by many engineers.

Fixing that or moving to a daemon or whatever starts to take more time than just redoing it from scratch, and if the whole thing is 10k lines of python, it's something a domain expert can mostly reimplement in a week to better serve the fb specific needs.


I've been using Waf for a couple of years, including on retro thinkpads from ~08. I've never run into issues with the startup time for waf and/or Python. Even if the interpreter were 100x slower to start and execute than it currently is, that time would be negligible next to the time spent waiting for a compiler or other build task to complete.

And if it is too slow, there's profiling support for tracking down bottlenecks, and many different ways to optimize them. This includes simply optimizing your own code, or changing waf internal behavior to optimize specific scenarios. There's even a tool called "fast_partial" which implements a lot more caching than usual project-wide to reduce time spent executing Python during partial rebuilds in projects with an obscene number of tasks.

> Fixing that or moving to a daemon or whatever starts to take more time than just redoing it from scratch, and if the whole thing is 10k lines of python, it's something a domain expert can mostly reimplement in a week to better serve the fb specific needs.

Well, considering Buck just went through a from-scratch rewrite, I would argue otherwise. Although, to be fair, that 10k count is just for the core waflib. There are extra modules to support compiling C/C++/Java/etc for real projects.

(also, waf does have a daemon tool, but it has external dependencies so it's not included by default)


> Well, considering Buck just went through a from-scratch rewrite, I would argue otherwise

Based on what, the idea that waf fits their needs better than the tool they wrote and somehow wouldn't need to be rewritten or abandoned?

> Even if the interpreter were 100x slower to start and execute than it currently is, that time would be negligible next to the time spent waiting for a compiler or other build task to complete.

This wrongly assumes that clean builds are the only use case. Keep in mind that in many cases when using buck or bazel, a successful build can complete without actually compiling anything, because all of the artifacts are cached externally.

> There's even a tool called "fast_partial" which implements a lot more caching than usual project-wide to reduce time spent executing Python during partial rebuilds in projects with an obscene number of tasks

Right, the point that this is a concern to some people, and that there's clearly some tradeoff here such that it isn't the default immediately rings alarm bells.


No offense, but I think you're reading too much into my casual comments here to guide your understanding of waf, rather than the actual waf docs. Waf isn't optimized for clean builds (quite the contrary), and neither you nor I know whether the waf defaults are insufficient for whatever Buck is being used for. I just pointed out the existence of that "fast_partial" thing to show how deep into waf internals a project-specific optimization effort could go.

But discussions about optimization are pointless without real world measurements and data.


The fact that it's implemented and not on by default is a red flag any way you slice it. Either it's implemented but unreliable, or it's reliable but the maintainers don't think it's worth turning on for some reason (why?).


Exactly, one of the key selling point Bazel/Buck is their caching systems: very high cache hit rate, with no inconsistency, which allows very fast incremental build: 0-change build takes close to 0 seconds.


Just imagine how much memory a large dependency graph would take in Python...

Especially considering how poor Python's support for shared memory concurrency is...


Waf bills itself as "the meta build system". But Buck2 is "the Meta build system". :)


waf looks pretty nice but does it have a remote cache? For me the biggest argument for Bazel is the remote caching, and not having it is a bit of a deal breaker IMO


It's probably more about better caching, but using buck2 internally at Meta reduced me buildtimes from minutes to seconds. A very welcome upgrade.


For what language?


Python mainly.


I'm missing some historical context here. This article goes out of its way to compare and contrast with Bazel. Even the usage conventions, build syntax (Starlark), and RBE API are the same as in Bazel.

Did FB fork Bazel in the early days but retain basically everything about it except the name? Why didn't they just...adopt Bazel, and contribute to it like any other open source project?


One thing you might be missing is that this is Buck2.

Buck (https://github.com/facebook/buck) has been open sourced for nearly 10 years now.

The lore I've heard is that former Googlers went to Facebook, built Buck based on Blaze, and Facebook open sourced that before Google open sourced Blaze (as Bazel).

The first pull to the Buck github repo was on May 8, 2013 (https://github.com/facebook/buck/pulls?q=is%3Apr+sort%3Acrea...). The first to Bazel was Sep 30, 2014 (https://github.com/bazelbuild/bazel/pulls?q=is%3Apr+sort%3Ac...).


I was visiting the Google Munich office on the day Google open sourced Blaze/Bazel. The Facebook Buck team sent a congratulations cake: https://photos.app.goo.gl/6KwE6qeD3i72kSo38

Amusingly, the cake was in German but most of the Bazel team didn't really speak German. But it was yummy.


Smells like what FB did with Caffe vs. Caffe2, the two of which have nothing to do with each other.


Blaze is very old (from 2006), the history is described here: https://mike-bland.com/2012/10/01/tools.html#blaze-forge-src...

In the years that followed folks left Google and joined other companies and created similar build systems because blaze had a lot of advantages at scale. Facebook made Buck, Twitter made Pants. Blaze was still closed source inside Google. They all used the same python looking language.

In 2012 Twitter open sourced Pants: https://blog.twitter.com/engineering/en_us/a/2016/the-releas...

In 2013 Facebook open sourced Buck: https://en.m.wikipedia.org/wiki/Buck_(software)

In 2015 Google finally open sourced most of blaze, but renamed it bazel for copyright reasons. Some might argue they waited too long because clearly there was a lot of demand for such a system. :)

After that Twitter (mostly?) migrated to bazel and Facebook sort of stalled out on Buck. But then recently they decided to rewrite it from scratch to fix a lot of the architecture problems resulting in Buck2.

Buck2 looks pretty impressive and hopefully it gets the bazel folks moving faster. For example the analysis phase in bazel is very slow even inside Google, and Buck2 shows an alternative design that's much faster.


The direction the Bazel team seems to be going in is shortening the wall clock time by allowing for concurrent analysis and execution: https://github.com/bazelbuild/bazel/issues/14057.


At the time that FB started writing Buck, Bazel was not open source. I believe it did exist as Blaze internally at Google before FB started writing Buck. Facebook open sourced Buck before Google open sourced Blaze as Bazel.

Over time Facebook has been working to align Buck with Bazel, e.g. the conversion to Starlark syntax so tools such as Buildozer work on both systems. I believe Buck2 also now uses the same remote execution APIs as Bazel, but don't quote me on that.


Blaze already existed when I was an intern in 2007.


Buck far predates Bazel, and was built by ex-googlers replicating Blaze.

Skylark was a later evolution, after the python scripts grew out of control, and a cue that fb took from Google long after Buck had been near-universally deployed for several years.


Remote Execution is just a gRPC protocol -- bazel, buck1 and others implement it.


Hrmm, it makes performance claims with regard to Buck1 but not to Bazel, the obvious alternative. Hardly anyone uses Buck1 so you'd think it would be relevant.


I have a non-toy multi-language project in https://github.com/dtolnay/cxx for which I have both Buck2 and Bazel build rules.

On my machine `buck2 clean && time buck2 build :cxx` takes 6.2 seconds.

`bazel clean && time bazel build :cxx` takes 19.9 seconds.


That's cool. I was not able to repro due to the buck2 instructions not working for me, two different ways

   Compiling gazebo v0.8.1 (/home/jwb/buck2/gazebo/gazebo)
  error[E0554]: `#![feature]` may not be used on the stable release channel
  --> gazebo/gazebo/src/lib.rs:10:49
Then with the alternate instructions:

  error: no such command: `+nightly-2023-01-24`
 Cargo does not handle `+toolchain` directives.
 Did you mean to invoke `cargo` through `rustup` instead?


It looks like you don't have rustup.rs. You will need to install that since Buck2 is depending on nightly Rust features.


FWIW buck2 doesn't seem to build with nightly-2023-01-24 at this time, nightly-2023-03-15 worked for me. (Nightlies from April cause an internal compiler error.)


Anyway regardless of the fact that my local Rust environment isn't cut out to repro your result, how much of that speedup is due to parallelism that Buck2 offers and Bazel does not? When I build your cxx repo with bazel and load the invocation trace, the build was fully serialized.


It's honestly hard to measure at the scale of Meta. Just making everything compatible with Bazel would be a non-trivial undertaking.

Also that seems an interesting thing an independent person could write about, but whatever claims Meta made on a topic like that would be heavily scrutinized. Benchmarking is notoriously hard to get right and always involves compromises. It's probably not worth making a claim vis a vis a "competitor" and triggering backlash. If it's significantly faster than Bazel that will get figured out eventually. If not the tool really is aimed at Buck1 users upgrading to Buck2 so that is the relevant comparison.


I wonder if it's just because they don't have the same scale of data, since FB as a company uses Buck1/Buck2 but not Bazel?

They've clearly learned from Bazel though! I like the idea of not needing Java to build my software, and Starlark is battle tested / might make transitioning off Bazel easier.


The author of Bazel came over to FB and wrote Buck from memory. In Google it’s called Blaze. Buck2 is a rewrite in rust and gets rid of the JVM dependence, so it builds projects faster but it’s slow to build buck2 itself (Rust compilation)


I believe this is an over simplification. Engineers who had used Blaze at Google reimplemented it at Facebook based on what they knew of how it worked.

Even Facebook's Buck launch blog does not offer this story of Bucks lineage and although the author worked on the Closure compiler at Google that is not all of Blaze.

https://engineering.fb.com/2013/05/14/android/buck-how-we-bu...


The author of Bazel came over to FB and wrote Buck from memory. In Google it’s called Blaze. Buck2 is a rewrite in rust and gets rid of the JVM dependence, so it builds projects faster but it’s slow to build buck2 itself (Rust compilation)


Does anyone know how IDE support for Buck2 is? I couldn't find anything except some xcode config rules. Half the battle with Bazel/Buck/etc is that getting and IDE or LSP to work for C++/Java/Kotlin/Swift/etc is always a pain because those tools don't really work out of the box.


The vscode bazel plugin is basically completely abandoned. There's an issue that has been open for 3 years asking to add intellisense support for C++. Seems completely ludicrous to put in the massive effort it took to build Bazel and then fumble at the goal line by not supporting vscode.


I think the recommendation for c/c++ in Bazel is to use this: https://github.com/hedronvision/bazel-compile-commands-extra...

And use the compile command json file to power clangd. I'm not a vscode person but I would hope the vscode c++ plugin would support that


probably use a starlark plugin?


How does that help with developing C in a Buck project?


How do the "transitive-sets (tsets)" mentioned here compare to Bazel depsets[1]? Is it the same thing with a different name, or different in some important way?

[1] https://bazel.build/rules/lib/depset


tsets are described in more detail here: https://buck2.build/docs/rule_authors/transitive_sets/. Bazel's depsets were one of the influences on their design. To users, they will seem fairly similar and would be used for solving similar problems, there's some differences in the details of the APIs.

I'm not well-versed on the internal implementation details of bazel's depsets, but one interesting thing about tsets that may further differentiate them is how they are integrated into the core, specifically that we try hard to never flatten them there. The main two places this comes up are: (1) when an action consumes a tset projection, the edges on the DICE graph (our incremental computation edges) only represent the direct tset roots that the action depends on, not the flattened full list of artifacts it represents and (2) when we compute the input digest merkle tree for uploading an actions inputs to RE, that generally doesn't require flattening the tset as we cache the merkle trees for each tset projection node and can efficiently merge them.


Do smaller companies (smaller than Meta and Google) use these kinds of build tools much? It seems like a system that rebuilds everything whenever a dependency changes is more suited an environment that has very few, if any, external dependencies.

Is anyone using Buck/Bazel and also using frameworks like Spring, or React, for example?


I used Bazel for a long time, in a small team (5). No spring/react for me, but we have ~100 external dependencies across 5 languages. It was generally positive, certainly better than any system we used beforehand, and multi-language stuff was great, but the learning curve was steep and it had annoying defaults (which we soon wrapped with our own).

Regarding 3rd party packages, if you update them regularly and they are depended on by intensive builds, that’s going to be the reality in any build system worth using. A build is a DAG and if one node changes then it’s children must too, if you want any guarantee that the update isn’t going to break fresh builds. As with any build, if you want fewer rebuilds you need to be diligent with assigning dependencies to things that need them (so unrelated things don’t rebuild), and when things can be divided and conquered then they should be.

Bazel it isn’t perfect at reproducibility and thats what made me abandon ship. It isn’t perfectly sandboxed, for instance - it still uses your system compilers, still has access to your system libraries. This means that you can accidentally depend on something without specifying it, which violates my requirement for true reproducibility.

These days we use Nix, which is ultimately a better environment for developing reproducible builds if you’re happy to shun Windows. It is also a superior language to Starlark IMO - declarative makes much more sense than imperative in something that has to later resolve to a DAG. In terms of using it as a build system, it’s lacking unless you put in the legwork. The default is to use other build systems within Nix, but that leads to a lot of unnecessary rebuilding during development, so we just span up our own Bazel-like library for Nix to do the heavy lifting.


Bazel supports pluggable toolchains these days. We use `zig cc` via https://github.com/uber/bazel-zig-cc.


I worked at a company that was about 150 people when I joined. It's not primarily a software company but the early team had a bunch of ex-google folks, and they chose Bazel. I encountered it for the first time there. We did use React, yes.

I really liked the cross-language aspect of Bazel. Having one command that could compile everything and produce a deployable container in a highly reproducible way is great. It really cut down on "what's your compiler/tool version etc."-type back-and-forth during debugging with other engineers.

The bazel JS/TS rules were tough to work with when we first started using it for JS (2018 I think), especially since we were using create-react-app at the time, and that didn't mesh well with the way bazel wants to work. It's gotten a lot better though.

If I was making the choice from scratch in a new company/codebase, I think it'd really depend on the team. You kind of need broad-based buy-in to get the full benefits IMO.


I would heavily consider this type of system once build times become a major pain point. That often happens somewhere around 20-50 people working in one codebase. So I think this is a problem space for medium sized companies. Truly small companies probably don't need this and should use the standard ecosystem tools, BUT if your team knows how to use it there's little downside in started from a Buck / Bazel. Especially since you get most of the benefit if you have a nice clean DAG of your modules, and that's easy to build at the beginning and hard to refactor into later.


We are a ~10 people startup and have been using Bazel since day 1 (where I introduced Bazel and learned about it on the job).

Overall, I would say that it has been very much worth it, as it eliminates some classes of developer problems almost entirely that come up in companies of any size (e.g. "works on my machine"). I also feel when it comes to time spent setting it up, it's also a net positive over alternative systems where we would've had to spend time tuning e.g. GH Actions CI caches or make Docker build more reproducible.


Uber adopted Bazel a few years ago for their Go and Java monorepos, which is the majority of their code at the time. I doin't know the state of their UI repos.


> In our internal tests at Meta, we observed that Buck2 completed builds 2x as fast as Buck1.

In my experience Buck was spending a huge amount of time in GC, so this doesn’t surprise me. It must have been (ab)using Java in such a way that massive amounts of stuff were sprayed across the heap.


Ah. The "you're holding it wrong" defense of GC :)


The dynamic dependency stuff looks very nice! It feels like a good entrypoint for systems that are "merely" wanting good build caching, and not being "so huge git falls apart" big.

My biggest gripe with Bazel is how when you're off the beaten path suddenly it feels like the ecosystem really doesn't want you to just solve problems yourself. Meanwhile in this Buck2 documentation, directly talking about adding good support for tools outsides of community-provided things.

I still am not a superfan of the awkward way that custom implementations get declared (which I think comes from needing to support super-giant projects? But it's jsut awkward) and all the naming suffers from Google-like "we cannot call them functions but must call them factories" NIH things... but at least there are clear docs.


I really hope the team responsible for it is called Timbuktu.


Congrats to the team! Very excited to finally get to use this.


The essential characteristics of Buck2 look very appealing - but it's hard to see this catching up with the substantial ecosystem of language support rules for Bazel.


If the language rules are all Starlark, shouldn't they be compatible?


Haven't looked at the Bazel codebase, but it would strongly surprise me if the language support rules were implemented in Starlark. More likely, I'd suppose them to be almost exclusively written in Java (and highly based on quite Bazel-specific classes), with Starlark only coming into play in the form of bindings for being used by the users for their BUILD/bzl definitions.

Edit/Add: cf the part about language rules in this comment from a person who says they're a former bazel developer: https://news.ycombinator.com/item?id=35477309


Only the C++ and Java rules are native, the rest are Starlark. I don't know Bazel internals so I can't really comment on your point about APIs.


Same language does not necessarily mean Buck2 will provide the same API as Bazel to write language rules.


As a former Bazel developer and current Bazel user, I very much like the design principles that they outline for Buck2. In particular:

* The fact that it is written in a compiled safe language is a breath of fresh air. I personally like Java the language and understand why Bazel was originally written in Java and how it has done a great job at "hiding" it, but it's still there. In particular, Java's memory and threading models has been problematic for certain scenarios. (I haven't kept up with the language advances and I believe there are new ways to fix this, but adopting them would require a major overhaul of Bazel's internals.) Plus Bazel being written in Java prevents it from being adopted in smaller projects that are /not/ written in Java--a bummer for the whole open source ecosystem.

* The complete separation of language rules from the core is great. This is something that Bazel has wanted to achieve for a long time, but they are still stuck with native C++ and Java rules (it's really hard to rewrite them apparently). Not a huge deal, but in Buck2's case, their design highlights that it's clean enough to support this from day one.

* The "single" phase execution is also nice to see. Bazel used to have three phases (loading, analysis, and execution) and later managed to interleave the first two. However, the separation is still annoying from a performance perspective, and also introduces some artifacts in the memory model.

* It's good that various remote execution facilities as well as virtual file systems have been considered from day one. These do not matter much... until they do, at which point you want the flexibility to incorporate them. Bazel used to have this in the Google-internal version (there is that ACM paper that explains this), but the external version doesn't. For example, there is a patch to support an output tree on a lazy file system courtesy of the bb-clientd project, but after years it hasn't been upstreamed yet.

* And lastly, it's also great to see that what they open sourced is what they use internally. Bazel isn't like that: Google tried to open source a "cleaner version" by removing certain warts that were considered dead ends... and that has been both good and bad. On the one hand, this has been key to developing Starlark to where it is today, but on the other, this has made it hard for certain communities to adopt Bazel (e.g. the Python rules were mostly unusable for a really long time).

Now, a question: Buck2 uses the Starlark language, but that does not imply that they implement the same Build APIs to support the rules that Bazel has. Does anyone know to what extent the rules are compatible between the two? If Buck2 supported the Bazel rules ecosystem or with minor changes, that'd be huge!


Thanks for the comments! There are two levels at which you could make Buck2/Bazel compatible:

* At the BUILD/BUCK file level. I imagine you can get close, but there are details between each other that will be hard to overcome. Buck2 doesn't have the WORKSPACE concept, so that will be hard to emulate. Bazel doesn't have dynamic dependencies which means that things like OCaml are written as multiple rules for a single library, while Buck2 just has a single rule. I think it should be possible to define a macro layer that was common enough so that it could be implemented by both Bazel and Buck2, and then open source users could just support the unified Bazck build system. * At the rules level. These might be harder, as the subtle details tend to be quite important. Things like tsets vs depsets is likely to be an issue, as they are approximately the same mechanism, but one wired into the dependency graph and one not, which is going to show up everywhere.


There are a few references to NixOS on the code/issues.[0] I wonder what Meta's use case is for NixOS.

[0] https://github.com/facebook/buck2/search?q=nixos&type=issues


That person creating the issues was me. I simply know of Neil's work, was a user of his previous build system Shake, and we ran in some similar circles in the past -- so when I saw buck2's source initially get released a few months ago on GitHub, I just started using it immediately and giving feedback long before the initial release. Nobody at Meta uses NixOS for production work, from my understanding.

Actually, the current tip of trunk for buck2 can't build on NixOS right now with buildRustPackage due to a problem with prost. I should probably file a root cause issue about that soon...


EDIT: nevermind, I couldn't get it to build due to a prost issue, gave up and did `cargo build` instead. I misremembered.

crane.dev with nightly-2023-03-15 from oxalica/rust-overlay built here okay (after I figured out all the BUCK2_BUILD_PROTOC, missing Cargo.lock, etc tricks).

I did have some weird trouble with trying to import buck2 into my flake as a non-flake input, with complaints about "failed to resolve patches" for prost, but putting the flake.nix into the buck2 source tree worked.


Just using `cargo build` will work as you've seen, but building a Nix package won't work right now. If you're still keeping up with this thread, though, I have code in this repository to build a copy of buck2 with Nix -- it's several weeks out of date at this point though:

- https://github.com/thoughtpolice/buck2-nix/tree/main/buck/ni...

You should mostly be able to just 'callPackage' that and have it work if you have rust-overlay applied. Check the corresponding 'flake.nix' if you need it.

The prost issue with buildRustPackage is preventing an upgrade, so ideally the minor patches to prost can get upstreamed, but I need to (again) file a ticket about moving this along.


These were from an open source contributor - out the box Buck2 doesn't really have support for Nix. But because the rules are flexible, you can write your own version of Nix-aware Buck2.


Wonder if they have examples for Java where maven and groovy are the main two tools.

Also, in case of our builds, we can benefit only so much from being faster during build phase because it is all the other bits like SonarQube scans, pushing artifacts to Artifactory, misc housekeeping bits, annoyingly slow Octopus deployments etc, that add most of the time to the long deployment cycle. Sometimes I think if a dedicated Go utility that takes care everything build related (parallelizing when possible) would make things faster; it will have the full picture after all. But then we will be reimplementing all the features of these various tools which is maybe ok at FB scale but would be too much for a smaller shop.


You can push those down too so they only operate on applicable code (and if caching is working, only the changed code) per-build, with a tool like bazel and probably buck2 - in addition to getting parallel execution per-target.


Everyone says buck and bazel are so amazing but honestly, mono repos are unicorns. No one does this. It's useful to no one I know. I keep hearing it's useful to somebody, so it must be really useful when it is, but I've never ever seen buck, bazel or monorepos in real life. And it's been my career to build stuff.


I'm sorry to break it to you, but monorepos are extremely common. Doesn't mean they have to be as large but every company I've been at had a monorepo.

And as soon as you have to manage PRs for multiple repos with a new cross-cutting feature or scheduling changes in the correct order you understand why they are so appealing.


There's quite a few well-known places listed on https://bazel.build/community/users across many industries. I think Buck and Pants and Please (and ...) are not as widely used, but if they had a list to add we'd have even more examples.


For folks that are using these kinds of tools, any regrets? How much more complexity do they add vs make or shell scripts?


Speaking of the “tools” genrically, they are totally worth it because of their ability to aggressively cache and parallelize, but also because you end up with more declarative definitions of common build targets in a way that is more or less type safe. I personally think that makes these kinds of tools a win over make. Beyond that, they make it trivial to implement repeatable builds that run build steps in “hermetic” sandboxes. You can do all that with make, but you are abusing the hell out of the tool to get there, and it will look foreign to anyone familiar with using make “the traditional way”.

That said, bazel’s set of prepublished rules, reliance on the jdk, etc, make it not worth the burden, imo/e.

I think less ambitious, but similar tools are where it’s at. We use please for this reason, and are generally quite happy with how it balances between pragmatism and theory.

In any event, having your build tool be a single binary is a major win. I’d rather use make than anything written in python or Java just because I don’t have to worry about the overhead that comes with those other tools and their “ecosystems”.


I've used Buck at Meta for years and while it is technically impressive and does an excellent job with speed, remote compilation and caching, I am not a fan of all of the undocumented, magic rules used all over the place in BUCK and .bzl files.

I've yet to try BUCK on small projects though - I personally default to Makefiles in that case.

On thing I definitely wouldn't use it is for Android development. The Android Studio integration is much worse than gradle's and adding external dependencies means you have to make BUCK modules for each one.

I would however use it for large-scale projects or projects with more than a dozen developers.


IMO one of the nice things about Buck or Bazel is that once you learn it, switching languages doesn't require you to learn a completely new tool. Obviously the cost of learning it the first time is high and if you are used to one ecosystem may not be worth it. But I'm now on my 3rd different ecosystem that uses Buck/Bazel (Android, iOS, C++) and it's nice to not worry at all about the underlying tools.


I like to use Makefiles for project automation. Does Buck make it straightforward to run phony target tasks? I have been considering transitioning to Justfiles, but Buck will obviously be getting significantly more expose and mindshare.


There's no such thing as a phony target, but there is both `buck2 build` and `buck2 run` - each target can say how to run and how to build it separately. So you can have a shell script in the repo, write an export_file rule for it, then do `buck2 run :my_script` and it will run.


Nuts. Possible, but I would be fighting the platform a bit. Especially if I might want something like phony1 to depends on phony2.


What kinds of things are you using phony targets for?


Everything I can. Building docs, checking health of prod, deploying to dev/prod (which has real+phony dependencies), starting the development web server with watchexec, running lint, tests, etc. Some of these might have real artifacts, but I typically have more phony than not.

Make has quite a few flaws, but it is near universal and a good location to centralize all required project work. Even if the entire task is defined in a shell script elsewhere. That being said, I have being looking longingly at just[0] which is not just concerned with compiling C, but has task automation builtin from the beginning.

[0] https://github.com/casey/just


Tests are done with `buck test`, linting and such can be done in bazel with aspects, not sure if buck can do the same, although I wouldn't recommend in most cases.

Deployment can be done with `run`, but again I wouldn't recommend it for deployment to real environments, starting a local devserver with run is a common pattern though.


Excuse my ignorance but what are the advantages of using such a system over the standard build systems of various languages (vite, gradle, maven, pip, etc)?


In my experience there are two main advantages: reproducibility and cross-language support.

Build systems like maven can read from the entire filesystem (e.g. it reads from ./~m2) and you might end up with artifacts that depend on the state of the machine. This makes debugging production issues harder. You can of course be careful with other build systems to stay reproducible but it's easy to make mistakes and buck will enforce that you don't.

Companies like Meta and Google have huge monorepos and are using multiple languages. It's common for developers to deal with multiple languages and for projects to depend on other languages. Buck can deal with that very naturally and avoids the engineers to deal with multiple build tools.

There are other upsides and downsides listed on their website.


If you only use one language, and that language has a reliable, reproducible build system that gives you the guarantees and functionality that you require, then not much.

Here’s how I used Bazel (and how I now use Nix).

I am provided a configuration file that specifies what a given instance of my program must do. I use language A that can natively understand this configuration to code-generate a file in language B (which is significantly more suited to the performance requirements of the program)

This file is then built along with generic program code. It is used to process a lot of data.

As an interface to this program, I have a HTTP interface that can communicate with it. It needs to understand the kinds of outputs the program will produce, so some of this HTTP interface is also code generated in language A. The interface is interactive so typescript is generated and then compiled too.

In order to process the output from the program, I need to produce extensions for languages that users of the output use: Python and R. These need to understand the kinds of data being used, so are also code generated - then built. They’re then tested against requirements defined by the config (so the tests are also code generated).

Each of these stages have dependencies required when building and dependencies required when running, and there are several languages involved.

I also need to be able to express the entire process as a function, because it’s something that needs to be run very frequently on different configs - sometimes in collections of configs that need everything built at once. It needs to be build on several different machines, sometimes desktops and sometimes remote servers, sometimes on clients’ hardware (depending on the needs). I need confidence that something that works on my development machine will work, in entirety, on a completely different machine with minimal prior setup. And I need it to be easy to do, easy to maintain, and I don’t want to mess around with many different build systems that have entirely different use cases, entirely different ideas of how build/runtime environments should be handled, entirely different languages to configure them - and many of them are rigid and don’t have a concept of functions, they’re just “state your dependencies, kthxbye”.

All of the above is absolutely trivial with bazel if you know where to tread lightly (e.g. surrounding Python environments). It’s also very easy with Nix once you get used to it, and you don’t need to tread lightly there - it has stronger guarantees.


That's very interesting, I've never exposed to such a development environment. Is there maybe a GH repository or something that I can see the above in action? Thank you


how would you compare (both in terms of what relevant difference it makes from a usage perspective and of a qualitative evaluation of what's better/worse) Bazel's Starlark with Nix's own functional DSL, for the purpose as a language for build file definitions?


In favour of Nix.

In Nix, the sequence of events that lead to your build are figured out on the language level. A build is this sequence of events.

In Starlark, you have to do things in a specific order - and that can get messy. It has a bunch of built-in phases that mean WORKSPACE, .bzl and BUILD files all have different levels of ability. A workspace can define a dependency (e.g. a http_archive), load a file into bazel, then call a bazel function from it. If you want to do that outside of a huge unmaintainable file, you have to break the steps down into multiple bzl files - one that does the download, then another that loads the dependency and calls it. And then you can call the latter from your workspace. BUILD files can do neither of the above and can only really define builds. So you end up with awkward file arrangements to load the right dependencies.

In Nix, anything you evaluate is a Nix variable. A function is a variable, a derivation (target) is a variable. You can refer to it natively and do what you want with it.

In Bazel, everything is referred to in a global register via labels, which are just strings. If you want two versions of the same thing that are the result of a function call, you have to give them different labels, so you can end up screwing around with string manipulation to generate unique identifiers.

This also feeds into a deeper issue. Bazel loves its global registry. Names from your dependencies are also your dependencies - so if the author of a project decides it hates the users and wants to enforce a “com_github_owner_repository” as its workspace name (because who doesn’t love Java-style conventions), you’re stuck using “@com_github_owner_repository//:library” throughout your codebase, anywhere you rely on that dependency. And to load in that dependency’s own dependency, you have to either load a special file and run special functions that the author provides, or you have to copy a bunch of code into your WORKSPACE file, making sure you give things the naming conventions used within the dependency.

In Nix, you name things whatever you want. If something you depend on has dependencies, it manages them itself - you can override them if you want.

Bazel doesn’t have much of a registry of third party repos. The onus is on you to scour the web, find an archive you want, grab the hash, and sometimes write your own build file to explain how it gets put together.

Nix has a truly vast repository of dependencies, which can be library dependencies or programs (in the latter, it’s the largest repository of applications out there - bigger than even debian and arch provide).

Bazel is maintained by Google, and maintainers make it very clear that they are spread thinly and don’t enjoy doing it. Some projects are completely stalled due to lack of maintenance. When bugs are found in the internals, they don’t get fixed for a long time. It’s based on Blaze, just like VSCode is based on Visual Studio, and there’s a feeling that it is being held back when things that could benefit users would be incompatible with their own system. There’s always the looming apprehension that Google are going to abandon it as soon as the next shiny thing comes along, or as soon as the people who work on it get promoted. Want to submit a PR? Sign this agreement first, then we might look at your PR, then we might merge it, might say no, but (note: speculation) an eerily familiar yet unattributed push might arrive a few months later from a googler.

Nix is full FOSS, with the standard open process you’d expect.

In favour of Bazel.

In Bazel, because you’re expected to follow a certain flow, errors are friendlier and easier to figure out. Bazel knows what you’re supposed to be doing and can help. When it builds things and fails, the intermediate files are saved in an easy-to-find place within your local directory (as a symlink) so it’s easy to explore and figure out the issue.

In nix, a derivation (target) truly resolves down to the tiny operations that put it together. It doesn’t know what you’re trying to do with it, so a typo at the topmost definition of a derivation can end up giving you horrific errors about low level list operations. If an error occurs in the code you’re building, intermediates can be found by copying a huge nix store path from the terminal and navigating to it, but it is a laborious process.

Starlark is based on Python, and it’s familiar to a lot of people. The overhead of learning the language is just a case of “oh it’s Python but without X, Y, Z, and… without fstrings!?”.

Nix is an entire language, with its own conventions, its own library, and its own syntax. It’s easy enough to learn if you’re used to picking up new languages (and honestly is one of my favourite languages now), but that’s an overhead. Most people think imperatively and eagerly, and nix is functional and lazy, so it can be hard to figure out how to accomplish something.

Bazel has an actual build system. It has native support for a bunch of languages. Nix is very much DIY - if you’re using make or CMake then the mkDerivation can do a lot of things for you, but for a developer using the two technologies together is unpleasant. We went as far as making our own build system in Nix that invoked e.g. gcc directly rather than try to work with cmake.


When it breaks down do they say the buck stops here?


There was some initial fanfare about Sapling but it has pretty much waned. Is git simply good enough?


I’ve switched over to using Sapling as a client for my GitHub-based projects - since it’s conceptually similar, with all-round-better UX, and no changes on the server side, it has made my life better and yet I don’t really have anything to say about it ¯\_(ツ)_/¯


Just curious, but are you ex-Meta? Sapling looks pretty neat, but I was hesitant to introduce it at the company I work at since I wasn't sure if it's UX would be well-received. It feels great to me since I got used to it at Meta, but also acknowledge that others have built up the muscle for git-centric workflows.


Kind of upset they didn't leverage a corny phrase like "the buck stops here".


> In our internal tests at Meta, we observed that Buck2 completed builds 2x as fast as Buck1.

Interesting, so twice the bang for your buck.


But if you need Buck2 then you’re back to one bang per buck


wow nobody has dropped obligatory xkcd 927

https://xkcd.com/927/


Build systems are not standards.


> Written in Rust

stopped reading there.


It might be worth to continue reading at least up to the point where they explain why they chose rust, what benefits does it bring compared to the previous tool(based on JVM), etc. In a company like Meta, while inefficiencies are bound to exist, when it come to projects like this, where it's very easy to measure before and after, decisions to rewrite a huge piece of infra are not taken on a whim or because "rewrite it in Rust".


Rust is everywhere. It's a great language.


It failed at its own premise (statically checked object lifetime).

The only reason people use it is to have an excuse to rewrite things in something new, free from "legacy".


No one seems to know how to do practical and useful build systems, so I write my own.

In particular the idea of writing something entirely generic that works for everything is a waste of time. The build system should be tailored to building your application in the way that matters to you and making the best of the resources that you have.


This sounds like a nightmare if you’re dealing with even single-digit numbers of projects - even just in my personal spare-time hobby projects, I have ~5 nodejs projects, ~5 php projects, ~5 rust projects, ~5 python projects - and I find myself wishing for a common build tool because right now, even if I only have one build system _per language_, that still means that eg migrating from travis to github actions meant that I needed to rewrite four separate build/test/deploy workflows...


I have 500 projects.

They're all made to follow the same conventions so neither of them specifies anything redundant.


> Build systems stand between a programmer and running their code, so anything we can do to make the experience quicker or more productive directly impacts how effective a developer can be.

How about doing away with the build system entirely? Build systems seem like something that shouldn't exist. When I create a new C# .NET app in Visual Studio 2019, what "build system" does it use? You might have an academic answer, but that's beside the point. It doesn't matter. It just builds. Optimizing a build system feels like a lack of vision, and getting stuck in a local maxima where you think you're being more productive, but you're not seeing the bigger picture of what could be possible.


Have you ever worked on a project that has to combine code from different languages into one cohesive application or tool? Have you ever had to build a binary that needs to end up in an installer package with some custom install scripts and also needs to support multiple end user OSs?

It's nice that you've lived in a world where you haven't had to concern yourself with the concerns of how your code is built, but please understand that some of us actually need or want to delve into this. It can't always be magic.


Building different languages into one cohesive application or tool is a procedural process. I build the code in language A, I build the code in language B, I package them together in some way, and do something with the result. I don't need a build tool with a custom declarative language for that, I can just use a BASH script. The problem is people want to do both high-level merging such as this and low-level compilation management using the same declarative language. It should really be two things.


> I package them together in some way

That's a build system!

> I can just use a BASH script

Yep you can. Many build systems start out like that. Complexity tends to creep in.


A shell script that copies files into a directory may technically be a "build system" in the same way that a hot dog it technically a sandwich. Obviously (question mark?) I'm not talking about that.


I am not sure I understand your point. I click build in VS, whatever VS does it takes 2 minutes.

Supposedly the C# compiler can compile millions of lines per second, but my not millions of lines of code project takes a minute to compile so it must be wasting time doing something, and could use some optimization.


Regrettably, your observation doesn't prove your point. It's common knowledge that Visual Studio is very slow. It's entirely unknown how much of that slowness is related to the build system (using that term in the same sense as buck2).


Visual Studio is a build system. And about eleven other things.


I guess you could write machine code by hand if you don't want to build it. Otherwise, what specifically do you propose to do away with build systems?


Okay you got me, I don't want to get rid of build systems. I want to stop having to think about them, because it's a waste of my time.


I think across this thread you're thinking about the run-a-graph-of-tasks aspect of build systems. I'd agree with you that's mostly-uninteresting. However, that's also the least significant aspect of what these build systems do. Most of the work is _determining_ the graph tasks that need to be run, and therefore a lot of the differentiation is in the ability to write new rules to guide that process and in the engine driving it.


I think this is a case of YAGNI. Yes, there are cases where you have a big graph of sources and products and you can't plan up-front what you'll need. I.E. you're at a megacorp and your dependencies are created by some branch of the organization that is actively hostile to yours. I dunno. But I think that most people just don't have those problems.


To some, the bigger picture of what could be possible includes using multiple languages, multiple build steps, etc.

Refining individual ecosystems to be effortless to use is one thing, and it’s a goal worth pursuing for the benefit of those who are happy in that ecosystem. For those who rely on bringing different technologies together, though, there can be a lot of complexity and nuance.

The point of these projects is to take that complexity and nuance and provide an effective “super ecosystem” to piece them together in. That’s why some of the biggest proponents of these technologies are Google and Meta - they have many interacting parts in many different languages that need to come together and work effectively.


So what you're saying is this is unfit for compiling projects, but could be used in a FAANG-scale environment to allow developers to aggregate lower-level efforts in a declarative way? I like that idea.


If you look around the toolbench, and you can't figure out who the build system is ...


Why should I care what the build system is? Please use your words.


Sure:

Whatever code we write as developers is only very tip of a the iceberg of software that comprises any substantial application. Controlling what goes into that iceberg, and how it's assembled is an essential part of the engineering of software. The details and quality of your build system determine the composition and construction of that iceberg, not to mentioned the reliability and velocity of your development process.

Even 'basic' local build systems like CMake, maven/gradle/ivy, sbt, lein, cargo, go, ... bridge dependency management and task execution. They decide what goes into the software artifact you ultimately distribute (or deploy), and how that's assembled.

At the scale of buck, bazel, ... tools of that shape are necessary to make forward progress in that's composed of internal dependencies that are managed by different teams, written in different languages, targeting a variety of environments, that are so numerous they require distribution to complete in reasonable timeframes, and require absolute.

I'm not VS/C# user, but MSBuild is definitely a build system, and both in and the developer definitely have care about these complexities, even if they come under the heading of "IDE" instead of "Build System".

Also:

As the joke in my first comment implied, if you can't identify the build system, you're probably the build system


I'm starting to understand. This is a problem at FAANGs, and I simply haven't worked at a FAANG.


(Note: I work at Meta, on the buck team)

This isn't really just a FAANG thing. I've seen this throughout my career before the FAANGs at much smaller places, but projects needing multi-platform/multi-language/multi-arch scenarios.

You do have a point, that the issue is that one shouldn't have to learn the depths of a build system to do what one needs. The trade off is how are the knobs exposed to a developer.

Another way to think about it, these domain specific languages like Starlark and otherwise are just where the knobs are being stored. As much as Visual Studio ends up writing these to vsproj/vcproj or msbuild files, it's just that these are stored in another form.


To clarify, I don't think that complex build systems are only used at FAANGs, I think that their usefulness is only realized at FAANGs. From what I've seen at smaller companies, most people are using these tools because everyone else does, not because their software production suffers from the kinds of problems that these tools solve. These tools also seem to provide the mental framework that people need to think about their software production processes, so I guess there's that.


Software scales up in the size of the iceberg of your deployment, and out in the number of teams working on it. Buck (and bazel and pants) solve for scaling both directions at once.

Whether you should care about the details and knobs of your build system is strictly a function of the first kind of scaling. Every team building software of appreciable size should think a bit about their build system, and even small teams might benefit from Buck or friends if their iceberg is big enough.


For C++, the #include's kind-of-sort-of tell the story of what needs to be included in the build, and morally, it seems wrong that you should have to repeat that information anywhere. I recently watched a talk from cppcon from the people at tipi.build, which seemed to go along these lines. They've tried to completely tear away the entire notion that you should be writing separated build files that re-describe what you are already telling the compiler with your #include's.


Visual studio is your build system. What if I want to use, develop, or depend on a project that wasn't built with visual studios preferences in mind?

I mean I don't know how to use visual studio to build software, but I get along fine, so why should we optimize for your preferred interface and not another one that might be better?


> When I create a new C# .NET app in Visual Studio 2019, what "build system" does it use?

...msbuild?


How do you think VS builds your code?


The point is it doesn't matter. What it's doing internally isn't interesting to me, a programmer. I'd rather not think about it. Unfortunately, people have decided that microoptimizations like writing declarative build definitions by hand is beneficial, so I've had to learn many build systems.


The way it builds your software is also a declarative build system: MSBuild. It happens to be configured in XML instead of a Python-like language, and the IDE integration is probably better than what you get from Buck or similar.

Building software can be complex sometimes, and _pretending_ it's a "microoptimization" to care about that is unnecessarily diminutive.


Well first, there is a build system in your VS. The fact that you don't know about it does not magically make it disappear. Some people have to understand some of it. For instance the person who maintains your project for you, apparently (because you'd rather not think about it).

It's like if I said "I don't really know to understand the details of how a CPU works in my daily job. What about we just removed CPUs entirely?".


How does Visual Studio convert a solution to an executible?


Using a compiler. Less snarkily: It uses a compiler, and a black box which I don't have to think about or configure. There is technically a build system in there, but again, I don't have to think about it or configure it, unless I'm doing something really weird (and in that case, Visual Studio has GUI options for that).



Exactly right, people are very confident that they need a non-abstracted build system, and that it needs to be replaced every N years. I'm calling that into question, and people are confidently telling me I'm wrong, without the ability to explain why. So your link is very appropriate.


For the same reason someone creates a new programming language every second day: some people find it interesting, some people like to learn new languages, and sometimes a new language actually brings something really valuable.

Is it worth putting those resources into a new build system? Apparently Facebook thinks so. You don't have to agree, you can use Visual Studio.

It just seems like you say "if I don't care about build systems, nobody should".


Well, at the top-level of this comment section I see people cheering this on, and people excited to try this out in their own. I don't think they all work at Facebook, so maybe they should be closer to me than Facebook on this matter.


I was reading on & on, going, yeah sounds great, but when are you going to mention that it runs on Mercurial, which just puts such massive distance between FB & the rest of the planet?

They do mention that it supports virtual file systems through Sapling, which now encompasses their long known EdenFS. I'd like to know more but right off the bat Sapling says it is:

> Sapling SCM is a cross-platform, highly scalable, Git-compatible source control system.

Git compatible eh? Thats a lot less foreboding. (It is however a total abstraction over git with it's own distinct cli tools.)

I hope there are some good up to date resources on Sapling/EdenFS. I have heard some really interesting things about code kind of auto getting shipped up to the mothership & built/linted proactively, it just at commit points, which always sounded neat, and there seem to be some neat transparent thin checkout capabilities here.

https://github.com/facebook/sapling


(engineer working on Buck2 here)

Buck2 is actually used internally with Git repositories, so using Sapling / hg is definitely not a requirement


I'm not sure how Mercurial is relevant. From reading the Buck2 getting started docs, it looks like it works just fine with Git repos.


It indeed is not the primary concern of build systems. For many folks, there's some ci/CD systems checking stuff out & feeding their build tools.

Buck2 notably though tries to be a good persistent incremental build system, and it needs much more awareness of things changing. It does integrate with Sapling for these kind of reasons.

So the boundaries are a bit blurrier than they used to be between scm/build/ci/CD tools.


My understanding is that Buck2 uses Watchman (disclaimer: I'm one of the maintainers) so it can work with both Git and Mercurial repositories efficiently, without compromising performance.


It can use watchman, but for open source we default to inotify (and platform equivalent versions on Mac/Windows) to avoid you having to install anything.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: