I like to think of our solution as a modern version of a makefile. Many people are familiar with makefiles, and back before every CI server was configured using YAML and groovy, this was a common way to build things. We are trying to improve the simplicity of make by adding on the repeatability of docker containers. By mixing those two concepts together, a lot of cool properties fall out.
Because each target is containerized, there is no works-on-my-machine issues, or tests failing because something is not installed on one Jenkins runner or so on.
Second, that isolation means that dependencies between steps need to be explicitly declared, and with this knowledge, earthly can infer parallelism.
Third, containerization gives caching abilities. We can tell if nothing’s changed since a previous run, and earthly can cache steps across build runs or even across machines. You get to use a docker registry as a build cache.
We shared this on here about a year ago[1], and since that, lots has changed. First of all, a year ago it was just Vlad working on this, and now there is a number of us, including me. But for end-users, we have shipped many great features such as multi-arch builds, docker-compose support (which is handy for running integration tests ), inline and explicit caching, and an interactive debug mode. Interactive debug mode is neat but straightforward: a failing build step pops you into a shell at the point of failure.
I don't get it: why wouldn't I just use a Dockerfile? The Earthfiles look the same. Dockerfiles run the same way every time and Docker supports caching. I just see more complexity on top without value add.
If I want to move beyond local builds then I'd use something like Garden or Skaffold. Skaffold builds the same way every time and supports remote builders.
You can take a look at how the phoenix framework uses earthly to simplify their builds. They are testing across OTP versions and across different databases (mysql, SQL Server, postgres) all in an earthfile[1]. Ecto_sql does something similar[2].
I have a very hard time imagining "AWS Infinibuild(R)" that eats the "makefile + dockerfile that builds locally" use case, which is why I was curious to learn more
---
Separately, I would value GitHub's license detector gizmo using some of that sweet machine-learning sauce to identify the BSL, since it seems to be the path forward for these situations
Thank you for that; I would suggest putting that link in the PR description (or a comment) since not everyone who is similarly curious will be on a Show HN thread with the authoritative parties
I'm not trying to start trouble or get into a massive licensing thread on HN, but I want to point out that 2/5ths of those exemplar links of BSL success are in fact Apache 2.0 licensed, and not just source-available. I believe you that there are (evidently) parts of Kafka which one must pay money for, but my understanding is that the Apache Software Foundation would not tolerate hosting source-available projects on the apache.org domain nor in the apache GitHub organization.
Makes sense. Kafka is Apache, confluent platform is open core. So I agree the post is a bit confused about that. We can update it and link in the PR. No intention to misrepresent the facts on our parts, just a small team trying to build stuff.
Hi mdaniel - just to clarify the blog post says that Confluent / Kafka follows an open core model. It doesn't claim that Kafka has a source available license. A few notable switches to source available licenses have been named in another part of the blog post though. Hope that makes it clearer.
to be fair "Free and open" is different than "Free and Open Source".
Open source has a much stronger connotation than open and I could certainly see calling it open if the source is available for the core of the product.
Yesterday I gave up building something locally and just got CI to do it because to run locally you have to kind of make a poor mans .travis.yml file out of bash locally then play spot the difference (I never figured out what was different that was breaking it locally, resulting in a k8s deployment with a missing mount/volume).
This might sound weird but when you inherit some legacy code you’ll know what I mean.
Something where you can run the CI locally is such a boon.
Also CI providers do have downtime. If you can run locally, you can’t deploy your app!
Maybe this is a stupid question, but what is the "make" part? I see a lot of docker syntax and a few targets without dependencies. But make is all about dependencies. What kind of make dependencies do you support? What kind of variables, macros, etc?
We support dependencies in a number of ways, but are a bit different than make dependencies, because the file system of each step is separate unless you explicitly copy in or depend on other targets. This helps us do things in parallel because you are being explicit about the DAG of the build.
Copying or using file from a step causes your step to depend on it. Similarly for a `from +target`.
We also have variables, user defined commands and conditional logic. User defined commands are our way to composition.
I looked at their main page and got the impression this tool is not very / not at all useful - except perhaps for some niche I'm not in.
> Earthly is a syntax for defining your build.
I already have syntax for defining my build.
> It works with your existing build system.
If I have a build system, or a build system generator, why do I need to redefine my build another way?
> With Earthly, all builds are containerized
Huh? What? Why? ... That's already a deal breaker. I don't want to be limited to building within containers.
> With Earthly, all builds are ... repeatable
Isn't that true for any decent build system?
> With Earthly, all builds are ... language agnostic
An ambiguous statement. Especially since a build process does depend, to some extent, on the language of the source files.
> Get repeatable and understandable builds today.
(shrug) more marketing talk.
> It's like Makefile and Dockerfile had a baby.
I don't want Dockerfile in the family, and as for Makefile - it's too low-level. Makefiles get generated these days (and have been for over 20 years for most software packages, IIANM.)
> No more git commit -m "try again" .
This has never been a problem for me.
(Well, ok, if there's network trouble, then downloading things during a build will fail; but Earthly can't magically reconnect a broken network.)
> "Earthly is a pragmatic and incremental solution to a thorny problem that all organizations face"
Which thorny problem is that? The "non-repeatable builds"? I'm confused.
Earthly is essentially a make/dockerfile syntax for a BuildKit frontend. If you understand and agree that BuildKit has value, than that's the jist of it.
Agree with all your points for sure, but people need words on their landing page
I guess there's a niche for Mac/Windows developers to be able to run the build system on linux, because that's what their CI system runs (so it might fail for some issues that does not fail on their host machines), but most CI systems cannot run nested containers (because docker-in-docker requires some privileges), and as a result their CI system usually can not run this thing, which totally defeats the purpose.
> If I have a build system, or a build system generator, why do I need to redefine my build another way?
There are build systems such as SBT, Maven, Webpack etc, Other build systems such as Bazel completely replaces them, what earthly offers is integration with them.
The comparison to Bazel here is really not appropriate. The core competency of Bazel is to know exactly the build graph and be able to do the minimum amount of work possible. I don't see how this (basically syntactic sugar over multi-stage docker builds) is going to scale in a large monorepo with hundreds or thousands of developers and still maintain CI build times on the order of minutes.
If you're using, say, SBT and you like it, why do you need Earthly again? And if you don't like it, why don't you replace it if it's possible to replace?
Here is an SBT example[1]. Earthly is starting up the database dependencies, running the integration tests, tearing them down and other glue type things that need to happen when a complex build pipeline has dependencies and several steps.
If you use both SBT and npm, for example, to build your full project, then having something to wrap them could be nice. That could be make or earthly or a random script.
Also if you’re using SBT and you do like it, but you want something it doesn’t do, say shared caching. In that case Earthly could provide that but you can keep your existing build config intact. At the cost of added complexity, though.
If they would only say _that_ on their front page, I would understand what they're offering me. Well, that, and a list of extra features they offer relative to various build systems.
I wonder what the scope of what would be a competitive service now that Earthly's license changed to BSL from MPL on 2021-02-01 [0]? Seems like Earthly Technologies Inc will launch a SaaS [1]. However, I'd assume it won't matter for most developers except for those selling a CI/CD SaaS. In any case, I'm looking forward to seeing what they end up offering!
I like nix. If you dig into nixpkgs far enough back you can find commits from me bumping version hashes on postgres.
The caching and our future plans for distributed builds have some similarities with Nix Hydra and Bazel but creating images in earthly is closer to docker multistage builds than it is to nix.
We are not just for building docker images, though. In the real world a 'build' is more than just building an image. It will have linting, and tests and integration tests and maybe some services need to be stood up and tore down in between to make sure an api contract wasn't broken and you might need different binaries generated for different architectures and so on.
I could be wrong but I think nix is focused more on just the producing the artifact part. I do love how they approach every build as a pure function though.
> I could be wrong but I think nix is focused more on just the producing the artifact part.
That is only true to a point. Nix has an extremely broad definition of what constitutes a "build", in practice, and this process can include things like tests.
That having been said, there are undoubtedly certain tasks that Nix itself is not great at - but when you think of Nix more as a system primitive for expressing build processes rather than as a universal do-everything tool, that just means you would build tooling on top of it for those tasks, like Hydra does for CI for example.
That way everybody wins; you don't need to reinvent the build orchestration parts of the process, it interoperates with the entire existing package set, and the broader Nix community gains a new tool that helps them solve their problems more effectively.
Nix does have some limited support for a "check" phase, but I think it's meant to be for an ultra-quick Homebrew style smoke test, basically just confirming that `thing-bin --version` writes to stdout and returns 0, rather than something usable for larger scope integration.
Do you plan on having some support for hermetic and reproducability? That's the main thing Bazel brings to the table and I'd like to see some solutions that can scale to hundreds of services like that
I know less about Bazel then Vlad does so he may have a better answer. But Earthly lets you use your languages build tool and expects but does not enforce that the individual steps you call are reproducible.
So it is a less granular approach than Bazel which we think offers a lot of same benefits but without guaranteed bit for bit reproducibility. You can do non-repeatable things in a `RUN` but if you don't then it works well and you get to use all your existing tooling rather than having to replace them.
My major gripe is that I legitimately need cross platform build and test (windows, Linux, and macos). I will pay hundreds to thousands of dollars a month to someone that solves these problems (I already do, and release engineers aren't hurting!).
A guess my question is more "how tightly coupled is this to Linux containerization technologies" because if the answer is "fundamentally so" then I don't see a legitimate use for it that creates any value.
I spend a lot of time and money on build automation because nobody has solved it for cross platform builds, and I see forcing containerization of key elements into backend services as an insane solution to that problem.
Yeah, I was sick of having to write build scripts in vendor specific formats and then not being able to run those builds locally without jumping through a lot of hoops.
If earthly didn't exist I think I would use a combination of make inside docker files to define my build and then use the bare minimum of CI specific configs to just call into those. That way when I need to switch from Travis to Github or Circle or whatever, I don't need to rewrite the steps.
Great to see earthly alive and kicking. Was in touch with Vlad nearly a year ago, and my main blocker to using it was longevity of the project in quite a saturated space. Perhaps it’s time to take a closer look :) It looks great!
Earthly has been great to use. My one issue is with docker. I would greatly prefer to use podman but it seems there is no support for it yet. Every issue I have had with earthly is with some weird quirk with Docker I don't have with podman. For the most part I can get around this with setting DOCKER_HOST, but it is still weird to depend on dockerd when I would expect podman to work fine.
I don't think it is meant to replace any particular build system like Bazel or make. Rather it is a better dependency glue flavor for Docker image building and testing. Earthly uses Makefile-like targets which can be leveraged through more advanced references [0]... essentially can replace plain "copy --from=" with a new magic strings syntax which also understand git. Build systems like Bazel would be invoked within a container.
I don't think the "why" using it instead of Bazel (and its plugins) will become clear until the Earthly SaaS gets announced that is hinted about in their TOS [1]. Build resource pool management, image release management, testing/dev environment definitions, custom URLs to test pull requests as seen elsewhere?
Arguably one could say they'd probably be able do the same thing by writing some Skylark and use remote build execution, but that could be close to saying Dropbox wasn't special in a Show HN thread. :-)
I use earthly. I would not use it instead of bazel, but with bazel. You can imagine earthly working by building a DAG of docker commands (RUN, ENV, ...), similar to how bazel/make make one on your source files. If A depends on B and B changes, B then A are built, etc.
So keep bazel for building and testing, but you can use earthly to run above it so every bazel build is in a fresh docker container. Example:
# All image deps
RUN apt install -y bazel gcc
build:
RUN bazel build
test:
FROM +build
RUN bazel test
release:
FROM +build
RUN bazel package
SAVE ARTIFACT AS LOCAL ./a.out a.out
SAVE IMAGE cool-app:latest
Basel build runs before the other two targets, but testing and packaging run in parallel. I do this for elixir projects -- npm install assets and mix deps.get in parallel, then copy both results into the final image.
Edit: you can also inject arguments and the dependency graph will fork when needed. So for example, I can specify multiple version targets and earthly will only start running in parallel when it hits the build arg line (since subsequent run lines will have different build arguments)
> Overall, compared to Bazel, Earthly sacrifices some correctness and reproducibility in favor of significantly better usability and composability with existing open-source technologies
Exactly how much correctness is getting sacrificed is pretty important. A key insight of Bazel is that you can’t have speed at scale without correctness.
Would be good to have Earthly described within the framework of “Build Systems a la Carte”.
Yeah, I use azure so I can get build support for linux, windows, osx. I wish I could simplify all that, but with windows I need call native C++ toolchain (using rust).
This seems like it could be better than Docker when you need to manage dependent images. Like if I want to build a simple base image for my backend that is then used to build images for both a prod deployment, a testing image, etc.
It's built on Docker's BuildKit which you can also enable with some config twerking (lol, not fixing this spell check because it's probably more accurate). That's how you get caching and DAG solving. BuildKit doesn't have a great wrapper or interface a.t.m.
Earthly is one of several companies building on BuildKit and providing a better DX
It does. You just put all your actually build logic in an earthfile and call it from your github actions. A benefit of this is its easy to move CIs and you can test your build locally.
So you just have to pull in earthly and then call it. Here is an example. Note that the travis file has almost nothing, because you can move all that to an Earthfile so that you can easily port it and run it locally.
I wonder how well will this play with [VSCode devcontainers](https://code.visualstudio.com/docs/remote/containers) -- perhaps one could create a target to build a development image, that extends the CI image by adding development tools? I'll definitely experiment with this, sounds promising.
This is AWESOME! I already have a super container heavy workflow and abuse dockerfiles to do things like this. It's great to be able to generate artifacts in a container and slurp them back out (really tricky with normal dockerfile builds, you need exterior commands).
How well does it work with big builds in a single stage that fail? E.g. If I have a 3h CMake build that fails, does it keep the intermediate files in an image somewhere?
This looks interesting, thanks for sharing. I've been looking for a solution to some of the problems that this targets; in general our build steps are containerized, but it's annoying to have to layer a `make <step>` command on top of a dependency tree of docker/other commands inside a given CI step, and it's worse to have multi-step commands in the CI/CD config file, making the jobs harder to reproduce.
One more general idea I'd be interested in others' thoughts on -- there seems to be an artificial split between build systems (e.g. Bazel) and CI/CD systems (e.g. Gitlab). As these tools become more generic, they tend to overlap almost-completely; ultimately the hard part is DAG management and latency-optimized distributed computation. When you look at "how to use Bazel in Gitlab"[1] or "how to use Earthly in Github Actions"[2], it's basically having a single CI step that calls `buildtool <run my build DAG>`. But there could be other CI/CD steps like "run the e2es", and in general I want multiple steps in my Gitlab CI/CD pipeline so I can break open the individual steps, retry flaky runs, see which step failed, and otherwise get all the nice tight integration that Gitlab affords. Unfortunately I have a DAG in the CI server, and then another DAG in the build job; it really seems like the CI/CD system should somehow unwrap the build-tool's DAG into the CI/CD DAG, then I could rerun individual failed steps etc. (though I think you might perhaps want some folding, because Bazel DAGs are probably in general too fine-grained for the way CI/CD servers tend to think about the visualization problem).
Maybe this is just a missing feature in GitLab? It seems like the Github workflow is less well integrated with the CI/CD pipeline (always a click or two to get to the pipeline result details) so maybe users there don't see any missing features.
CI/CD systems like Concourse have a really nice model of "artifact revisions as inputs/outputs" which would play really well with this model; different parts of the pipeline accept a hash and will trigger downstream jobs only when those change.
I wonder if the Earthly team has any thoughts on integrating with the CI/CD DAG in this way?
I agree that build systems and CI/CD systems should converge. One way this could happen is considering one side, say the Bazel build or the earthfile canonical, and generating the config for the other side from it, including various sub-steps and so on. Or Bazel or an Earthfile could be the basis for a CI system, but I think things can go further than that.
Building locally and building in CI should converge, and building on a single machine vs. building across a cluster of machines should also converge.
I'd love to hear other's thoughts or ideas on this. We are thinking about this as well. There is a lot of innovation to come in this space.
Good point. Looking at this, it seems that maybe you could build Earthly=>Gitlab with the current GL API. With "parent/child pipelines" [1] you can have step 1 generate the GL YAML for the rest of the graph, and then step 2 run that YAML.
One thing that might be needed on the Earthly side is some way to run only the named job, and fail if the deps aren't built ("pipeline mode"?) so that the downstream steps don't re-run the upstream ones. But feels like the model you're considering is actually close.
Anyway, glad to get validation that others are thinking this way, and I'll follow your project to see where you guys get to!
It is frustrating that we seem to be evolving towards the nesting of one build system inside another, such as "go build" inside "docker build". (One of the projects I work on uses janky, docker compose, docker build, Go build, Make, Rake, and Bundle.) This approach is opaque, coarse-grained, inefficient, unnecessarily sequential, and doesn't materialize "the" build graph as a single entity from which we can identify our software artifacts and their relationships and derive and automate all our workflows.
Bazel addresses these problems with a single, canonical, first-class graph, and many projects that use Bazel achieve close-to-ideal build efficiency, scalability, and reproducibility, as well as a good foundation for other workflows. But switching to Bazel is a large undertaking for a project that has already become entangled in the kind of situation described above. Maintaining hand-written or generated BUILD files for dependencies can be a burden. (I led the design of Blaze in 2006, at which time Google was already using BUILD files to declare software artifacts and dependencies in its more primitive predecessor system; we simply could not have achieved Blaze even at Google's then scale otherwise.) I often dream of a future world in which it is expected that compiler writers will provide a formal definition of the toolchain's workflow in a form that allows rule-sets for Bazel and its successors to be derived.
So there is a place for tools like Earthly that try to retrofit some of the goodness of pure functional builds onto the mess of Docker files, whose natural abstraction is not
"consume inputs, produce outputs", but "mutate the file system", which doesn't compose.
By strange coincidence I had just started playing last week---a mandatory vacation at my employer---with a proof-of-concept of a tool just like Earthly, so the timing of this announcement couldn't be better.
I've not tried this and I'm not very familiar with TeamCity but it should just work. If you are running the teamcity builds inside of containers, you may need to switch that off, as Earthly brings its own containerization.
After a cursory look, it seems like there is no support for parallel and dependent builds in the syntax right now. Is this correct? Are there any plans to support that? That's the advantage of CodeBuild spec files –– they let me specify build dependency and order, but I have to write a file for each job. I'd rather write a single file with a simple syntax like the Earthfile.
Ah, I see. That's pretty cool. But all builds are run on the same machine, which is not a deal breaker; it'd be nice to distribute them on multiple machines, as well.
After a day of trying it out, I have to say... this is really terrific. I was fully prepared to be disappointed, but I was pleasantly surprised by it at every turn. I hope I can replace all my yml files with a single Earthly file someday.
I like to think of our solution as a modern version of a makefile. Many people are familiar with makefiles, and back before every CI server was configured using YAML and groovy, this was a common way to build things. We are trying to improve the simplicity of make by adding on the repeatability of docker containers. By mixing those two concepts together, a lot of cool properties fall out.
Because each target is containerized, there is no works-on-my-machine issues, or tests failing because something is not installed on one Jenkins runner or so on.
Second, that isolation means that dependencies between steps need to be explicitly declared, and with this knowledge, earthly can infer parallelism.
Third, containerization gives caching abilities. We can tell if nothing’s changed since a previous run, and earthly can cache steps across build runs or even across machines. You get to use a docker registry as a build cache.
We shared this on here about a year ago[1], and since that, lots has changed. First of all, a year ago it was just Vlad working on this, and now there is a number of us, including me. But for end-users, we have shipped many great features such as multi-arch builds, docker-compose support (which is handy for running integration tests ), inline and explicit caching, and an interactive debug mode. Interactive debug mode is neat but straightforward: a failing build step pops you into a shell at the point of failure.
Let me know what you think of what we have built.
[1] https://news.ycombinator.com/item?id=22890612