Hacker News new | past | comments | ask | show | jobs | submit | sclevine's comments login

This is especially true of Paketo's buildpacks, which tend to do exactly one thing each. E.g., Paketo Node.js buildpack is just a configuration file that's composed of other buildpacks: https://github.com/paketo-buildpacks/nodejs/blob/main/buildp...


Additionally, Paketo's buildpacks build reproducible images given the same source code and buildpack versions.


Many comments here point out how difficult it is to manage a separate dependency stack for each container when you use Dockerfiles to build them. This problem is just as difficult, time-intensive, and security-critical for microservice apps running on K8s as it is for CLI tools and graphical apps.

Worth pointing out that there is an incubating CNCF project that tries to solve this problem by forgoing Dockerfiles entirely: Cloud Native Buildpacks (https://buildpacks.io)

CNB defines safe seams between OCI image layers so that can be replaced out of order, directly on any Docker registry (only JSON requests), and en-mass. This means you can, e.g., instantly update all of your OS packages for your 1000+ containers without running any builds, as long as you use an LTS distribution with strong ABI promises (e.g., Ubuntu 20.04). Most major cloud vendors have quietly adopted it, especially for function builds: https://github.com/buildpacks/community/blob/main/ADOPTERS.m...

You might recognize "buildpacks" from Heroku, and in fact the project was started several years ago in the CNCF by the folks who maintained the Heroku and Cloud Foundry buildpacks in the pre-Dockerfile era.

[Disclaimer: I'm one of the founders of the project, on the VMware (formerly Cloud Foundry) side.]


I hadn't heard of Buildpacks before, sounds very interesting.

In particular the out of order layer replacement. I'm interested in switching to Buildpack for the images I maintain for my home cluster. Would make upgrading my base image so much simpler compared to rebuilding all the other images! I read a bunch of docs/articles since reading your comment yesterday but couldn't find any mention of this, or better yet an example. Are there some docs I missed? (I didn't look into the spec.)


Nevermind, I realized that rebase is exactly that. I had misunderstood the docs.


Don't hesitate to reach out on Slack if you have more questions: https://slack.buildpacks.io

A few tips on rebase:

(1) If you want to rebase without pulling the images first (so there's no appreciable data transfer in either direction), you currently have to pass `--publish`.

(2) If you need to rebase against your own copy of the runtime base image (e.g., because you relocated the upstream copy to your own registry), you can pass `--run-image <ref>`.


I'm just using K8s (specifically: K3s) for configuration management in this case. This post hits the nail on the head: https://news.ycombinator.com/item?id=23006114

That said, NGINX can do UDP load balancing and WireGuard is stateless, so it should be possible to use this with a Service + NGINX ingress controller at scale: https://kubernetes.github.io/ingress-nginx/user-guide/exposi...

I have not tried it though.


This was indeed the motivation for my write-up :)


Thanks, I was meaning to look into this but your post will save me some research work.


The OCI image format is a standardization of the Docker v2 image format, but they are generally compatible and interchangeable. That FAQ entry is misleading and slipped past the engineering team working on the project. Just removed it :)


The presentation to the CNCF TOC covers some of the technical details: https://www.youtube.com/watch?v=uDLa5cc-B0E&feature=youtu.be

Some key points:

- CNBs can manipulate images directly on Docker registries without re-downloading layers from previous builds. The CNB tooling does this by remotely re-writing image manifests and re-uploading only layers that need to change (regardless of their order).

- CNB doesn't require a Docker daemon or `docker build` if it runs on a container platform like k8s or k8s+knative. The local-workstation CLI (pack) just uses Docker because it needs local Linux containers on macos/windows.


While I really appreciate the work tonistiigi did to create the Cloud Foundry buildpack frontend for buildkit, it uses a compatibility layer[1] (which I wrote myself and no longer maintain) that only works with deprecated Cloud Foundry buildpacks that depend on Ubuntu Trusty. It doesn't work with the new, modular Cloud Native Buildpacks, and the buildpacks that ship with it are outdated (and vulnerable to various CVEs). It will stop working with new buildpack versions entirely when Cloud Foundry drops support for Trusty.

Implementing CNBs as a buildkit frontend would break key security and performance features. For instance, CNBs can build images in unprivileged containers without any extra capabilities, which buildkit cannot do. CNBs can also patch images by manipulating their manifests directly on a remote Docker registry. This means that image rebuilds in a fresh VM or container can reuse layers from a previous build without downloading them (just metadata about them), and base images can be patched for many images simultaneously with near-zero data transfer (as long as a copy of the new base image is available on the registry). As far as I know, buildkit can't do any of that yet.

That said, we do plan on using buildkit (once it ships with Docker by default) to optimize the CNB pack CLI when you build images without publishing them to a Docker registry. It's a huge improvement over the current Docker daemon implementation for sure!

[1] https://github.com/buildpack/packs


Your answer makes sense, but it actually makes me less excited about CNB.

It sounds like CNB will break compatibility with the massive Dockerfile ecosystem, in exchange for... sometimes not downloading a layer? That is not appealing to me at all, because Dockerfiles are too embedded in my workflow, losing support for them is simply not an option.

As for unprivileged builds, I don’t see any reason buildkit can’t support it since it’s based on containerd.

I think it’s a mistake not to jump on the buildkit/docker-build bandwagon. You would get a 10x larger ecosystem overnight, basically for free. Instead it seems like you’re betting on CNB as a way to “kill” Dockerfiles. But users don’t actually want to kill anything, they want their stuff to continue working. Without a good interop story, you’re pretty much guaranteeing that CNB will not get traction outside of the Pivotal ecosystem. Seems like a shame to me.


Meanwhile some of us were already looking for alternatives to Dockerfiles due to roughly the same reasons described in the blog post and are excited about having this option. No connection to Pivotal or Heroku myself; and indeed the blog post came from Heroku, not Pivotal.


Image rebasing and layer reuse really matter at scale. Patching base images for many images simultaneously in a registry is a huge win if you are an organization running thousands of containers and there is a CVE in one of the OS packages in your base image. Optimizing data transfer similarly matters when you add up the gains across many containers.

And devs like fast builds too :)

I also don't really see how this project is "incompatible" with anything in the docker ecosystem. I hope Dockerfiles have a long and healthy life. CNBs are another method to build OCI compatible images that provides a lot of benefits for some types of users and use cases.

I expect that the pack cli will eventually build on top of and take advantage of Buildkit


The incompatibility I mentioned is with Dockerfiles.

You’re right that more flexible patching and optimizing transfers are valuable. But those problems are independent of build frontends: you could solve them once for both buildpacks and Dockerfiles. In fact buildkit is well on its way to doing exactly that.

Basically I would prefer if buildpacks and Dockerfiles could all be built with the same tooling. CNB seems like a wasted opportunity to do that, because it bundles two things that should be orthogonal: a new build format, and a new build implementation. Docker is going in the opposite direction by unbundling the format (Dockerfile) from the implementation (buildkit).


> But those problems are independent of build frontends: you could solve them once for both buildpacks and Dockerfiles.

You can build an image with both technologies, but that's not the key to the argument. The key here is every Dockerfile is unique and potentially quite different from any other Dockerfile. Small differences in layer order and layer contents multiply to very large inefficiencies at scale.

The way you tackle this problem is to make the ordering and contents of layers predictable for any given software that is being built. You can achieve this with Dockerfiles with golden images, strict control of access to Dockerhub, complicated `FROM` hierarchies, the whole shebang.

But at that point you are reinventing buildpacks, at your own expense.

Note that this doesn't change with or without buildkit.

> a new build format, and a new build implementation. Docker is going in the opposite direction by unbundling the format (Dockerfile) from the implementation (buildkit).

Is your understanding that we invented a new image format or that we rewrote most of Docker? Or that the way we've written it prevents, for all times and all purposes, adopting buildkit as part of the system in future?

Because both of those are misapprehensions. We have extensively reused code and APIs from Docker, especially the registry API.


> Small differences in layer order and layer contents multiply to very large inefficiencies at scale.

Can you provide an example of how layer order can cause an issue?


Consider:

    FROM nodejs
    COPY /mycode /app
    RUN npm install /app
Now suppose I change my app code. In a Dockerfile situation, the change to the `COPY` invalidates the `RUN npm install /app` layer, even if I didn't change anything that NPM would care about.

An NPM buildpack can signal that there's nothing to change, allowing the overall lifecycle to skip re-building that layer.

There's also the problem of efficient composition. Suppose I have this:

    RUN wget https://example.com/popular-shell-script.sh && \
        go get https://git.example.com/something-else@abc123 && \
        ./popular-shell-script.sh && \
        rm ./popular-shell-script
And this:

        RUN go get https://git.example.com/something-else@abc123
Both of the resulting images will contain the same `something-else` binary and in an ideal world of file-level manifests I could save on rebuilds and bandwidth consumption (NixOS has this, approximately).

But I don't get to do that, because the layers have different overall contents and different digests. Buildpacks don't get you all the way to a file-centric approach, but because they follow a repeatable, controlled pattern of selecting the contents and order of layers, they greatly improve layer reuse between many images.


I'm not Stephen, but we've worked together on a few projects, including this one.

If you like Dockerfiles, you like Dockerfiles. Lots of people do. I did, until I'd used them for a while.

I'm not sure what you mean by "break compatibility". CNBs produce OCI images. They'll run on containerd just fine.

As for ecosystem: you'll note that the domain is heroku.com.


Dockerfiles require you to rebuild lower layers when any upper layers change, even though the OCI image format doesn't care about this. Cloud Native Buildpacks can intelligently choose which layers to rebuild and replace. Additionally, certain layers can be updated en mass for many images on a registry (using cross-repo blob mounting, with no real data transfer!), which is useful for patching CVEs quickly at scale.

The samples take advantage of this (as well as a separate transparent cache) in order to demonstrate the different aspects of the formal spec. A simple buildpack is not necessarily much more complicated than a simple Dockerfile.


Yes, if the underlying layer’s hash changes then it has to be rebuilt. But if you just change index.html it caches the other layers and builds are very quick.

My issue with Buildpacks is that it looks like a glorified bash script (which is a skill I am not bashing — pun not intended) whereas a dockerfile is much more human readable and the idea of layers, for a guy coming from a systems background, is much more intuitive for me. The analogy of a very lightweight VM makes perfect sense to me which means I’m much more productive with it.


For those who would like to better understand the technology:

Presentation to CNCF TOC: https://docs.google.com/presentation/d/1RkygwZw7ILVgGhBpKnFN...

Formal specification: https://github.com/buildpack/spec

Sample buildpacks: https://github.com/buildpack/samples


In other words: they're (typically) shell scripts that set up an OS to run a language.

Here's the node one: https://github.com/buildpack/samples/blob/master/nodejs-buil...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: