Hacker News new | past | comments | ask | show | jobs | submit login
Release engineering is exhausting so here's cargo-dist (axo.dev)
262 points by ag_dubs on Feb 1, 2023 | hide | past | favorite | 63 comments



"Alright so I've given you The Pitch of how this should work, but how does it actually work today? Well, yes, I wouldn't announce this if not -- but there's a lot of future work to do! Actually no I take that back, it's definitely perfect. Those familiar with my broader oeuvre know this will absolutely be a perfectly crafted demo that avoids all the sharp corners of cargo-dist!"

One of the most honest paragraphs ever written.

Seriously though, great tool and great write up. I hope something like this lands as an official cargo feature. Coming from mostly Python land at work with crazy dependencies for TF and PyTorch GPU support (On Windows sometimes!) makes me super jealous.


Ouch. Python on Windows? I sympathize.


I think this would benefit from an example repo that shows just Cargo.toml for a simple src/main.rs with `fn main() { println!("Hello, world!"); }` project with the simplest needed .github/workflows/foo.yaml possible to actually use this.

If it was in the article and I missed it I apologize.


The "way-too-quickstart" is the minimal example: https://github.com/axodotdev/cargo-dist#way-too-quick-start

A key feature of cargo-dist is that

cargo dist init --ci=github

should simply set everything up for an arbitrary* Rust workspace and you just check in the results.

* Not sufficiently tested for all the wild workspaces you can build, but a "boring" one with a bunch of libraries supporting one binary should work -- that's what cargo-dist's own workspace is, and it's self-hosting without any configuration. Any time I want to update the bootstrap dist I just install that version of cargo-dist on my machine and run `cargo dist generate-ci github --installer=...` to completely overwrite the ci with the latest impl.


Just to elaborate on this a bit: as discussed in the Concepts section of the docs[0] the core of cargo-dist absolutely supports workspaces with multiple binaries, and will chunk them out into their own distinct logical applications and provide builds/metadata for all them.

However this isn't fully supported by the actual github CI integration yet[1], as I haven't implemented proper support for detecting that you're only trying to publish a new version of only one of the applications (or none of them!), and it doesn't properly merge the release notes if you're trying to publish multiple ones at once.

I never build workspaces like that so I'm waiting for someone who does to chime in with the behaviour they want (since there's lots of defensible choices and I only have so many waking hours to implement stuff).

[0]: https://github.com/axodotdev/cargo-dist/#concepts [1]: https://github.com/axodotdev/cargo-dist/issues/69


Cool, thanks.

It creates a ~100 line release.yml file

1. create GitHub release

2. upload artifacts

3. upload manifest

4. publish GitHub release

I like it, it's opinionated, but I don't know how much it will catch on. If somebody needs to maintain a 100 line GitHub Actions release YAML file, in my experience you typically want to understand everything in it in case you need to adjust it for future needs/as they grow.

It's well done though. Curious to see how much adoption it picks up.


> Congrats kid you're A Release Engineer now and your life is hell. Enjoy debugging basic typos on a remote machine with 20 minute latency because you sure can't run those Github CI bash-scripts-in-yaml files locally!

Yes! Why is this accepted??

Gitlab has a way of running CI locally (for Docker based builds anyway; who knows about Windows or Mac) but a) it doesn't support the same features at the "proper" one (even basic ones like the `default` key) and b) they deprecated it!

Ok in fairness they've stated in a random comment that they won't remove it before providing an alternative.... But still, how is this not a core feature of all CI systems?


CircleCI solved this so many years ago, allow users to SSH into the environment the build happens.

Workflow is something like:

- Guess together a random CI or CD workflow, this is just to kick off the process. You can also start with a empty config.

- Something fails, get SSH host+port to connect to

- Enter environment and manually do everything you want to be able to automatically do

- Execute `history` and copy output, trim it into something nicer and put in your config


You can run GitHub workflows locally with act: https://github.com/nektos/act


This is far from perfect, IME. The big problem I have (and maybe there's a solution I don't know about) is that there's no easy way to test getting event data, unless you wanna rebuild the events yourself (which is hardly reliable if the problem is something like verifying a conditional that enables/disables a stage).


Seconding this; act has a lot of potential but misses a number of features such as support for deployment environment variables (eg `${{ var.DEPLOY_SPECIFIC_ENV_VAR }}`) and only recently added support for reusable workflows (https://github.com/nektos/act/issues/826). It looks like fine software and the maintainers deserve praise for their work but it's not yet a drop-in replacement for GitHub Actions.


Act is okay, but the runner image behaves quite differently than a GitHub runner would. The original image would just be too big for reasonable local workflows.

Also artifacts don't seem to be supported.


> The original image would just be too big for reasonable local workflows.

Is there a way so I can still do this? I have terabytes of SSD just waiting to be used.


What we need is a standard CI language/config format across vendors (stuffing a custom DSL inside YAML doesn't count)

That would allow for a tooling ecosystem of static checkers, offline runners, IDE integrations, etc etc, and would also cut down on the learning barrier every time you switch to a new company that uses a different vendor


> But still, how is this not a core feature of all CI systems?

Vendor lock-in, presumably.


In multiple directions, too. It's easier to build the CI system itself if you are only targeting one class of servers/one means of hosting servers/one specific "cloud".


If you just want to solve fundamental build issues (rather than say, uploading artefacts etc), I open a PR, make the silly small edits/experiments using the in-browser file editor, then it runs the CI each time. If I ever get it working, I then squash all the crappy debug I did to get there. Miles from ideal but a slight improvement.


If you're using nix to build your repo, it's worth adding scripts for releases etc and run them locally and as part of the ci https://determinate.systems/posts/nix-github-actions


> (Did you know the warning you get on Windows isn't about code signing, but is actually just a special flag Windows' builtin unzipping tool sets on all executables it extracts?)

My jaw hit the floor here.


Correction on this someone else sent me:

The check of interest is for a Mark Of The Web[0] flag that Windows includes in file system metadata. The builtin unzipping utility just faithfully propagates this flag to the files it unpacks. Other utilities like 7zip are unlikely to do this propagation (effectively clearing it).

But yeah either way it has nothing to do with code signing!

[0]: https://nolongerset.com/mark-of-the-web-details/


If someone's interested with more details about MoTW, EricLaw (long time MSFT engineer at IE and Edge teams) got you covered:

https://textslashplain.com/2016/04/04/downloads-and-the-mark...

https://textslashplain.com/2022/12/02/mark-of-the-web-additi...


Interesting, thank you for sharing!


macOS has a similar feature with Gatekeeper, which bit me when preparing a Pyinstaller binary for Mac. The flag doesn't get added when you download a file with curl, but it does when you download it through a web browser, which can cause difficult to debug issues with binaries downloaded from GitHub releases.

You can remove this flag with the xattr command:

    xattr -d com.apple.quarantine the_quarantined_binary
I wrote up the details of this in a PR [0] where I last dealt with it.

[0] https://github.com/splitgraph/sgr/pull/656


This is actually pretty similar. The OS has an alternative data stream(An idea they stole from Mac), and they list what site a exe was downloaded on, or if it came from somewhere else. Others incorrectly called it a flag, when it works by having two different file data streams for a single file, one is the default one.

So for example, a single file can actually contain two different "files"(File data).

So, foo.exe, actually will effectively open the file foo.exe:DEFAULT. You could also add a piece of malware to the foo file in place of a datastream. So foo.exe is legit, but if you open foo.exe:MALWARE , it will open up the malware datastream.

So tldr, how Windows does this, it when you get a file from a third party source(Internet, USB Drive, etc), it adds a new datastream in the form of a textfile. And the textfile contains info about the source. Namely, a number for location it came from(3? for web), and then some more info.


Thanks for the details! Judging by your username, I assume you know this area well :)

Most surprising to me on Mac was that the "flag" (I'm not sure that's the right term here either) was preserved on files extracted from a tarball downloaded from the internet. Although I think this also required extracting it via Finder (GUI) and did not apply when using the tar command - I can't remember exactly.


I'm really not a fan of the "download the prebuilt binary from github releases" workflow that's been proliferating along with the popularity of Rust. It seems like a step backward in terms of secure package management, and it's surprising to me that Rust doesn't offer a more out-of-box experience for this, instead encouraging users to build from source. I understand the arguments for this, and I even find some of them convincing - namely, that the inverse problem of opaque intermediate binaries and object files would be much worse as it would cause a cascade of brittle builds and barely any packages would work.

But the fact remains that end users want to download a binary, and the common approach to this is to simply publish them to GitHub actions. Surely Cargo could offer a better option than this, while retaining the ability to build all packages from source (maybe you can only publish binaries to Cargo _in addition_ to the source files... or maybe Cargo.toml could include a link to GitHub releases, and Cargo could include a command for downloading directly from there.)

In the meantime, I've been considering publishing Rust binaries to npm (in addition to GitHub releases). I got the idea from esbuild, which is written in Go but distributed via npm. What do people think of this approach? Here's a recent blog post [0] describing the setup.

[0] https://blog.orhun.dev/packaging-rust-for-npm/


I've noticed people publishing binaries to PyPI too - you can run "pip install ziglang" to get Zig, for example.

I wrote a bit about that pattern here: https://simonwillison.net/2022/May/23/bundling-binary-tools-...


we actually agree and are working on this! github releases are just an easy initial target, and makes our tool a drop-in replacement for the kinds of things people are already doing. longer-term we'd like to see something more robust, and cargo-dist is the first cog in that machine.

i have personally packaged and published many rust devtools on npm (cloudflare's wrangler, apollo's rover, wasm-pack) but that was largely because they were targeted at a javascript developer audience.

as a former npm registry engineer i'm curious what you find to be the particular value of publishing to npm? installing node is actually very unpleasant and then getting global installs to work is also... very unpleasant. i think it works well for people already in that ecosystem but i think we can build something better for a more agnostic audience that can give a similar out-of-box exp without requiring a centralized registry. would love to learn more about your perspective!


> as a former npm registry engineer i'm curious what you find to be the particular value of publishing to npm

I have to admit I find this to be an amusing sentence. :D Poor npm (maybe rightfully) does not have the best reputation.

I suppose it appeals to me mostly for reasons of personal bias, and the thought that most people downloading binaries from GitHub probably have node installed. Although I'm probably wrong about that. I also like that optionalDependencies and postinstall steps can use JS code to effectively act as an install script, which at least feels somewhat cleaner than curl install.sh | bash.

You make a good point about this being more appropriate for binaries that have some relation to JS (like esbuild). I think this is probably a compelling enough argument not to distribute binaries to npm if they're not related to JS - or at least, if there is no obvious overlap between the expected userbase of the binary and JS users.

Perhaps the solution friendliest to end users (but least friendly to maintainers) would be to distribute the binary to as many package managers as possible. I mean, why not publish it to npm, and PyPi, and maven, and rubygems...? If your Rust tool has bindings in these languages, then of course it makes sense. Otherwise, it sounds ridiculous... but is it really so different from publishing to multiple OS-level package managers like apt, brew, macports, etc? Almost certainly your users, if they're downloading a tool from GitHub, will have one of these package managers installed. (But then again they'll probably also have one of apt/brew/nix, so maybe no need to use the language specific manager.)

Actually, if this is something you're working on, maybe that is a problem you can solve - you could handle distributing the binary to multiple package registries, and I could just push it to GitHub releases and add an extra line to my workflow.yml. You wouldn't even need to store the binaries - you could just fetch them from the GitHub releases page. (Although now we've strayed quite far from my original motivation of a more secure distribution channel! And perhaps this leads to reinventing a few wheels...)


> most people downloading binaries from GitHub probably have node installed

There are millions, perhaps even billions of people who aren't js developers.

I'm one of them :)


Worth considering using nix with cargo. Of course it still involves a lot of "download from github" or "download from nix cache" but reproducibility + tight source hash pinning helps guarantee provenance.


> It seems like a step backward in terms of secure package management

what is an example of a language package manager that does this well, in your view? I think the idea is something similar to how apt-get/deb/rpm work at the os-level, but I'm interested in the details of what the improved version of this is in your mind.


Github's CI is a fine way to build binaries.

Github's CDN is a fine way to host binaries in a highly available manner which is also not easy to tamper with.

GitHub's project page gives instant access to source code, and to a rich README.

I don't see how Github in this regard is any worse than npm or pypi.

What I would appreciate is a way to sign binaries the same way commits are signed, attesting that it was built from a particular commit, and a particular state of dependencies, by a compiler isolated from the internet. GitHub's CI runner might sign it with some private key, while a public key to check signatures would be available from github.com.

Of course that would require some cooperation from code authors, but for important code that should be manageable.


with npm:

    $ npm install foo # or
    $ npx foo
with github:

    Find repository
    Click latest release, downloads to ~/Downloads
    $ tar -xvzf ~/Downloads/some.tar.gz
    $ cp foo/bin/foo /usr/local/bin


As clearly laid out in the article, the rust equivalent to npm install is

    cargo install foo

The entire point of providing downloads is for people who don't have dev tools already installed.


How is distributing binaries via cargo (automatic, can’t opt out, not possible to audit, invisible) better than explicitly downloading them from github?

Just puzzled; I think binary distributions make any supply chain issues basically impossible to solve.

Vendoring them into the tool chain instead of distributing source code you can compile yourself seems the opposite of solving the problem you’ve posed.


There's so much work to do to release software. Kind of explains why everything is a website.


I think that this article and the discussion around it are more of a condemnation of the way GitHub Actions and similar software works rather than a generic “releasing software is hard”. One of the first things this guy mentioned in the article is that he went down this route because GHA sucks and he can’t run it locally (I know about act, but it ain’t a solution for everything)

I realize you’ve probably thought about this a lot since I know you as the guy who wrote “mazzle” and posted about it here a few months ago. I wish more CI systems worked closer to the thing you designed.


Thanks for remembering me :-)

I would like things to run locally by default and then deployed to the cloud where they run.

Should be easier to debug problems if I can get the code to my machine and investigate issues with tools that my computer has such as "strace", "perf" and debug logging that I liberally apply to the build script.

In production we would have log aggregation and log search (such as ELK stack) and it is a good habit to get into the perspective of debugging production via tooling.

But CICD feels before that tooling in the pipeline. You could wire up your CICD to log to ELK but I would prefer local deployable software.

I think my focus on automating things means I want to be capable of seeing how the thing works without relying on a deployed black box in the cloud and using assumptions of how it works rather than direct investigation.

One of my journal entries is almost a lamentation of all the things that need to be done to release and use software.

This is that entry:

https://github.com/samsquire/ideas4#5-permanent-softwareplat...

I wonder if software could be deployed more like a URL that has all the information to configure a virtual machine. Docker over URL or something.


Woman, she.

In general, please don't unnecessarily gender (verb) people whose gender (noun) you don't know. Using "they" has been fine since the 13th century.


Thanks for helping clarify this. When you mean woman (she) are you specifically referring to the OP or to Sam Squire (who I was replying to).


Talking about the author of the article, Gankra.


How does this tool differ from release-plz?: https://github.com/marcoIeni/release-plz


OK, after reading through all that, I still can't tell if this can generate a Windows installer. Generating an installer is mentioned, but the examples seem to just generate a .tar file. Or maybe a .zip file. That's not what non-programmer Windows users expect.

Rust does a good job of generating Windows binaries cross-platform. But the tools for generating Windows installers are not yet cross-plaform. Does this project improve that situation?


The actual project is at [1].

It looks like the closest thing you can get to an installer is an 'executable zip'.

[1] https://github.com/axodotdev/cargo-dist/


Nearly all of these problems (except for maybe the Windows one) are solved by publishing a nix flake with your application and telling people to install that instead.

Not everyone uses nix, but I'd rather push its adoption than trying to build these kind of solutions for all possible language ecosystems when a general solution already exists and works great.


Unbelievable. So if they don't use Nix they simply don't deserve to use the software?

I really wish people wouldn't get so zealous all the time about their favorite technology.


Nix builds binaries. You don't need nix to run binaries built using nix. Just like you don't need Cargo to run binaries built using cargo. This is about developer tooling.


We should generally be pushing in the direction of developer tooling being available to more developers on more platforms. One of Rust's great strengths is good Windows support. Not supporting native Windows makes Nix a non-starter in many of these discussions.


It's interesting how many people comment on nix without seeming to know anything about it. Why?

Nix doesn't replace cargo. It would use cargo under the hood. If Cargo supports Windows builds, those builds will work with nix. However, nix (probably) doesn't give you anything additional in this case.


I'm not sure what you mean then. Clearly you have to put a bunch of effort into building tooling that works on Windows, and Nix won't cut it for that (you must build on Windows in most cases rather than cross-compiling). And if you're doing that you might as well also reuse the same tooling on other platforms.

Nix flakes can still be useful on the tool user side, if your application build requires Unix anyway, but on the tool producer side it's a bit less so.


Nix normally binary edits /nix/store paths all over the place in the usual built binaries, those won't exist without nix installed & dependencies downloaded.


Is there a guide explaining how to build such a distributable binary?


If you happen to install a lot of things with cargo, check out cargo-binstall: https://github.com/cargo-bins/cargo-binstall

It'll fetch the binary release from the repo so you don't have to compile it yourself.


This article does a really good job of setting up the "why" right in the beginning.



definitely inspired by goreleaser! it's a great project


Awesome thank you. I'm adding it to the cargo favorites list: https://github.com/sixarm/cargo-install-favorites

Tiny feedback:

- Can you consider changing "git add ." to be explicit e.g. "git add Cargo.toml .github/workflows/release.yml"?

- How about modifying Cargo.toml to add cargo-dist as a dev-dependency? I know it's not strictly necessary; it's just very helpful for typical collaboration.


Do you mean this kind of dep? https://rust-lang.github.io/rfcs/3028-cargo-binary-dependenc...

That's an interesting thought, I'm not sure I've ever seen someone employ that as a pattern. Actually no wait, I thought cargo bin-deps specifically gave the developer no way to manually invoke it (i.e. there's no equivalent functionality to npm's npx)? Without that, what use would the dependency be?


Yes, where that link describes "[dev-dependencies]" and "[build-dependencies]".

I use the dev-dependencies section often, and the build-dependencies rarely. And anyone here, please correct my understanding if there's a better way to do what I'm describing.

For me, these sections are an easy way to be explicit with collaborators that the project needs the deps installed in order to work on the project, and does not want/need the deps to be compiled into the release.

My #1 use case is to have a collaborator start with a git clone, then run "cargo build", and have cargo download and cache everything that's needed to work on the project and release it. My #2 use case is to deal with semver, so a project can be explicit that it needs a greater cargo-dist version in the future.

To your point about cargo bin deps akin to npx, yes, your understanding matches mine i.e. not available yet. I do advocate for that feature to be added because it's helpful for local environments. Cargo does offer "default-run" which shows there an awareness of a developer preferring specific local exectuables-- maybe Cargo can/will add more like that for deps?


Bend the curve!


Really impressive work. The description of “release engineering” really hit home in regards to the slow feedback cycle with everything CI/CD.

Also, seeing the initial attempt fail because of a missing C dependency is a common problem with a lot of language-specific package managers.


Wow, that looks extremely useful! Thanks!




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: