"Alright so I've given you The Pitch of how this should work, but how does it actually work today? Well, yes, I wouldn't announce this if not -- but there's a lot of future work to do! Actually no I take that back, it's definitely perfect. Those familiar with my broader oeuvre know this will absolutely be a perfectly crafted demo that avoids all the sharp corners of cargo-dist!"
One of the most honest paragraphs ever written.
Seriously though, great tool and great write up. I hope something like this lands as an official cargo feature. Coming from mostly Python land at work with crazy dependencies for TF and PyTorch GPU support (On Windows sometimes!) makes me super jealous.
I think this would benefit from an example repo that shows just Cargo.toml for a simple src/main.rs with `fn main() { println!("Hello, world!"); }` project with the simplest needed .github/workflows/foo.yaml possible to actually use this.
If it was in the article and I missed it I apologize.
should simply set everything up for an arbitrary* Rust workspace and you just check in the results.
* Not sufficiently tested for all the wild workspaces you can build, but a "boring" one with a bunch of libraries supporting one binary should work -- that's what cargo-dist's own workspace is, and it's self-hosting without any configuration. Any time I want to update the bootstrap dist I just install that version of cargo-dist on my machine and run `cargo dist generate-ci github --installer=...` to completely overwrite the ci with the latest impl.
Just to elaborate on this a bit: as discussed in the Concepts section of the docs[0] the core of cargo-dist absolutely supports workspaces with multiple binaries, and will chunk them out into their own distinct logical applications and provide builds/metadata for all them.
However this isn't fully supported by the actual github CI integration yet[1], as I haven't implemented proper support for detecting that you're only trying to publish a new version of only one of the applications (or none of them!), and it doesn't properly merge the release notes if you're trying to publish multiple ones at once.
I never build workspaces like that so I'm waiting for someone who does to chime in with the behaviour they want (since there's lots of defensible choices and I only have so many waking hours to implement stuff).
I like it, it's opinionated, but I don't know how much it will catch on. If somebody needs to maintain a 100 line GitHub Actions release YAML file, in my experience you typically want to understand everything in it in case you need to adjust it for future needs/as they grow.
It's well done though. Curious to see how much adoption it picks up.
> Congrats kid you're A Release Engineer now and your life is hell. Enjoy debugging basic typos on a remote machine with 20 minute latency because you sure can't run those Github CI bash-scripts-in-yaml files locally!
Yes! Why is this accepted??
Gitlab has a way of running CI locally (for Docker based builds anyway; who knows about Windows or Mac) but a) it doesn't support the same features at the "proper" one (even basic ones like the `default` key) and b) they deprecated it!
Ok in fairness they've stated in a random comment that they won't remove it before providing an alternative.... But still, how is this not a core feature of all CI systems?
This is far from perfect, IME. The big problem I have (and maybe there's a solution I don't know about) is that there's no easy way to test getting event data, unless you wanna rebuild the events yourself (which is hardly reliable if the problem is something like verifying a conditional that enables/disables a stage).
Seconding this; act has a lot of potential but misses a number of features such as support for deployment environment variables (eg `${{ var.DEPLOY_SPECIFIC_ENV_VAR }}`) and only recently added support for reusable workflows (https://github.com/nektos/act/issues/826). It looks like fine software and the maintainers deserve praise for their work but it's not yet a drop-in replacement for GitHub Actions.
Act is okay, but the runner image behaves quite differently than a GitHub runner would. The original image would just be too big for reasonable local workflows.
What we need is a standard CI language/config format across vendors (stuffing a custom DSL inside YAML doesn't count)
That would allow for a tooling ecosystem of static checkers, offline runners, IDE integrations, etc etc, and would also cut down on the learning barrier every time you switch to a new company that uses a different vendor
In multiple directions, too. It's easier to build the CI system itself if you are only targeting one class of servers/one means of hosting servers/one specific "cloud".
If you just want to solve fundamental build issues (rather than say, uploading artefacts etc), I open a PR, make the silly small edits/experiments using the in-browser file editor, then it runs the CI each time. If I ever get it working, I then squash all the crappy debug I did to get there. Miles from ideal but a slight improvement.
> (Did you know the warning you get on Windows isn't about code signing, but is actually just a special flag Windows' builtin unzipping tool sets on all executables it extracts?)
The check of interest is for a Mark Of The Web[0] flag that Windows includes in file system metadata. The builtin unzipping utility just faithfully propagates this flag to the files it unpacks. Other utilities like 7zip are unlikely to do this propagation (effectively clearing it).
But yeah either way it has nothing to do with code signing!
macOS has a similar feature with Gatekeeper, which bit me when preparing a Pyinstaller binary for Mac. The flag doesn't get added when you download a file with curl, but it does when you download it through a web browser, which can cause difficult to debug issues with binaries downloaded from GitHub releases.
This is actually pretty similar. The OS has an alternative data stream(An idea they stole from Mac), and they list what site a exe was downloaded on, or if it came from somewhere else. Others incorrectly called it a flag, when it works by having two different file data streams for a single file, one is the default one.
So for example, a single file can actually contain two different "files"(File data).
So, foo.exe, actually will effectively open the file foo.exe:DEFAULT. You could also add a piece of malware to the foo file in place of a datastream. So foo.exe is legit, but if you open foo.exe:MALWARE , it will open up the malware datastream.
So tldr, how Windows does this, it when you get a file from a third party source(Internet, USB Drive, etc), it adds a new datastream in the form of a textfile. And the textfile contains info about the source. Namely, a number for location it came from(3? for web), and then some more info.
Thanks for the details! Judging by your username, I assume you know this area well :)
Most surprising to me on Mac was that the "flag" (I'm not sure that's the right term here either) was preserved on files extracted from a tarball downloaded from the internet. Although I think this also required extracting it via Finder (GUI) and did not apply when using the tar command - I can't remember exactly.
I'm really not a fan of the "download the prebuilt binary from github releases" workflow that's been proliferating along with the popularity of Rust. It seems like a step backward in terms of secure package management, and it's surprising to me that Rust doesn't offer a more out-of-box experience for this, instead encouraging users to build from source. I understand the arguments for this, and I even find some of them convincing - namely, that the inverse problem of opaque intermediate binaries and object files would be much worse as it would cause a cascade of brittle builds and barely any packages would work.
But the fact remains that end users want to download a binary, and the common approach to this is to simply publish them to GitHub actions. Surely Cargo could offer a better option than this, while retaining the ability to build all packages from source (maybe you can only publish binaries to Cargo _in addition_ to the source files... or maybe Cargo.toml could include a link to GitHub releases, and Cargo could include a command for downloading directly from there.)
In the meantime, I've been considering publishing Rust binaries to npm (in addition to GitHub releases). I got the idea from esbuild, which is written in Go but distributed via npm. What do people think of this approach? Here's a recent blog post [0] describing the setup.
we actually agree and are working on this! github releases are just an easy initial target, and makes our tool a drop-in replacement for the kinds of things people are already doing. longer-term we'd like to see something more robust, and cargo-dist is the first cog in that machine.
i have personally packaged and published many rust devtools on npm (cloudflare's wrangler, apollo's rover, wasm-pack) but that was largely because they were targeted at a javascript developer audience.
as a former npm registry engineer i'm curious what you find to be the particular value of publishing to npm? installing node is actually very unpleasant and then getting global installs to work is also... very unpleasant. i think it works well for people already in that ecosystem but i think we can build something better for a more agnostic audience that can give a similar out-of-box exp without requiring a centralized registry. would love to learn more about your perspective!
> as a former npm registry engineer i'm curious what you find to be the particular value of publishing to npm
I have to admit I find this to be an amusing sentence. :D Poor npm (maybe rightfully) does not have the best reputation.
I suppose it appeals to me mostly for reasons of personal bias, and the thought that most people downloading binaries from GitHub probably have node installed. Although I'm probably wrong about that. I also like that optionalDependencies and postinstall steps can use JS code to effectively act as an install script, which at least feels somewhat cleaner than curl install.sh | bash.
You make a good point about this being more appropriate for binaries that have some relation to JS (like esbuild). I think this is probably a compelling enough argument not to distribute binaries to npm if they're not related to JS - or at least, if there is no obvious overlap between the expected userbase of the binary and JS users.
Perhaps the solution friendliest to end users (but least friendly to maintainers) would be to distribute the binary to as many package managers as possible. I mean, why not publish it to npm, and PyPi, and maven, and rubygems...? If your Rust tool has bindings in these languages, then of course it makes sense. Otherwise, it sounds ridiculous... but is it really so different from publishing to multiple OS-level package managers like apt, brew, macports, etc? Almost certainly your users, if they're downloading a tool from GitHub, will have one of these package managers installed. (But then again they'll probably also have one of apt/brew/nix, so maybe no need to use the language specific manager.)
Actually, if this is something you're working on, maybe that is a problem you can solve - you could handle distributing the binary to multiple package registries, and I could just push it to GitHub releases and add an extra line to my workflow.yml. You wouldn't even need to store the binaries - you could just fetch them from the GitHub releases page. (Although now we've strayed quite far from my original motivation of a more secure distribution channel! And perhaps this leads to reinventing a few wheels...)
Worth considering using nix with cargo. Of course it still involves a lot of "download from github" or "download from nix cache" but reproducibility + tight source hash pinning helps guarantee provenance.
> It seems like a step backward in terms of secure package management
what is an example of a language package manager that does this well, in your view? I think the idea is something similar to how apt-get/deb/rpm work at the os-level, but I'm interested in the details of what the improved version of this is in your mind.
Github's CDN is a fine way to host binaries in a highly available manner which is also not easy to tamper with.
GitHub's project page gives instant access to source code, and to a rich README.
I don't see how Github in this regard is any worse than npm or pypi.
What I would appreciate is a way to sign binaries the same way commits are signed, attesting that it was built from a particular commit, and a particular state of dependencies, by a compiler isolated from the internet. GitHub's CI runner might sign it with some private key, while a public key to check signatures would be available from github.com.
Of course that would require some cooperation from code authors, but for important code that should be manageable.
How is distributing binaries via cargo (automatic, can’t opt out, not possible to audit, invisible) better than explicitly downloading them from github?
Just puzzled; I think binary distributions make any supply chain issues basically impossible to solve.
Vendoring them into the tool chain instead of distributing source code you can compile yourself seems the opposite of solving the problem you’ve posed.
I think that this article and the discussion around it are more of a condemnation of the way GitHub Actions and similar software works rather than a generic “releasing software is hard”. One of the first things this guy mentioned in the article is that he went down this route because GHA sucks and he can’t run it locally (I know about act, but it ain’t a solution for everything)
I realize you’ve probably thought about this a lot since I know you as the guy who wrote “mazzle” and posted about it here a few months ago. I wish more CI systems worked closer to the thing you designed.
I would like things to run locally by default and then deployed to the cloud where they run.
Should be easier to debug problems if I can get the code to my machine and investigate issues with tools that my computer has such as "strace", "perf" and debug logging that I liberally apply to the build script.
In production we would have log aggregation and log search (such as ELK stack) and it is a good habit to get into the perspective of debugging production via tooling.
But CICD feels before that tooling in the pipeline. You could wire up your CICD to log to ELK but I would prefer local deployable software.
I think my focus on automating things means I want to be capable of seeing how the thing works without relying on a deployed black box in the cloud and using assumptions of how it works rather than direct investigation.
One of my journal entries is almost a lamentation of all the things that need to be done to release and use software.
OK, after reading through all that, I still can't tell if this can generate a Windows installer. Generating an installer is mentioned, but the examples seem to just generate a .tar file. Or maybe a .zip file. That's not what non-programmer Windows users expect.
Rust does a good job of generating Windows binaries cross-platform. But the tools for generating Windows installers are not yet cross-plaform. Does this project improve that situation?
Nearly all of these problems (except for maybe the Windows one) are solved by publishing a nix flake with your application and telling people to install that instead.
Not everyone uses nix, but I'd rather push its adoption than trying to build these kind of solutions for all possible language ecosystems when a general solution already exists and works great.
Nix builds binaries. You don't need nix to run binaries built using nix. Just like you don't need Cargo to run binaries built using cargo. This is about developer tooling.
We should generally be pushing in the direction of developer tooling being available to more developers on more platforms. One of Rust's great strengths is good Windows support. Not supporting native Windows makes Nix a non-starter in many of these discussions.
It's interesting how many people comment on nix without seeming to know anything about it. Why?
Nix doesn't replace cargo. It would use cargo under the hood. If Cargo supports Windows builds, those builds will work with nix. However, nix (probably) doesn't give you anything additional in this case.
I'm not sure what you mean then. Clearly you have to put a bunch of effort into building tooling that works on Windows, and Nix won't cut it for that (you must build on Windows in most cases rather than cross-compiling). And if you're doing that you might as well also reuse the same tooling on other platforms.
Nix flakes can still be useful on the tool user side, if your application build requires Unix anyway, but on the tool producer side it's a bit less so.
Nix normally binary edits /nix/store paths all over the place in the usual built binaries, those won't exist without nix installed & dependencies downloaded.
- Can you consider changing "git add ." to be explicit e.g. "git add Cargo.toml .github/workflows/release.yml"?
- How about modifying Cargo.toml to add cargo-dist as a dev-dependency? I know it's not strictly necessary; it's just very helpful for typical collaboration.
That's an interesting thought, I'm not sure I've ever seen someone employ that as a pattern. Actually no wait, I thought cargo bin-deps specifically gave the developer no way to manually invoke it (i.e. there's no equivalent functionality to npm's npx)? Without that, what use would the dependency be?
Yes, where that link describes "[dev-dependencies]" and "[build-dependencies]".
I use the dev-dependencies section often, and the build-dependencies rarely. And anyone here, please correct my understanding if there's a better way to do what I'm describing.
For me, these sections are an easy way to be explicit with collaborators that the project needs the deps installed in order to work on the project, and does not want/need the deps to be compiled into the release.
My #1 use case is to have a collaborator start with a git clone, then run "cargo build", and have cargo download and cache everything that's needed to work on the project and release it. My #2 use case is to deal with semver, so a project can be explicit that it needs a greater cargo-dist version in the future.
To your point about cargo bin deps akin to npx, yes, your understanding matches mine i.e. not available yet. I do advocate for that feature to be added because it's helpful for local environments. Cargo does offer "default-run" which shows there an awareness of a developer preferring specific local exectuables-- maybe Cargo can/will add more like that for deps?
One of the most honest paragraphs ever written.
Seriously though, great tool and great write up. I hope something like this lands as an official cargo feature. Coming from mostly Python land at work with crazy dependencies for TF and PyTorch GPU support (On Windows sometimes!) makes me super jealous.