We have used this many times during GitHub outages. It's great and does what it says.
But just one word of warning: when you run `act` with no arguments, what does it do? Displays usage? Nope -- it runs all workflows defined in the repo, all at once in parallel!
This seems like a crazy default to me. I've never wanted to do anything remotely like that, so it just seems both dangerous and inconvenient all at once.
This piece of software would have to handle all the intricacies of the GitHub actions but also be updated to the latest changes...
We are moving back to a makefile based approach that is called by the GitHub workflows. We can handle different levels of parallelism: the make kind or the indexed by worker number when running in actions. That way we can test things locally, we can still have 32 workers on GitHub to run the full suite fast enough.
I also like that we are less bound to GitHub now because it has been notorious unreliable for us this past year and we may move more easily to something else.
Take a look at https://dagger.io/. Declarative pipelines using Node, Python, or Go. Parallelism built in, and caching built in - things are cached if they're unchanged.
You could implement something similar by splitting the make targets in the GitHub action before they get passed to make so each worker is assigned their own target, then have a make target that executes all the targets for local multithreaded builds via `make -j${NUM_CONCURRENT}`.
Finally! I've been wanting something like this for ages- generally speaking I don't consider myself an idiot, but I'm forced to pull that into questioning every time I test and debug ci/cd actions with an endless stream of pull request modifications
Unfortunately act is only capable of running very simple workflows. I've found this action to be more useful against the endless PR stream: https://github.com/mxschmitt/action-tmate
You drop it in your workflow and get an SSH shell into the worker, figure things out iteratively, then push when it's working.
Can you elaborate with some examples of workflows that it is incapable of?
So far I’ve not found any limitations or issues using Ubuntu runners on my OSX dev machine. A couple examples from my workflows:
- building docker images
- provisioning VMs with the Digital Ocean cli / http api
- hardening VMs with ansible
- creating/managing k3s clusters with k3sup
- deploying apps with helm
I like your suggested approach of using tmate to access the worker mid-way through a run. This should make it faster to develop/debug the steps that make up the workflow. Though this doesn’t address the cycle time of push-to-GitHub/queue-workflow/watch-for-result.
I’m actually going to try combining the two techniques - use tmate to develop inside a local act runner.
Workflows that interact with the Github API heavily will fail as they're not available in act e.g. actions like https://github.com/styfle/cancel-workflow-action. Dealing with secrets is also a bit cumbersome. You can throw the following on actions that are not compatible with act in order to skip them:
if: ${{ !env.ACT }}
That said, despite its limitations, I've been using both act and tmate in combination for a couple of years. Gets the job done.
I don't consider myself a cynic (1), but I am forced to pull that into questioning every time I see how many paid worker minutes are wasted by such endless streams of pull request modifications.
name one (1) CI system, open or closed, which shares enough with another CI system, open or closed, that there is no pain when changing from one to another.
And running your jobs in Docker is what I recommend that people at my work do, and I admin several GitHub Enterprise Server instances and GitHub Enterprise Cloud as well.
This is the one thing that Drone got very right - every Drone job runs in a container. It is built in to the drone tooling to be able to run those jobs locally on your development machine as well, and requiring containers is why that works.
If you run your CI and/or CD steps from within a container, you can run that container anywhere, and writing a small script to read your CI/CD yaml (or whatever you use), and wrap your favorite container command line tool into a working local CI/CD system should be pretty trivial.
Using containers also makes moving to another CI/CD system which can use containers as trivial as it can currently be.
I normally create a draft PR first to test CI/CD changes and open a new one when it's working. The workflow still sucks but I at least look like less of an idiot in the eyes of my peers.
Yeah, I got so frustrated with the odd workflow (having no sane way to locally test new/more advanced pipelines and having to do lot's of "change .gitlab-ci commits") at work that I started investigating alternatives.
At home, for some hobby projects, I've been using earthly. It's just amazing. I can fully run the jobs locally and they are _blazing_ fast due to the buildkit caching. The CI now only just executes the earthly stuff and is super trivial (very little vendor lock in, I personally use woodpecker-ci, but it would only take 5 minutes to convert to use GH actions).
I am not a fan of the syntax. But it's so familiar from Dockerfiles and so easy to get started I can't really complain about it. Easy to make changes, even after months not touching it. Unless I update dependencies or somehow invalidate most of the cache a normal pipeline takes <10s to run (compile, test, create and push image to a registry).
This workflow is such a game-changer. It also allows, fairly easy, to do very complicated flows [1].
I've tried to get started with dagger but I don't use the currently supported SDK's and the cue-lang setup was overwhelming. I think I like the idea of a more sane syntax from dagger, but Earthly's approachability [2] just rings true.
Rather than replacing your Makefile with GH Actions, replace your GH Actions with a Makefile, and make your GH Actions run `make` in a script task.
Do you really need that GH Action for pulling Docker images / installing $language_compiler / creating cloud resources ? A `docker` / `curl` / `sudo apt-get install` invocation in a Makefile / script needs to be written once and is the same in CI as on your dev machines. The Action is GH-specific, requires you to look it up, requires you to learn the DSL for invoking it, requires you to keep it up-to-date, requires you to worry about it getting hacked and stealing your secrets, ...
A Makefile already supports dependencies between targets. A shell script is a simple DSL for executing processes, piping the output of one to another, and detecting and acting on failure of subcommands. How much YAML spaghetti do you need to write to do those same things in a workflow file?
This is a lesson that I've learned after going all-out on actions once.
Now my makefiles in addition to the usual "make" and "make test" also support "make prerequisites" to prepare a build environment by installing everything necessary and "make ci" to run everything that CI should check. With actual implementation being scripts placed under "scripts/ci".
The scripts do provide some goodies when they are run by GitHub Actions – like folding the build logs or installing dependencies differently – but these targets also work on the developer machine.
If it’s manageable – just don’t. Build from scratch. Make sure your build works from scratch and completes in acceptable timeframe. If it’s painful, treat the root cause and not the symptoms.
If it’s unbearable due to circumstances out of your control, there’s nothing wrong with adding some actions/cache steps to .github/workflows – this goes around the build: fetch previous cache before, update the cache after if needed.
The build is still reproducible outside of GitHub Actions, but a pinch of magic salt makes it go faster sometimes without being an essential part of the build pipeline married to GitHub.
If you need to install a whole host of mostly static dependencies, GitHub Actions support running steps in arbitrary Docker container. Prepare an image beforehand, it can be cached too, now you have a predictable environment. (The only downside is that it doesn’t work on macOS and Windows.)
> If you need to install a whole host of mostly static dependencies, GitHub Actions support running steps in arbitrary Docker container. Prepare an image beforehand, it can be cached too, now you have a predictable environment. (The only downside is that it doesn’t work on macOS and Windows.)
Actually I use a similar workflow for some of the other projects that I have to work with, keeping everything CI agnostic. I incrementally add various types of dependencies to the container images where the application will be built.
For example:
1. common base image (e.g. Debian, Ubuntu, or something like Alpine, or maybe RPM based ones)
2. previous + common tools (optional, if you want to include curl or other tools in the container for debugging or testing stuff quickly)
3. previous + project runtime (depending on tech stack, for example OpenJDK for Java projects)
4. previous + development tools (depending on the tech stack, typically for pulling in dependencies, like Maven, or npm, or whatever)
5. previous + project dependencies (if the project is large and the dependencies change rarely, you can install them once here and the changing 5% or so later)
6. previous + project build (including things like running tests, typically multi-stage with the build and tests, and built app handled separately)
Compared to the more "common" way to do things, step #5 probably jumps out the most here, I do a pass of installing all of the dependencies, say, every morning, or hourly in the background, so that later when the project is built the CI can just go: "Hmm, it seems like 95% of the things I need here are already present, I'll just pull the remaining packages (if any)." Clean installs only need to be done when packages are removed, which is also reasonably easy to do.
Though the benefits of this aren't quite as staggering, if you use a self-hosted package repository like Sonatype Nexus, which can cache any dependencies that you've used previously and make everything faster on the network I/O side. This only doesn't hold true when actually installing the packages takes up the majority of the time (e.g. compiling native code), in which case the above is still very useful.
So, an example of how the stages might look, is as follows:
Builder: Ubuntu + tools (optional) + OpenJDK + Maven + project dependencies + project build (and run tests)
Runner: Ubuntu + tools (optional) + OpenJDK + built project from last image (using COPY with --from, typically .jar file or app directory)
Of course, things are less comfortable when you don't have all of your app's dependencies packaged statically but need them "on the system" instead, like Python packages or Ruby Gems, but then your builder and runner will simply look more alike.
A Makefile really sucks at displaying outputs/logs of commands, especially when there are lots of commands and when they run concurrently. It also really sucks at communicating what the overall progress is: how many jobs have finished, how many left, how much time has elapsed.
Heck make can make all this much better by just prepending each output line with some colored prefix and timestamp. But make hasn't changed in 30 years and likely won't change.
People are proud that it "solves" things since 1976. Yes if your requirements never changed since 1976. I'm not holding by breath that it will deliver basic usability-enhancing features that one can reasonably expect nowadays.
Then you reduce make to a simple small-scale task runner, basically admitting that it's unusable for large numbers of heterogeneous tasks or concurrency.
I agree. GitHub Actions should call your scripts but your scripts should not depend on GitHub Actions API.
I also suggest Bazel as a consideration alongside Make. With Bazel, you get two advantages over Make:
1. It is easier to ensure that what GitHub Actions runs and builds is the same as what you have locally, since Bazel can fetch toolchains as part of the build process
2. Using a Bazel remote cache, you do not have to repeat work if the build fails halfway and you need to make some changes before running it again.
I use Makefiles anywhere I can fit them they're a brief respite from YAML hell. This issue was solved in 1976 -- I appreciate there's a lot of VC money in reinventing the wheel (and coming full circle) but I digress.
I too use Make everywhere, but what I would give for an improved tool that had better syntax, composability, and simultaneously deployed everywhere. Sadly, it is good enough, so we shall suffer forever.
This so hard. I like to think of the make targets, e.g. build, test, install, etc. as an API that should be consistent across repos. This really helps with cross team collaboration. The details of how these tasks happen is free to change at will without the need to “distribute” these changes to developers. There’s no disruption to anyone’s flow. Plus, with a little documentation, on boarding new developers is so much more simple.
I’ve run into this with overly complicated Jenkins pipeline files as well. I think the root cause is just that a single entry point pipeline is boring— everyone wants a CI config that sets statuses and posts results and does things in parallel and interacts with plugins, and every one of those steps is something that is at least semi unique to the CI execution environment.
I think the method you describe is still absolutely how it should be, but this types of interactions are why there’s gravity in the other direction.
>every one of those steps is something that is at least semi unique to the CI execution environment.
Apart from triggers and environment set up none of those things have to be unique.
I often push complex CI logic in YAML into code where it is more easily debugged and I dont have to scratch my head to figure out how to use conditionals. Sending slack messages should always be in code IMHO.
You should still have a Makefile that calls your shell scripts when "make" or "make test" is run. Every person who writes shell scripts has a different filename and arguments. "make" and "make test" are always the same everywhere.
forget make, just put it all in a docker container and let the docker container be your CI. that's what the tool linked in the post does, and if you have a docker container you can run it unmodified in just about any CI system.
With actions you can run multiple tasks in parallel, restart failed portions without retrying the whole CI, run certain sections with conditions like which branch you are on (different tasks when commit to master vs feature branch)
>With actions you can run multiple tasks in parallel,
If you mean "I want to run one build with the foo feature and one build with the bar feature. Actions lets me run those in parallel", then that is the "strategy" part of the workflow, not the "tasks" part. My comment was about the latter. ("and make your GH Actions run `make` in a script task.")
If you mean "I want to run two steps of a job in parallel and then run the rest of the job after they're complete", then shell is a much simpler DSL for that. Running things in parallel is literally a single `&` character.
>restart failed portions without retrying the whole CI, run certain sections with conditions like which branch you are on (different tasks when commit to master vs feature branch)
> Running things in parallel is literally a single `&` character.
Yes now try waiting for all parallel tasks, and error out if at least one errors. And try separating their output so that they don't get interleaved into a big mess. And try by default hiding the outputs of commands that succeeded, only showing those that failed, except when the user explicitly asks for it.
Your "simple" shell script now suddenly isn't so simple anymore.
That’s what GNU Parallel is for. Or Pueue, which gives you a GH-Actions-level feature set but it’s less likely to be installed on any particular machine. Pretty sure you could fetch a pueue binary at the start of an actions script and do everything that way.
(These can’t project their tasks over multiple GH Actions runners, eg for multiple OSes. For that you will need to use the YAML. Good compilers will already do work in parallel and max out however many cores they are given, so multi machine is the main use case. Unfortunate.)
What I mean is if you have 50,000 unit tests, writing an actual CI config will let you split that up in to 20 jobs that all run at once and if a test fails, you can retry that 1/20th of the tests instead of the whole thing.
The CI runners aren’t multi core VMs I believe so you can’t just use standard shell utilities, you have to indicate to the CI system you want to run multiple tasks.
This. All my action files are just a `make test` call which installs dependencies on-demand and most importantly, makes the process reproducible locally which is invaluable to debug.
I use this a lot. It's not perfect, but better than nothing. For example, last time I checked it did not support reusable workflows, and I have had a few cases I got some error from github but act worked. I guess it's a hard catchup game, not only do you have to get the semantic for correct workflows correctly, you also want to error out on the same errors.
I really don't understand why github don't release a local runner for github actions. Everyone I know which has worked with cicd wants the same, some way to run and debug the pipelines/workflows locally.
This is pretty cool but I have never managed to make it work with our AWS ECR repos. There are probably some permissions that I am missing but not very clear in the documentation how to do that.
Same. I've only got it to work with the most basic of actions. Anything that requires Docker didn't work for me the last time I tried it!
There's a way to download a much larger "base image" which in theory would make it work, but if I remember rightly it was something like 60gb of containers which I was never patient enough to download. It was always quicker to just push up to github and iterate that way unfortunately.
Until your (to extend your example) container orchestration is complex enough that that too requires faster/less permissioned iteration than infrastructure-as-code provides, in which case you need to reimplement the paid service locally. Hopefully then the paid service is open source or has some good-enough-for-your-needs analog like this.
FWIW, while the above sort of recommends kubernetes-everywhere, I'm happy to make a bet on a service like AWS Fargate because I _don't_ think I need to iterate on container orchestration much (as an application developer). Something like DynamoDB, by contrast, seems quite treacherous to build atop, given how closely an application's code is likely to be tied to its primary database.
I'm a strong advocate (after migration between various cloud CI providers free tiers when I worked in academia) of making your build process as agnostic as possible to the CI provider. That generally means putting things into shell scripts and calling them, which can be a bit painful, and you'll never get 100% there, but it also has the added benefit of being able to debug things locally too if something goes wrong with the pipeline.
The first time I used GH Actions I thought there was a strong vendor lock-in element to it which I wasn't hugely comfortable with. I'm a big fan of using GitLab CI these days which seems to be a good trade off between various considerations.
I really really want to use this but it doesn't work with podman which is unfortunately a bit of a blocker. IMO every CI system should be runnable locally, with minimum effort. Otherwise you end up testing via git push and that's just an ugly development cycle.
This does make me wonder if you could create a sort of "local first CI" where CI is just an extension of local checks. Therefore the CI is just a check that tests that pass locally also pass on a clean machine. Obviously we don't want to run CI locally if it'd take an hour, but on the flip side, if CI takes an hour on a (typically) beefy local dev machine, it'll probably take 2 hours on a remote machine.
My solution to testing GitHub actions is pretty straightforward - I've created a private repository where I push and test my actions first. Then when I'm satisfied, I go ahead and create a PR on the main repository I want the action to be in.
This could've been very useful to me when setting up some cross compiler targets in a Rust project. I burned through my almost all of my 3000 actions minutes in no time at all. Had to put my project on a hiatus until the next billing period starts. It's very easy to set up a nice matrix of CPU architectures and operating systems, but phew does it churn through a lot of CPU time fast.
I did not like configuring the GH Actions YAML files at all, but in the end it works quite nicely. The ability to do MSVC Windows (x86_64-pc-windows-msvc) and MacOS builds (for "free") is kinda nice.
The downside to this approach is that you have to wait longer for things to pass compared to certain actions.
For example, linting can be run in seconds with an action. If you instead clone and install dependencies before linting, you’re waiting much longer to realize you have more whitespace than you should have
You could create multiple tasks that aren’t dependent on each other to run? You could still have the main build task but also split it up into multiple tasks.
This is already entirely doable: just create some executable "test" program in $language_of_choice (shell script, Python, compiled C binary if you really want to) and run that in the CI. You're still going to have to need a wee bit of CI configuration (usually YAML) to tell it which containers to run and whatnot, but this usually isn't all that much.
Yep. CI systems offer some extra features. If you don't use them there isn't really a lock in.
From the top of my head:
1. Parallelization.
2. Capture of build artifacts, which can also be useful for logs of non-linear complex tests such as dependencies of e2e tests.
3. Secrets for release or artifact uploads to third party repos (e.g. docker repos)
4. Caching.
There may be more. Once you sprinkle these things left and right in your CI config, it becomes hard to move to another, even if the bulk of the actual tests you run are just "make test"
You don't really need all that much CI-specific stuff for most of that, except maybe the secrets, although you will end up duplicating some CI features if you choose not to use them, but that's usually not too hard.
Long time ago I implemented my own "CI" system. The basic idea was that by putting a make wrapper in the MAKE env variable I would intercept recursive makefile executions and I would spawn tasks in a message queue. Workers would pick up the messages and perform a fast checkout from a hot git repo cache in each node (using git --references). It worked as a charm. The exact same makefiles would work also locally out of the box
I’ve used this a bunch; I actually don’t know how else you’d write and test a GitHub action. Do people just push them and hope they work and push more commits with “fix” as the message until it works?
- write the main job in a Python/PowerShell/Bash script that runs locally,
- write a workflow that sets up the environment (AWS login), installs dependencies (using GitHub's facilities for that) and calls the script (often passing values using GitHub Secrets),
- push the workflow with a "push" trigger on a feature branch (for fast testing),
- push 30 fixup commits to fix all the small syntax errors, logic mistakes, and GHA idiosyncrasies I hadn't thought of,
- remove the push trigger,
- do an interactive rebase to squash all the fixups,
- force-push and merge.
It works pretty well and can be easily used locally or in other CI/CD systems.
> - write the main job in a Python/PowerShell/Bash script that runs locally,
> - write a workflow that sets up the environment (AWS login), installs dependencies (using GitHub's facilities for that) and calls the script (often passing values using GitHub Secrets),
> - push the workflow with a "push" trigger on a feature branch (for fast testing),
> - push 30 fixup commits to fix all the small syntax errors, logic mistakes, and GHA idiosyncrasies I hadn't thought of,
> - remove the push trigger,
> - do an interactive rebase to squash all the fixups,
> - force-push and merge.
> It works pretty well and can be easily used locally or in other CI/CD systems.
It saddens me that such madness is considered as "works pretty well"
The only way to actually run the workflow is to push the code and trigger the workflow somehow. And you can't push code without committing it since this is git.
You could write the workflow correctly the first time. Good luck. It's not one language, you're writing PowerShell scripts inside a bunch of YAML files that reference each other with relative paths and need to call installers that were never written for unattended installs. The only way I can make it work is through trial and error.
GHA really feels like it's trying to cobble up a CI/CD pipeline from a million different pieces that were designed for entirely different purposes. It works, if you spend enough weeks on it, but it will never be pleasant to configure. I can understand why they didn't even try to make it reproducible on the users' workstations, it can break if basically any part of your system's configuration, filesystem, or environment deviates from the runners.
That's obviously not entirely GitHub's fault and the service is very practical once it's set up. My advice is to depend on GHA for their useful features (caching, Secrets, parallel jobs) and do everything else in your own scripts or in Docker.
(You don't have to open a PR / the usual history rewriting consequences don't apply if you're just pushing to Actions to see how it reacts.)
But alternatively, if you find yourself doing this a lot and it's really the code of the step that you're debugging (vs. debugging how the workflow interacts with Actions itself) I try to keep my jobs' steps pretty simple:
- run: ci/thing.sh
… specifically so that I can run the step locally. It's usually much faster. (And on a MBP, doesn't incur the costs of virtualization, at the cost of needing to port to macOS. Usually worth it.)
-C <commit>, --reuse-message=<commit>
Take an existing commit object, and reuse the log message and the authorship information (including the timestamp) when creating the commit.
In the above example, commit can simply be "HEAD", when doing the amend commit:
I NEED this RIGHT NOW - awesome and thank you so much to the submitter! I'm writing my first action and there are so many facets to interacting with the runtime environment that I was testing by pushing updates. Checking my commits, I see that I committed and pushed 19 times this morning and most of them could have been avoided using act. I should also note that my Action is a bit odd since it has the local (to Docker) checkout and then calls back to the GitHub GraphQL API to make changes.
Act is truly helpful and I appreciate its development but it has its caveats. For instance, I face problems with passing parameters to composite actions.
It’d be great if act could build its container images locally instead of downloading them through a rate-limited pipe from docker hub. The full size image (~14gb) takes the better part of a day to download; I’ve never been able to wait it out on my work laptop.
That smells like a good opportunity to set up a mock http server that implements enough of the Actions API to trick the runner into executing the job as if it was a push event
This is one of the advantages Gitlab has because the ability to do this is built into gitlab-runner so you don’t have to maintain anything, you’re using the same code Gitlab itself is.
They like to pretend that's true, but it is for sure not, at least not unless your job is so simple it could literally be a shell script or a Makefile as others have said
Any use of rules, cache, includes, or ... you know, real life ... gitlab-ci constructs makes `gitlab-runner exec my-job` do absolutely nothing helpful. The circleci binary did as advertised on the tin and has been my favorite "debug ci builds locally" experience
Git also has built in support for automated stuff (git hooks). By default it is only local to each machine, but it is possible to set it up to distribute the hooks too.
Yep. And these hook shell scripts, don't need to involve Node.js projects (cough Husky) injecting this scripts and executing them without your consent.
The post is about running github actions locally though. The built in ones also support running server-side too, though I imagine github doesn't have the same level of interface over that as their own services.
It was my comment you replied to. I meant for running scripts locally, before pushing/pulling/commiting etc. The distribution part is just those scripts being by default in a folder that is not tracked by git (.git/hooks), but you can change it in your config file. Hooks should be able to run any executable script file, so there isn't really any limits outside of convenience to what you can make it do.
You probably could use server-side git hooks just fine as an alternative if you self host the repo though, but I would assume if you are using github, or another hosting service, their tools are probably best/easiest/most convenient for their own platform.
Now, the asterisk to that claim is that I have had good success running gitlab locally, with local runners, such that I can do whatever crazy thing I want to the CI variables, to the repo, to the branches, and can run the runners in "debug" mode (although usually not necessary since I can "docker exec" into them) but the idea that I can just tell my colleagues to "brew install gitlab-runner && gitlab-runner exec some-job" is for sure false
But just one word of warning: when you run `act` with no arguments, what does it do? Displays usage? Nope -- it runs all workflows defined in the repo, all at once in parallel!
This seems like a crazy default to me. I've never wanted to do anything remotely like that, so it just seems both dangerous and inconvenient all at once.
Nice otherwise though...