Hacker News new | past | comments | ask | show | jobs | submit login
Remote code execution in Homebrew by compromising the official Cask repository (ryotak.me)
387 points by spenvo on April 24, 2021 | hide | past | favorite | 90 comments



I feel a lot less secure now knowing there is an automated pipeline glued with yaml and ruby, that takes a stranger's code and pushes it directly to my machine.

I'm sure the homebrew team tried hard to make sure it is as simple and restrictive as possible. As they are doing it for free and without warranty, they are not obligated to match any security standards. I am grateful, but I still wish it is done differently. Maybe instead of armchair complaining on HN I should donate...


For what it’s worth, Homebrew responded by entirely removing the automation, not just patching the specific vulnerability: https://brew.sh/2021/04/21/security-incident-disclosure/

To me this means they take the class of threats seriously. And in the first place, automating version bumps could actually improve availability of security hotfixes for brew-installed software, so it’s consistent with a good faith security stance.

And, to be fair, there are plenty of NPM repositories with far fewer qualms about pushing untrusted code in dependencies to dev machines...


NPM shouldn't be up for serious consideration as part of anything that needs to be secure after their series of incidents like these.

If that's the new standard that is problematic.


I think it's less about using NPM as a good-practice model, and more just as an example of a widespread tool with worse flaws.

Not an excuse, just a comparison.


https://github.com/stripe/stripe-js strikes perhaps the most realistic balance possible, recognizing that NPM is an insecure place for their core JS logic that creates a PCI compliant iFrame, and so their NPM package is just a loader for a script tag hosted securely. And yet they encourage people to use NPM for the wrapper itself. Which is just as vulnerable to supply chain attacks as anything else on NPM. If this isn't tacit acknowledgement of a "new standard" I don't know what is. I absolutely agree that it's problematic.


This also makes the default auto update behaviour of Homebrew itself a double edged sword. On the one end it ensures people run into less issues because Homebrew is always up to date but on the other end an RCE like this will be propagated to every user automatically within a really short time.


I'm not sure yaml and ruby have anything to do with it. This is a common risk with all centralized package management systems. It's ultimately a tradeoff of trusting a third party for the speed and simplicity. The alternative is building everything from source, or hashing your own binaries. If anything, this is on Apple for the inexplicable fact that they haven't implemented native package management in macOS yet.


Well, if they didn’t made the trade off between security (manual pull request review) and easy of maintenance (automatic pull request merge), this would have happened.

If instead of parsing git diffs with ruby they used libgit (or something equivalent in how much battle tested it is), this wouldn’t have happened.

If Apple provided a more capable package manager, this wouldn’t have happened.

If the millions of customers of homebrew (as am I) gave them even a fraction of the monetary resources they need to build such a massive project, this wouldn’t have happened.

As you say, there is no single responsible party for the vulnerability. But it is an interesting case too ponder about. If instead of responsibly disclosed, a malicious party decided to use it, this could’ve been very ugly. I hope it gets noticed by the community and better measures are taken.


To be fair, if Apple provided a more capable package manager, you wouldn’t be allowed to publish an HTTP library to it for fear it may make requests to lewd images.


I wish Apple would provide a package manager... but just like with Linux we’d have to rely on PPAs. If you need to control how / when you update key packages (e.g., you need the latest PHP 7.4 and not 8, or node 12 instead of 15), you can’t use the official distribution... and then you’re still relying on a third party who might not have good security practices.


Macports is a package manager that gives you that control, including having multiple versions of packages installed.


Really this is what Docker is all about. If you have environments that need certain preconditions then wrap them up in a container so they don’t break when the system updates.


Well, it's also just bad engineering tbh. What they're trying to do here is take a workflow intended for humans to make arbitrary code changes and other humans to review those changes, then restrict it to be a workflow for machines to update two keys in a database. Why not just introduce a real API for people to submit new versions instead of hacking one together from tools never meant for this use case?

Honestly I use brew and love it but the decision to make everything git and ruby is super hacky. And fundamental architectural decisions like that aren't something you can fix with donations.


yep is a founding decision, so they can use free git/github infrastructure to run their entire backend, and someone else pays for it. To do what you describe would require hosting something outside of that, which would cost $.

Reminds me of at least 1 github engineering blogpost where they reference Homebrew repo(s) as pathological examples, repos that cause them pain to operate effectively because they just don't 'look' like almost anything else.

Is what happens when you build your entire architecture around using someone elses free-tier / public service.


would be interested to see that blog post if you happen to have a link, couldn’t find it googling “github engineering blog homebrew”


Not brew, but a very similar set of issues were faced by the GitHub team with the CocoaPods project, which at the time worked similar to Homebrew in that they used github as a CDN/host in a somewhat uncommon way:

https://blog.cocoapods.org/Master-Spec-Repo-Rate-Limiting-Po...

https://github.com/CocoaPods/CocoaPods/issues/4989#issuecomm...


I can't find it either; but the post I remember was from someone from homebrew. IIRC, homebrew used to do a shallow clone (the first time it ran everyday, or something like that) of the repo with all the taps and casks or whatever is in it. A big old git repo. I think github asked them to change homebrew, and they fixed so that it uses the regular git commands vs trying to get tricky and do a shallow clone.


Maybe, but this would require even more work, time and infrastructure.


I'd rather call it smart engineering that makes it easy for anyone to contribute. `brew bump-formula-pr` etc. are huge drivers behind Homebrew having among the largest number of contributors on GitHub.

It would help them a lot more if users like us contributed and volunteered to help maintain it. It's crazy that cask has over 100000 PRs and gets a few hundred everyday. For a team of just 20-30, that's insanely hard to manage.

Edit: spelling


They could have a convenient command line submission process for new versions without requiring every such change to be an actual PR. A simple web server that takes a POST request and converts it into a PR behind the scenes would be simpler for users as well, as there's no need then to clutter up your own github account with forks of their repo (or even to have a GH account in the first place).


I imagine that having an additional server for a (somewhat understaffed) project run by volunteers introduces additional costs and additional maintenance burden.

The current process is already convenient and all you need is a GitHub account. Having the account adds some identity or semblance of it at the very least. Without that? And with an additional server? Seems like a much wider attack surface compared to what's there now.


Thats what their GH action effectively _is_. They just used a poor choice of tool. They could write a program and host it on a web server and it still might screw up.


The exploit isn't a general risk of package management, but a weakness in scripted pull request handling.

GitHub is the problem, and not for the first time (remember the Homakov exploit?).

Homebrew is quite similar to FreeBSD ports, which do not have these issues. Incidentally, I find that a lot of packages in Homebrew have a very high packaging quality that puts several distributions to shame. I'm saying this as someone who had previously been suspicious of Homebrew, but then I looked at some of the packages.


The automated pipeline doesn't exist any more. More than donations I think this requires volunteering to review pull requests, as they mention in the disclosure.


Wow yikes, kudos to the great disclosure and fix process from homebrew and other involved folks.

Github Actions seems like such an easy target--the workflows people build are complex, difficult to audit, and lightly or never tested at all except in prod. It's just a recipe for disaster and all these recent issues (see also: https://news.ycombinator.com/item?id=26908076) make me wonder if the Github flow model of a wild west of public pull requests just isn't compatible with how people use workflows and automation in practice. No one in their right mind would leave a Jenkins server open for anyone to trigger some workflows. It's silly that everyone effectively does that because they see some Github branding on a page and feel safer.


> No one in their right mind would leave a Jenkins server open for anyone to trigger some workflows.

Open source projects have been doing exactly that for ages.

Homebrew in particular had a Jenkins server hosted at bot.brew.sh running arbitrary pull requests since 2013 before finally fully migrating to GitHub Actions in 2020.


Running arbitrary code from users in CI is one thing. Automatically merging code is another, which is what this bug exploited.

The former can be made somewhat/reasonably safe. The latter probably can't.


Yes, but that is not what this thread (GitHub Actions is somehow more vulnerable than Jenkins because no one exposes Jenkins?) is about.


> Open source projects have been doing exactly that for ages.

I'm not sure it's that common. What's common is to allow people to browse the Jenkins instance, but disable anything to run PRs that changes the workflow/build itself from people who are not part of the organization/team that manages Jenkins.


You don’t need to make changes to workflows to run arbitrary code. Projects usually run tests, and you can just put whatever you want to execute in the code being tested, or exploit other holes in the existing workflow. TFA does not make any changes to existing workflows, for instance.

Also, you can disable workflow changes in GitHub Actions with the pull_request_target event, in which case it uses the workflow definition from the target branch.


The pull_request_target event however allows for read/write access to the base repo and access to all of its secrets. It's not the event type you'd want to use whenever dealing with code that's incoming in a PR.

See the big warning at https://docs.github.com/en/actions/reference/events-that-tri....


That’s not true at all. Unless you expose the token or secrets in steps where you run untrusted code, the token and secrets are safe. This is explicitly designed for untrusted pull requests on public repos, solving the problem of, say, adding labels with the write token, or deploying a preview build to Netlify, while keeping the secrets safe from arbitrary code execution.


Not only this, but public Jenkins often have their shell open to the public!


This is exactly why on a project I recently set up CI on, it is only run when a collaborator gives an approving review, or something is pushed directly to master.

I saw far too much of an attack surface from Github actions, especially with MSBuild.



The big thing I see is that it was only three hours from security report receipt to primary mitigation in place. That’s a very quick turnaround for a volunteer open source project.

Kudos to the Homebrew team for running a disclosure program to find risks like this, and for their speedy mitigation!


Thanks for the kind words, they are surprisingly motivating (and rare).

I was the person who responded to this at the weekend while my youngest kid was having a nap. I'm glad to see people recognise that this sort of turnaround on a volunteer-run project has a human cost and doesn't come automatically.


They were conducting the program (with help from HackerOne) so they were on standby.


I think this article, and the discussion about it, are outstanding.

I've been a casual Homebrew user for several years. It helps me set up an environment where it's relatively easy to access and work on my projects that involve lightweight software development, as opposed to lightweight ops, which is probably my strength.

Working solo, I don't have the bandwidth to understand the details of everything I touch. I truly appreciate that there are people in the community who like to examine deeper technical issues with package management and also have the communications skills to explain it so people like me can understand.


I started doing all my development inside a virtual machine. UTM on the M1 chip running Ubuntu ARM works very well and has minimal impact on battery life. The VS Code remote extension makes development inside the VM feel like its local.


What I dislike in the VM approach is that the development machine and the host are somewhat disconnected and it's non-trivial for them to share files. Also, I'm concerned I'll write software with too many linuxisms that would render it incompatible with other Unixes (which are still used in far too many places). With my tooling, I even try hard to make it work on Windows, just in case.


That's why I use Parallels Desktop, even though I would prefer to use open source software. They provide great file system integration on common OS plus a few other things like GPU graphics (I tried with windows 10 arm).


If you share $HOME with your VM (which is apparently the default in Parallels), you're not gaining much in terms of security. https://zerodayengineering.com/blog/dont-share-your-home.htm...


So you run a Linux box using Parallels and use VS Code or something on the Mac side to edit files? I'm interested in adopting a similar workflow, so your guidance would be appreciated.


> and it's non-trivial for them to share files.

My approach is to have a separate dropbox and GitHub account for sharing, that both machines have access to.


You may want to look into Vagrant. It automates the setup of the VMs (and the sharing of files between them) so you can quickly test between say Ubuntu and FreeBSD.


What's the story with MacPorts these days? When I used MacOS I preferred it over alternatives such as brew and was under the impression it was used and had contributors from Apple as well.


There is no story, really. It just works. Combined with virtualenvwrapper it's my daily driver. Most of my colleagues went with the pyenv route and suffer for it.

In reality, there is no real reason why you couldn't have both (I do) and some things I install under brew, while others (python versions, for instance) I install under MacPorts.


Wait? you could have Brew and MacPorts in the same user? I read that they are impossible to co-exists due to how brew and macports handles the user folder? BTW I never tried this, I just got this Macbook Air M1 few months ago. Fearful that I will brick it.


MacPorts is fine (a coworker is using it daily). There's also pkgsrc and nixpkgs (I use the latter daily), which are great.

That kind of vulnerability being a real possibility is one of the main drivers behind ArchMac, although I did contribute to Homebrew, I always found it to be way too complex and brittle for its own good (no offense, many of it is a design choice, it’s just my personal opinion and why I steer clear of it)


Macports works fine. There is no major issue with it.

It was more or less blessed by Apple at some point (using Apple infrastructure for hosting and having Apple employees contributing such as Jordan Hubbard), but I don’t think they are still involved now.


Apple's macOSforge site still lists it: https://www.macosforge.org

It's boring technology that works well, which is what you want in a package manager.


Using it since around 2006. Boring old tech that works.


Why use anything else?


Why do they even use a feature like automerge without approval? I feel like homebrew needs a solid dose of KISS (keep it simple, stupid)


Automerge is one thing, it's the implementation that relied on some pure text hack. Matching text is a hard problem, and regex is not a solution, but a mere tool that translates one problem to another.

Actually, it only creates more problems:

    Some people, when confronted with a problem, think
    "I know, I'll use regular expressions."
    Now they have two problems


To reduce the amount of effort by maintainers.


> Why do they even use a feature like automerge

I think so too. On the other hand that goes in line with all the CI/CD/CX craze - which involves commonly accepted complexity in its own way.

> without approval

Probably that also shows that the project might need more contributors. To me it's a core component of every macOS install but actually a voluntary project which is an unusual combination.


Ignoring the whole homebrew side of it.

The article itself is a pretty great read. It explains the technical issue well. But additionally it shows the experience of the author and their process to find and exploit the issue.


> After that, I received I can’t add a test cask just for this but you could try to make a harmless modification to an existing cask perhaps? from the staff. Therefore, I chose a random cask and decided to make harmless changes.

Yikes. And indeed this “harmless modification” did leak into a user-facing change.

Seems like questionable judgement from the brew security team here. Surely they should have a staging/test cask for hackers to attack. The lack of concern at pointing a hacker to make a real change to an arbitrary cask is worrying.


It is likely that you'd have seen it anyway, as `brew update` loads casks whether or not installed. Note that this happened only to people who tapped the cask repo.

Turning off the autoupdate is highly not recommended - you don't get security updates for the software you install and you may experience issues with Homebrew. Chances are you'd have run `brew update` before you wanted to install/upgrade packages.

The best thing to happen here is for auto reviewing to be turned off and I'm glad for it as a daily user. Perhaps a much larger change (for Homebrew 4.0?) would be to avoid loading these .rb files on update, perhaps use an online cache/API like formulae.brew.sh.


This is a good lesson in why you should almost never use regex for parsing and validation.


What would you have used in this case?


I would have used a library that parsed the diff properly - or better yet one that applied the diff algorithm natively without going through a patch file format. E.g. something like this in Rust: https://github.com/utkarshkukreti/diff.rs (there are similar libraries for other languages).


Took me a bit to figure out that it's using Ruby code like this to fool the surrounding CI scripts.

  "b/#{puts 'test';}"
That happens to run the code within the braces, despite it all being enclosed in double quotes and preceded with a #. Can a Ruby person explain what's happening here? I understand it's working as designed.

Edit: Ah, okay, so the interpolation is #{code here;} everything else is a just a string in quotes. Interesting that they use the # character to denote interpolation.


It's string interpolation where you can provide an expression, not just a variable.

> String interpolation is common in many programming languages which make heavy use of string representations of data, such as Apache Groovy, Kotlin, Perl, PHP, Python, Ruby, Scala, Swift, Tcl and most Unix shells. Two modes of literal expression are usually offered: one with interpolation enabled, the other without (termed raw string). Placeholders are usually represented by a bare or a named sigil (typically $ or %), e.g. $placeholder or %123. Expansion of the string usually occurs at run time.

https://en.wikipedia.org/wiki/String_interpolation#%3A%7E%3A...


String interpolation. It can be used for good but it’s possible to exploit if the user picks the string. Many languages have exactly the same feature:

Python:

f”Price: {get_price()}”

JavaScript:

`Price: ${get_price()}`

Notice how both Python and JS require more than just double quotes; you have to explicitly opt in with f-strings or template literals.


Ruby is also opt-in in the sense that single quoted strings do not support interpolation, and while the language doesn’t care if you use single or double quotes, most linters will enforce single quotes as the default.


It was Ruby's use of the # character for interpolation that was throwing me off. Apparently you can skip the braces too, if the # precedes some types of variables.


This action by GitHub will help prevent this type of exploit, which seems like a good thing. Folks who haven’t merged a pull request before will need approvals to trigger an action: https://github.blog/2021-04-22-github-actions-update-helping...


I don't think this prevent this exploit because it mergered by bot (BrewTestBot in this case).

An this exploit still working if any of contributor create malicious update


The case against using homebrew mounts daily. Their poor decisions (regarding all of engineering, community/social, and privacy) are numerous enough for a non-listicle blog post at this point.

I switched to using Nix on macOS to manage packages, and it's fantastic. The nix concept in general is wonderful, and there's a lot of work put in to making it work well on the mac. I haven't missed `brew` at all.


Not at all? I just tried to install mongodb through nix, and it begins by downloading the C++ compiler, because it has to compile I don't know how many files. Homebrew was a lot quicker and lighter in that respect.


I thought of Nix while reading this thread and I'm wondering what makes it unique here? As a daily NixOS user I get that it is better but I don't know the specifics. the nixpkgs rpeo is superficially similar to homebrew (lots of people submitting packages, running on github, automation around commits).

What are the differences wrt to security?

1. It's language, Nix, is limited in scope?

2. No automated PR merge workflows (yet)?

3. Better community/engineering/security?


Well, a very simple answer is that homebrew embeds nonconsensual spyware into the brew tool itself, and nix does not. For me, "doesn't exfiltrate my private data to Google in the default config" is an important security benefit of nix over homebrew.

The longer answer is about the inherent benefits of the nix way of doing things; it is a horse of a different color compared to all other package managers I've seen or heard about. It is a different installation paradigm, and the nix documentation (and many blog posts) do a better job of describing its main differences than I can here.

Deterministic builds as a first class feature is probably the shortest summary. Being able to reference an entire and exact hash tree of deps is hugely valuable.


It does warn about it before sending anything though https://github.com/Homebrew/brew/blob/3e0f14083aa983c136a375...


Analytics are an invaluable resource for a volunteer-run project. To their credit, they issue a noticeable warning with the command to turn it off. It seems to me your issue is more about using Google Analytics - if there's a better alternative that is sustainable (read: free and doesn't require much effort to maintain) that should be suggested.


My issue is that the data is exfiltrated without consent. It literally does not matter where it goes if private data is transmitted without consent.

With consent, it's fine to send it anywhere they like.

Their analytics are also unauthenticated. I sort of want to make the first letter of each package in the top packages ranking spell out something funny.


Fine on the unauthenticated part. But about private data - it's literally anonymous and just counting the number of installs of a package and the number of build failures. All the data they collect (and the code that handles it) is public. They can't even isolate individual users because no individual data is collected.


It's not anonymous by any stretch, that's a false claim.

It includes a persistent unique identifier, generated on install, a sort of Homebrew supercookie for tracking you across months or years until you wipe your OS install.

It also includes the client IP address (because you can't make an HTTP connection without transmitting that). The fact that Homebrew doesn't see this information doesn't mean it's not transmitted (to Google). This leaks your city-level location.

These two things permit Google to assemble a rough location tracklog based on client IP geolocation (correlated via the unique install ID), along with which packages you installed. Google, as we know, spies for the US federal government.

You're missing the point here, though: even if it were totally anonymous (which, as I've pointed out, it's not): it's still unethical malware even in that case because it's private data transmitted without consent. The fact that you don't consider your usage data private is fine; others do and transmitting that from their systems without consent is abuse.

I mentioned that it's unauthenticated to point out that there's literally nothing stopping anyone on the whole internet from polluting the dataset with whatever bogus information they want. I wouldn't undertake this myself, but it's entirely in-bounds for an organization that feels entitled to co-opt your private computer to conduct surveillance on you without your consent. It's a public API, after all.


We’re looking into other analytics platforms. If you happen to know a good replacement with better privacy and similar features, we’d be happy to review PRs.

Please mind that knowing unique installs will remain important for us though. We have zero interest in tracking people but we do need meaningful install counts. Those numbers have been super helpful for making decisions, for example which packages are worth maintainers’ time and which aren’t.


I don't care at all about how valuable the stolen data you've obtained by violating people's consent is to you.

You're shipping unethical malware. Making the data more private, switching analytics platforms, doing IP scrubbing... it doesn't change the ethical issue at all. You're stealing data without the consent of the subject/generator of that data.

What you're doing is illegal in medicine, for example.


Nix doesn’t even install on my Mac; for reasons I don’t understand.

> Creating a Nix Store volume... error: refusing to create Nix store volume because the boot volume is FileVault encrypted, but encryption-at-rest is not available. Manually create a volume for the store and re-run this script. See https://nixos.org/nix/manual/#sect-macos-installation

The provided link doesn’t really tell me what to do...? Why are there so many (complicated) ways to install this thing, and why can’t it do it automatically?


I managed to get it running, but what it does is create an extra volume and mount entry for it. If your boot volume can't be altered by the script, you have to do it by hand, and that probably requires booting in maintenance mode. The reason behind that is that they chose /nix as their store, and it's apparently hard-coded all over the place, and macOS 11 has locked /, so you can't have a simply directory anymore.

They do explain why they chose this option, and provide you with others (each with its own disadvantages). Nix seems to be the arch of package managers.


based on the Nix documentation the macOS situation does not look as positive as described by this comment:

- if you have an M1 mac, it does not work (https://github.com/NixOS/nixpkgs/issues/95903)

- if you have an intel mac, when you install it, you have to add the "--darwin-use-unencrypted-nix-store-volume" switch to the installer, otherwise it does not work (https://nixos.org/manual/nix/stable/#sect-macos-installation). i do not want unencrypted things on may computer. (and yes, there is a long section of the documentation about other approaches with other tradeoffs, and if you have a T2 chip then maybe it is ok (but why isn't the installer detecting this automatically?)... the point is, with homebrew there is no such problem)


Thanks for mentioning nix. I didn't know it had a macOS variant. I've been meaning to learn the whole nix ecosystem but hadn't found the energy yet.


Nix pills might be a good start if you want to understand how all the pieces fit together. https://nixos.org/guides/nix-pills.

There is nix, the language/package manager which can be used standalone; nixpkgs, the ports or homebrew like repo of packages that can be built/run on many systems including macOS; and nixos, the Linux+Nix distro that puts it all together.


Thanks. I've already installed nix/nix-darwin and tried installing wireshark, but it isn't working for some reason. I have some reading to do.


Had to happen someday


(Weird, I posted that yesterday, thought there was duplicate detection... but good that it gets discussed now.

https://news.ycombinator.com/item?id=26916854 )




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: