Backdoored images downloaded 5M times removed from Docker Hub

zimmerfrei · on June 15, 2018

On the same topic, PyPI has recently moved to a new backend, and in the process all end-to-end PGP signatures (created by the package owner upstream, proving that no tampering happened on the online servers) have disappeared from the UI, and that is seen as a "feature":

https://github.com/pypa/warehouse/issues/3356

You can still get them through some obscure API and you still need to know the right PGP key for verification, but this really signals the lack of consensus and awareness on the path toward a secure software supply chain.

EDIT: typos

egjerlow · on June 15, 2018

FWIW, here is a blog post by dstufft which might help contextualize this behaviour: https://caremad.io/posts/2013/07/packaging-signing-not-holy-...

msl09 · on June 15, 2018

That is discussed extensively in the issues related to the OP. The problem is that package maintainers of distros actually check whether the GPG signature has changed in order to repackaged python projects for their distros.

jonathonf · on June 15, 2018

Why are distros packaging from PyPI and not from the upstream project?

jwilk · on June 16, 2018

Because that's where upstream puts release tarballs.

x1798DE · on June 15, 2018

They can still do that, it's just not exposed in the UI anymore.

msl09 · on June 15, 2018

I have tried checking the REST API[1] but I only found a has_sig parameter. Where is the actual signature?

https://warehouse.readthedocs.io/api-reference/json/

Sean1708 · on June 15, 2018

It's not particularly obvious, but you find a release for which `has_sig` is true then you take the URL from that release and append `.asc` to that URL.

  $ curl -s $(curl -s https://pypi.org/pypi/cryptography/json | jq '.releases["2.2.2"][] | select(.has_sig) | .url' | sed -e 's/^"//' -e 's/"$//').asc
  -----BEGIN PGP SIGNATURE-----
  
  iQEzBAABCAAdFiEEBf2foWz3VzUNkaVgI1rl8Sn57ZgFAlq6dNgACgkQI1rl8Sn5
  7Zg0Ygf/WzulfXom9qdbCHrUJh2xkTxPqK2/SUqDqOQ1OdKJm+MxDBcMhwrCdBDh
  8+eXyPTLnnhPUcCSqVFcJeUu9KyKB2MhKi7gdBUHrDxjbufexxPC+L/KwjOq3nod
  gL4OPHGGeX2ZgSlwFPR4zPIIheUmf9kPX88qtW8DD8zmuyhci6ibac9a/3fHkDVt
  H27B+aqs+WObMjcfwZV7gMnRbZwUOBZvVFRxwfMHVuMpfbwhQC8HdBK74XKNaoTd
  Golmpa5fqRm1sNquBz9YRVElWuw1qj1CZJhRBuR7V5xyPLX8J7EVUrYa70/fVtfr
  hW7oAlNbMFYb58hGC9K20v6WX8XT2w==
  =zox2
  -----END PGP SIGNATURE-----

At least that's what I was able to piece together from the docs...

dralley · on June 15, 2018

On the package?

ddevault · on June 15, 2018

They've also repeatedly broken having predictable tarball download URLs, which makes it harder still to make Python packages for distros, and dismissed it as by design. Package managers shouldn't have to implement Python-specific API adapters just to find tarballs and signatures. The Warehouse team seems more concerned with a pretty UI than a working platform.

jwilk · on June 16, 2018

https://github.com/pypa/pypi-legacy/issues/438

There's a redirector that is supposed to provide stable URLs, although IME it doesn't work immediately after upload, which is when I need it most. :-/

llampx · on June 15, 2018

It is the appification of software development.

evancordell · on June 15, 2018

I’m not sure what the implementation status is, but PEP 458 and 480 define how to integrate TUF with PyPI. It could be that pgp is being de-emphasized in favor of TUF?

(A quick search couldn’t tell me the integration status or if there are still plans to do so, but I’m familiar with the pypi plans from the TUF side)

TekMol · on June 15, 2018

I wonder how much malicious code like this is doing its work deep down in the endless pyramid of npm dependencies.

And how much as-of-now clean code will turn into malicious code when bad guys take over npm repos in the future.

It might be possible to tackle this issue by some intelligent trust algo that combines a trust rank similar to google-page-rank and signed messages.

Say somebody pushes an update to their repo. Now the first user of it might read it and sign it with 'Looks OK /Joe'. And the next user sees the signed message by Joe in some kind of package-review-message list. Based on all the reviews and the trust of the reviewers, they then can calculate a trust score for the update.

onion2k · on June 15, 2018

...or CPAN modules, composer libs, cocoa pods, etc. Anything you use to install unchecked external code is potentially dangerous.

At least npm has auditing now, so checking for problems is relatively trivial. GitHub even does it automatically.

dvfjsdhgfv · on June 15, 2018

Well, it's much, much easier for me to audit a CPAN module than a Docker image - the latter is practically impossible.

raesene9 · on June 15, 2018

eh?

Assuming automated builds, you can just read the dockerfile (and it's hierarchy if necessary), it's less complex syntax than perl by a long way.

Even assuming no automated build, all the information is in the manifest, and using something like portainer it's pretty easy to read.

heinrichhartman · on June 15, 2018

> you can just read the dockerfile

For many images, no Dockerfile is provided. And even if there is, it's not clear that it has been used to produce the software. I tried many times to reproduce images from Dockerfiles and failed: What `apt get update` does, depends on the current time and your network configuration. You can never be suer what you get from DockerHub.

My strategy has been to copy the Dockerfiles and run `docker build` myself, instead of using the image files. Alas it's does not work very well.

raesene9 · on June 15, 2018

So it should generally be possible to go from image --> dockerfile as the information is included in the manifest. If you save an image as a .tar.gz you can extract the info from there or there are also tools to reverse engineer Dockerfiles from images

https://samaritan.ai/blog/reversing-docker-images-into-docke...

That is a good point about lack of reproducability though. I suppose an interesting attack would be to deliberately forge information in the manifest before pushing to Docker hub...

I also tend to use anything other than the official images as "inspiration" and re-implement myself.

cyphar · on June 16, 2018

> I suppose an interesting attack would be to deliberately forge information in the manifest before pushing to Docker hub...

"Forge" is a bit of a strong word. The main reason Docker included build information in the image history was so that build caching would work (from memory). It's not meant to be an authoritative source of information about how an image was built at all (not to mention that "RUN apt-get update" tells you almost nothing about what packages were installed, when they were updated, etc).

Personally I think that the current trend of using Dockerfiles is completely broken in terms of reproduciblity and introspectability (I'm working on a blog post to better explain the issue).

palotasb · on June 15, 2018

Don't a lot of OS images start by importing a non-transparent prebuilt tarball containing nontrivial binaries? I would hide the malware inside those.

ownagefool · on June 15, 2018

Sure, but so could the OS or app distributors. You need to establish a baseline of trust somewhere, and this will likely be on the official (or your cherry picked official) images, and you build from there.

raesene9 · on June 15, 2018

The official images are made by the OS project themselves, so if you can get malware in there, you can just get your malware into debian/CentOS/alpine and not even bother with Docker hub :)

onion2k · on June 15, 2018

They do, but you can easily see what's being used as a base image. If it's something odd then you should think twice about using it.

csomar · on June 16, 2018

How? The Dockerfile downloads a bunch of files/programs that it is not clear what they are? How are you going to automatically catch these?

Patrick_Devine · on June 15, 2018

You can use the `docker trust` to verify which key (if any) was used to sign a given docker image. The `docker history` command will also give you a list of each of the layers as well as the command which was used to create the layer.

MPSimmons · on June 15, 2018

It's trivial to insert software into the build pipeline without being noticed.

Run your own devpi server, build a compromised version of any dependency you want, and you can install whatever you would like, with no sign of it in the Dockerfile.

2sk21 · on June 15, 2018

Between pip, npm and DockerHub, we seem to be inviting all kinds of lurking security holes. What if the code was used in an embedded system - probably never to be updated again.

moviuro · on June 15, 2018

https://news.ycombinator.com/item?id=16084575 and article https://hackernoon.com/im-harvesting-credit-card-numbers-and...

contravariant · on June 15, 2018

Seems like we need some kind of inter-repository git. Allowing for signed commits and version control between repositories.

cat199 · on June 15, 2018

You get this type of security inherently via most os-level packagers which check upstream tarballs especially the ports-like ones which are primarily designed for from-source usage.

Unfortunately there has been a trend to ignore these and install directly of late, especially for dynamic language modules.

progx · on June 15, 2018

Or in Windows, or in Linux, or in Android, or in IOS, or in OSX, ....

npm (and i think the other repos too) implement more and more security tools and procedures to reduce the risk. But 100% security is impossible.

TekMol · on June 15, 2018

    Or in Windows, or in Linux, or in Android,
    or in IOS, or in OSX

None of these let random users upload stuff to their repos.

progx · on June 15, 2018

But they have many developers. And one developer could insert code which works and passes tests, but in Background start someday a download and start monero mining.

The point is: you have to trust the developers at this companies, that they not do anything bad and that the companies check every code of every dev. Trust.

jinglebells · on June 15, 2018

Tools like Snyk are doing good work to try and solve this problem.

https://snyk.io/

2sk21 · on June 15, 2018

This looks promising but I hope that _Snyk_ itself is not compromised. And of course, http://wiki.c2.com/?TheKenThompsonHack

ihsw2 · on June 15, 2018

npm packages are interesting in that they run both in-browser and in-server -- it stands to reason that the in-browser attack surface is far larger and its impact would be rarely noticed by users (since they're already accustomed to websites taking up inordinate amount of resources.)

fpgaminer · on June 15, 2018

Somewhat related, since this is about Docker security: I started looking at Traefik today. It's a reverse proxy that runs as a Docker container and automagically configures itself to expose your other services (that are also running in Docker containers).

Neat idea. However, to accomplish this you have to mount the docker socket into Traefik's container...

Which means that when a bug shows up in Traefik attackers can pivot out of the container and onto the host; access to the docker socket is equivalent to root on the host.

And of course Traefik is the thing you're exposing directly to the internet.

It's like giving the guards outside manning your castle's gate the skeleton key to the rest of the castle.

Of course, Traefik is quickly becoming popular because of its simplicity. But to achieve this simplicity it carves a giant hole in the security of your application.

rollcat · on June 15, 2018

Two things... it's about time we've had an official permission system for the docker API, so I can grant "inspect running containers" and nothing else and sleep at night; two - it should be possible to run traefik in a pod as two containers - one talking to the API's and tweaking the runtime config, the other serving public traffic (jwilder/nginx-proxy can do this!). It's called privsep and OpenBSD have been doing it since forever - check their remote hole count, it's still two in a lifetime.

CGamesPlay · on June 15, 2018

jwilder/nginx-proxy has been instructing users to only grant read-only access to the docker daemon for as long as I've been using it, so I know this is at least possible. It's not fine-grained, but it is read-only access. https://github.com/jwilder/nginx-proxy

fpgaminer · on June 15, 2018

According to this: https://raesene.github.io/blog/2016/03/06/The-Dangers-Of-Doc...

mounting the socket readonly doesn't help.

CGamesPlay · on June 18, 2018

Those are definitely good points. I'm curious about the part at the end: "An attacker with ro access to the socket can still create another container..." How is that possible with readonly socket access?

gnur · on June 15, 2018

The integration of Traefik with the Docker daemon should mainly just be used while developing (imho). Once you get to acceptance / production environments, you are very unlikely to run plain docker containers, if you use kubernetes you interface Traefik with the kubernetes api itself, and the service account you create for Traefik can be (and should be) completely read only. Same for Docker Swarm, Marathon, Consul and AWS ECS.

So no, Traefik is not the big security problem you make it out to be.

Sorry to be so harsh, but Traefik is one of the most amazin pieces of software I have come across in the last years that has seriously made my life much easier.

jakobegger · on June 15, 2018

If software has an insecure mode "just for development" that absolutely shouldn't be used in production, you can be certain that a large fraction of developers will use that in production nevertheless.

Security today doesn't mean that you are safe if you do everything according to best practices and follow the docs. Modern Security includes making sure that default settings are safe, and that it should be impossible or hard to set up the software in an insecure manner.

If you make it easy to shoot yourself in the foot, that's what people will do.

raverbashing · on June 15, 2018

> If you make it easy to shoot yourself in the foot, that's what people will do.

Yes they will. Just repeating it here because it bears repeating.

Yes, they don't care. And yes, there will be an attacker trying to exploit it.

gnur · on June 15, 2018

To be fair, this is a limitation of Docker, there is nothing Traefik can do about it.

heavenlyblue · on June 15, 2018

Well, then let's allow the evolution take it's place.

nemothekid · on June 15, 2018

If you use Traefik with Docker Swarm, the official docs[1] recommend you mount /var/run/docker.sock.

Sure its not a problem if you use k8s, mesos, consul, or any of the other schedulers, but the security gap is still there.

[1] https://docs.traefik.io/user-guide/swarm-mode/#deploy-trfik

styfle · on June 15, 2018

A couple years ago, I wrote a container than periodically polls the Docker Socket to check for currently running containers with exposed ports and a special label applied.

It then iterates over those containers and writes a nginx.conf file to a shared volume, then sends a SIGHUP signal to another container running nginx as a reverse proxy to the containers.

The "polling job" container doesn't expose any ports and is not reachable from the outside world and the only input into this program is reading data from the Docker Socket.

Do you think this is still vulnerable to attacks like Traefik is or does this 2-container routing protect against the attacks you're thinking of?

fpgaminer · on June 15, 2018

Very nice. Seems airtight to me.

As others have suggested, Traefik should really be doing something similar (or Docker should add ACLs to its API).

LeoPanthera · on June 15, 2018

This bug: https://github.com/docker/hub-feedback/issues/1121

raised over a year ago(!) is really interesting. It seems like many of the downloads may have been malicious - the author of the malicious images was scanning for open docker api ports and then installing their own images to mine cryptocurrency.

So they're essentially using docker as a dropper. Clever, in a way.

bboreham · on June 15, 2018

In what sense is this a “backdoor”? Seems to me the code is coming through the front door, which the victims left open.

DockerHub is just the delivery mechanism.

imtringued · on June 15, 2018

I'm scratching my head at where the /mnt mount is coming from. If you're doing "docker run -v /:/mnt <sketchy_username>/mysql" then absolutely nobody can help you.

b6z · on June 15, 2018

Same for me.

From Kromtech's article I deduced that this only happens when a docker daemon (or kubernetes interface) is exposed to the Internet and an attacker uses that to download and start a docker image on the victim's host. Then they can bind mount a host directory like described and attack the host computer.

cyphar · on June 15, 2018

It should be noted that some of the reports talk about the Docker API being publicly accessible over the internet which allowed people to run containers on their machines. This is actually not the worst thing that could have happened -- having access to the Docker API gives you root access on that machine without any authentication!

(One of the ideas of rootless containers is to remove the possibility of any privileged codepath, which helps eliminate this issue.)

avip · on June 15, 2018

Docker api exposed over internet without TLS implies a head should be removed. That’s not the default config. Not what the docs recommend. Why??

djsumdog · on June 15, 2018

I don't think that's even possible. Docker doesn't let you expose the daemon over HTTP without configuring certs. I had to write an ansible script to do that, and even then I locked down my Docker port to my VPN subnet:

https://github.com/sumdog/bee2/blob/master/ansible/roles/doc...

cyphar · on June 16, 2018

    sudo dockerd -H tcp://0.0.0.0:8080

will happily start Docker with it listening on my IP address without TLS. It will print an all-caps warning, but nothing else (you don't even need to pass a --give-the-internet-root-access flag). However, I just submitted a PR which adds the --give-the-internet-root-access flag[1] because it's pretty obvious to me that very few users do this intentionally (and with full knowledge of the consequences).

[1]: https://github.com/moby/moby/pull/37299

toopsss · on June 15, 2018

What the heck are you talking about? If dockerd is started in tcp mode, it is unencrypted and unauthenticated by default.

ccnafr · on June 15, 2018

ORiginal report: https://kromtech.com/blog/security-center/cryptojacking-inva...

The ArsTechnica article, like most AT articles, glosses over most details and focuses on a small-time cryptomining campaign

djsumdog · on June 15, 2018

I don't understand why people use other people's Docker images. Unless it comes from an official repository for the tool you're using, it's better to look at the source code/Dockerfile in the github link and just roll your own.

A lot of times you're just installing the package you want with apt-get within your Dockerfile anyway; a package you can't check for normal updates for anymore since it's in a container. So now you need a tooling system around making sure your packages in your containers don't have security issues.

Docker is kinda a mess.

y4mi · on June 15, 2018

its not really a mess for its usecase.

its immutable infrastructure at its heart, so yeah, you don't do updates on containers... what you do need is periodic rebuild of your images for upgrades and each new image needs to run all integration and system tests again.

it just makes this process easier than it is without docker. But it doesnt alleviate you of writing the system that actually keep everything updated in an automated way.

It also doesnt help you deploy unless you're already experienced with docker. and while we'Re on the topic... no, if you know how to execute 'docker run -it --rm ubuntu bash' you still don't know shit about it. sigh sorry, i'm just remembering someone from work today...

ex_amazon_sde · on June 15, 2018

For those who wonder why Linux distributions are "still" around, this is a reason. Some have a good vetting process for packages.

HankB99 · on June 15, 2018

I wonder which ones have the best vetting. And if it is adequate.

I also wonder about other packaging systems. CPAN, pip (pypy?) AUR and so on. It doesn't surprise me to see this happen. I wonder what other surprises might be in any of these packages.

FWIW, I'm running mostly Debian and some Ubuntu. I always prefer to install packages via the package manager rather than directly from some tool specific repository because I'll get automatic updates and some level of testing/vetting.

crypt1d · on June 15, 2018

>By the time Docker Hub removed the images, they had received 5 million “pulls.” A wallet address included in many of the submissions showed it had mined almost 545 Monero digital coins, worth almost $90,000.

This seems incorrect because its impossible to see wallet balances on the Monero network. So I'm assuming they just came up with the numbers based on some rough calculations.

ricANNArdo · on June 15, 2018

If you go to the pool website with the address specified on the botnet you can see how much it was mined. The main article [1] linked on the news said:

> The actor has been able to mine about 630 XMR to date, which at the current USD rate is more than $172,000 for just a little more than one year of activity.

[1]: https://www.fortinet.com/blog/threat-research/yet-another-cr...

crypt1d · on June 15, 2018

that makes more sense, thanks.

jchw · on June 15, 2018

The worst part here is definitely the timeline. NPM is often criticized about security, perhaps rightfully so, but at least the issues are handled promptly after raised publicly.

paulie_a · on June 15, 2018

There is no "perhaps" about npm, it is absolutely a shit show. It's like the creators went out of their way to build an ecosystem of security nightmares.

jchw · on June 15, 2018

I'm not making claims either way. I don't like NPM either, it just wasn't relevant to what I was saying.

outside1234 · on June 15, 2018

ok, hang on, how is npm different from maven, pypi, etc. ?

oneweekwonder · on June 15, 2018

I will take a stap at comparing npm vs pypi.

While this will be anecdotal I found my `pip freeze` packages a lot more manageable compared to `npm ls --parseable`.

My Full Stack Flask application sits at about 64 requirements. Where last time I used expressjs my dependancies inside of node_modules inside of node_modules border-lined to insanity.

I can see myself hand picking and reviewing my requirements.txt, which I did to some extent.

But I just gave up with npm.

Which is a bit of a personal dilemma for me, because some of my tooling needs npm.

ehnto · on June 15, 2018

So many packages and authors creates an impossibly large surface area to review and secure. It is my impression that most people don't even try so the issue falls on deaf ears. But for a PCI compliant piece of software for example, you would have to had reviewed every module in node_modules. As a stack it makes it a non-starter.

cup-of-tea · on June 15, 2018

I've been doing Python for over five years now. I recently installed a small command line tool using npm. I was completely stunned by the number of dependencies. At first I thought it must be something else, like maybe it's running tests? But no, hundreds of dependencies. If it were written in python it probably wouldn't even have one.

eeZah7Ux · on June 15, 2018

In Python, 64 dependencies is quite a lot. Most services can be build with less that 5 or 10 well-known, trusted libraries.

The numbers and ecosystem maturity make a huge difference in being able to vet dependencies.

inapis · on June 15, 2018

You’re not OP but you’ve move the goal posts from security to number of dependencies. I think GP was focusing on how are other package managers different from npm when it comes to security except maybe apt and pacman?

And node_modules is now a flat tree for the most part.

The number of dependencies a basic JavaScript project pulls is definitely something to be concerned about though.

paulie_a · on June 15, 2018

Because I need 90,000 stupid packages on npm which can be disabled and removed at any moment. Vs python which mostly requires 20-100 if you are really making a complex system.

Also the versioning on npm is incredibly pathetic vs pypi. I would never trust npm for anything serious or waste my time debugging that crap.

overkalix · on June 15, 2018

Not a webdev, but from what I've seen webdevs are forced to compete, not in what they can do, but in terms of how they can do it (frameworks, packages). So, an aspect of your work process becomes a proxy for the quality of your work. In a context of permanent competition, this forces an acritical and rapid adoption of frameworks that perhaps only marginally improve some aspects of some other frameworks. Then compound this with the inflow of bootcamp-trained webdevs with poor practices whose employability relies on having being recently trained on the latest hyped framework.

heavenlyblue · on June 15, 2018

No, JavaScript just doesn't have a standard library. And even if you think it does - then look at what Py std has to offer.

earenndil · on June 15, 2018

It's not a technical issue, it's a cultural one.

paulie_a · on June 15, 2018

It is both

raesene9 · on June 15, 2018

This is essentially a dupe of https://news.ycombinator.com/item?id=17303570

FWIW that headline isn't great. Docker hub pulls in no way correlate to innocent users pulling/using those images. It could be (and this is quite likely) just other malware which made use of those images and just used Docker hub as a repository.

There are official images for the software in question and I don't think it's that likely that that many people ignored the official ones and got these ones.

meuk · on June 15, 2018

“For ordinary users, just pulling a Docker image from Docker Hub is like pulling arbitrary binary data from somewhere, executing it, and hoping for the best without really knowing what’s in it,”

This is basically what you do every time you install something (except when it's via a walled garden like an 'app store'). Besides, I'm not sure I would even classify mining for someone else as 'malicious'. It hogs your CPU a little, but if that's malicious then visual studio should be considered malicious as well.

laumars · on June 15, 2018

Maybe it's not malicious in the strictest sense of the term but it's also not the same as your Visual Studio example. In your case Visual Studio is a productivity tool that brings you value and thus you chose to install it. In the case of this docker image it's not adding you value and was installed without your concent.

The JS example another commenter made is more apt however the argument there is that you still requested the site and it's content (even if you didn't really want it). Whereas many of the installs of this "dockerised" miner were remotely via exposed Docker APIs. That I think is the real crux of the potential "malice" (for want a better description) here.

_wldu · on June 15, 2018

Or when you visit web pages that run JS in your browser on your machine. No one seems to mind that, but they should.

INTPenis · on June 15, 2018

This is exactly why I never liked Ansible Galaxy, and Docker Hub came into the same category.

Screw the extra work, I'd rather write my own roles and Dockerfiles.

geerlingguy · on June 15, 2018

Better yet, fork it.

Most roles and Docker Hub images are pretty simple, and you should be evaluating them anyways before using them. If you’re concerned about the security but want to save the time in building and debugging, fork it, and maintain your fork, only pulling in changes from the upstream when you have time to vet them.

pmlnr · on June 15, 2018

Most of these things are plain text instruction files - yaml for ansible, docker's own thing for docker. It falls under the same category as random bash install scripts: download the text file, read it, use it, if it's safe.

INTPenis · on June 15, 2018

Yes read it, use it but the next step can't be update it because then you'd have to read it again. I just don't have the time to audit someone elses yaml constantly.

outside1234 · on June 15, 2018

where are the images enumerated?

rafaele · on June 15, 2018

https://kromtech.com/blog/security-center/cryptojacking-inva...

search for "Figure 7"

etaioinshrdlu · on June 15, 2018

A startup I'm aware of (not associated with) that aims to help tame this problem a bit: https://anchore.io/

yani · on June 15, 2018

This is not a backdoor. I myself have a miner on Docker hub. The image can be used by anyone with correct envars set. Should my image be removed if used by other users no matter what their intensions are?

sleepychu · on June 15, 2018

You are being facetious.

If your image is called monero-miner and a bunch of people download it, of course it's not going to be considered malicious code.

If your image is called apache-webserver and a bunch of people download it and you've stealth bundled a monero miner, of course it's going to be considered malicious code.

EDIT: even worse than that, the images are actually back doored they open up a reverse shell to allow the remote to execute arbitrary commands.

inapis · on June 15, 2018

This particular case is definitely a backdoor and malicious. The images were pretending to by mysql, mssql, Apache etc.