RancherOS: An OS for Docker Containers

jdub · on Feb 25, 2015

Docker itself as PID 1 is… creative.

Good things they've thought through:

- boots fast, does almost the absolute minimum it needs to get up and running

- supporting user data and at least minimal config via cloud-init

- properly minimal: you have to use a Debian image to set up persistent storage with mkfs.ext4! (edit: when using the ISO version, which is not the primary use case)

- but helpfully familiar: you can install distro flavoured "console" experiences with more than just busybox

- it's almost a better Boot2Docker than Boot2Docker! (a little bit of love for VirtualBox / VMware shared storage wouldn't go astray)

riobard · on Feb 25, 2015

I'm not sure… running a 7MB binary with tons of functionality as PID 1 sounds like a really bad idea. Possibly even worse than the usual complaints against systemd (which has at least process isolations for its various features).

jdub · on Feb 25, 2015

Who has a 7MB binary with tons of functionality as PID 1? That doesn't describe RancherOS or systemd.

(Edit: Oh, you mean Docker itself as PID 1 on RancherOS. Yes, that's true. Certainly hairier than systemd. I haven't really dug into how much they limit what system-docker can do.)

jdub · on Feb 25, 2015

Interesting data point:

- /usr/bin/systemd-docker = 8.4M

- /usr/bin/docker = 14.3M

NeutronBoy · on Feb 25, 2015

From a security perspective, not really, the alternative is run Docker as root. Perhaps it is from a system stability perspective though?

Gigablah · on Feb 25, 2015

I'm leery of entrusting so much to Docker... I've had something as innocuous as pulling a remote image crash the daemon (due to a misconfigured private registry) which brought all my containers down. Imagine if this happened to all your critical system services!

jdub · on Feb 25, 2015

That's why they have system-docker as PID 1, running essential services in containers.

One of those services is docker, for running Random Shit You Downloaded Off The Internet. :-)

akerl_ · on Feb 25, 2015

"Systemd and Docker don’t work well together as they both attempt to manage control groups."

Has anybody hit actual issues with this? Having used Docker and systemd concurrently for a while, I can't say this has every caused conflicts, any more than the fact that both myself and my guests manage drinks in my fridge.

tristanz · on Feb 25, 2015

In my experience it works, but is a mess. You end up with really crazy unit files that aren't actually managing the docker daemon, they're managing the docker client. CoreOS created Rocket in part because the match between docker and systemd is so bad. Rancher seems to be going in the other direction by making Docker support fully native. I do wonder how much this really buys you over a higher level framework like Mesos or Kubernetes though.

You can see Darren's discussion of these issues here: https://groups.google.com/forum/#!msg/coreos-dev/wf7G6rA7Bf4...

vidarh · on Feb 25, 2015

I wouldn't say the unit files end up "really crazy", but I do agree it is a bit of a mismatch and we lose out on a bit of the advantages of systemd.

Re: the post you refer to, I really don't like his proposed solution of "docker attach | docker start | docker run". I usually do this:

    ExecStartPre=-docker kill %P
    ExecStartPre=-docker rm %P
    ExecStart=docker run --rm --name %p ...
    ExecStop=docker stop %p

Of course that would not apply to any containers used to export volumes, but I tend to mount volumes from the host anyway. But I explicitly want a clean slate when I restart containers.

tristanz · on Feb 25, 2015

Simple examples looks fine, but try a setup where you have dozens of environmental variables being set, data volume-containers that need to match the lifecycle of the main container, pulling images that don't exist, ambassadors containers for cross-host linking, and other sidekicks for things like load-balancer registration and health checking. If you look at how Deis does it, they've actually created tons of little binaries to help with different tasks (https://github.com/deis/deis/issues/2254) because the unit files get too crazy by themselves. You end up trying to build a system like Kubernetes using shell scripts and unit files.

The key thing to remember with CoreOS is it puts systemd at the center not docker and only interacts with docker through the client. In fact, there's really zero docker integration besides shipping the binary.

There's also the underlying technical issue around what is PID 1. In your example, you aren't actually monitoring the container, you're monitoring the status of the docker client, which may be different than the docker daemon. For containers, the docker daemon owns the fork rather than systemd. In a system like Kubernetes you have a reconciliation loop that monitors the daemon. You could probably build a similar idea into your unit files to make them more robust, but that's even more complicated.

Having gone down the systemd route it seems like a dead end to me for distributed docker applications. If you believe in a containerized future, then choose an orchestration framework that treats docker as first order primitive. You can still bootstrap this framework using CoreOS/systemd but after that I'd much prefer to just interact with a higher-level system.

vidarh · on March 2, 2015

I don't build any of that into the unit files because I need additional health checking anyway. That the processes are running does not really tell me much useful - a huge proportion of the system failures I deal with are failures where the processes are still running but something else has gone wrong.

For the same reason it doesn't really matter to me if Docker is treated as a first order primitive.

linkregister · on Feb 25, 2015

What's your setup? CoreOS host?

titrate · on Feb 25, 2015

I've been using Docker since forever and I think I grasp the architecture here.

But, my question is, what does this bring to the table that isn't possible with Docker + Machine + Swarm?

That's a super important question to ask, because adding another layer to deployments is not something people are wont to do to a system (Docker) that's supposed to put the simplicity back into deployments. Also, since this isn't under Docker Inc.'s umbrella people would be right to be cautious depending on it lest it die, whether that's fair or not.

I don't use CoreOS but I get it: it offers orchestration whistles that solve some people's problems. I also don't use but I understand PaaSes like Flynn: they're solving problems at a completely different layer and their ties to Docker are incidental.

But this, the primary touted advantage seems to be that it's a slim Docker image. For me (and I imagine others) that is a solved problem with Machine and/or boot2docker. On top of not imposing any new overarching architecture to learn, those tools are already widely deployed, supported, and trusted, and have the huge unfair advantage of being blessed by Docker core.

And if you really want to run Docker in Docker, that's been supported for a very long time, and you get that for free without installing anything.

So I'm at a loss to think of a case I'd advise someone to reach for this. Is there something I'm missing?

ehazlett · on Feb 25, 2015

I've only played with RancherOS for a very short amount of time so keep that in mind :) I don't think we quite know exactly what it will turn out to be or exactly where it would be most effective. Without going into the technical details (because I'm not the person for that - I'll leave that to rancher), RancherOS isn't just a Docker image. It runs Docker at the core. Not just an OS with Docker in it. This gives it the potential to have an update mechanism similar to CoreOS etc.

You are correct in that it is very similar to boot2docker. boot2docker is also a very light distro with docker (and by "very light" I mean awesome slim -- steeve is amazing :) The main difference, as I see it, is in updating / extending / packages. With TCL, you need to either find the package and include it in your build or create the package with a build chain etc. which isn't trivial. In the RancherOS route, you could simply pull a new docker image.

The interesting thing to me is it offers choice. If you want to run systemd, etcd you can without changing an entire system (i.e. today, if I want Fedora with systemd I have to configure the Docker daemon opts differently than say with Ubuntu).

_ytji · on Feb 25, 2015

Curious if anyone has come up with a way to run X11 itself in a container? (there are numerous articles on running GUI apps in containers that simply need a way to talk to Xauthority or X socket, or using VNC or xpra). I'd imagine it would need to run privileged with access to device files, or something like that.

I've kicked around the idea of a "workstation" set of containers to run on top of CoreOS, but this is the biggest hump I've run into.

TheDong · on Feb 25, 2015

Many people already have run X in a container with no trouble. They often just vnc it out.

The hard part is not running X, but running X and having it display to your hardware. This is a massive distinction. You need to passthrough your graphics device so that X can write to its framebuffer and all that bizzazz, and that's the question you really should be asking. "How do I pass my graphics device / display device into a container".

_ytji · on Feb 25, 2015

Right, this is what I actually meant

kordless · on Feb 25, 2015

Jess at Docker has some examples of setting this up and running various applications using VNC: https://blog.jessfraz.com/posts/docker-containers-on-the-des.... Scroll down a bit toward the middle.

Patrick_Devine · on Feb 25, 2015

I'm starting to wrap some of my pyweek games in Docker because getting all of the pyglet/opengl/avbin dependencies to work correctly is a royal pain in the rear end.

Check out: https://github.com/pdevine/yoyobrawlah

You can use the run_docker.sh script to get things going. The image is also available on Docker Hub, but you need to pass in things like the dri device to get it working correctly.

zwischenzug · on Feb 25, 2015

Here's an example of X11 in a container, with VNC'ing in.

https://zwischenzugs.wordpress.com/2015/02/01/win-at-2048-wi...

original link:

https://zwischenzugs.wordpress.com/2014/05/09/docker-shutit-...

jdub · on Feb 25, 2015

I don't imagine anyone's bothered trying too seriously, because it wouldn't solve any interesting (read: security) problems. Relatively simple software distribution of X is a solved problem. ;-)

compsciphd · on Feb 25, 2015

everything you need to know

https://www.usenix.org/legacy/events/atc10/tech/full_papers/...

progman · on Feb 25, 2015

Quote: "At 20MB, RancherOS is two orders of magnitude smaller than a typical Linux distribution, and an order of magnitude smaller than even other minimalist Linux distributions."

This claim is incorrect. Linux kernels can actually be as small as around 2 MB. The rest of a working Linux system (GNU, GUI etc.) does not belong to the kernel.

http://superuser.com/questions/370586/how-can-a-linux-kernel...

A Debian Linux distribution (kernel + stuff) can run in just 32 MB. For instance:

http://stackoverflow.com/questions/1522146/minimum-configura...

Quote: "I've used a TS-7200 for about five years to run a web server and mail server, using Debian GNU Linux. It is 200 MHz and has 32 MB of RAM, and is quite adequate for these tasks. It has serial port built in. It's based on a ARM920T."

LewisJEllis · on Feb 25, 2015

Let's interpret the claim precisely before calling it incorrect.

What it asserts is that there exist some minimalist Linux distros around a few hundred megabytes in size, and this can be shown true merely by example.

The claim notably does not use "all" or "every", so the existence of something smaller does not create a contradiction.

vidarh · on Feb 25, 2015

Our X "workstations" running Slackware back in the day used to have 16MB RAM. I don't remember how much disk space it had, but probably 300MB-400MB - I do remember 500MB drives were not that common yet.

I've run Linux in embedded systems with 4MB flash and 4MB RAM too.

It's harder to keep Linux slim these days without sacrificing all kinds of things we've gotten accustomed to, but yeah, it's pretty much down to how minimalist you are willing to go rather than how small it is possible to get.

steeve · on Feb 25, 2015

Well, boot2docker is 24mb and is based on Tiny Core Linux, which is 9mb.

So yeah, not true at all.

_mikz · on Feb 25, 2015

How you upgrade system docker when it is PID 1 ? Does the machine has to be rebooted?

zaius · on Feb 25, 2015

Regarding the "Ideal for Production" tagline - how would I go about running this in production? Specifically on linode - would I have to run a "regular" linux and then use kvm to run the rancheros iso?

jdub · on Feb 25, 2015

I don't think you can install and run arbitrary stuff on Linode, so you'd have to convince Linode to provide an OS image for RancherOS.

(Or goof around copying stuff onto a disk image.)

akerl_ · on Feb 25, 2015

You can certainly run arbitrary stuff on a Linode; I'm currently running a custom-built read-only Arch image.

You do have to put it there yourself, but if RancherOS is billing itself as small and easy to update, that seems like it shouldn't be hard.

ecnahc515 · on Feb 25, 2015

Cant be surprised that Darren is the creator. I remember seeing him at a talk this summer and I imagined he would be the kinda guy to put this type of project together.

tipiirai · on Feb 25, 2015

How does this differ from CoreOS?

jdub · on Feb 25, 2015

Besides the guts of how the OS platform itself is built and runs, the major feature difference is that it doesn't have all the cluster co-ordination bits like etcd, fleet, and so on.

vidarh · on Feb 25, 2015

You can easily run etcd in containers. In fact, at work our etcd deployment are all the official coreos/etcd docker images. Fleet is majorly tied to systemd so that's a different matter.

saryant · on Feb 25, 2015

There's CoreOS the OS and there's CoreOS the project. The OS is just a self-updating barebones Linux distro built for hosting Docker containers.

jdub · on Feb 25, 2015

Which is also very different to RancherOS, but I excluded how the OS is built and run from my previous answer.

Gigablah · on Feb 25, 2015

Most obvious difference is swapping out systemd for Docker (which then runs another Docker).

XYEaQMZJvS · on Feb 25, 2015

Anyone else find the Docker logo extremely adorable? I know nothing about it but I smile every time I see that happy little barge-whale.

wmf · on Feb 25, 2015

Now check out the narwhale: http://socketplane.io/

swatow · on Feb 25, 2015

The Docker logo is cute because it isn't trying to be cute.

Animats · on Feb 25, 2015

It's not quite an OS for Docker containers yet. It's a distro with special middleware. The next question is how much can be thrown out of the Linux kernel. Or can it be replaced with something more secure? What would it take to run this in Xen without Linux?

_ytji · on Feb 25, 2015

I think running linux containers without linux would be difficult! You probably want Mirage OS or OSv.

http://www.openmirage.org/

http://osv.io/

carlos22 · on Feb 25, 2015

The Joyent guys are running docker in smartos by so called "branded zones". They emulate the Linux Kernel ABI. This works surprisingly well, as they say. See: https://www.joyent.com/developers/videos/docker-and-the-futu...

The cool thing is, that they open sourced the SDC in November 2014.

jdub · on Feb 25, 2015

Docker is a user interface and policy wrapper around core Linux features. It's not, in itself, a virtualisation or container system. So... not likely!

(But you can use Docker as a user interface / API to non-Linux container implementations -- see all the stuff the Joyent folk are doing.)

labianchin · on Feb 25, 2015

I am curious about one thing: how System Docker as PID 1 deals with reaping zombie processes?

kordless · on Feb 25, 2015

From the discussion last night at the Docker meetup, the System Docker is responsible for the reaping of child processes. The comment from the presenter was basically "we sorta expect this (systemd) to be stable and not crash". That said, he also mentioned that it's early work and their actively thinking about solutions to this problem and the ones related to security.

tracker1 · on Feb 25, 2015

Interesting... Played around with CoreOS a little bit, but wasn't quite confident with it... currently rolling Docker under Ubuntu Server containers.

I really like the idea of these micro OSes, and both CoreOS and RancherOS have some interesting aspects to them.

shouldbeworking · on Feb 25, 2015

Makes me think of http://en.wikipedia.org/wiki/Huevos_rancheros

graffitici · on Feb 25, 2015

I only played with docker briefly on my laptop, and never in a production system. Do you think there will be any performance hits in a system like this?

smcleod · on Feb 25, 2015

Is SELinux or GRSecurity built in and enabled?

lclarkmichalek · on Feb 25, 2015

Highly unlikely, given this is a product coming out of the Docker ecosystem.

smcleod · on Feb 25, 2015

We run SELinux on all our docker hosts and are about to roll out GRSecurty as well, at present this is one of the best lines of defence you have against running code so close to root.

lclarkmichalek · on Feb 25, 2015

A hypervisor is the best defense you can have against running code so close to root

smcleod · on Feb 25, 2015

It's not quite that simple, an insecure or vulnerable hypervisor can actually make it easier to exploit a system. (Note: I'm not suggesting that running Docker as PID1 or similar is a good idea)

lclarkmichalek · on Feb 25, 2015

I think it's fair to say that it's easier to secure a hypervisor than it is to secure a Docker daemon. Lord knows we've had a lot more experience securing hypervisors.

digi_owl · on Feb 25, 2015

That name, a dig at servers as "cattle vs pets" i take it?

crimsonalucard · on Feb 25, 2015

I've used coreos and configuring it into clusters was over complicated. So much so that I switched back to ubuntu. Does anyone know how RancherOS compares?

smw355 · on Feb 25, 2015

For clustering and management, you might try running RancerOS with Rancher the company's orchestration tool.

general_failure · on Feb 25, 2015

This is a CoreOS alternative?

zkanda · on Feb 25, 2015

Does it support NFS?

CzechsMix · on Feb 25, 2015

Docker-ception

owly · on Feb 25, 2015

I appreciate what you did there. :)

antocv · on Feb 25, 2015

I cant help but feel that this is a clusterfuck in disguise.