My notes on Amazon's EC2 Container Service, aka Docker on AWS

nailer · on May 17, 2015

Doesn't running containers in a Xen hypervisor (what AWS are offering) defeat the performance benefit of containers?

Also you still have to care about instances and AMIs, rather than just containers and dockerfiles, which is a second layer of management.

mdasen · on May 17, 2015

I think most people aren't looking to containers for performance. Containers offer a lot more flexibility.

At my work, we run containers on top of EC2. We have a system where we can note how much RAM and CPU we want a container to have. This gives us a couple advantages over running regular boxes on EC2.

First, the CPU allocations are soft. If there's idle CPU on a box, it can be allocated to containers that want it. Imagine a 4-ECU box with 4 containers on it, each wanting 1-ECU. If they're all using the CPU equally, they each get an ECU. However, if two of them are idle on the CPU, the other two can burst to 2-ECU. If they were separate EC2 instances, they wouldn't get that burst.

Second, what happens when you want to allocate another 2GB of RAM to a container you've given 14GB of RAM to? Running containers means we can run large EC2 boxes and let an algorithm deal with the backpack problem of how to stuff things into the different EC2 instances with minimal waste. Without containers, we need to step up to the next-sized instance which might have way more RAM than we need, leading to waste.

Along the same lines, we often want to leave a little overhead in the RAM for different processes. Like, the RAM usage shouldn't go over 10GB, but if it goes up to 11GB for a short time, we'd rather let the garbage collector do its thing than kill it. This means leaving a little overhead. Having this little overhead for everything adds up, but sharing it among lots of things on a large box makes it less waste and the in the unlikely even that all of them want the overhead at the same time, some get killed and re-started on different boxes.

Now, why use EC2 rather than physical hardware from SoftLayer or someone else? EC2 allows us to spin up instances fast. If someone needs 10 15GB containers launched, the EC2 instances can be provisioned in a couple minutes and the containers can be launched on them. SoftLayer notes 20-30 minutes for one of their standard configuration physical servers which is pretty good, but a long time to wait if you're trying to scale up for a burst of traffic or trying to batch process some data or whatnot. If you're looking to customize anything, SoftLayer takes 2-4 hours.

There are definitely cases where every last bit of performance counts (and costs money). I've heard stories of Google being maniacal about finding anything that is blocking in some of their systems. I also know a lot of people that worry about overhead that don't even know where most of the time is spent in their own code. I mean, if one is running a Rails app making a cross-country network call in the request tying up that process for 100ms, the overhead of Xen really isn't an issue. That's an extreme example, but for most users I think the convenience and flexibility of containers is the selling point, not eliminating a tiny bit of overhead from Xen.

yid · on May 18, 2015

Excellent post. Just wanted to say that the practical alternative to using EC2 isn't physical hardware, but all-inclusive VPS boxes or a managed colo from eg Rackspace.

Rapzid · on May 17, 2015

These are good points. I believe that if you have a few critical apps with a reasonable load and need to scale beanstalk is the way to go. Otherwise you're just getting into managing your own PaaS infrastructure for what? For nothing.

If you've got a bunch of small one-off services you deploy all the time and want a bit better density and an nicer deploy story, then sure do up a small CoreOS or ECS cluster. Or maybe you're very large and your resource requirements so high that having tons of spare capacity around to run up containers in makes sense.

matthewrudy · on May 17, 2015

On the first point, what I'm looking for is flexibility and speed of deployment.

On the second point, I guess that they'll end up doing that too. When they launched they made a point of running your containers on your instances as a security thing. Over time I think they'll trust Docker enough to ignore it.

nailer · on May 17, 2015

So you still miss out on deploy speed: when extra capacity needs another EC2 instance, you wait for virtual firmware, kernels, and init before you can get a new container started.

Agreed re: 2

tb93 · on May 18, 2015

The fundamental reason of that is the lack of isolation to run containers in a multi-tenant environment.

justincormack · on May 17, 2015

It is still multiple containers per VM, not one per VM so it is better than nothing. On the second point, Lambda seems to be their closest offering to not having to specify machines, it seems to use containers under the hood.

hartror · on May 17, 2015

This is quite an old post relative to the age of ECS, closer in time to when it was announced than now.

rossf7 · on May 17, 2015

Yes this is based on the trial release of ECS. Some extra features were added in the GA release last month. There is now ECS support in the AWS web console Also ECS Services can automatically register containers with an Elastic Load Balancer.

We launched http://force12.io on Friday which is a demo of container autoscaling running on ECS. I'll be doing a blog post this week on how we're using ECS.

jronald · on May 17, 2015

I'll try and keep an eye out for the blog post - I'm working with building a CD pipeline w/ jenkins, docker, and node to Beanstalk, and I'm sure we're doing things inefficiently.

rossf7 · on May 17, 2015

Thanks, I'll be posting it to HN!

You've probably seen it but for Elastic Beanstalk there is now a multi container version which uses ECS under the hood. Previously there was a limitation of 1 container per VM.

http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create...

kevinbowman · on May 17, 2015

Also odd that, if you read until the end, the author professes to have not actually tried using ECS yet.

It's a good collection of the author's impressions of what ECS looks to be from the documentation, but it's a little out of date.

tjholowaychuk · on May 17, 2015

ECS still feels like a preview, but it's coming along! We might blog about our experiences there in the near future. Overall it has a good feel to it, other than some seemingly odd configuration deviations which turned out to be abstractions made for "future flexibility".

NathanKP · on May 17, 2015

I'm using ECS to run a small cluster of 25 odd microservice containers of three different categories spread out over a number of instances.

I'd second your opinion that it still feels like a preview. There are some rough edges that we've encountered such as a mistyped tag name causing ECS to hang forever in the pending state, as it didn't properly catch the error condition of not being able to pull the nonexistent image tag from our Dockerhub account.

One thing that I think ECS really highlights is the need for an ELB service specifically for docker containers. I'd love to be able to run multiple web containers on the same instance and load balance incoming requests to multiple containers on the same machine. Right now because of the limitations of ELB listeners we are instead running one beefy web container per instance with lots of CPU and memory allocation, and one node.js process per core running inside the container. Then ELB load balances requests across our instances, but each instance is still running only one web container.

ECS still brings benefit for deploying and updating web containers, but the true power lies in its management of worker and other background service containers, and the ability to shuffle them around and disperse them wherever there are free resources with very little administration required.

jacques_chester · on May 17, 2015

Cloud Foundry does what you want -- push an app, see it run, get load balancing and log aggregation out of the box[0]. And healthchecks, and buildpacks, and so on and so forth.

CF runs on AWS.

You can also play with Lattice, which is a cut down version intended to support fast development cycles[1].

Disclaimer: I used to work on Cloud Foundry.

[0] http://cloudfoundry.org/

[1] http://lattice.cf/

osipov · on May 17, 2015

It would be great to hear more about I/O performance on ECS. A while back I ran I/O tests on another Docker-based container service running on Softlayer(Bluemix) and I could get near-native level performance because that environment had Docker containers running on bare-metal hardware. More here: http://www.cloudswithcarl.com/?p=63

frankchn · on May 17, 2015

> for now, you can only start containers from public images hosted on the Docker Hub, but that's expected to change when the service goes out of preview

This is incorrect. You can run a private docker registry and point ECS to launch images from there, although I do wish that Amazon provides a hosted registry service rather than letting everyone run their own private registry.

Weizilla · on May 17, 2015

What's the difference between ECS and Elastic Beanstalk (EB)? EB already supports deploying single and multiple docker images per EC2 instance.

NathanKP · on May 17, 2015

There are a number of differences, but in specific the way deployments work is a good example of the difference between EB and ECS.

Due to the way EB works the only way to get zero downtime deploys in EB is by having two environments and swapping their CNAMES:

http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-...

On the other hand ECS does a much more intelligent deployment approach. Basically when a service is updated it creates a new container to replace one of the old ones, wait for an old one to drain, then switches it out with the new one, etc, until all the containers are replaced with the updated container.

http://docs.aws.amazon.com/AmazonECS/latest/developerguide/u...

Basically if you are doing CI and want to have many zero downtime deploys per day with no service interruption then ECS does it much more gracefully. Additionally if your service is really large, requiring a lot of instance resources then EB requires you to run two copies of your environment in parallel, at least during deploys, which can be quite expensive. ECS just swaps out containers on the running hardware, so you just need a few extra instances for wiggle room when deploying, but you don't have to launch another parallel copy of your stack.

deet · on May 17, 2015

I don't believe this is accurate. When making radical changes to your application it might be required to run two copies in EB and do a CNAME swap.

But for day-to-day deploys of new versions, EB can use rolling batches within one application, as you say ECS does, you configure it to.

Specifically, you can have EB update only a fixed number or percentage of instances in your application with new versions at a time. It does this by removing, in batches, only some of the instances from the load balancers, waiting for existing connections to complete, updating those instances, and then re-adding them to the load-balancer.

See http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-...

(I agree that the existence of this page along with "Deploying Versions with Zero Downtime" is confusing, but rolling deploys are specifically mentioned as an alternative to CNAME swap: "Batched application version deployments are also an alternative to application version deployments that involve a CNAME swap")

Also, regarding terminology, note that under the hood EB, when used for multi-container Docker applications, is just an abstraction layer over ECS. It sets up ECS tasks for you you.

rossf7 · on May 17, 2015

Elastic Beanstalk is a wrapper for multiple AWS services. The Docker multi container version uses ECS for containers. Using ECS directly can be more work but it also gives more control.

EB works the same way for VMs, it will integrate with EC2 for you or you can use EC2 directly instead of EB.

nroman · on May 17, 2015

Anyone know if the IAM roles and security groups are applied at the individual container level or at the instance level?