DigitalOcean Partners with CoreOS for Large-Scale Cluster Deployments

waffle_ss · on Sept 5, 2014

This image uses the CoreOS Alpha channel, which is not supposed to be used for production[1]. It "closely tracks current development work and is released frequently" so I would be using it with the knowledge that things might break. In other words, CoreOS on DigitalOcean should only be used for trying out CoreOS and not for running production apps (for now). But if I were going to do that, there is already a Vagrant setup[2] that is super easy to use. Hopefully DigitalOcean will provide a CoreOS Stable image soon.

On the subject of DigitalOcean images, there was a severe Docker bug[3] the last month or so that made Linux kernel 3.15 unusable. Linode let me easily select a 3.14 kernel to use for my host OS to get around the bug, but DigitalOcean doesn't have that level of granularity. So DigitalOcean either needs to provide more fine-tuned configuration of images or provide a CoreOS Stable image before I would think of using it for production Docker containers.

Finally, CoreOS is still an enormous pain[4] to install on Linode, so I hope this gives Linode a strong nudge to make it easier to install there.

[1]: https://coreos.com/releases/

[2]: https://coreos.com/docs/running-coreos/platforms/vagrant/

[3]: https://github.com/docker/docker/issues/6345

[4]: http://serverfault.com/a/620513/85897

andrewsomething · on Sept 5, 2014

The Alpha channel contains some work to interoperate with the DigitalOcean metadata service properly. CoreOS promotes images roughly every two weeks. So in that time, the first Beta will become available on DigitalOcean and about two weeks after that the Stable channel will be available as well.

andrewmunsell · on Sept 5, 2014

For those experimenting, you can also switch to the release channel[1] (or beta, if you prefer). CoreOS will not downgrade so you will be on the alpha version until this version is promoted to the channel you subscribe to. This way, when the changes do get promoted, you will be tracking the release channel without any additional changes.

[1]: https://coreos.com/docs/cluster-management/setup/switching-c...

lsllc · on Sept 5, 2014

Vultr have supported CoreOS for a while (and FreeBSD!)

https://coreos.com/docs/running-coreos/cloud-providers/vultr...

devicenull · on Sept 5, 2014

And any other x86 OS that supports VirtIO :)

michaelsbradley · on Sept 5, 2014

The article How To Set Up a CoreOS Cluster on DigitalOcean[1] (written by a DigitalOcean employee) fails to mention what seems to me to be a serious security-related concern.

Since droplets with private networking enabled are on the same private network as other customers' droplets, then if "$private_ipv4" is specified for "addr" and "peer-addr" in cloud-config, isn't it critical that etcd be secured with TLS and client cert authentication?

See: CoreOS – Etcd: Reading and Writing over HTTPS[2]

I realize that delving into that aspect of coreos/etcd configuration is beyond the scope of an introductory "how to" article, but I believe that some strong mention should be given to this concern.

I made a comment[3] to this effect on DigitalOcean's website.

[1] https://www.digitalocean.com/community/tutorials/how-to-set-...

[2] https://coreos.com/docs/distributed-configuration/etcd-secur...

[3] https://www.digitalocean.com/community/tutorials/how-to-set-...

akbar501 · on Sept 5, 2014

The article you linked to provides security for etcd.

However, what is the standard approach for securing Docker container to container communication across hosts. For example from an app server to a DB server.

Is IPSec setup within CoreOS network layer, or is the security provided by Docker? If so, what are the options?

michaelsbradley · on Sept 5, 2014

I don't think CoreOS does anything special in this regard.

It should be possible via cloud-config to change the runtime config of the docker service[1], in which case one could set "--icc=false"[2] to enforce stricter rules about inter-container communication on a particular docker host (e.g. a coreos droplet).

[1] https://coreos.com/docs/launching-containers/building/custom...

[2] https://docs.docker.com/articles/networking/#between-contain...

EDIT:

Okay, I see you were asking about regulating network comm between containers on separate docker hosts, i.e. coreos instances.

That's a good question! I still don't think CoreOS addresses that concern in any special way at the level of iptables and routing (but I could be wrong). What it does give you is the ability to control service affinity with respect to your fleet "units". That way, you can be certain that docker containers which need to be "linked" in order to communicate properly (e.g. you have set "--icc=false") will run on the same host.

knite · on Sept 5, 2014

This is a really great point. I may raise it on one of the CoreOS mailing lists.

beigeotter · on Sept 5, 2014

You can find the DigitalOcean tutorials on using CoreOS here: https://www.digitalocean.com/community/tutorial_series/getti...

HorizonXP · on Sept 5, 2014

This was useful, as I couldn't figure out how to add the cloud-config files I use for my Vagrant-base CoreOS cluster.

Here's another question: If I have a droplet up and running already, anyone know how I might change it from Ubuntu to the new CoreOS image? I'd rather change it than create a new one to maintain the same public IP, or else I have to have my DNS records updated, which takes time, and is outside my direct control.

tedchs · on Sept 5, 2014

CoreOS is not a drop-in replacement for Ubuntu; migrating to it requires re-deploying your services inside Docker containers. I would think you might want to run your service implementations in parallel and then cut over later.

Can you not move IP's between Digitalocean droplets?

tsileo · on Sept 5, 2014

It's impossible to move IP's between DO droplet, it's a big drawback I think. It prevent a lot of people to use DO over AWS EC2.

HorizonXP · on Sept 5, 2014

My services are already running in Docker containers on this Ubuntu host.

aduitsis · on Sept 5, 2014

Similarly, I created a new coreos droplet just to play around, neglected to add the cloud_config yaml configuration. I couldn't find a setting to add it to the droplet afterwards.

derefr · on Sept 5, 2014

Semantically, cloud-config is really not something you can add after. cloud-config comes from Ubuntu, but the semantics are a virtual clone of the ones of AWS's cfn-init; the point of both is to inject your config during initial instance bring-up, when system config files are first being generated by the instance provisioner. You can't really run them again once you've already brought up the instance, since they'd just messily trample over their previously spewed configs without removing the previous ones first.

With cloud-config, you're basically expected to be using ephemeral (or nearly-so) instances, particularly within the context of an autoscaling group or equivalent. The lifecycle is supposed to go "[scale up], provision, configure, start services; [crash or scale down], terminate, repeat." CoreOS adds to this soft reboots for upgrades, but definitely still assumes cattle, not snowflake, instances.

STRML · on Sept 5, 2014

This is actually really big news for anyone running or interested in running a Docker-based PaaS system such as Deis or Flynn. DigitalOcean's cheap instances are a great match for Docker containers.

As of Deis 0.8.0 it only runs on CoreOS, and I believe most other DIY PaaS systems are moving the same way.

IMO Docker + etcd is a far more sane configuration than endless Ruby Chef scripts, or worse, Amazon OpsWorks.

fsaintjacques · on Sept 5, 2014

etcd is not a magical replacement for a configuration system.

STRML · on Sept 5, 2014

Just curious - what do you see as its shortcomings?

kapilvt · on Sept 5, 2014

its not about its shortcomings, its a database. your saying it replaces chef and opsworks. you're missing a few pieces to make that picture.

ie. how do i update the load balancer as i add capacity to my web app, how do i setup a new db instance and open up the firewall rules to an app on a different host. etc.

docker, and etcd are both nice pieces, but they don't make a complete picture. nor does fleet imo.

geerlingguy · on Sept 5, 2014

Don't forget Ansible; out of the box, it can perform pretty much any of the glue operations necessary to manage a set of containers and servers.

It's still pretty early, so much of the tooling will need more improvement before container-based infrastructure gets more mainstream (e.g. enterprise, smaller teams) uptake.

justizin · on Sept 5, 2014

While I agree that etcd and config management systems do not solve the same problems, and esp the same way, the use of etcd by all of your services does enable your examples.

For instance, there is a tutorial on using vulcand as your http load balancer, which is driven by configuration in etcd. A tutorial on the CoreOS site shows you how to use this to cleanly deploy a new version of an app as new versions of a container, and to rotate them into the LB and to rotate the old ones out.

You can also connect a script that controls firewall rules based on etcd, its' changes would be reflected almost immediately, rather than the splayed 30-60min period of seemingly-randomly-applying-changes you typically see with config runs of tools like chef and puppet.

I've been trying to figure out where tools like chef, fabric, and docker/coreos/etcd/fleet will play together, what the boundaries will be, etc.. In situations where I'm using Docker and CoreOS, I don't expect to use chef, but I'm not sure it will be used for things like database servers, which we typically dedicate and tune hosts for.

It would be nice to see something like chef running on CoreOS for, say, user management so that people who want to talk to, say, fleet aren't required to all ssh as 'core'.

toomuchtodo · on Sept 5, 2014

etcd is better compared to zookeeper (SOA orchestration, service discovery, etc) and etcd (after researching for the last two weeks for a production environment) is nowhere near as stable (ie non-beta/alpha) as zookeeper or Netflix's Eureka.

Configuration management (puppet, chef, salt, ansible) is a completely different beats than service discovery, health management, etc.

kapilvt · on Sept 5, 2014

agreed, etcd is like zookeeper, their both databases that you write applications against. No! neither does service discovery and orchestration out of the box. The applications you write on top of them do.

justizin · on Sept 5, 2014

Right, and you can achieve service discovery, orchestration, and convergence driven by something like zookeeper or etcd, rather than by bulky config runs.

While I enjoy working with Chef, have had some pretty reasonable times with Puppet, these full-run tools do have issues sometimes where you introduce a narrow bug in your user management or some other code, and all of a sudden you can't update some random conf file that happens after it. I've also seen tools like capistrano and fabric bastardized to allow this sort of precise updating, but lose the cohesion of typical config-managed systems.

With something like etcd, certain problems are solved by not relying on static configuration. That is the paradigm shift, and it is similar to Hadoop. When I've built Hadoop systems with puppet, we simply wrote out the same configs and files on every machine, then chose which services to start, and via zookeeper things like primary / secondary failover take care of themselves.

gnepzhao · on Sept 5, 2014

+1

If you want to do orchestration of both the infra and the app, you need things like opsworks.

derefr · on Sept 5, 2014

One simple solution would be to add etcd configuration items as a CloudFormation/Heat custom resource type.

waffle_ss · on Sept 5, 2014

For one thing, it's not production ready yet: https://github.com/coreos/etcd/blob/769c043537263dd5701f5254...

Another is that your configuration is wide open to every node in the cluster (e.g. your Web app's etcd client can read the key that stores the master database user password). Hope you never get hacked!

wastedhours · on Sept 5, 2014

I'm really out of touch with Docker and CoreOS, so please forgive my ignorance if this is a ridiculous Q: could this combo be used for spinning up machines with specific apps for thin clients? Or is it more about scaling one app?

drcode · on Sept 5, 2014

Essentially it lets you run 20 super-lightweight VMs on a single machine... Each one of those could be running different apps, and Docker makes it really easy to build a custom machine image using a single script (called a Dockerfile) using any standard linux OS as a base.

So yes, powering 20 thin clients running different apps from a single server is a perfect use case.

gregory90 · on Sept 6, 2014

What about isolation between apps? I have app A connected to database A, and app B connected to database B. Is there a way to deny connections from app A to database B etc?

What I want is to make groups of containers that can talk only to each other(only to containers within one group). Does CoreOS provide something like that? Maybe kubernetes? What are possible options?

drcode · on Sept 6, 2014

Docker has a sophisticated system for controlling what ports are open on each container and which other containers it can "see" when it uses these ports.

wastedhours · on Sept 5, 2014

Great, thanks for that! Not a sysadmin by trade so didn't know whether I'd misinterpreted how it worked.

Donzo · on Sept 5, 2014

I have another noob question. Would this be a good way to get redundancy for serving a website?

justsee · on Sept 6, 2014

> IMO Docker + etcd is a far more sane configuration than endless Ruby Chef scripts, or worse, Amazon OpsWorks.

For what use case? Doesn't using Docker for everything take you into 'golden master image' dead-ends, as outlined by the Opscode guys in comments to a Docker blog post last year [1].

What Lamont and Joshua had to say in that thread resonated, but I haven't really looked into DevOps approaches with Docker.

I'm also not sure what you mean by 'endless Ruby Chef scripts'?

[1] http://blog.relateiq.com/why-docker-why-not-chef/#comment-43...

thu · on Sept 5, 2014

I still am very impressed with CoreOS but I find sad that they were forced to move away from their initially touted "only systemd+etcd". (No blame on them, that was expected as the Docker ecosystem is still young and very fast-moving; plenty of interleaved problems must be solved elegantly.) When you say that Docker+etcd is saner that other solutions, I wonder for how long this will remain the case as the field matures.

fmdud · on Sept 5, 2014

Well "Docker + etcd" still needs a surprising amount of tooling around it to be considered a viable production environment. CoreOS provides a lot of this and deis sits atop that.

Deis is the best thing I've seen come out of Docker's DevOps gold rush.

lukebennett · on Sept 5, 2014

"Deis is the best thing I've seen come out of Docker's DevOps gold rush."

I have to agree - Deis fills a lot of holes, although it still has some way to go.

tedchs · on Sept 5, 2014

Would you be willing to share more about what's in CoreOS that you feel like shouldn't be?

fintechie · on Sept 5, 2014

Indeed... Great news for everyone working with DIY PaaS and 12factor apps. Smart move from both, DO and CoreOS, but will also benefit projects like Deis and its users.

jmbro · on Sept 6, 2014

Digital Ocean doesn't load the kernel from the current system image, but instead uses a prestored external kernel associated with the image. This means that upgrade to the kernel from within the droplet (e.g. distribution security updates) are ignored (See http://digitalocean.uservoice.com/forums/136585-digital-ocea...). There is a workaround using kexec (see https://www.alextomkins.com/2013/11/digitalocean-debian-kern...). Does any body know if a similar approach would work for CoreOS ,given their whole image update process, or whether the DigitalOcean/CoreOS team have already taken care of this some other way?

Nux · on Sept 6, 2014

WHAT?? They still haven't fixed this? Unbelievable.

yrro · on Sept 6, 2014

Yeah, and their Debian kernel images are, by now, years out of date.

rb2e · on Sept 5, 2014

The post on DO's blog maybe be more informative: https://www.digitalocean.com/company/blog/coreos-now-availab...

kapilvt · on Sept 5, 2014

one nice unrelated thing that didn't make any of the blog posts, digital ocean now supports userdata when launching instances via console or the api! but it looks like they still need to update their other os images to install cloudinit.

raiyu · on Sept 5, 2014

When we began the work to integrate with CoreOS we saw that it was a perfect opportunity to build out the metadata service, which is why we decided to delay that initial launch until we rolled out this service.

After we've had a chance to work through some of the bugs that customers will uncover that we've missed in our testing we'll move forward to updating the rest of our images for this new metadata service and launching it publicly for production consumption for all customers.

Thanks

jimmyfalcon · on Sept 5, 2014

I remember attending a talk given by the CEO a few months ago. The strong point of CoreOS is for hosting application servers because it is does auto restart / updates rather than hosting can-not-go-down systems such as Databases.

This is exciting to me from a technological standpoint.

1. One of first large public projects written in Go (after docker) 2. One of the first large public projects using Raft. (consensus algorithm aimed to replace Paxos)

I am really looking forward to seeing how this project turns out. Personally, I wouldn't move any of my projects onto CoreOs for at least a few years.

Other than that, I always question how they plan to make money. Consulting model?

polvi · on Sept 5, 2014

Links to docs, etc, here: https://coreos.com/blog/digital-ocean-supports-coreos/

cdnsteve · on Sept 5, 2014

So this means DigitalOcean, when running CoreOS via Docker for your deployment, means you no longer need to worry about OS level updates? Is this now handled by DigitalOcean?

totallymike · on Sept 5, 2014

CoreOS has a pretty great update mechanism. You can read about it here: https://coreos.com/using-coreos/updates/

In short, CoreOS maintains two OS partitions, A and B. When an update is available, it is automatically downloaded to the B partition, and upon reboot the B partition becomes active, effectively rotating the partitions.

andrewsomething · on Sept 5, 2014

Droplets running CoreOS will receive automatic updates through the normal CoreOS update service.

yrro · on Sept 6, 2014

I'd seriously worry if I had to rely on Digialocean providing timely OS updates.

ebarock · on Sept 5, 2014

Marketing, marketing and marketing...

Digital Ocean is doing lots of advertising but their servers are not holding the traffic.

I had my website hosted with them and I was literally unable to connect on it via SSH due to the low quality of their link.

I was disappointed with DreamHost, moved to Digital Ocean, now I am testing Linode.

threeseed · on Sept 5, 2014

Pretty sure it's something related to the quality of your link. I had no issues in the past SSHing from Australia no less into DigitalOcean US DCs.

Also whilst you are testing Linode have a look into their behaviour the last few times they were hacked. They deliberately withheld information from their customers. Pretty despicable company if you ask me.

jahewson · on Sept 5, 2014

What makes you so sure that the problem wasn't the quality of your link?

avinassh · on Sept 5, 2014

Their official blog announcement: https://www.digitalocean.com/company/blog/coreos-now-availab...

fishnchips · on Sept 5, 2014

Hurray! I've been really waiting for that for the last few months. I remember there being a huge thread about it in the DO community.

notastartup · on Sept 5, 2014

What benefits do you get with CoreOS support? I don't understand from just reading the article maybe a real world example makes more sense.