Well, some notes from when I deployed this the other day.
If running in a docker container, you'll need to mount /etc/ssl/certs, as the etcd container is minimal, and will require finding some x.509 roots or something, even when running without HTTPS communication (that's what you get for using Go, I guess).
I manually specify the proxy nodes by putting adding -proxy 'on' (my vpn host was becoming a master, which was not optimal), this may not be a problem for you.
Removing and re-adding node was funky for me, as although :7001/members is great, removing a node there does not remove it from the discovery node, making rejoining with a clean etcd from the name machine rather painful. Not much that can be done about that though.
All in all, I think I'll start writing my Etcd compatible interface to zookeeper :)
One new 2.0 feature you may like (discovery wise) is SRV discovery, if you've got an internal DNS/dnsmasq or something. Set a few records for where machines can be found (and keep them appropriately up to date) and it'll do the same thing as a static bootstrap automatically
Unfortunately, I run skydns for service discovery, and having multiple dns servers on the same machine is a PITA. I should note, for anyone considering skydns, that skydns + etcd + docker is circular dependency hell :)
If skydns fails to load its config from etcd, it doesn't terminate, it just continues and emits a warning.
If you are using etcd.discovery.io for your discovery url, you need to have dns working.
At some point, passing multiple --dns to docker didn't cause failover to be working. Or at least, dns resolution was failing in the etcd container, despite the docker daemon having `--dns 127.0.0.1 --dns 8.8.8.8 --dns 8.8.4.4`. I don't even know.
We had some trouble with etcd at work with constant leader re-election and high CPU usage around last summer, we switched to consul and so far we are happy with it, but etcd seems to be better supported by 3rd party apps, so maybe we should take it for another spin.
This would certainly help. High CPU was also an issue that we started to notice on 0.4.6, here with some of the NYC CoreOS guys, and that's been fixed. Chalk it up to completely redoing internal communication.
The team has worked _very_ carefully to follow the raft state machine as described in the paper as close as possible. For example we have a set of tests[1] that takes possible problems outlined in the original paper and implements them as unit tests.
Seriously, this was originally 0.5. However, because people had started to use 0.4.6 as a pseudo-1.0 in production, and because the internals are completely different, it's a little bit of version jumping to a base we actually want to call our stable branch.
To expand, the 0.4.6 uses the internal v1 API and 2.0 uses the internal v2 API. It made sense to sync up the internal and external release numbers to make things clearer going forward.
Do excuse my ignorance, but what practical advantages does etcd offer over Cassandra (or even Riak)? To me, it seems that raft's leader-does-the-heavy-lifting style of replication will only limit the cluster size and thus the horizontal scalability of the cluster. The gossip-based Cassandra has stability and proven scalability.
etcd is designed to store app settings, data for service discovery, feature flags, distributed locks, that type of thing. It's not a general purpose data store and it isn't designed to store data in the same way you would use Cassandra.
I'm not sure you know how HN works. Just posting isn't enough, you need your colleagues to upvote it if you want the front page. Unless it's super interesting.
It's not difficult to get on the front page but it has massive PR implications however. My previous startup is living proof of that.
And this isn't the first time Docker announce something on HN and CoreOS follow straight up with something else. That's why I was wondering.
If running in a docker container, you'll need to mount /etc/ssl/certs, as the etcd container is minimal, and will require finding some x.509 roots or something, even when running without HTTPS communication (that's what you get for using Go, I guess).
My setup for command line flags is
I manually specify the proxy nodes by putting adding -proxy 'on' (my vpn host was becoming a master, which was not optimal), this may not be a problem for you.Removing and re-adding node was funky for me, as although :7001/members is great, removing a node there does not remove it from the discovery node, making rejoining with a clean etcd from the name machine rather painful. Not much that can be done about that though.
All in all, I think I'll start writing my Etcd compatible interface to zookeeper :)