Hacker News new | past | comments | ask | show | jobs | submit login
Etcd 2.0 – First Major Stable Release (coreos.com)
120 points by philips on Feb 10, 2015 | hide | past | favorite | 29 comments



Well, some notes from when I deployed this the other day.

If running in a docker container, you'll need to mount /etc/ssl/certs, as the etcd container is minimal, and will require finding some x.509 roots or something, even when running without HTTPS communication (that's what you get for using Go, I guess).

My setup for command line flags is

    -name "{{ hostname }}" \
    -initial-advertise-peer-urls "http://{{ clusterip }}:7001" \
    -advertise-client-urls "http://{{ clusterip }}:4001" \
    --listen-client-urls 'http://0.0.0.0:4001' \
    --listen-peer-urls 'http://0.0.0.0:7001' \
    -discovery "{{ discovery url }}"
I manually specify the proxy nodes by putting adding -proxy 'on' (my vpn host was becoming a master, which was not optimal), this may not be a problem for you.

Removing and re-adding node was funky for me, as although :7001/members is great, removing a node there does not remove it from the discovery node, making rejoining with a clean etcd from the name machine rather painful. Not much that can be done about that though.

All in all, I think I'll start writing my Etcd compatible interface to zookeeper :)


One new 2.0 feature you may like (discovery wise) is SRV discovery, if you've got an internal DNS/dnsmasq or something. Set a few records for where machines can be found (and keep them appropriately up to date) and it'll do the same thing as a static bootstrap automatically


Unfortunately, I run skydns for service discovery, and having multiple dns servers on the same machine is a PITA. I should note, for anyone considering skydns, that skydns + etcd + docker is circular dependency hell :)


What problems have you had? We're using a similar setup but SkyDNS has been pretty trouble-free so far.


If skydns fails to load its config from etcd, it doesn't terminate, it just continues and emits a warning.

If you are using etcd.discovery.io for your discovery url, you need to have dns working.

At some point, passing multiple --dns to docker didn't cause failover to be working. Or at least, dns resolution was failing in the etcd container, despite the docker daemon having `--dns 127.0.0.1 --dns 8.8.8.8 --dns 8.8.4.4`. I don't even know.


We had some trouble with etcd at work with constant leader re-election and high CPU usage around last summer, we switched to consul and so far we are happy with it, but etcd seems to be better supported by 3rd party apps, so maybe we should take it for another spin.


This would certainly help. High CPU was also an issue that we started to notice on 0.4.6, here with some of the NYC CoreOS guys, and that's been fixed. Chalk it up to completely redoing internal communication.

EDIT: Master election too :)

(Disclaimer: etcd dev here)


Are you guys still per spec RAFT or have you diverged at this point?


The team has worked _very_ carefully to follow the raft state machine as described in the paper as close as possible. For example we have a set of tests[1] that takes possible problems outlined in the original paper and implements them as unit tests.

[1] https://github.com/coreos/etcd/blob/master/raft/raft_paper_t...


Keep in mind that reads do not go through Raft by default, so it's possible to get stale data.

They've added "consistent=true"/"quorum=true" URL parameters for GETs per https://github.com/coreos/etcd/issues/741 as a workaround.


I had problems due to high-cpu load as well. Haven't had an issue since I updated to latest etcd sometime late last year.


I had the same issues last around July - August and continued upgrading as they release new versions and somewhere along the way it got fixed.

You can also had to fine tune some timeouts (election and compression if I remember correctly) to get the best performance out it.


Pretty sure the high-cpu issue was a time.Ticker leak that was fixed earlier.


Was the 1.0 release not major, or not stable?


How about not existing?

Seriously, this was originally 0.5. However, because people had started to use 0.4.6 as a pseudo-1.0 in production, and because the internals are completely different, it's a little bit of version jumping to a base we actually want to call our stable branch.


To expand, the 0.4.6 uses the internal v1 API and 2.0 uses the internal v2 API. It made sense to sync up the internal and external release numbers to make things clearer going forward.


Or just Larry Ellison versioning.


etcd 3.0 :-)


Do excuse my ignorance, but what practical advantages does etcd offer over Cassandra (or even Riak)? To me, it seems that raft's leader-does-the-heavy-lifting style of replication will only limit the cluster size and thus the horizontal scalability of the cluster. The gossip-based Cassandra has stability and proven scalability.


etcd is designed to store app settings, data for service discovery, feature flags, distributed locks, that type of thing. It's not a general purpose data store and it isn't designed to store data in the same way you would use Cassandra.


Any idea when we might see this included in a CoreOS release?


Excellent release notice, as it does not leave any questions as to what the software is supposed to do.


Did you guys time this post with the Docker 1.5 release post?


Note the date.


Yeah, why not post it on HN on Jan 28th? :)


You assume that people can magically decide when their stuff appears on HN.


The poster works for CoreOS...not so magical.


Again, you assume that only CoreOS got to decide when to post it. There was nothing stopping you or anyone else from posting it.

Still, if you think that getting on HN is the highest level of PR mastery, I have a hello_world.go-to-io.js transpiler to sell you.


I'm not sure you know how HN works. Just posting isn't enough, you need your colleagues to upvote it if you want the front page. Unless it's super interesting.

It's not difficult to get on the front page but it has massive PR implications however. My previous startup is living proof of that.

And this isn't the first time Docker announce something on HN and CoreOS follow straight up with something else. That's why I was wondering.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: