Flynn: first preview release

dominotw · on April 21, 2014

Could someone point out to me (non superficial) differences between various service discovery projects that have popped up recently like,

[1]. etcd [2] skydock [3] consul [4] zookeper ( ok this is not new) [5] flynn

They all seem to be doing the exact same to me.

1. https://github.com/coreos/etcd 2. https://github.com/crosbymichael/skydock 3. https://github.com/hashicorp/consul/ 5.https://flynn.io/

Edit: flynn uses discoverd (https://github.com/flynn/discoverd)

nemothekid · on April 21, 2014

Etcd and Zookeeper provide essentially the same functionality. They are both a strongly consistent key/value stores that support notifications to clients of changes. These two projects are limited to service discovery, there are a number of projects that depend on zookeeper (that theoretically could be ported to etcd) for various distributed state management such as Kafka, Storm, and Hadoop.

Skydock (Skydns) and Consul provide automagic service discovery primarily through dns. They use a strongly consistent backend (think Zookeeper or Etcd), to keep track of what servers are active and what applications they run. They then expose a DNS that routes through said servers. So lets say you had a client application that would talk to a node application that could be on any number of servers. What you could do is hard code that list into your application and randomly select one, in order to "fake" load balancing. However every time a machine went up or down you would have to update that list. What Consul provides is you just tell your app to connect to "mynodeapp.consul" and then consul will give you the proper address of one of your node apps.

Consul and Skydock are both applications that build on top of a tool like Zookeeper and Etcd. [1]

Finally all these tools, you can essentially consider "ops" tools to manage the lifetime of application, services and servers "in the cloud". What a developer ideally wants to do is just push code and not have to worry about what servers are running what, and worry about failover and the like. What Flynn provides (if I get it), is a diy Heroku like platform that makes use of tools like Etcd, Skydns, Docker(?) and others. With Flynn, I believe the goal is to radically simplify ops. Another project that I believe may be similar to Flynn is Apache Mesos.

[1] Technically Consul and Skydock, IIRC don't actually use Etcd or Zookeeper, but both implement an underlying protocol to achieve the same thing - Raft, but they could use Etcd or Zookeeper instead of implementing the protocol themselves.

justincormack · on April 21, 2014

Consul actually uses Serf rather than implementing it itself.

But a good overview, thanks.

Dobbs · on April 21, 2014

Consul actually uses serf for communicating between edge nodes. The central servers (think your zookeeper cluster) use their own implementation of Raft for internal communication.

andrewmunsell · on April 21, 2014

Flynn isn't a service discovery library by itself. It's more of a self hosted Heroku. Though, it does use etcd for service discovery.

dominotw · on April 21, 2014

Are you sure flynn uses etcd for service discovery instead of its own[1]? Looks like it uses Google Omega instead of RAFT which etcd uses. https://github.com/flynn/discoverd

andrewmunsell · on April 21, 2014

You can read more about service discovery on the Flynn site: https://flynn.io/docs/architecture#service-discovery

discoverd is an API on top of another system. Right now, Flynn uses discoverd backed by etcd, though they indicate that this could be replaced with Zookeeper, etc. due to the mdoular design.

nemothekid · on April 21, 2014

As I understand it Google Omega and RAFT are two different things. Google Omega is Google's answer to Apache Mesos. Omega would need a service like Raft to understand what services are currently available. In short, Google Omega is not a service discovery service.

hendzen · on April 21, 2014

Actually, I think Google Omega is the next-generation scheduler after their original Borg scheduler, which in-turn inspired Mesos.

The design of Omega is quite different than Mesos. I really do recommend reading the Mesos & Omega papers, they are quite interesting.

And yes, Raft is a consensus algorithm for keeping a set of distributed state machines in a consistent state. Mesos actually uses ZooKeeper under the hood for master election, which uses its own consensus algorithm (ZAB - Zookeeper Atomic Broadcast).

thu · on April 21, 2014

Very first line of discoverd's README:

    It's currently backed by etcd[...]

dominotw · on April 21, 2014

totally missed that. thank you.

SEJeff · on April 23, 2014

Check this out for an excellent and detailed technical comparison of consul vs zookeeper/etcd/doozerd/chef/serf/skydns/etc:

http://www.consul.io/intro/vs/index.html

Also, discoverd is pluggable, but currently relies on etcd.

sirsar · on April 22, 2014

I feel rather silly now, but...

What does Flynn do? The word "ops" is so central to the pitch and so ambiguous to me that I have absolutely no idea what the function of this software (it is software, right?) is.

benatkin · on April 22, 2014

It's a project to build something like https://github.com/progrium/dokku

Its goals are more ambitious than Dokku but it's too early to tell if it will be any good.

malandrew · on April 22, 2014

Or this:

http://deis.io/

Jemaclus · on April 22, 2014

I'm not a DevOps person. Can someone explain in less jargony terms what this is? Is it like Heroku? I have no idea.

jmspring · on April 22, 2014

Even with experience in ops, my brain hurts reading this. At first quick glance, I thought it was another HBO silicon valley spoof.

Jemaclus · on April 22, 2014

It seems to me that a startup of any kind should have some sort of tagline or side bar that explains it like I'm five. If I'm not an Ops person and I come across, this, then if I can't grasp what it is or how useful it might be right off the bat, I won't recommend it to my Ops person at work, right? I understand the target audience is DevOps, but I think every startup should have a simple-yet-accurate layman's explanation on the front page. If you can't do that, maybe your service is too complicated?

This is something I think most startups miss when they do landing pages, unfortunately. Even as a frequent Hacker News reader, I have no idea what's going on... and that really shouldn't happen. :(

/rant. :)

jmspring · on April 22, 2014

The landing page hasn't dislodged me from my dinosaur ways.

Possible weekend project idea -- startup landing pages for the common man. A service to crowdsource/automate (and then index) the translation of buzz to something at least your or I understand in 5-10 seconds, a more lay person, say 30 seconds.

Jemaclus · on April 22, 2014

Do it.

erkkie · on April 22, 2014

In short yes, it's like self-hosted heroku. They use docker and the heroku buildpack interface and many different components to coordinate container management, logging, etc to provide a whole system.

cellis · on April 21, 2014

Can anyone from the Flynn project comment on this versus Deis? I've tried Dokku and thought that was pretty slick, but ultimately decided to move on to a Deis/Chef devops setup, because dokku was too lightweight for my needs and Flynn wasn't ready.

danielsiders · on April 21, 2014

Flynn definitely isn't "ready" yet. This is a preview release, production grade stability is still a few months away.

I haven't looked closely at Deis in several months, but Flynn doesn't use Chef (or anything similar) anywhere and don't plan to.

bacongobbler · on April 21, 2014

Deis dev here. We've removed the Chef dependency as of v0.7.0 in favour of CoreOS and fleet for container/machine job scheduling. That ship set sail only recently in master. :)

fizx · on April 22, 2014

Good riddance. Oh, and Chef is still in your website's meta tags (and therefore google results).

bacongobbler · on April 22, 2014

We'll be updating the website in preparation for our next release. Most users are still on v0.7.0 or earlier. Once it's been released then our website should reflect the changes in master.

anentropic · on April 22, 2014

oh good... time to have another look at Deis then :)

akhatri_aus · on April 21, 2014

I've been waiting ages for flynn its an awesome project. I used Dokku and its lovely & its so simple.

Is there a way to install it manually on a server with stuff already on it?

Also is the source of the aws flynn.cupcake.io tool on github, it would be nice to see whats being installed onto an aws cluster.

danielsiders · on April 21, 2014

Running it on your own server: We're currently only providing step by step instructions for vagrant, but you should be able to use the script in the vagrant file on your own host with some modifications on your own hardware. Happy to help in IRC (#flynn on freenode)

Flynn Dashboard: At least some part of this will end up as open source in the not distant future, we're trying to figure out the best strategy for additional tools like the dashboard and what parts should be packaged.

ghayes · on April 21, 2014

On that note, it would be great to know what AWS `User Policy` Flynn needs / expects so I could grant it a unique IAM key with limited access for creating servers, etc.

Titanous · on April 21, 2014

We haven't nailed down the exact API calls that we're going to need long-term so there isn't a policy we're recommending. Currently EC2 and Route53 full access will work, but we recommend creating a new AWS account for security reasons if you have anything else running.

notdonspaulding · on April 21, 2014

FWIW, the default PowerUser role that can be assigned to an IAM account was sufficient for my tests. I don't know if it could have worked with fewer permissions.

groby_b · on April 21, 2014

Very nit-picky comment, but: Can you explain in one or two sentences what the value proposition is?

I might be dense, but it took me quite a bit of clicking around, reading HN comments, and poking at the Omega paper to understand what you want to do their. (Well, I think I understand. The "private Heroku" line points in the right direction, at least)

danielsiders · on April 21, 2014

Flynn is the product ops should provide to developers.

Flynn is a single platform that runs all your services from databases to applications to individual Linux processes. With Flynn ops teams can stop being consultants and start providing a single product to their internal "customers". Basically you can deploy and scale whatever you want all in one place on any infrastructure without having to think about individual hosts.

groby_b · on April 22, 2014

Yes. I read the site, as I said. Reposting the text from there is kind of pointless, no?

I don't ultimately care one way or another - it was meant as friendly feedback that it's unclear right now. You'll notice that quite a few of the comments also said something to the effect of "I guess it does this". Which means it's not entirely clear what it does to more people than just me.

zakelfassi · on April 21, 2014

About time. One of the very few projects I'm following closely and excited about Can't wait to compare it side-by-side with Dokku.

stavros · on April 21, 2014

Are these sorts of services good enough? I want somewhere to deploy my Django apps, ideally I should be able to get a beefy server and deploy every app as a single thing on that server, completely compartmentalized.

The problem I keep hitting is that these things have a tradeoff between ease of use and power. I want to use Docker, but it has no easy way to say "take this file that contains instructions and make everything". You can write Dockerfiles, but you can only use one part of the stack in them, otherwise you run into trouble.

I'm hoping Flynn will allow me to use a database, Redis, uwsgi, nginx, etc with a simple deployment command. Does anyone know of anything else like this?

danielsiders · on April 21, 2014

Built-in database appliances are a big part of what Flynn will offer in the next few months (right now it's only Postgres).

davidcelis · on April 21, 2014

One thing I've been wondering about with Flynn is data persistence. I have an app already running that I'd like to move to Flynn for deployment; will it be easy to bring over a populated database and have it persist, shared, across instances of the app?

Titanous · on April 21, 2014

Currently there is a very alpha Postgres appliance[0] (including automatic cluster orchestration and a database provisioning API) included with Flynn and we'll be standardizing the appliance model and adding more in the future. Flynn datastore appliances run inside of containers and are managed by Flynn just like everything else and will include support for backups, HA/replication, etc. out of the box with close to zero configuration.

[0] https://github.com/flynn/flynn-postgres

davidcelis · on April 21, 2014

Thanks. It's been pretty tough to wait for Flynn's release. Great job so far, can't wait for it to get to stability so I can migrate!

cheez · on April 21, 2014

How do people build up the experience + the time to develop this kind of thing?

danielsiders · on April 21, 2014

We actually did a crowdfunding campaign[1] last summer to pay for the development time.

[1] https://news.ycombinator.com/item?id=6058662

thu · on April 21, 2014

Actually after Jeff Lindsay latest blog post[0] I thought that the project was put on the back burner. What is the relationship between cupcake and the initial author(s) ?

[0]: http://progrium.com/blog/2014/02/06/the-start-of-the-age-of-...

danielsiders · on April 21, 2014

Cupcake (Apollic Software) has been the company behind Flynn (and Tent) since day 0 (we tend to downplay the branding because we think both projects can stand on their own).

Most of the development on Flynn (nearly all of the 2014 work) was done by Jonathan Rudenberg (@titanous), one of the founders of Cupcake. We brought in Jeff as a contractor and paid him out of the crowdfunding campaign. He contributed to the initial architecture and prototypes but stepped back in December.

I don't want to speak for Jeff (@progrium) but my understanding is that he'll keep evangelizing the project at conferences and may be working on a proprietary version/components at Digital Ocean.

The Cupcake team (especially Jonathan) will keep developing Flynn (of course with open source contributions) full time in the future.

progrium · on April 22, 2014

Yeah, Flynn is open source so I can contribute as necessary, but Cupcake is continuing to maintain and build the project.

My specific long-term plan is still vague, but it will involve Flynn and so far I've been finding ways to fund my time to invest in some R&D type projects for the Flynn ecosystem.

DigitalOcean's plan is not set either, but my work there should involve Flynn and that ecosystem / architecture / worldview.

thu · on April 21, 2014

Thanks a lot for your answer!

cheez · on April 21, 2014

That's really interesting, I'm glad you delivered.

necubi · on April 21, 2014

Can anybody discuss how Flynn compares to Mesos? Superficially, it seems to be solving the same sort of problems.

danielsiders · on April 21, 2014

Right now Flynn Layer 0 is very immature compared to Mesos, but yes they're trying to solve similar problems. After Flynn reaches production stability and builds out more features, we expect Layer 0 to be a valid (and much lighter weight) alternative to Mesos that we hope will be a superior solution for a broad class of users.

I feel like the projects have very different prospective user bases and communities (Mesos is an Apache project, hundreds of thousands of lines of C++; Flynn Layer 0 is run by a startup and only a few thousand lines of Go) and will likely develop in very different directions to service those communities.

That being said, we've explored creating a version of Flynn layer 1 components that run on Mesos instead of Flynn Layer 0 for users who are already deeply invested in the Mesos ecosystem.

mahler · on April 22, 2014

1. Mesos is very mature software, we take reliability, quality, and backwards compatible upgrades very seriously as there are companies currently relying on these properties.

2. At a high level, Mesos aims to provide abstraction for building distributed applications. This means "frameworks" are either built on top, like Aurora, Marathon, Chronos, etc. Or frameworks are existing distributed applications that are made to run on top, like Spark, Hadoop, Jenkins, distcc, etc. The goal being to run these distributed applications together in the same cluster in order to simplify operational complexity and gain efficiency. In this sense, Mesos is trying to build and grow the common lower level abstraction, akin to a "kernel" for the datacenter.

3. Flynn is aiming to solve a much broader set of problems, by providing a PaaS, (Mesos is more like an IaaS, PaaS should be built / run on top). Flynn is aiming to provide something that is immediately useful on its own, that means things in the layer 1 listed on the website are included. Flynn is aiming to provide some of these "schedulers" out of the box. That is my understanding from reading their website.

4. I'm not sure the authors of Flynn fully comprehend the subtlety that exists between Omega and Mesos. Unfortunately, there are some primitives in Mesos that have been discussed for quite some time and have yet to be implemented that aim to alleviate the issues brought up the Omega paper. I think the Omega model makes sense at Google, where they have complete control over the schedulers. However, in the open source world, I think the Mesos model is more appropriate (this claim really warrants it's own post). With additional primitives, like "optimistic offers", "revocable offers", and "over-subscription", many of the issues discussed in the Omega paper should be remediated.

Dislaimer: I am a Mesos PMC member. :)

hendzen · on April 21, 2014

It seems to partially duplicate the functionality of Mesos, as they are writing their own task scheduling framework [0] based on Google's Omega [1].

The Omega authors claim that Mesos' system of application specific schedulers accepting resource offers from the Mesos master is well suited toward short-lived jobs (think ephemeral Map/Reduce or MPI type workloads) but is not well suited for long lived 'service' jobs (like a Rails app or DB server). As this seems to be an important use-case for Flynn, it would seem like a valid architecture decision to not use Mesos.

[0] - https://github.com/flynn/flynn-host/tree/master/sampi

[1] - http://eurosys2013.tudos.org/wp-content/uploads/2013/paper/S...

presspot · on April 22, 2014

Mesos is exceptionally good at managing long-running service and that use case represents about 50% of the workloads I've seen on large production clusters.

"Scheduling" long-running services is straightforward, as they typically only need to be run "once." It's trivial to use something like Marathon [0] to do that, and you then immediately benefit from Mesos' fault-tolerance and self-healing. Marathon also makes it easy to elastically scale the long-running processes (e.g., start more Rails servers when traffic increases).

[0] - https://github.com/mesosphere/marathon

necubi · on April 21, 2014

I haven't read the Omega paper yet, but plenty of people are running long-running tasks in Mesos (Marathon [0] is a framework for doing just that).

[0] https://github.com/mesosphere/marathon

hendzen · on April 21, 2014

From the Omega paper (section 4.2):

  Mesos achieves fairness by alternately offering all 
  available cluster resources to different schedulers,
  predicated on assumptions that resources become available
  frequently and scheduler decisions are quick. As a result,
  a long scheduler decision time means that nearly all 
  cluster resources are locked down for a long time, inaccessible
  to other schedulers. The only resources available for other
  schedulers in this situation are the few becoming available
  while the slow scheduler is busy. These are often insufficient
  to schedule an above-average size batch job, meaning that
  the batch scheduler cannot make progress while the service
  scheduler holds an offer. It nonetheless keeps trying, and
  as a consequence, we find that a number of jobs are abandoned 
  because they did not finish scheduling their tasks by
  the 1,000-attempt retry limit in the Mesos case (Figure 7c).
  This pathology occurs because of Mesos’s assumption
  of quick scheduling decisions, small jobs and high re-
  source churn, which do not hold for our service jobs. Mesos
  could be extended to make only fair-share offers, although
  this would complicate the resource allocator logic, and the
  quality of the placement decisions for big or picky jobs
  would likely decrease, since each scheduler could only see
  a smaller fraction of the available resources. We have raised
  this point with the Mesos team; they agree about the 
  limitation and are considering to address it in future work.

Its worth noting that Andy Konwinski was a coauthor on both Mesos & Omega, so I'd hope they (Omega authors) represented Mesos' capabilties accurately. I don't have any personal experience running Mesos in production, I'm just going of what was written.

necubi · on April 21, 2014

Ah, interesting. My personal experience with Mesos has included clusters that are basically all transient services (like map reduce jobs) or all long running services, but not both. I can see how that might lead to pathological scheduling decisions.

gales · on April 21, 2014

Does anyone know if Flynn components can be run on CoreOS? Also, does Fylnn offer the ability to use sub-domains for accessing apps, with either a random string or custom, eg. //dev.example.com? Thx

danielsiders · on April 21, 2014

Flynn is designed to run on any modern Linux kernel. We haven't tested CoreOS yet, but once we put out a few fires we'll explore more fully.

You can add any route to any app in Flynn, so different subdomains, domains, and TCP ports are pretty easy.

louis-paul · on April 21, 2014

Flynn is very interesting! I've been following its development since the project was made public and the team behind it made great progress during the last months.

danielsiders · on April 21, 2014

Thanks! Let us know if you have any questions!

andrewmunsell · on April 21, 2014

It's great to see the project making progress. Is there any instructions for starting up a Flynn cluster on something other than Vagrant?

Titanous · on April 21, 2014

There are currently no tutorial-style instructions for using anything but Vagrant, but you should be able to just to run these commands in whatever environment you are using: https://github.com/flynn/flynn-demo/blob/26fcd98cd7646199513...

dpritchett · on April 21, 2014

PSA to anyone trying this on Ubuntu 14: the docker command seems to be "docker.io" now.

shykes · on April 21, 2014

There is an ongoing conundrum in the downstream Debian/Ubuntu packaging of Docker. Our friends at Debian and Ubuntu are doing their best to solve it.

In the meantime, you should either a) install the latest docker from the official docker APT repos http://docs.docker.io/installation/ or b) do something like ln /usr/bin/docker{.io,}

In any case, don't leave your Docker install without a docker binary in your path. That will universally break all tutorials and scripts ever written for Docker, and is unsupported by us (Docker maintainers).

dpritchett · on April 21, 2014

Yeah, I went with the `ln -s` method. Thanks Solomon!

shykes · on April 21, 2014

another thing to watch out for: downstream packages (the one you have installed) are out-of-date. The current version of Docker is 0.10.0, and ships a ton of improvements. For that reason alone you should consider installing from upstream repos, at least until downstream catches up. Just keep it in a separate file in /etc/apt/sources.list.d to easily remove it later.

bigonlogn · on April 21, 2014

FYI, the "docker.io" repository in trusty points to an older version of docker (0.9.1). I believe the script at http://get.docker.io/ubuntu will install the latest version of docker.

andrewmunsell · on April 21, 2014

Kind of what I figured-- trying it out now. Thanks!

auvi · on April 22, 2014

My first impression was that it has to do something with TRON's Kevin Flynn. Quoting him: "I tried to picture clusters of information as they moved through the computer. What did they look like? Ships? Motorcycles? Were the circuits like freeways?"

ochoseis · on April 22, 2014

How does Flynn compare to OpenShift or Cloud Foundry? Is it trying to solve the same problems, perhaps leapfrogging them in some ways?

[Edit] I would also ask the same about Apache Mesos since that seems like it's coming up a lot in this discussion as well.

milkmiruku · on April 22, 2014

I too am interested in the comparison with such projects and the differing requirements they cater for. I'm also working on some notes on https://wiki.thingsandstuff.org/Stack if anyone is interested in bettering the order/outline/etc.

_vya7 · on April 21, 2014

Images containing text that I can't select? But.. why?

RivieraKid · on April 22, 2014

How does this relate to AWS or OpenStack? Is it somethin' like Heroku – i.e. somethin' that can run on top of AWS?

toisanji · on April 21, 2014

Im looking forward to playing with this!

deltron · on April 21, 2014

Was the codename Walt Jr?

Goranek · on April 21, 2014

Not sure why is this downvoted..but it made my day.

pcmonk · on April 21, 2014

It was likely downvoted since it was essentially content-free. While we can all appreciate a good joke, they lower the signal-to-noise ratio. This isn't reddit. A good comment says something constructive about the product.