Can anybody discuss how Flynn compares to Mesos? Superficially, it seems to be s...

danielsiders · on April 21, 2014

Right now Flynn Layer 0 is very immature compared to Mesos, but yes they're trying to solve similar problems. After Flynn reaches production stability and builds out more features, we expect Layer 0 to be a valid (and much lighter weight) alternative to Mesos that we hope will be a superior solution for a broad class of users.

I feel like the projects have very different prospective user bases and communities (Mesos is an Apache project, hundreds of thousands of lines of C++; Flynn Layer 0 is run by a startup and only a few thousand lines of Go) and will likely develop in very different directions to service those communities.

That being said, we've explored creating a version of Flynn layer 1 components that run on Mesos instead of Flynn Layer 0 for users who are already deeply invested in the Mesos ecosystem.

mahler · on April 22, 2014

1. Mesos is very mature software, we take reliability, quality, and backwards compatible upgrades very seriously as there are companies currently relying on these properties.

2. At a high level, Mesos aims to provide abstraction for building distributed applications. This means "frameworks" are either built on top, like Aurora, Marathon, Chronos, etc. Or frameworks are existing distributed applications that are made to run on top, like Spark, Hadoop, Jenkins, distcc, etc. The goal being to run these distributed applications together in the same cluster in order to simplify operational complexity and gain efficiency. In this sense, Mesos is trying to build and grow the common lower level abstraction, akin to a "kernel" for the datacenter.

3. Flynn is aiming to solve a much broader set of problems, by providing a PaaS, (Mesos is more like an IaaS, PaaS should be built / run on top). Flynn is aiming to provide something that is immediately useful on its own, that means things in the layer 1 listed on the website are included. Flynn is aiming to provide some of these "schedulers" out of the box. That is my understanding from reading their website.

4. I'm not sure the authors of Flynn fully comprehend the subtlety that exists between Omega and Mesos. Unfortunately, there are some primitives in Mesos that have been discussed for quite some time and have yet to be implemented that aim to alleviate the issues brought up the Omega paper. I think the Omega model makes sense at Google, where they have complete control over the schedulers. However, in the open source world, I think the Mesos model is more appropriate (this claim really warrants it's own post). With additional primitives, like "optimistic offers", "revocable offers", and "over-subscription", many of the issues discussed in the Omega paper should be remediated.

Dislaimer: I am a Mesos PMC member. :)

hendzen · on April 21, 2014

It seems to partially duplicate the functionality of Mesos, as they are writing their own task scheduling framework [0] based on Google's Omega [1].

The Omega authors claim that Mesos' system of application specific schedulers accepting resource offers from the Mesos master is well suited toward short-lived jobs (think ephemeral Map/Reduce or MPI type workloads) but is not well suited for long lived 'service' jobs (like a Rails app or DB server). As this seems to be an important use-case for Flynn, it would seem like a valid architecture decision to not use Mesos.

[0] - https://github.com/flynn/flynn-host/tree/master/sampi

[1] - http://eurosys2013.tudos.org/wp-content/uploads/2013/paper/S...

presspot · on April 22, 2014

Mesos is exceptionally good at managing long-running service and that use case represents about 50% of the workloads I've seen on large production clusters.

"Scheduling" long-running services is straightforward, as they typically only need to be run "once." It's trivial to use something like Marathon [0] to do that, and you then immediately benefit from Mesos' fault-tolerance and self-healing. Marathon also makes it easy to elastically scale the long-running processes (e.g., start more Rails servers when traffic increases).

[0] - https://github.com/mesosphere/marathon

necubi · on April 21, 2014

I haven't read the Omega paper yet, but plenty of people are running long-running tasks in Mesos (Marathon [0] is a framework for doing just that).

[0] https://github.com/mesosphere/marathon

hendzen · on April 21, 2014

From the Omega paper (section 4.2):

  Mesos achieves fairness by alternately offering all 
  available cluster resources to different schedulers,
  predicated on assumptions that resources become available
  frequently and scheduler decisions are quick. As a result,
  a long scheduler decision time means that nearly all 
  cluster resources are locked down for a long time, inaccessible
  to other schedulers. The only resources available for other
  schedulers in this situation are the few becoming available
  while the slow scheduler is busy. These are often insufficient
  to schedule an above-average size batch job, meaning that
  the batch scheduler cannot make progress while the service
  scheduler holds an offer. It nonetheless keeps trying, and
  as a consequence, we find that a number of jobs are abandoned 
  because they did not finish scheduling their tasks by
  the 1,000-attempt retry limit in the Mesos case (Figure 7c).
  This pathology occurs because of Mesos’s assumption
  of quick scheduling decisions, small jobs and high re-
  source churn, which do not hold for our service jobs. Mesos
  could be extended to make only fair-share offers, although
  this would complicate the resource allocator logic, and the
  quality of the placement decisions for big or picky jobs
  would likely decrease, since each scheduler could only see
  a smaller fraction of the available resources. We have raised
  this point with the Mesos team; they agree about the 
  limitation and are considering to address it in future work.

Its worth noting that Andy Konwinski was a coauthor on both Mesos & Omega, so I'd hope they (Omega authors) represented Mesos' capabilties accurately. I don't have any personal experience running Mesos in production, I'm just going of what was written.

necubi · on April 21, 2014

Ah, interesting. My personal experience with Mesos has included clusters that are basically all transient services (like map reduce jobs) or all long running services, but not both. I can see how that might lead to pathological scheduling decisions.