Hacker News new | past | comments | ask | show | jobs | submit login
Mesos Borgs Google’s Kubernetes Right Back (nextplatform.com)
74 points by rbanffy on Sept 8, 2017 | hide | past | favorite | 30 comments



I'm not one to write negative reviews of open source tech. Typically, everyone has skin in the game for one reason or the other, and diversity in tech is positively great.

But Mesosphere DC/OS is purely advertising and marketing driven.

Their "Docker support" simply means they use normal Mesos worker processes to shell out to the Docker CLI across a cluster. They tightly wrap Hashicorp Vault and label it their own solution. Marathon has terrible support for security and application deployments geared toward enterprise teams. Hell, deployments can write over each other's network volumes and setting IAM roles can be sniffed straight out of unencrypted HTTPS headers. Don't get me started with Minuteman, Mesos DNS, meshing IPTables rules, and the hundreds of hacks around missing IP-per-container/network virtualization that even Solaris has had for the past 20 years (Crossbow anyone?).

The only thing people want right now is AWS in private/hybrid cloud. All the big movers are getting off AWS. If not, they're either too small to matter or are positioning their "cloud partnership" as a buy-out to Ma'Amazon.


Almost all of this is out of date. Did you try DC/OS a long time ago, maybe?

> Their "Docker support" simply means they use normal Mesos worker processes to shell out to the Docker CLI across a cluster.

This hasn't been true for quite some time now: http://mesos.apache.org/documentation/latest/container-image...

> hundreds of hacks around missing IP-per-container/network virtualization

DC/OS does have network virtualization and IP-per-container: https://dcos.io/docs/1.9/networking/virtual-networks/ip-per-...

Also, what's an unencrypted HTTPS header?


Oh man there are decrypted HTTPS headers? I can stop using Wireshark then!

https://jimshaver.net/2015/02/11/decrypting-tls-browser-traf...


Holy moly, that is a helpful link. I had been doing it the old-fashioned way that used the server key. It's so great that browsers added support for logging session keys.


IP-per-container is a thing in DC/OS. I don't recommend using it. What's wrong with Minuteman? You can disable it if you want.


The only thing people want right now is AWS in private/hybrid cloud

So, Azure then. Is there really a mass migration underway?


Maybe in the future, but right now there is no clear way to manage a collection of random on premise Windows VM's, which is what many enterprises have, with the full Azure toolset.


Spark on Kubernetes (which is still not fully production ready) will be the clearest indicator of k8s taking over the space occupied by Mesos.

Mesos/Yarn is used by spark to schedule batch jobs in a distributed manner on a spark cluster. So they have to have very deep knowledge of the resource availability and the needs of the jobs being scheduled. We are not talking static resource allocation like a container, but scheduling a particular piece of processing in real time to one of the nodes managed by Mesos/Yarn.

Kubernetes does not come with this level of awareness today. Mesos was always built with this in mind. So for Mesos, the scheduling of containers is a subset of its functionality.

However kubernetes is getting there - https://github.com/kubernetes/kubernetes/issues/34377 https://github.com/apache-spark-on-k8s/spark/issues/4 https://issues.apache.org/jira/plugins/servlet/mobile#issue/...

Today if you want to deploy a fleet of containers and schedule a bunch of processes across them in a resource aware manner...you have to use both kubernetes and mesos.


In general, nested orchestration (k8s on Mesos, OpenStack on k8s, Docker on VMware, etc.) is an infrastructure smell due to the significant overlap and impedance mismatch between the systems.


Dunno this is really a huge deal -- they're different tools with different purposes.

I've always viewed Mesos as a bare-metal orchestration platform. From there, you can deploy k8s, VMware, Linux KVM, whatever to the nodes under management by Mesos.

k8s is more of a container management platform that can be used for applications to tie arbitrary software architectures into a set of unified operations tooling.

So you may host your cluster on k8s, but you would use Mesos to auto-scale the active servers in your clusters for say, power management. But Mesos is something you probably won't have a need for until you're well past managing one k8s cluster.


How do you see Mesos as more bare-metal than k8s?

Also, both Mesos and k8s deploy containers. I'm always confused when people call k8s a container orchestration tool, but not Mesos. What makes a container orchestration tool? Does everyone agree on this definition or are we talking past each other? At the very least, Mesos was considered a container orchestration tool a couple years ago. Something seems to have changed in the popular discourse such that now k8s and Marathon are considered "container orchestration", yet Mesos isn't.


You seem confused and you are interchangeably call mesos and marathon. Those two things are not equivalent. Marathon is run on mesos, but so you can run other things, many of which are not even containers.

Marathon is something on the level of kubernetes, although kubernetes is far more powerful at this point.

But you can also use mesos to run other things like deploy VMs (this is work in progress though), or actually run applications without using containers.

The kubernetes on mesos means that kubernetes can schedule on which agent each container will run on, but at the same time the mesos will decide which physical machines in your data center the kubernetes will run on.


Mesos is a cluster manager. It coordinates running tasks on a cluster. It can run Docker container tasks but can also run local programs. Most people are interested in running containers because makes easier to deploy and isolate programs. It can be used for container orchestration but isn't only container orchestration.


Ah, so it seems that the ambiguity lies in "container".

All Mesos tasks run in containers, i.e. cgroups. Not all containers are filesystem isolated, and not all are Docker containers. Docker is so popular that people seem to think otherwise.

Jesse Frazelle explains this well in her post: https://blog.jessfraz.com/post/containers-zones-jails-vms/


The are the same thing.


It is not. And I am hoping that someone will correct me if I am wrong about some of this (because I don't really want to learn Mesos for myself), but here is how it is not:

In Kubernetes, you have nodes. Your containers run directly on nodes. This is very much the same as running your kernel inside of a VM. It's there, on the computer, running. One kernel, one VM. One node, many containers, but the containers are still each running on exactly one node.

There are however five or more abstractions between your containers and the nodes that they run on (ingress routes traffic into service, which points at deployment, and under deployment is a ReplicaSet, that spawns pods, which may run containers in them). Through these you may arrange to provide Highly Available service guarantees.

But if that node goes down, your container is down until the pod can be restarted on a different node, or possibly until that node comes back. So you ensure that your pods can be scaled out horizontally through some potentially painful process if your product is not new and you've made bad decisions, or if it's from a vendor and the vendor has made some bad decisions that are perhaps not easy to undo. (If you chose the wrong vendor. It happens. Time to update your resume perhaps.)

In Mesos, you have Agents and physical machines, and the agents take care of "work units" that are roughly a share of some task in the same sense that on VMware DRS[1], your RAM and CPU cycles are drawn from a pool, and they may come from any of the physical nodes in your cluster at any time.

I think this was called "non-locality" in vSAN. Do not think about your data as being on a disk, because enough hot copies of it may be scattered across a fleet of disks that at any given time, you might even have a local copy but an agent decides that you're going to read from and write to somewhere else across 10gbit backplane instead.

That's where I'll draw the parallel, and that's where I'll stop trying to draw parallels because I'm not trying to tell you about VMware, and hope that someone comes along to correct me if I'm actually telling lies about Mesos. Because I don't really know about Mesos and could be talking right out of my ass. But from what I've read, I'm thinking not too far off.

(I had access to full ESXi clustering and vSAN license for a year, and we turned on all those features before I left that company. This is why I think that I know something about VMware, at least.)

I'd assume that you can get the same kind of guarantees from Mesos that I could get from VMware's DRS and HA solutions. In other words, if a node goes down and I have enabled all the, err, checkboxes...

...then I won't need to wait for that node to be rescheduled, I won't even notice an interruption in the continuous operation of the container(s) unless the loss of a node represents enough missing CPU cycles that it puts my cluster "over the edge." I don't need to have pod replicas in the sense that Kubernetes needed me to have them in order to provide that guarantee. Because they are not down. Their shares of work are just sent to another worker from the pool seamlessly.

(They might not even have been scheduled onto one physical machine during normal operation, the agent decides where to send the shares and may change its mind about that any time based on new information about load. You're never supposed to notice that unless you're paying very close attention to it.)

I'm extrapolating how I think Mesos must work based on what I've heard about it, and how I know you can get VMware to do if you're willing to spend millions of dollars scaling it out, or if your hardware is already big enough and you know the salesman that is willing to sell you a zero-support no-maintenance-included contract at 97% discount so you can use the HA and FT features when you need them. (I used ESXi for 4 years without any of these features and it was fine, but they are nice features and you need to arrange things a bit differently when you don't have them. vSAN is not RAID5.)

[1]: "What is DRS"

VMware DRS (Distributed Resource Scheduler) is a utility that balances computing workloads with available resources in a virtualized environment. The utility is part of a virtualization suite called VMware Infrastructure 3.


Mesos actually handles the lower level part (determining which machines are up and which resources they have (CPU, memory, disks, network ports, GPUs), running tasks on machines, isolating resources from each running task on each machine, either with its own containerizer or using Docker, etc), but the tasks themselves and their scheduling details (how many tasks to run, should it try to run each task on a different machine or not, etc) are determined by the framework you are using. There are some popular "meta-frameworks" (like Marathon and Aurora) that abstract away the Mesos interface and let you just say "run five instances of this Docker container, one per physical machine". They also might handle higher level details, like service registration/discovery, rolling updates and more.

>Because they are not down. Their shares are just sent to another worker from the pool seamlessly.

Eh, depends on what you consider by "seamlessly": basically if a node goes down, everything it was running will be rescheduled and ran again on other nodes, but this might take some minutes, and ensuring this failure is not perceived from the outside depends entirely on you. Disk management is entirely up to you also, so you have to roll your own SAN if you want to have something like that (at my company Mesos disk space is completely ephemeral, and truly persistent state is always stored on S3 or external databases).


OK, thank you

That helps. I won't try to compare it to HA ESXi environment again :D

Can you tell in a sentence, or paragraph maybe, why people would get excited about running Kubernetes on Mesos? Is it simply so that they no longer have to go on running their Kubernetes alongside of Mesos, as in this diagram:

https://github.com/mantl/mantl/raw/master/docs/_static/mantl...


It is a big deal, because I (my last employer) spent ten(s?) of thousands of dollars on VMware so that I would not have to put infrastructure nodes on bare metal. They created vSAN so it would be so. (Modern infrastructure I'm sure has better solutions than vSAN, but it worked great for us, and I'm out of there now. We never honestly needed that until we had large Windows servers to take care of. I personally have a distaste and distrust for Windows servers, but when the new owner came he was the boss, and that was where they were invested. (Yeah, let's go with invested...))

Some of those infrastructure nodes were Windows machines, and (most) of them were Linux VMs. I have not used Mesos yet (I am using Kubernetes in dev only, but my org has not drank any of the kool-aid yet), it kind of sounds like I want Mesos (and Marathon?) for the same reasons that I wanted VMware when I had only one Windows server and I literally couldn't live without it when I suddenly had dozens more Windows servers.

(I heard that OpenShift will soon run Windows containers.

I can't imagine it will be long after that before Kubernetes at-large also does.)

I'm not sure I need Mesos, Kubernetes does an awful lot, but it still seems like Kubernetes Admin user could royally fuck things up for me on a multi-tenant system that I admin (I learned the ways from OpenShift), if it's just Kubernetes. But on Mesos, maybe I can get some guarantees that I couldn't without it...

I'm not sure how much anyone has a "need" for Mesos. If I understand exactly what makes it different from Kubernetes, it's one of those things that you just kind of layer on top so that when your suspenders fall right off for some reason, your belt actually can still keep your pants up.

I honestly don't know any more about Mesos than what I've been told, but as a belt-and-suspenders kind of guy I think it could actually be for me. What say you to that?


One of the problems I've heard with Mesos is that it would require your application code to need to use its constructs - this may not be true or inaccurate now. For a K8s pod, you create a container and it doesn't need to be rewritten unless it's stateful and is not really a 12 factor design application. Mesos' originating use cases come from stateful big data ish systems like Hadoop and Zookeeper clusters, and K8s aimed more at containerized web application infrastructures and both are filling out use cases the other direction nowadays.

From a risk perspective, unless something terrible happens with K8s adoption soon in 4 years I expect every other shop to be using K8s in production and many, many more people will know how to run it day to day than they do now. With that said, adoption is quite slow in some places and nothing is assured.


[flagged]


Would you please not post unsubstantive comments to HN?

https://news.ycombinator.com/newsguidelines.html


Does kubernetes actually work as a Mesos framework? I thought that was the point of marathon.


I don't think k8s operates as a Mesos framework, but the use cases for k8s is a lot closer to Marathon than Mesos. For most use cases k8s and marathon satisfy the same needs. If the only framework you are running on Mesos is Marathon, then you may be better off with k8s.


"I don't think k8s operates as a Meoss framework"

Have you ever used Mesos? It is the only way k8s could run ontop of Mesos. Also:

https://github.com/kubernetes-incubator/kube-mesos-framework

Also, you could maybe compare Marathon to Kubernetes Dashboard + the Kubernetes apiserver maybe, but comparing the two directly is not remotely close.

With Mesos + Marathon, you still need a load balancer for both North <---> South AND East <---> West cluster traffice. This is builtin to Kubernetes via Service and Ingress objects and their associated controllers. There is also Federation, and so much more.

Mesos is just a resource scheduler, and a very good two level one at that, built to prove the theory of two level schedulers working well with interactive and batch workloads concurrently. Kubernetes is the entire package and includes a default scheduler, but allows you to plug in your own.

Disclaimer: I spent a full year working on Mesos and went to MesosCon twice. After speaking in person with Kelsey Hightower at Monitorama, he convinced me to switch to Kubernetes on purely technical reasons. The stuff that you have to DIY on Mesos is built sensibly and pluggably in Kubernetes. From a cluster operator standpoint frankly, Kubernetes is leagues ahead.

Lots of fortune 500s and enterprise customers find the lack if sensible RBAC and real to the core auditing support in Mesos to be seriously lacking.

https://twitter.com/nirajtolia/status/903437346443378688


Marathon was just Mesosphere's (not even Mesos') way of running docker containers. It is very simple, and it gained more popularity than naive mesos' container solution - Aurora).

At this point kubernetes' won in the container area, but that doesn't mean mesos is no longer useful. Running containers was just one of things it could do, its main purpose is to manage hardware.

As for your original question, I don't see any other way mesos would run kubernetes than as a framework. That is the whole point, otherwise there would be no point of using mesos.

BTW there already was kubernetes framework, but it seemd to be abandoned since k8s 1.3 or 1.2. I'm glad it was resurrected.


>At this point kubernetes' won in the container area

This seems like an overstatement. From where I’m standing, it seems like there’s plenty of room for many solutions to thrive in this space.

I use and like Kubernetes in local development at my org, and we’re probably going to be testing production workloads on Kubernetes soon. The learning curve is quite steep and almost entirely front-loaded. There will be a significant portion of companies for which a different solution is more desirable, based on that fact alone.


It does work as a Mesos framework now.


Mesos isn't a bare metal platform but a cluster manager that runs on OS (usually Linux). It coordinates clients scheduling tasks on the cluster. Tasks can be programs, containers, or KVM VMs.


"You will be assimilated. Resistance is futile."


r/titlegore




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: