Designing Our Serverless Engine: From Kubernetes to Nomad, Firecracker, and Kuma

lloydatkinson · on July 7, 2021

I'm hoping that this is the start of a trend away from the overly complex, overly engineered, big ball of mud that is Kubernetes towards simpler and easier deployment/hosting/container runtimes that are easier for smaller and large teams alike to manage.

Don't get me started on how "devops" has become a meme. I want to deploy services and features. I wouldn't want to spend my full time dealing with Kubernetes and containers and all manner of complexity.

I think cloud providers have much better models generally (e.g. Azure App Service). Setup pipeline, deploy, done. Containers are an optional thing.

What I'm saying is, I don't think it's fair on anyone that essentially the only choices are 1) run servers and VM's yourself and manage deploying to them 2) kubernetes 3) cloud providers.

There needs to be an option between them.

mountainriver · on July 7, 2021

There have been a ton of in between options, they just rarely work well for anything outside of a basic use case.

Kubernetes is complex because the reality of what you’re dealing with is complex.

Kubernetes is way better than how we used to do things, not a perfect system but there are a lot of good reasons why it’s built the way it is.

_6pvr · on July 8, 2021

> Kubernetes is complex because the reality of what you’re dealing with is complex.

Passing off complexity instead of reducing it isn't a net-positive though. The purpose of frameworks/libraries/etc is to (considerably) reduce complexity. In Kubernetes, you've replaced a group of specialist engineers with just a different group of specialist engineers.

nkotov · on July 8, 2021

Kubernetes is not the answer to every infrastructure question.

threeseed · on July 8, 2021

Nobody ever said it was.

But if you have lots of containerised applications that needs distribution across multiple nodes then Kubernetes is definitely the answer.

Thaxll · on July 7, 2021

There is nothing complicated to deploy a single deployment on Kubernetes, people that say so either don't use Kuberentes or are just lying.

https://kubernetes.io/docs/concepts/workloads/controllers/de...

What's complicated is to operate a Kubernetes cluster, nothing forces you to do that, major cloud provider all do it pretty well.

As for OP post lot of things don't make sense, I think the only valid reasons is about isolation which k8s is not very good at for now ( multi tenancy ). The rest is interesting... let them build there own solution to maintain and operate and see how it goes for them in 2 years.

Things that I found strange:

- Complexity "understanding how Kubernetes works and why it was implemented this way is hard" So instead they're building their own tools? If you have the knowledge to build a similar platform, extanding and understanding k8s is easy.

- Global and multi-zone deployments "Implementing multi-zone with Kubernetes requires deploying a full cluster per zone, with a dedicated control plane for each data center." good luck with a single control plane controling all of you regions, one little problem will bring down all your servers. You should always have isolated regions. Nothing prevent you from building some sort of orchestration around multi region deployments.

- Scalability: Kubernetes is known to have limits in terms of the number of nodes in a cluster. Yes there are limits, but looking at your startup I bet that 5000 nodes for a single cluster is more than what you need in a single region. https://kubernetes.io/docs/setup/best-practices/cluster-larg...

nuker · on July 8, 2021

> There is nothing complicated to deploy a single deployment on Kubernetes

Maybe, not there yet. People are too busy deploying Kube itself.

seer · on July 7, 2021

Not sure about the other clouds, but GCP is just atrocious for this with both of their offerings.

App Engine, seems to be no longer developed, and is considered “old” technology, it works, but its hard to do more enterprisy stuff, like virtual private cloud (VPC) with it. For example you can configure it to be able to access a VPC, but you cannot put it “inside” the VPC so others services can interact with it.

Cloudrun seems to be the new hot thing, and it works too, seems to be made using kubernetes itself, so promises future development and integration. However there is a tiny flaw in the system, berried in a fine print in an obscure doc page. Cloudrun severely throttles cpus on any containers that are not currently processing an http request. So you cannot build anything that is long running - kafka consumers, batch processes etc. What you are supposed to do is delve head first into the whole of GCPs ecosystem and embrace their queues, message brokers, and event sourcing systems (as they’re all http request based), woe to the ones that have to interact with tech thats outside of it.

/rant

kelseyhightower · on July 8, 2021

I'm currently testing our (GCP) solution to the CPU throttling you've highlighted. I've been using Vault[1] as my test case, and so far so good. Be on the look out for early sign up if you're interested.

[1] https://github.com/kelseyhightower/serverless-vault-with-clo...

coryrc · on July 8, 2021

> seems to be made using kubernetes itself

"Cloud Run for Anthos" is, but the standard Cloud Run product is not. Throttling would only apply to the latter. I work on them.

(Not that this level of detail is germane to the conversation).

seer · on July 8, 2021

Is Anthos an abstraction on top of Kubernetes then? Kind of hard to understand with all the marketing around it. E.g. it runs k8n, alongside other google tech on your hardware?

If that is the case maybe its going in the wrong direction from the general sentiment in this thread - trying to solve the problem of too many abstractions by adding another abstraction. I think I’d prefer tech that removes turtles from the stack rather than adding fresh ones.

Wonder how long it will take to have “Kipos - a tech to manage your complex Anthos deployments” :-D

nh2 · on July 8, 2021

> There needs to be an option between them.

NixOS with nixops is one such option (on bare metal or VMs).

You need to climb up the Nix learning curve a bit, but obtain an overall vastly less complex system than Kubernetes with its millions of lines of code and various levels of OS and networking encapsulation. It is possible for a single human to read and understand all code involved in a nixops deployment in a few months of side-time reading. It is possible to debug operational or configuration issues easily, with standard tools, and standard Linux systems understanding.

You manage your entire infrastructure, including all software and how to build it, declaratively. If you use cloud VMs, nixops will also create them for you.

It is not as powerful as Kubernetes; for example you currently don't get fully automated rolling deployments that roll back if your health checks start failing; you need to do this yourself e.g. running `--include machine1` through `--include machine3` to do a rolling deployment. However, this is often fine, as most small to medium teams do human-triggered deployment (and ensure that it works as expected) anyway.

I found that this works very well for small to medium teams that want to spend as little ops overhead as possible, while still understanding their stack entirely.

fnordsensei · on July 8, 2021

I heard about NixOS years ago and thought that an “immutable” Linux dist is an intriguing idea. Game changer if it actually works in practice. But I never properly experimented with it.

Are there any good learning resources for using NixOS in an “infrastructure as code” scenario, or should I just RTFM?

nh2 · on July 8, 2021

Yes, you should read the manuals and try it out, given how easy and cheap it is to play around with smaller-instance cloud servers.

I also started writing a tutorial series for practitioners you can check out:

https://github.com/nh2/nixops-tutorial

I've only written 2 tutorials in there so far doing basic things. In the future and as time permits, I'd like to write about how my org implemented various tasks with NixOps, such as rolling deployments, highly available database setups with automatic fallbacks, distributed file systems, how you can patch any software in your stack, how to write reliable systemd services with correct dependencies, and so on.

kfk · on July 7, 2021

Kubernetes is complex ok but I disagree on cloud providers offering better alternatives, especially Azure. Azure is bug ridden and support depends on what type of price tier you are in. The thing with Cloud providers is when their services go down or change you are on the hook for supporting your app, not them. At least Kubernetes (or Nomad) lets you mange your own dumpster fire around which you can build safer deployment workflows.

dilyevsky · on July 7, 2021

Ah yes still waiting on that infinitely scalable, simple and cheap option too.

Edit: forgot reliable

dmitriid · on July 7, 2021

> I'm hoping that this is the start of a trend away from the overly complex, overly engineered, big ball of mud that is Kubernetes towards simpler and easier deployment/hosting/container runtimes that are easier for smaller and large teams alike to manage.

But is it easier to manage? Is it simpler?

So they went from kubernetes to

- Control Pane running in

- Container running on

- Firecracker with Kuma running on

- Nomad with CoreOS running on

- Some servers

No idea what half of those words are, how they all are ocnfigured, interact with each other, and fail in spectacular ways.

gravypod · on July 8, 2021

To be fair many of these words would be needed for running a bare metal kube cluster.

- Some servers: you need servers to run a bare metal cluster.

- CoreOS: This is just a minimal OS that's designed to be PXE booted. Think of it as a docker container for a bare metal machine.

- Kuma: Multi-cluster mesh network. This makes it easier to write services that talk to `foobar:8080` and have it automatically fail over to other pods in other regions. I wouldn't use kuma, there's a few other options, but this is very helpful. Good service meshs will allow you to do Canary deployments. If you are multi-cluster than this sort of reliability is probably important to you.

But, an important part of this is that you don't need these if you're ok with running on another cloud's infrastructure.

dmitriid · on July 8, 2021

> But, an important part of this is that you don't need these if you're ok with running on another cloud's infrastructure.

Same with kubernetes ;)

gravypod · on July 8, 2021

Kube gives you portability. You can move away if you need to. Very difficult to do that with AWS Fargate.

dmitriid · on July 8, 2021

> Kube gives you portability. You can move away if you need to.

It doesn't, you can't. Because you don't buy in to some nebulous clouds. You buy in to services and capabilities. This tweet sums it up the best: https://twitter.com/coryodaniel/status/1412645383876927492

gravypod · on July 8, 2021

In my experience I haven't bought into anything like that. The most I've done is paid for a hosted DB which is a complexity I could offload to many different companies.

I haven't used any services that have lock in, I've been able to ship services efficiently and meet deadlines, and have an abstracted compute layer.

I've also multiple times spun up demo environments with docker-compose on a bare metal machine to demo clients without networking.

Kube abstracts the compute and load balancing capabilities to a bare level. Everything else can be implemented within kube for at-cost prices.

dmitriid · on July 8, 2021

> Kube abstracts the compute and load balancing capabilities to a bare level.

> Everything else can be implemented within kube for at-cost prices.

Until you need something more than a bare service with nothing in it: anything from messaging to databases to storage.

Companies don't usually run a bare-bones service in a kubernetes cluster. It needs to connect somewhere and do stuff. For example, we use BigQuery on GCP. Any idea how it "can be implemented within kube for at-cost prices"?

gravypod · on July 8, 2021

Don't use services that lock you in. Instead use a layer that abstracts that. In the past I think I have heard of people using Presto for similar operations and that can be self hosted. I would never build a company on top of tech I can only get from one place. Regardless of the layer in the stack.

If it's abstracted behind an agreed upon industry standard protocol (MySQL, pg, mongo connections or S3/NFS storage, etc)? Sure. If it's special sauce? No way.

My $Job-1 is migrating the stack I setup from Azure to GCP and it's been fairly painless from what I heard (only taking a few weeks to make an account, test, then move everything over).

dmitriid · on July 8, 2021

> Don't use services that lock you in.

Why, if they bring value?

> Instead use a layer that abstracts that.

How can I "abstract" BigQuery?

> If it's abstracted behind an agreed upon industry standard protocol (MySQL, pg, mongo connections or S3/NFS storage, etc)?

This still doesn't answer the question of "can be implemented within kube for at-cost prices."

BigQuery is amazing at processing enormous amounts of data at fraction of time and cost it would take, say, MySQL to process the same data. How can I "abstract it away" and "implement it at cost with kube".

Even if it's purely standard stuff like PubSubs, or Apache Beam, there are value-added options that are hard to get from other providers (or implement yourself): anything from monitoring to management to scaling to...

I'm not to crazy about vendor lock-ins myself, but "abstract everything away and you can easily move over if needed" is largely a myth.

fosk · on July 8, 2021

May I ask why you wouldn't use Kuma?

gravypod · on July 8, 2021

Kuma is probably fine but I haven't used it so I'd probably use other tools I had used in the past. For example: cilium & stuff by weave.

straphka · on July 8, 2021

Wrt Azure App Service, thats basically just Kubernetes (well, AKS to be precise) under the hood anyway so it's really not an argument against the perceived complexity of Kubernetes. Which illustrates nicely that offloading the complexity of managing K8s clusters to cloud providers is a very valid way of doing things imo. Having said that, I really don't like AAS, but that's a different story:)

rodgerd · on July 8, 2021

> I'm hoping that this is the start of a trend away from the overly complex, overly engineered, big ball of mud that is Kubernetes

The main problem with k8s in the real world is that too often it's being deployed by people who want to fuck with k8s and have that on their CV, rather than people who want to provide something to make it easy to deploy workloads.

A huge amount of the k8s world and chatter is dominated by people who are doing the equivalent of, having been asked for a Linux server, providing Linux from Scratch, rather than a server running a well-known distro that people can just work on.

JanMa · on July 7, 2021

Nice post, thanks for sharing it.

From personal experience I can only agree with their choice to pick Nomad. At the place where I work we have been running Nomad as our main container orchestrator for around 2.5 years now. It's rock solid, very easy to set-up and maintain and overall not too complex to understand in depth.

pm90 · on July 7, 2021

> Global and multi-zone deployments: User workloads on Koyeb need to be able to run in multiple zones. Kubernetes doesn't support multi-zone out of the box. Implementing multi-zone with Kubernetes requires deploying a full cluster per zone, with a dedicated control plane for each data center.

I’m having trouble understanding this. K8s worker nodes should be deployable across zones, regions etc. You can label the nodes with the zone id and use taints/tolerations to ensure workloads are deployed in specific zones/regions (if that’s what you want).

rileymichael · on July 7, 2021

You certainly can deploy workers across regions, however the latency to the control plane makes it quite unpleasant.

johntash · on July 8, 2021

IIRC the only real issue I ran into with workers in multiple datacenters was etcd latency.

If you have a fast enough network, things will mostly be fine.

bproven · on July 7, 2021

yeah i am not sure if they mean "region" (as in AWS region) instead of zone. if that is the case then I agree with their assessment.

dilyevsky · on July 7, 2021

Yeah pretty sure it’s that bc k8s 100% supports multi-az on every major cloud provider out of the box. It also supports federation via kubefed but it needs a separate control plane in each region.

In theory nothing should be stopping you from deploying your etcd/master nodes in different regions but you’d probably need to tweak cloud resource provider to handle that and if one of regions is partitioned away from quorum those master become unavailable

tantalor · on July 7, 2021

> Kubelet uses between 10% and 25% of RAM ... We're more around 100MB with our new architecture

These figures are not comparable.

dilyevsky · on July 7, 2021

Highly questionable figures. Our kubelets have a 2% memory limit which I’ve never seen hit as it’s a hugely over-provisioned.

ngrilly · on July 7, 2021

Seems similar to fly.io, but fly.io seems better: uses WireGuard instead of service mesh, supports custom domains, supports any TCP or UDP service, provides volumes, etc. But that’s great seeing more options in that market!

jimaek · on July 7, 2021

Great post and a great tech stack.

As far as I know Fly.io also went the same path of using the Nomad stack and tools.

We did something similar at appfleet.com and also decided against Kubernetes. We opted to write a lightweight manager of Firecracker VMs.

tkiolp4 · on July 7, 2021

I’m waiting the day someone discovers a simpler alternative to k8s. It will be like that day in which we realized that http verbs >>> Corba/rmi for web services.

staticassertion · on July 7, 2021

I'm curious - when a user deploys their new code to your product, does that kick off a new Nomad job, or is that managed internally, kicking off a koyeb-managed Firecracker with Kuma for service discovery?

icythere · on July 8, 2021

(Off topic)

I maintain this small repo [1] and I have added our thread today to the section (#infrastructure). It's to learn "What/Why people move from this to that."

It's not an _awesome_-like repo but I've found it's useful to learn others' decisions. Feel free to keep them up-to-date (but if you know there is better place / resource, I'm happy to work with them too.) Thanks a lot.

[1] https://github.com/icy/w2w

rockyluke · on July 7, 2021

Thanks for sharing! That's currently an uncommon change but I must admit you explained it well.

Quick question for the team: did you consider or try Consul as a service mesh ?

yann_eu · on July 7, 2021

Hi!

Yann, co-founder at Koyeb, here.

Yes, we considered Consul as a service mesh but found that it was too strongly coupled with the network and task layer of Nomad. We were looking for something highly customizable which wouldn't get into the way.

Also, the multi-tenancy features are paid features, which might have been difficult to sustain economically for us.

tsujp · on July 8, 2021

If things like multi-tenancy are required and the project (Consul) is open-source would adding that functionality yourselves not be in-scope given the amount of work you've already put into your setup?

I ask as a theoretical what-if.