• Lets companies brag about having # many production services at any given time
• Company saves money by not having to hire Linux sysadmins
• Company saves money by not having to pay for managed cloud products if they don't want to
• Declarative, version controlled, git-blameable deployments
• Treating cloud providers like cattle not pets
It's going to eat the world (already has?).
I was skeptical about Kubernetes but I now understand why it's popular. The alternatives are all based on kludgy shell/Python scripts or proprietary cloud products.
It's easy to get frustrated with it because it's ridiculously complex and introduces a whole glossary of jargon and a whole new mental model. This isn't Linux anymore. This is, for all intents and purposes, a new operating system. But the interface to this OS is a bunch of <strike>punchcards</strike> YAML files that you send off to a black box and hope it works.
You're using a text editor but it's not programming. It's only YAML because it's not cool to use GUIs for system administration anymore (e.g. Windows Server, cPanel). It feels like configuring a build system or filling out taxes--absolute drudgery that hopefully gets automated one day.
The alternative to K8s isn't your personal collection of fragile shell scripts. The real alternative is not doing the whole microservices thing and just deploying a single statically linked, optimized C++ server that can serve 10k requests per second from a toaster--but we're not ready to have that discussion.
I am ready! NetBSD is running on the toaster. I think haproxy can do 10K req/s. tcpserver on the backends. I only write robust shell scripts, short and portable.
As a spectator, not a tech worker who uses these popular solutions, I would say there seems to be a great affinity amongst in the tech industry for anything that is (relatively) complex. Either that, or the only solutions people today can come up with are complex ones. The more features and complexity, the more something is constantly changing, the more a new solution gains "traction". If anyone reading has examples that counter this idea, please feel free to share them.
I think if a hobbyist were to "[deploy] a single statically linked, optimized [C++] server that can serve 10k requests per second from a toaster" it would be like a tree falling in the forest. For one because it is too simple, it lacks the complexity that attracts the tech worker crowd, and second, because it is not being used by well-known tech company and not being worked on by large numbers of people, it would not be newsworthy.
I can see your point for small hobby projects. But enterprise web development in C++ is no fun at all. For example: "Google says that about 70% of all serious security bugs in the Chrome codebase are related to memory management and safety." https://www.zdnet.com/article/chrome-70-of-all-security-bugs...
Developer time for fixing these bugs is in most cases more expensive, than to throw more hardware at your software written in a garbage collected language.
That's why C++ is in brackets. :) I think the reason a hobbyist might be able to pull off something extraordinary is because he is not bound to the same ambition as a tech company. He can focus on performance, the 10K req/s part, and ignore the "requirement" of serving something large and complex that is likely full of bugs. "Developer time" is "hobby time". Done for the pleasure of it, not the money. Some of the most impressive software from a performance standpoint has been written by more or less "one man teams". I am glad I am not a tech worker. Even for someone who enjoys programming, it does not sound fun at all. No wonder there is so much cynicism.
True, but the alternative to C++ with that reasoning is Rust or Go (depending on your liking), not Ruby. And with both of these you can step around a lot of deployment issues, because a single server can be sufficient for quite high loads. Avoid distributed systems as much as you can: https://thume.ca/2020/05/17/pipes-kill-productivity/
I find that the big problems that k8s solve is the usual change management issues in production systems.
What do we want to deploy, okay, stop monitoring/alerts, okay, flip the load balancer, install/copy/replace the image/binary, restart it, flip LB, do the other node(s), keep flipping monitoring/alerts, okay, do we need to do something else? Run DB schema change scripts? Oh fuck we forgot to do the backup before that!
Also now we haven't started that dependent service, and so we have to rollback, fast, okay, screw the alerts, and the LB, just rollback all at once.
And sure, all this can be scripted, run from a laptop. But k8s is basically that.
...
And we get distributedness very fast, as soon as you have 2+ components that manage state you need to think about consistency. Even a simple cache is always problematic (as we all know how the cache invalidation joke).
Sure, going all in on microservices just because is a bad idea. Similarly k8s is not for everyone, and running DBs on k8s isn't either.
But, the state of the art is getting there. (eg the crunchydata postgresql operator for k8s.)
> I find that the big problems that k8s solve is the usual change management issues in production systems.
This.
I was in a company where devops was just a fancy marketing term, developers would shit out a new release and then it was our problem (we the system engineers / operations people) to make it work on customers' installations.
I now work as a devops engineer in a company that does devops very well. I provide all the automation that developers need to run their services.
They built it, they run it.
I am of course available for consultation and support with that automation and kubernetes and very willing to help in general, but the people running the software are now the people most right for the job: those who built it.
As I said in my other comment: it's really about fixing the abstractions and establishing a common lingo between developers and operations.
If you want to be a successful indie company, avoid cloud and distributed like the plague.
If you want to advance in the big corp career ladder, user Kubernetes with as many tiny instances and micro-services as you can.
"Oversaw deployment of 200 services on 1000 virtual servers" sounds way better than "started 1 monolithic high-performance server". But the resulting SaaS product might very well be the same.
recently talked to a non-SV software developer friend: yeah, I killed two days making a window show up on top of all others on ubuntu with QT (under KDE, so, a supported business-configuration...). After telling him that this is a standard-feature of most X windows managers and asking, whether he used Wayland (didn't know), we went into functionality. "Yeah, you can enter a command and execute it on a remote machine" - "isn't that what ssh is for and you can do with a 20 line shell-script"" - "Yeah maybe, but some management type developed it, while he was still a grunt so we have to keep it...". I bet, said management type is still bragging about how he introduced convenient remote command execution by writing a QT app and his own server (or maybe he's using telnet...).
Rust and Go could help you there, and their deployment story is as excellent as C/C++: just compile and ship.
However, their learning curve is pretty steep (particularly Rust) and most developers don’t enjoy having to worry about low-level issues, which makes recruitment and retention a problem. Whereas one can be reasonably proficient with Python/Ruby in a week, Java/C# is taught in school, and everyone has to know JS anyway (thanks for nothing, tweenager Eich), so it’s easy to pick up manpower for those.
Do those highly available developers have the same quality as the Rust developers you could get, though? (On average.) I'd wager there is a correlation between interest in deeper topics as required by Rust and good long term design and quality. I've seen how marvelously efficient a small team of very talented people can be be. To replace them with the average highly available developers you would need far more than doubly the people and some extra manager, just because communication scales so badly. The 10x developer is a myth, but I think the 5x development team is realistic in the right circumstances.
Disclaimer: Neither do I claim to be a very good developer, nor do I think you, the reader, is only average. Just given that you are reading Hackernews is a strong indicator for your interest in reflection and self improvement, regardless of your favorite language.
The problem might be that those people interested in learning new things, solving complicated issues and building good long-term architectures are usually not too interested in sitting in a cubicle without a window.
So big companies might be forced to settle for less skilled developers, simply because the top tier is doing their own thing. I assume that's also why acqui-hiring is a thing.
Cubicles? You're too kind! Open floor plans are so much more 'creative' and 'communicative' and storage efficient. Also, they look so nice when you have visitors. (No joke: Our CEO told us he was so impressed with the office of this other company he visited – after being there once and not talking to the employees specifically about how they liked it! I'm shocked again and again how many decisions top managers like to make without listening to those it affects.)
Additionally, for larger eneterprises, the operational overhead quickly grows and slows down new app development, if one is tied to a traditional server approach.
Within our teams, we’ve found we can do with an (even) higher level of abstraction by running apps directly on PaaS setups. We found this sufficient for most of our use-cases in data products.
> because it is not being used by well-known tech company and not being worked on by large numbers of people, it would not be newsworthy.
And at this point the hobbyist might wonder, "why isn't my toaster software being used by well-known companies? where are the pull requests to add compatibility for newer toaster models?"
> As a spectator, not a tech worker who uses these popular solutions, I would say there seems to be a great affinity amongst in the tech industry for anything that is (relatively) complex.
I think you have it backwards. General/abstract solutions (like running arbitrary software with a high tolerance for failure) have broad appeal because they address a broad problem. Finding general solutions to broad problems yields complexity, but also great value.
> great affinity [...] for anything that is (relatively) complex.
That, and mixture of a sunken cost fallacy/lack of the ability to step back and review if the chosen solution is really better/simpler rather than a hell of accidental complexity. If you've spend countless months to grok k8s and sell it to your customer/boss, it just has to be good, doesn't it?
Plus, there's a great desire to go for an utopian future cleaning up all that's wrong with current tech. This was the case with Java in the 2000s, and is the case with Rust (and to a lesser degree with WASM) today, and k8s. Starting over is easier and more fun than fixing your shit.
And another factor are the deep pockets of cloud providers who bombard us with k8s stories, plus devs with an investment into k8s and/or Stockholm syndrome. Same story with webdevs longing for nuclear weapons a la React for relatively simple sites to make them attractive on the job market, until the bubble collapses.
But like with all generational phenomenae, the next wave of devs will tear down daddy-o's shit and rediscover simple tools and deploys without gobs of yaml.
> Plus, there's a great desire to go for an utopian future cleaning up all that's wrong with current tech. This [...] is the case with Rust [...]. Starting over is easier and more fun than fixing your shit.
The domain I'm working in might be non-representative, but for me fixing my shit systematically means switching from C++ to Rust. The problems the borrow checker addresses come up all time either in the form of security bugs (because humans are not good enough for manual memory management without a lot of help) or in the form of bad performance (because of reference counting or superfluous copies to avoid manual memory management).
But otherwise I agree with you that if we never put in the effort to polish our current tools, we'll only ever get the next 80%-ready solution out of the hype train.
I am all for modern C++, and although I like Rust, C++ is more useful for my line of work.
However, it doesn't matter how much modern we make C++, if you don't fully control your codebase, there is always going to exist that code snippet written in C style.
However I always have static analysers enabled on my builds, so it is almost as if they were part of the language. Regardless if we are talking about Java, C# or C++.
Just like most people that are serious about Rust have clippy always enabled, yet it does stuff that ins't part of Rust language spec.
I think the popularity of Kubernetes is that it can run all kinds of workloads on a single platform not really its performance. I've been in Ops business for over a decade now, in my case every single customer has a completely different kind of application architecture and sometimes even with similar use cases (i.e like ecom app or chatbots etc). With Kubernetes the differences are largely irrelevant encapsulated nicely within the container. This means in my universe I can run all of our customers on a single architecture and the differences are nicely abstracted.
Same feeling here but I came to understand that it's the enterprise and the design by committee that brings in the complexity.
K8S is developed by a multitude of very large companies. Each with their own agenta/needs. All of them have to be addressed. Thus the complexity. If you think about it they probably manage to keep the complexity to relatively low levels. Maybe because it is pushed to the rest of the ecosystem (see service meshes for example).
Being pushed by the behemoths also explains the popularity. Smaller companies and workers feel that this is a safe investment in terms of money and time familiarizing with the tech stack so they jump on. And the loop goes on.
Main business reason for all that though I think it's the need of Google et all to compete with AWS creating a cloud platform that comes to be a standard and belongs to no-one really. In this sense it is a much better, versatile and open ended openstack attempt.
That's kind of why people use Google Go. No memory issues, statically linked, easy cross-compile, and it can handle 10k requests on a toaster.
And yes, there is less fancy companies like one where I work where we don't use Kubernetes because it's kind of overkill if all of your production workload fits onto 2 beefy bare metal servers.
I can see a point in using Docker to unify development and production environments into one immutable image. But I have yet to see a normal-sized company that gets a benefit from spreading out to hundreds of micro instances on a cloud and then coordinating that mess with Kubernetes. Of course, it'll be great if you're a unicorn, but most people using it are planning for way more scaling than what they'll realistically need and are, thus, overselling themselves on cloud costs.
Yes, but with the cloud I'm also only one platform issue away from not having my virtual instances start correctly. Like the one on May 22nd this year.
Last year, my bare metal website had 99.995% uptime. Heroku only managed 99.98%.
Of course, I could further reduce risk by having a hot standby server. But I'm not sure the costs for that are warranted, given the extremely low risk of that happening.
The reason for complexity fetishisation (great term!) is to show everybody else how amazingly smart you are. It's the infrastructure version of "clever" code.
Its starting to get depressing at my age. After nearly 20 years in the industry I think I have a knack for keeping things as "simple as necessary and no more" and managing complexity. Unfortunately interviews are full of bullshit looking at pedantic nitpicking, or stuff I learned 20 years ago in university and have never needed to use since.
Someone else above in the thread believes there is "great value" in being able to accomodate complexity, what the commenter refers to as a "general" solution, as well as this being worth any cost in performance. God help you if these are the type of people who are doing the interviews. Despite pedantic technical questions, I doubt they are actually screening for crucial skills like reducing complexity. Rather, the expectation is that the tech worker will tolerate complexity, including the "solutions" to managing it that are themselves adding more complexity, e.g., abstraction.
Just focus on staying relevant with new languages and ignore the hypetrain that is k8s. Its something you can learn on the job and can easily talk your way around in an interview.
The older i get the more i realise the less i want in my stacks.
This article listed as a benefit, frequent, multiple major updates each year, new features and no sign of it slowing down. I just cringed and wondered who the fuck is asking for this headache?
Ive been working a lot with wordpress lately and the stability of the framework is spoiling me rotten.
I was pretty skeptical too but then handed over a project which was a pretty typical mixed bag: Ansible, Terraform, Docker, Python and shell scripts, etc... Then I realized relying on Kubernetes for most projects has the huge benefit of bringing homogeneity to the provisioning/orchestration which improves things a lot both for me and the customer or company I work for.
Let's be honest here, in many cases it does not make a difference whether Kubernetes is huge, inefficient, complicated, bloated, etc... or not. It certainly is. But just the added benefit of pointing at a folder and stating : "this is how it is configured and how it runs" is huge.
I was also pretty skeptical of Kustomize but it turned out to be just enough.
So, like many here. I kind of hate it but it serves me well.
> Company saves money by not having to hire Linux sysadmins
Citation? In my experience companies hire more sysadmins when adopting k8s. It's trivial to point at the job reqs for it.
> Company saves money by not having to pay for managed cloud products if they don't want to
Save money?! Again citation. What are you replacing in the cloud with k8s? In my experience most companies using k8s (as you already admitted) don't have a ton of ops experience and thus use more cloud resources.
> Treating cloud providers like cattle not pets
Again. Citation? Companies go multi-cloud not because they want to but because they have different teams (sometimes from acquisition) that have pre-existing products that are hard to move. No one is using k8s to get multi-cloud as a strategy.
> It's going to eat the world (already has?).
Not it won't. It's actually on the downtrend now. Do you work for the CNCF? Can you put in a disclaimer if so?
> just deploying a single statically linked, optimized C++ server that can serve 10k requests per second from a toaster
completely un-necessary; most of the HN audience is not creating a c++ webserver from scratch; most of the HN audience can trivially serve way more than 10k reqs/sec from a single vm (node, rust, go, etc. are all easily capable of doing this from 1 vcpu)
Your C++ example is orthogonal to the deployment aspect because it discusses the application. Kubernetes and the fragile shell scripts are about the deployment of said application.
How are you going to deploy your C++ application? Both options are available, and I would wager that in most cases, Kubernetes makes more sense, unless you have strict requirements.
Kubernetes is for orchestrating a distributed system. What I was suggesting is to (1) make a monolith and (2) make it fast, light and high-throughput. The goal of your service is to reliably serve users at scale; this is just another way of doing it, just much more esoteric.
A "C++ monolith" allows me to potentially bypass a lot of this deployment stuff because it could serve lots (millions) of users from a single box.
>Kubernetes is for orchestrating a distributed system.
No it's not. You can use it to run bunch of monoliths too. K8s provides a common API layer that all of your organisation can adhere to. Just like containers are a generic encapsulation of any runable code.
Sure, but you could just use plain Docker for the monolith. Containerization isn't the issue. If you're not orchestrating and connecting an array of services, what's the value add of K8s?
Standardization of deployment logic, configuration management, networking, rbac policies, ACL rules, etc etc. across on-prem or any cloud provider.
I can leave my current job, jump into a new one and start providing value within less than couple of days. Compared that to spending weeks if not months trying to understand their special snowflake of an infrastructure solving the same problems already solved a million times before.
> A "C++ monolith" allows me to potentially bypass a lot of this deployment stuff
No it doesn't. Let's assume you write that application as a C++ monolith. Congratulations, you now have source code that could potentially serve 10k users on a toaster... If only you could get it onto that toaster. How are you going to start the databases it needs? How are you going to restart it when it crashes, or worse: When it still runs but is unresponsive. How are you going to upgrade it to a new version without downtime? How are you going to do canary releases to catch bugs early in production without affecting all users? How do you roll back your infrastructure when there is an issue in production? How do you notice when your toaster server diverges from it's desired state? How do you handle authorization to be compliant with privacy regulations? I'd love to see that simple and safe shell script of yours which handles all those use cases. I'm sure you could sell it for quite a bit of money.
What you fail to understand is that k8s never was about efficiency. Your monolith may work at 10k users with a higher efficiency but it can never scale to a million. At some point you can't buy any bigger toasters and have no choice but to make a distributed system.
Besides, microservice vs monolith is orthogonal to using k8s.
I run a Dockerized monolithic application in ECS, and I'll be switching to Kubernetes soon. I am 100% sold on this approach and will never go back to any deployment methods that I've used in the past (Capistrano, Ansible, Chef, Puppet, Saltstack, etc.)
I use Convox [1] which makes everything extremely simple and easy to set up on any Cloud provider. They have some paid options, but their convox/rack [2] project is completely free and open source. I manage everything from the command-line and don't use their web UI. It's just as easy as Heroku:
convox rack install aws production
convox apps create my_app
convox env set FOO=bar
convox deploy
You can also run a single command to set up a new RDS database, Redis instance, S3 bucket, etc. Convox manages absolutely everything: secure VPC, application load balancer, SSL certificates, logs sent to CloudWatch, etc. You can also set up a private rack where none of your instances have a public IP address, and all traffic is sent through a NAT gateway:
convox rack params set Private=true
This single command sets up HIPAA and PCI compliant server infrastructure out of the box. Convox automatically creates all the required infrastructure and migrates your containers onto new EC2 instances. All with zero downtime. Now, all you need to do is sign a BAA with AWS and make sure your application and company complies with regulations (access control, encryption, audit logs, company policies, etc.)
I run a simple monolithic application where I build a single Docker image, and I run this in multiple Docker containers across 3+ EC2 instances. This has made it incredibly easy to maintain 100% uptime for over 2 years. There were a few times where I've had to fix some things in CloudFormation or roll back a failed deploy, but I've never had any downtime.
My Docker images would be much smaller and faster if I built my backend server with C++ or Rust instead of Ruby on Rails. But I would absolutely still package a C++ application in a Docker image and use ECS / Kubernetes to manage my infrastructure. I think the main benefit of Docker is that you can build and re-use consistent images across CI, development, staging, and production. So all of my Debian packages are exactly the same version, and now I spend almost zero time trying to debug strange issues that only happen on CI, etc.
So now I already know I want to use Docker because of all these benefits, and the next question is just "How can I run my Docker containers in production?". Kubernetes just happens to be the best option. The next question is "What's the easiest way to set up Docker and Kubernetes?" Convox is the holy grail.
The application language or framework isn't really relevant to the discussion.
P.S. Things move really fast in this ecosystem, so I wouldn't be surprised if there are some other really good options. But Convox has worked really well for me over the last few years.
so you build your docker images all by yourself using just a dockerfile and your statically linked app based on RHEL - for all that HIPPA and PCI compliance?? Iirc the current hottest shit of this shitshow (the dockerhub-using "ecosystem") was to use ansible in you docker builds because it's oh so declaratiev.
No, not just for HIPAA/PCI compliance, that's just one of the many benefits. Here's some more reasons why I love Convox, Kubernetes/ECS, Docker:
* Effortlessly achieve 100% uptime with rolling deploys
* Running a single command to spin up a new staging environment that is completely identical to production
* Easily spinning up identical infrastructure in a different AWS region (Europe, Asia, etc.)
* Easily spinning up infrastructure inside a customer's own AWS or Google Cloud account for on-premise installations
* Automatic SSL certificates for all services. Just define a domain name in your Convox configuration, and it will automatically creates a new SSL certificate in ACM and attach it to your load balancer.
* Automatic log management for all services
* Very easily being able to set up scheduled tasks with a few lines of configuration
* Being able to run some or all of my service on AWS Fargate instead of EC2 with a single command
* Ease of deploying almost any open source application in a few minutes (GitLab, Sentry, Zulip Chat, etc.)
well, I am not really interested in your convox-ads but more in your claim that it somehow makes the typical docker-workflow of running random-software from the net somehow HIPPA and PCI-compliant? That's an interesting claim, especially with your description of it as zero-effort.
No, Convox doesn't automatically make any application compliant. Convox makes it far easier to achieve HIPAA/PCI compliance by easily setting up compliant server infrastructure:
Note that dedicated instances are no longer required for HIPAA compliance [1]. Also note that the private Convox console is completely optional. You can achieve all of this with the free and open source convox/rack project: https://github.com/convox/rack
As I mentioned in my original comment, you still need to do a lot of work to set up company policies and make sure your application complies with all regulations.
You should also be aware that I'm comparing Convox with some other popular options for HIPAA-compliant hosting:
* Datica: https://datica.com (I think it starts around $2,000 per month, but not 100% sure)
These companies do provide some additional security and auditing features, but I think there's no reason to spend thousands of dollars per month when Convox can get you 95% of the way in your own AWS account. PLUS: If you have any free AWS credits from a startup program, you might not need to pay any hosting bills for years.
In a world where your architecture is that simple I don't think kubernetes would be the choice for long.
I think for the average application there's still something to be said for manual cross-layer optimization between infrastructure, application, and how both are deployed.
What I mean is we can't yet draw too clear a line between the application and how it's deployed because there are real tradeoffs between keeping future options open and getting the product out the door. A strength of kubernetes is that if you get good at it it works for a variety of projects, but a lot of effort is needed to get to that point and that effort could have gone into something else.
Disagree with the GUI part. Text based configuration is stable, version-controllable, and intuitive to use. Look at Xcode for a nightmare of GUI configuration.
Currently mainstream k8s is text based, because it's still too fast moving and new. Creating a great GUI would be a serious overhead and there's not enough interest/demand for it. It'll come eventually.
Rancher is even more approachable and streamlined than Openshift. Yeah we need k8s to be more streamlined. It's fragmented cluster-something right now. There is no pride in making systems more complex nor there is pride in knowing how to operate them.
OpenShift's web interface lets you do a lot (other alternatives are of course available). And it's quite nice being able to edit the YAML of everything via the web interface as well!
> • Company saves money by not having to pay for managed cloud products if they don't want to
In some cases, the cost of a managed cloud product may be cheaper than the cost of training your engineers to work with K8s. It just depends on what your needs are, and the level of organizational commitment you have to making K8s part of your stack. Engineers like to mess around with new tech (I'm certainly guilty of this), but their time investment is often a hidden cost.
> The alternatives are all based on kludgy shell/Python scripts or proprietary cloud products.
The fact that PaaS products are proprietary is often listed as a detriment. But, how detrimental is it really? There are plenty of companies whose PaaS costs are insignificant compared to their ARR, and they can run the business for years without ever thinking about migrating to a new provider.
The managed approach offered by PaaS can be a sensible alternative to K8s, again it just depends on what your organizational needs are.
"The alternative to K8s isn't your personal collection of fragile shell scripts. The real alternative is not doing the whole microservices thing and just deploying a single statically linked, optimized C++ server that can serve 10k requests per second from a toaster--but we're not ready to have that discussion."
You are writing this and i thought yesterday how to extend my current home k8s setup even further.
I would even manage that little c++ tool through k8s.
K8s brings plenty of other things out of the box:
- Rolling update
- HA
- Storage provisioning (which makes backup simpler)
- Infrastructure as code (whatever your shellscript is doing)
I think that the overhead k8s requires right now, will become smaller over the years, it will be simpmler to use it, it will become more and more stable.
It is already a really simple and nice control plane.
I like to use a few docker containers with compose. But if i already use docker compose for 2 projects, why not just using k8s instead?
You still need quite a lot of stuff for that one, statically linked, heavily optimized C++ server. In a way, that's actually what k8s comes from ...
How do you manage deployments for that C++ monolith? How is the logging? Logrotate, log gathering and analysis? Metrics, their analysis and and display? What happens when you have software developed by others that you might also to want to deploy? (If you can run a company with only one program ever deployed, I envy you).
All of that is simplified by kubernetes by simply making all stuff follow single way - "classical" approaches tend to make Perl blush with the amount of "There is more than one way to do it" that goes on.
Are there any decent GUI services for creating the YAML files?
Most of it can be managed by text boxes on the front-end with selections and then it can just generate or edit the required files at the end of a wizard?
Your static app won't scale. Once your kernel is saturated with connections, once your buffers are full, you will get packets dropped. Your app may crash unexpectedly so you need to run it in an infinite loop. And of course your static example is a simple echo app or hello world. That works fine from a toaster. In the real world we use databases where every store takes a significant time to process and persist. You quickly overgrow a single server. Then you need distributed systems and a way to do versioning so you use containers and then you need an orchestrator so you pick K8S because it's most mature and there are many resources around. And then you can even do rolling updates and rollbacks. Finally you use Helm, Terraform and Terragrunt and never look back. It works surprisingly good. I lost several years by learning all this stuff, it was difficult but it was worth it. I have more visibility into everything now thanks to metrics-server, Prometheus Operator, Grafana, Loki and I have two environments so I can update deployments in one environment and test, once tested ok, I can apply to live. No surprises. No need to run 5 year old versions of software and fear updating it...
I've never coded in C++, so I'm curious - what is the feedback loop like? My understanding is that since C++ is a compiled language, you can't "hit refresh" the same way you can in JavaScript/Ruby/etc.
Is that an incorrect understanding? I know C++ is supposed to be great for performance, but in truth I've never needed anything to be that fast. And if I can get the job done just as well with something I already know, I won't bother learning something like C++ which has a reputation for not being approachable.
Technically you can have some sort of command in your C++ program to make it fork itself, that is restart itself disk, maybe after freeing non-shareable resources like listening sockets. One could even imagine that this command includes self-recompilation by invoking make (or whatever your build system is).
An alternative is to have your program expose its pid somewhere, and your make file could send a signal to that pid when a new version is ready. The advantage is that if your program crashes on startup, you don't have to do something different to restart it.
If your application has to include an "auto-update" feature (like browsers), use that instead - its certainly better to eat your own dog food. Maybe just hack it a little bit so that you can force a check programmatically (e.g. by sending it a signal) and so that it connects to a local updates server.
It is true that C++ is an overly complicated language; if you don't need maximal performance, you have a lot of AoT languages that are a bit slower ( something around 0.5 C ) but more "user-friendly". In particular, if you are into servers and want fast edit-compile-run loops, Go might be a good choice.
The part of the equation you are missing is how little traffic the majority of business applications end up seeing.
A B2B app that has at most 10 concurrent requests can run on the smallest EC2 instance whether it's written in PHP or written in C++.
Bandwidth doesn't change with language choice and neither does the storage requirements of your app so those billable items don't come into the equation either.
So the CPU cost can effectively be dropped off of 95% of the apps out there today. At that point your main variable cost between C++ and something like PHP/Javascript is going to be the cost of development. All I can say to that is that it's a lot harder to find developers who can write C++ web apps at the same pace as developers slinging PHP for web apps. There is a reason Facebook uses a PHP derivative for huge portions of its web backend.
So maybe the question we should ask ourselves is: why isn’t there a smaller, cheaper EC2 instance (or any other provider than AWS) ?
This industry is tailoring the levels. Of course it’s understandable because, well, they live on it. And they count on small instances to share hardware ressources to overbook said hardware.
And I don’t blame them for that, I’m doing the same on my own bare metal servers, hosting multiple websites for clients and making money on it.
But I have the feeling that there is a lot of ressource loss somewhere in it, just for the sake of loosing it because it’s easier. Maybe I’m wrong.
I only use AWS when consulting customers prefer it.
I get a lot of mileage serving a lot of content for several domains on a free Google Cloud Platform f1 micro instance. I also prefer GCP when I need a lot of compute for a short time.
Hetzner has always been my choice when I need more compute for a month or two. For saving money for VPSs OVH and DO have also been useful but I don’t use them very often.
Go will get you 90% there in performance. I recommend that instead of C++ because it compiles fast for that edit-build-refresh cycle. Go also has the benefit of producing static binaries by default which solves a lot of the problems Docker was for.
C++ is more for the extreme control over memory. An optimized C++ server can max out the NIC even on a single core and even with some text generation/parsing along the way.
If you're a ruby user looking for extra performance, try Crystal[1]. Everything you love + types compiled to native binaries. You can set up sentry[2] to autocompile and run every time you save, so the feedback loop is just as tight.
It doesn't have to be C++ of course. The main point of OP lies in C++ being fast and easy to deploy (at least it can be). Go, Rust, to some extend C# and Java also fall in that category. This feature set becomes interesting, because it has the potential to simplify everything around it. If you don't need high fault tolerance you can go a very long way with just a single server, maybe sprinkled with some CDN caching in front if your users are international.
If you do need higher availability you can go the route StackOverflow was famous for for quite some time of having a single ReadWrite master server, a backup master to take over and a few of ReadOnly slave servers, IIRC. With such setups you can ignore all the extra complexities cloud deployments bring with them. And just because such simple setups make it possible to treat servers like pets, doesn't mean they have to be irreproducible undocumented messes.
Working in strongly statically typed languages is quite different, because the types (and the IDE/compiler) guides you. You don't have to hit refresh that often.
Even just working in TypeScript with TSed and a few basic strongly typed concepts (Rust's Result equivalent in TS, Option or Maybe, and typed http request/response, and Promise and JSON.parse) makes a big difference.
A lot less okay, just echo/print/log this object (or look up documentation eew), look at what does this look like and how to transform it into what I need. Instead you do that in the IDE.
I would speculate that for companies these two statements will be exclusive:
• Company saves money by not having to hire Linux sysadmins
• Company saves money by not having to pay for managed cloud products if they don't want to
As a developer I want to right code, not manage a Kubernetes installation. If my employer wants the most value from my expertice they will either pay for a hosted environment to minimize my time managing it or hire dedicated staff to maintain an environment.
And this is why I only give Linux like one decade more to still be relevant on the server.
With hypervisors and managed environments taking over distributed computing, if there is a kernel derived from Linux or something completely different, it is a detailed that only the cloud provider cares about.
That doesn't really make sense in the context of docker.
It's still Linux inside the container. Even if it's some abstract non-Linux service thing running the container, what happens in the container is still the concern of the developer.
Yes it does, my application written in Go, Java, .NET, doesn't care if the runtime is bare metal, running on an hypervisor type 2, type 1 or some other OS.
I run Docker on Windows Containers, no Linux required.
There are also the ugly named serverless deployments, where the kernel is meaningless.
or a Free Pascal server (powered by mORMot (https://github.com/synopse/mORMot))? Natively compiled, high performance, and supports almost all OS's and CPU's
> The real alternative is not doing the whole microservices thing and just deploying a single statically linked, optimized C++ server that can serve 10k requests per second from a toaster--but we're not ready to have that discussion.
The alternative is to have a old and boring cluster of X identical java nodes which host the entire backend in a single process... The deployment is done by a pedestrian bash script from a Jenkins. It used to work fine for too long I guess and folks couldn't resist "inventing" microservices to "disrupt" it.
k8s is popular because Docker solved a real problem and Compose didn’t move fast enough to solve orchestration problem. It’s a second order effect; the important thing is Docker’s popularity.
Before Docker there were a lot of different solutions for software developers to package up their web applications to run on a server. Docker kind of solved that problem: ops teams could theoretically take anything and run it on a sever if it was packaged up inside of a Docker image.
When you give a mouse a cookie, it asks for a glass of milk.
Fast forward a bit and the people using Docker wanted a way to orchestrate several containers across a bunch of different machines. The big appeal of Docker is that everything could be described in a simple text file. k8s tried to continue that trend with a yml file, but it turns out managing dependencies, software defined networking, and how a cluster should behave at various states isn’t true greatest fit for that format.
Fast forward even more into a world where everybody thinks they need k8s and simply cargo cult it for a simple Wordpress blog and you’ve got the perfect storm for resenting the complexity of k8s.
I do miss the days of ‘cap deploy’ for Rails apps.
> k8s is popular because Docker solved a real problem and Compose didn’t move fast enough to solve orchestration problem. It’s a second order effect; the important thing is Docker’s popularity.
I introduced K8s to our company back in 2016 for this exact reason. All I cared about was managing the applications in our data engineering servers, and Docker solved a real pain point. I chose K8s after looking at Docker Compose and Mesos because it was the best option at the time for what we needed.
K8s has grown more complex since then, and unfortunately, the overhead in managing it has gone up.
K8s can still be used in a limited way to provide simple container hosting, but it's easy to get lost and shoot yourself in the foot.
>Before Docker there were a lot of different solutions for software developers to package up their web applications to run on a server.
There are basically two relevant package managers. And say what you will about systemd, service units are easy to write.
It's weird to me that the tooling for building .deb packages and hosting them in a private Apt repository is so crusty and esoteric. Structurally these things "should" be trivial compared to docker registries, k8s, etc. but they aren't.
.rpm and .deb are geared more towards distributions needs. Distributions want to avoid multiplying the number of components for maintenance and security reasons. Bundling dependencies with apps is forbidden in most distribution policies for these reasons, and the tooling (debhelpers, rpm macros) actively discourage it.
It's great for distributions, but not so great for custom developments where dependencies can either be out of date or bleeding edge or a mix of the twos. For these, a bundling approach is often preferable, and docker provides a simple to understand and universal way to achieve that.
That's for the packaging part.
Then you have the 2 other parts: publishing and deployment.
For publishing, Docker was created from the get go with a registry, which makes things relatively easy to use and well integrated. By contrast, for rpm and deb, even if something analog exists (aptly, pulp, artifactory...) it much more some tools created over time which work on top of one another, giving a less smooth experience.
And then, you have the deployment part, and here, with traditional package managers, it difficult to delegate some installs (typically, the custom app develop in-house) to the developers without opening control over the rest of the system. With Kubernetes, developers gained this autonomy of deployment for the pieces of software under their responsability whilst still maintaining separation of concerns.
Docker and Kubernetes enabled cleaner boundaries, more in line with the realities of how things are operated for most mid to large scale services.
Right, the bias towards distro needs is why packaging so hard to do internally, I'm just surprised at how little effort has gone into adapting it.
You need some system mediating between people doing deployments and actual root access in both cases. The "docker" command is just as privileged as "apt-get install." I have always been behind some kind of API or web UI even in docker environments.
You can always simplify your IT and require everyone to use only a small subset of Linux images which were preapproved by your security team. And you can make those to be only deb or rpm based Linux distributions.
The only problem with these Linux based packaging for deployments are Mac users and their dev environment. Linux users are usually fine, but there always had to be some Docker like setup for Mac users.
If we could say that our servers run on Linux and all users run on some Linux (WSL for Windows users) then deployments could have been simple and reproducible rpm based deployments for code and rpm packages containing systemd configuration.
I'm guessing they meant to say package formats, in which case they'd be deb and rpm. Those were the only two that are really common in server deployments running linux I'd guess.
dnf is a frontend to rpm, snap is not common for server use-cases, nix is interesting but not common, dpkg is a tool for installing .deb.
> everybody thinks they need k8s and simply cargo cult it for a simple Wordpress blog
docker _also_ has this problem though. there are probably 6 people in the world that need to run one program built with gcc 4.7.1 linked against libc 2.18 and another built with clang 7 and libstdc++ at the same time on the same machine.
and yes, docker "provides benefits" other than package/binary/library isolation, but it's _really_ not doing anything other than wrapping cgroups and namespacing from the kernel - something for which you don't need docker to do (see https://github.com/p8952/bocker).
docker solved the wrong problem, and poorly, imo: the packaging of dependencies required to run an app.
and now we live in a world where there are a trillion instances of musl libc (of varying versions) deployed :)
sorry, this doesn't have much to do with k8s, i just really dislike docker, it seems.
I am a big fan of using namespaces via docker, in particular for development. If I want to test my backend component I can expose a single port and then hook it up to the database, redis, nginx etc. via docker networks. You don't need to worry about port clashes and it's easy to "factory reset".
In production this model is quite a good way to guarantee your internal components aren't directly exposed too.
that's sort of my point though - namespacing is a great feature that allows for more independent & isolated testing and execution, there is no doubt. docker provides none of it.
i would argue that relying on docker hiding public visibilty of your internal components is akin to using a mobile phone as a door-stop - it'll probably work but there are more appropriate (and auditable) tools for the job.
> docker _also_ has this problem though. there are probably 6 people in the world that need to run one program built with gcc 4.7.1 linked against libc 2.18 and another built with clang 7 and libstdc++ at the same time on the same machine.
You are supposed to keep only a single process inside one docker container. If you want two processes to be tightly coupled then use multi-container pods.
Hit the nail on the head. How else could you at the push of a button not just get a running application but an entire coordinated system of services like you get with helm. And deploying a kubernetes cluster with kops is easy. I don't know why people hate on k8s so much. For the space I work in it's a godsend
Good points but I think it would be accurate to say that Docker solved a developer problem. But developers are only part of the story. Does Kubernetes solve the business' problem? The user's problem? The problems of sys admins, testers, and security people? In my experience it doesn't (though I wouldn't count my experience as definitive).
At my company we have had better success with micro-services on AWS Lambda. It has vastly less overhead than Kubernetes and it has made the tasks of the developers and non-developers easier. "Lock-in" is unavoidable in software. In our risk calculation, being locked into AWS is preferable than being locked into Kubernetes. YMMV.
> I do miss the days of ‘cap deploy’ for Rails apps.
Oh boy I do not miss them. Actually I'm still living them and I hope we can finally migrate away from Capistrano ASAP. Dynamic provisioning with autoscaling is a royal PITA with cap as it was never meant to be used on moving targets like dynamic instances.
>I do miss the days of ‘cap deploy’ for Rails apps.
Add operators, complicated deployment orchestration and more sophisticated infrastructure... It is hard to know if things are failing from a change I made or just because there are so many things changing all the time.
Kubernetes is very complex and took a long time to learn properly. And there have been fires among the way. I plan to write extensively on my blog about it.
But at the end of the day: having my entire application stack as YAML files, fully reproducible [1] is invaluable. Even cron jobs.
Note: I don't use micro services, service meshes, or any fancy stuff. Just a plain ol' Django monolith.
Maybe there's room for a simpler IAC solution out there. Swarm looked promising then fizzled. But right now the leader is k8s[2] and for that alone it's worth it.
[1] Combined with Terraform
[2] There are other proprietary solutions. But k8s is vendor agnostic. I can and have repointed my entire infrastructure with minimal fuss.
I'm not sure "a plain ol' Django monolith" with none of the "fancy stuff" is either what people are referring to when they say "kubernetes", or a great choice for that. I could run hello world on a Cray but that doesn't mean I can say I do supercomputing. Our team does use it for all the fancy stuff, and spends all day everyday for years now yamling, terraforming, salting, etc so theoretically our setup is "entire application stack as YAML files, fully reproducible", but if it fell apart tomorrow, I'd run for the hills. Basically, I think you're selling it from a position which doesn't use it to any degree which gives sufficient experience required to give in-depth assessment of it. You're selling me a Cray based on your helloworld.
Reading this charitably: I guess I agree. k8s is definitely overpowered for my needs. And I'm almost certain my blog or my business will never need that full power. Fully aware of that.
But I'm not sure one can find something of "the right power" that has the same support from cloud providers, the open source community, the critical mass, etc. [1]
Eventually, a standard "simplified" abstraction over k8s will emerge. Many already exist, but they're all over the place. And some are vendor specific (Google Cloud Run is basically just running k8s for you). Then if you need the power, you can eject. Something like Create React App, but by Kubernetes. Create Kubernetes App.
Curious why run it at all? The cost must be 10 times more this way. It is mostly for the fun of learning.
I come from the opposite approach. I have 4 servers two digital ocean $5 and two vulr $2.50 instances. One holds the db. One server as the frontend/code. One server to do heavy work and another to server a heavy site and holds backups. For $15 I'm hosting hundreds of sites, running so many background processes. I couldn't imagine hitting that point where k8s would make sense just for myself unless for fun.
If you do, the recipe is to reduce the number of components, get the most reliable components you can find, and make the single points of failure redundant.
Saying you can use Kubernetes to turn whatever stupid crap people tend to deploy with it highly available, is like saying you can make an airliner reliable by installing some sort of super fancy electronic box inside. You don't get more reliability by adding more components.
> Saying you can use Kubernetes to turn whatever stupid crap people tend to deploy with it highly available, is like saying you can make an airliner reliable by installing some sort of super fancy electronic box inside. You don't get more reliability by adding more components.
This is a bit funny, considering Airbus jets use triple-redundancy and a voting system for some of their critical components. [1]
Are you ok with your application going down for each upgrade? With Kubernetes, it's very simple to configure a deployment so that downtime doesn't happen.
If and only if your application supports it. Database schema upgrades can be tricky for instance, if you care about correctness.
On the other hand, atomic upgrades by stopping the old service and then starting the new service on a Linux command line (/Gitlab runner) can be done in 10 seconds (depending on the service of course – dynamic languages/frameworks sometimes are disadvantaged here). I doubt many customers will notice 10 second downtimes.
And that downtime can even be avoided without resorting to k8s. A simple blue-green deployment (supported by DNS or load balancer) is often all that's needed.
K8s only makes sense at near Google-scale, where you have a team dedicated to managing that infrastructure layer (on top of the folks managing the rest of the infrastructure). For almost everyone else, it's damaging to use it and introduces so much risk. Either your team learns k8s inside out (so a big chunk of their work becomes about managing k8s) or they cross their fingers and trust the black box (and when it fails, panic).
The most effective teams I've worked on have been the ones where the software engineers understand each layer of the stack (even if they have specialist areas of focus). That's not possible at FAANG scale, which is why the k8s abstraction makes sense there.
Takes a couple of minutes at most for an average application upgrade / deployment, a lot of places can deal with that. Reddit is less reliable than what I used to manage as a one man team.
How do you do automated deployments, though? I don't like using K8s for small stuff, but I am also extremely allergic to having to log on to a server to do anything. Dokku hits the sweet spot for me, but at work I would probably use Nomad instead.
Set your pod to pull image always and have entrypoint shell script that clones the repo, kill the pod so on restart you could get your code deployed.
You could run init container with Kaniko that pushes image to repo and then main container that pulls that back but for that you need to do kubectl rollout restart deploy <name>
If you are looking for pure CI/CD gitlab has awesome support or you could do Tekton or Argo. They can run on the same cluster.
What's wrong with logging in to a server? I love logging in to a server and tinkering with it. Sure, for those who operate fleets of hundreds it's not scalable, but for a few servers that's a pleasure.
The problem is when you need to duplicate that server or restore it due to some error, you have no idea what all the changes you made are.
Besides, it's additional hassle and a chance for things to go wrong, the way I have it set up now is that production gets a new deployment whenever something gets pushed to master and I don't have to do anything else.
But this is a solved problem since... well, at least since the beginning of internet. I managed 1000s of Linux & BSD systems over the past 25 years and I have scripts that do all that since the mid 90s that automate everything. I never install anything manual; if I have to do something new, I first write + test a script to do that remotely. Also, all this 'containerization' is not new; I have been using debootstrap/chroot since around that time as well. I run cleanly separated multiple versions of legacy (it is a bit scary how much time it takes to move something written early 2000s to a modern Linux version) + modern applications without any (hosting/reproducibility) issues since forever (in internet years anyway).
True; I learned many years ago that that is not a good plan. Although, I too, love it. But I use my X220 and Openpandora at home to satisfy that need. Those setups I could not reproduce if you paid me.
> The problem is when you need to duplicate that server or restore it due to some error, you have no idea what all the changes you made are.
A text file with some setup notes is enough for simple needs, or something like Ansible if its more complex. A lot of web apps aren't much more than some files, a database, and maybe a config file or three (all of which should be versioned and backed up).
Make backup of /etc and package list. Usually that's enough to quickly replicate a configuration. It's not like servers are crashing every day. I'm managing few servers for a last 5 years and I don't remember a single crash, they just work.
I'm logging into a server because i need to, not because its 'pleasure'.
I don't hate it but if you need to login to a server regularly because you need to do an apt upgrade, you should have enabled automatic security updates and not login every few days.
If your server runs full because of some logfiles or stuff, you should fix the underlying issue and not needing to login to a server.
You should trust your machines, independently if it is only one machine, 2, 3 or 100. You wanna be able to go on holiday and know your systems are stable, secure and doing their job.
And logging in also implies a snow flake. Doesn't matter as long as that machine runs and as long as you have not that many changes but k8s actually makes it very simple to finally have an abstraction layer for infrastructure.
Flux or Argo can help with this. The operator lives on your cluster, and ensures your cluster state matches a Git repo with all your configuration in it.
This is what I like to do. In my case, even the CI/CD host is just a systemd service I wrote.
The service just runs a script that uses netcat to listen on a special port that I also configured GitHub to send webhooks to, and processes the hook/deploys if appropriate.
Then when it's done, systemd restarts the script (it is set to always restart) and we're locked and loaded again. It's about 15 lines of shell script in total.
Do you manage to aggregate all logs on a single place? Do you have the same environment as staging? How do you upgrade your servers? Do you have multiple teams deploying on their own components? Do you have a monitoring/metric service? How do you query the results of a Cron? Can you rollback to the correct version when you detect an error at production?
remote syslog has been a thing for how many years?! As has using a standard distribution for your app with a non-root-user for each app, easily wiped and set up every deploy (hint: that's good for security too!). Monitoring was also a solved problem and I guess Cron logs to syslog. Rollback works just like a regular deploy? (I wonder how good k8s helps you with db-schema rollbacks?)
Setting all of that from scratch is not really that easy, and I wouldn't consider "monitoring" to have been a "solved problem". syslog over TCP/UDP had many issues which is why log file shippers happened, and you still need to aggregate and analyze it. Getting application to reliably log remotely is IMO easier with k8s than remote-writing syslog, as I can just whack the developer again and again till it logs to stdout/stderr then easily wrap it however I want.
Deploying as distribution package tends to not work well when you want to deploy it more than once on a specific server (which quickly leads us to classic end-result of that approach, which is VM per deployment minimum - been there, done that, still have scars).
Management of cron jobs was a shitshow, is a shitshow, and probably will be a shitshow except for those that run their crons using non-cron tools (which includes k8s).
Yes, k8s makes it easier and more consistent. But it's not like all the stuff from the past suddenly stopped working or was not possible like GP made it sound ;)
Are you using a framework (cPanel etc) for this or just individual servers talking to each other? Need to move my hosting to something more reliable and cheaper...
I'm learning Elixir now, and it's quite confusing to me how one would go about deploying Elixir with K8s. How much you should just let the runtime handle.
How much of K8s is just an ad hoc, informally-specified, bug-ridden, slow implementation of half of Erlang.
> But I'm not sure one can find something of "the right power" that has the same support from cloud providers, the open source community, the critical mass, etc.
I totally agree. I would dearly like something simpler than Kubernetes. But there isnt a managed Nomad service, and apparently nothing in between Dokku and managed Kubernetes either.
I've been very pleased with Nomad. It strikes a good balance between complexity and the feature set. We use it in production for a medium sized cluster and the migration has been relatively painless. The nomad agent itself is a single binary that bootstraps a cluster using Raft consensus.
So are you saying that, no matter what, if you want to reply your whole infrastructure as code (networks, dmz, hosts, services, apps, backups etc ) that you are going to have to reproduce that somehow (whatever the combo of AWS services are, OR just learn K8S
Effectively, "every infrastructure as code project will reimplement Kubernetes in Bash"
Instead of a varied interface tools set, I can have one with consistent interfaces and experiences. Kubernetes is where everything is going, hence why TF and Ansible have both recently released Kubernetes related products / features. It's their last attempt at remaining relevant (which is more than likely wasted effort in the long run). They have too much baggage (existing users in another paradigm) to make a successful pivot.
Ironically for me, those two tools are part of the blessed triad that we use for all of our infrastructure as code and end-user virtual machine initial setup.
If we only got to keep two tools it would be kubernetes and terraform.
This isn’t a competition, they are tools. Ansible is widely used and will continue to be so for a long long time. Its foundations - ssh, python and yaml are also in for the long run to manage infrastructure...
Yaml is on the way out, Cuelang will replace it where it's used for infra. It's quite easy to start by validating yaml and then you quickly realize how awesome having your config in a well thought-out language is!
well, then perhaps the growing frustration with a configuration language where meaning depends on invisible chars is an argument. And then what helm and others are doing text interpolating and add helpers for managing indentation.
There are many experiments into alternatives happening right now, so I do believe yaml's days are numbered. I'm actively replacing it where ever I encounter it with a far superior alternative. Cue is far more than a configuration language however, worth the time to learn and adopt at this point.
Kubernetes is outstanding because proving ol bash scripts were bad and re-writing all Bash glue in Go glue, slapping 150 APIs on top (60 "new" and about hundred "old versions" https://kubernetes.io/docs/reference/generated/kubernetes-ap...), adding few dozen opensource must have projects - so finally it can be called "cloud native" - boom, a new cloud native bash for the cloud is born!
Bash glue has rarely failed me. Never used k8s, but the horror stories I hear ensure I won’t feel dirty for writing a little bash to solve problems anytime soon.
Swarm is still supported and works. I have it running on my home server and love it.
Kubernetes is fine, but setting it up kind of feels like I'm trying to earn a PhD thesis. Swarm is dog-simple to get working and I've really had no issues in the three years that I've been running it.
The configs aren't as elaborate or as modular as Kubernetes, and that's a blessing as well as a curse; it's easy to set up and administer, but you have less control. Still, for small-to-mid-sized systems, I would still recommend Swarm.
> setting it up kind of feels like I'm trying to earn a PhD thesis.
The kind of people who has to both set the cluster up and keep it up and also has to develop the application and deploy it and keep it up etc is not the target audience.
K8s shines when the roles of managing the cluster and running workloads on it are separated. It defines a good contract between infrastructure and workload. It lets different people focus on different aspects.
Yes it still has rough edges, things that are either not there yet, or vestigial complexity of wrong turns that happened through it's history. But if you look at it through the lense of this corporate scenario it starts making more sense than when you just think of what a full-stack dev in a two person startup would rather use and fully own/understand.
One of the things nobody liked to talk about in public when test automation was slowly "replacing" testers is that if you had the testers write automation, they brought none of the engineering discipline we tend to take as a given to the problem.
It's hard to make tests maintainable. Doubly so if you aren't already versed in techniques to make code maintainable.
I wonder sometimes if we aren't repeating the same experiment with ops right now.
There are elements of our company that want to move to Kubernetes for no real reason other than it's Kubernetes. I can't wait to see the look on their faces when they realise we'll have to employ someone full-time to manage our stack.
Do you have a recommended tutorial for engineer with backend background to setup a simple k8 infra in ec2, I am interested in understanding devops role better
I worked in networking for the longest time. When I started there network guys and server guys (at least where I was). They were different people who did different things who kinda worked together.
Then there were storage area networks and similar, networks really FOR the server and storage guys.... that kind of extended the server world over some of the network.
Then comes VMware and such things and now there was a network in a box somewhere that was entirely the server guy's deal (well except when we had to help them... always).
Then we also had load balances who in their own way were a sort of code for networks ... depending on how you looked at it (open ticket #11111 of 'please stop hard coding ip addresses').
You also had a lot of software defined networking type things and so forth brewing up in dozens of different ways.
Granted these descriptions are not exact, there were ebs and flows and some tech that sort of did this (or tired) all along. It all starts to evolve slowly into one entity.
Conversely I think the trend of writing infrastructure as yaml may be the worst part of modern ops. It’s really hard to think of a worse language for this.
Fortunately YAML is not actually necessary in k8s, and only provided as a convenience because writing JSON by hand (or Proto3, lol) is annoying verging on insane.
We can build higher level abstractions easily having a schema to target and we can build them in whatever we want. That's a big boon for me :)
For kubernetes thats absolutely true and I think more people should do that (disclaimer I work with this https://github.com/stripe/skycfg daily). I think it actually would be easier for people to understand the k8s system if they did.
But, yaml is now everywhere in the ops space. Config management systems use it, metrics systems use it, its the defacto configuration format right now and that is unfortunate cause its bad.
Its typing is too weak. It’s block scoping is dangerous. References, classes and attachments are all pretty bad for reuse. Schemas are bolt on and there is no standard query language for it.
> I can and have repointed my entire infrastructure with minimal fuss.
When you get to that blog post please consider going in depth on this. Would love to see actual battletested information vs. the usual handwavy "it works everywhere".
I sure will. 99% of the work was ingress handling and SSL cert generation. Everything else was fairly seamless.
Even ingress is trivial if you use a cloud balancer per ingress. But I wanted to save money so use a single cloud balancer for multiple ingresses. So you need something like ingress-nginx, which has a few vendor-specific subtleties.
I've been using Nomad for my "toy" network, and I like it. It runs many services, and a few periodic jobs. Lightweight, easy to set up, and has enough depth to handle some of the weirder stuff.
Nomad, in its free offering, cannot compete with k8s for organization-wide usage:
- no RBAC
- no quotas
- no preemption
- no namespacing
This means: everyone is root on the cluster, including any CI/CD system that wants to test/update code. And there's no way to contain runaway processes with quotas/preemption.
can it handle networking (including load balancing and reverse proxies with automatic TLS) or virtualized persistent storage? Make it easy to integrate common logging system?
Cause those are the parts that I miss probably the most when dealing with non-k8s deployment, and I haven't had the occasion to use Nomad.
For load balancing you can just run one of the common LB solutions (nginx, haproxy, Traefik) and pick up the services from the Consul service catalog. Traefik makes it quite nice since it integrates with LetsEncrypt and you can setup the routing with tags in your Nomad jobs: https://learn.hashicorp.com/nomad/load-balancing/traefik
What Nomad doesn’t do is setup a cloud provider load balancer for you.
Those are the advantage and the problem of nomad. We're using it a lot by now.
Nomad, or rather, a Nomad/Consul/Vault stack doesn't have these things included. You need to go and pick a consul-aware loadbalancer like traefik, figure out a CSI volume provider or a consul-aware database clustering like postgres with patroni, think about logging sidecars or logging instances on container hosts. Lots of fiddly, fiddly things to figure out from an operative perspective until you have a platform your development can just use. Certainly less of an out-of-the-box experience than K8.
However, I would like to mention that K8 can be an evil half-truth. "Just self-hosting a K8 cluster" basically means doing all of the shit above, except its "just self-hosting k8". Nomad allows you to delay certain choices and implementations, or glue together existing infrastructure.
I count the "glue it with existing infrastructure" to be higher cost than doing it from scratch. It was one feature that I definitely knew regarding Nomad, as one or two people who used it did chime in years ago in discussion, but for various reasons that might not be applicable to everyone I consider those unnecessary complication :)
Depending on how big the infrastructure is and how long you want to migrate over... usually there's not enough resources to "redo all from scratch", millions of LoC are already in production, people who owned key services are no longer in the company, other priorities for business exists other than have what you have already working in k8s..
The context was rather different (home setup), but all that you mention can be used as arguments Noth for and against redo, basing on situation in company, future needs, etc.
I have actually done a "lift and shift" where we moved code that had no support or directly antagonistic one to k8s because various problems reached situation where CEO said "replace the old vendor completely" - we ended up using k8s to wrestle with the amount of code to redeploy.
- Yes, just added CSI plugin support. Previously had ephemeral_disk and host_volume configuration options, as well as the ability to use docker storage plugins (portworx)
- I haven’t personally played with it, but apparently nomad does export some metrics, and they’re working on making it better
nomad is strictly a job scheduler. If you want networking you add consul to it and they integrate nicely. Logging is handled similarly to Kubernetes.
Cool thing about Nomad that it's less prescriptive
With "Swarm", do you mean Docker Swarm? Why has it "fizzled"?
The way I learned it in Bret Fisher's Udemy course, Swarm is very much relevant, and will be supported indefinitely. It seems to be a much simpler version of Kubernetes. It has both composition in YAML files (i.e. all your containers together) and the distribution over nodes. What else do you need before you hit corporation-scale requirements?
I use Swarm in production and am learning k8s as fast as possible because of how bad Swarm is:
1. Swarm is dead in the water. No big releases/development afaik recently
2. Swarm for me has been a disaster because after a couple of days some of my nodes slowly start failing (although they’re perfectly normal) and I have to manually remove each node from the swarm, join them, and start everything up again. I think this might be because of some WireGuard incompatibility, but the strange thing is that it works for a week sometimes and other times just a few hours
To add another side, I use Swarm in production and continue to do so because of how good it is.
I've had clusters running for years without issue. I've even used it for packaging B2B software, where customers use it both in cloud and on-prem - no issues whatsoever.
I've looked at k8s a few times, but it's vastly more complex than Swarm (which is basically Docker Compose with cluster support), and would add nothing for my use case.
I'm sure a lot of people need the functionality that k8s brings, but I'm also sure that many would be better suited to Swarm.
This happened to me too when I was using swarm in 2017. Had to debug swarm networks where nodes could send packets one way but not the other. Similar problems as #2 where stuff just breaks and resetting the node is the quickest way I found to fix it.
Switched to k8s in late 2017 and it’s been much more solid. And that’s where the world has moved, so I’m not sure why you’d choose swarm anymore.
> What else do you need before you hit corporation-scale requirements?
Cronjobs, configmaps, and dynamically allocated persistent volumes have been big ones for our small corporation. Access control also, but I'm less aware of the details here, other than that our ops is happier to hand out credentials with limited access, which was somehow much more difficult with swarm
Swarm has frankly also been buggy. "Dead" but still running containers - sometimes visible to swarm, sometimes only the local Docker daemon - happen every 1-2 months, and it takes forever to figure out what's going on each time.
A little off topic but why do these orchestration tools always prefer to use YAML as I really feel it a harder to understand format than JSON or better TOML and it's the only thing that I don't like about Ansible.
I see ruby kind of uses YAML often but are people comfortable editing YAML files? I always have to look up how to do arrays and such when I edit them once in a while.
Anything new needs a certain amount of sustained practice before you get the hang of it. I think I had to learn regex like four times before it stuck. I haven’t hit that point with TOML yet so I avoid it.
I’d suggest using the deeper indentation style where hyphens for arrays are also indented two spaces under the parent element. Like anything use a linter that enforces unambiguous indentation.
I prefer YAML for human-writeable config because JSON is just more typing and more finicky. The auto-typing of numbers and booleans in YAML is a pretty damn sharp edge though and I wish they’d solved that some other way.
That’d be a great first step if the purpose is to learn Kubernetes. If, however, you want to set up a cluster for real use then you will need much more than bare bones Kubernetes (something that solves networking, monitoring, logging, security, backups and more) so consider using a distribution or a managed cloud service instead.
Setting up your own k8s from scratch is kind of like writing your own string class in C++: it’s a good exercise (if it’s valuable for your learning path) but you probably don’t want to use it for actual work.
Maintaining a cluster set up like that is a ton of work. And if you don’t perform an upgrade perfectly, you’ll have downtime. Tools like kops help a lot but you’ll still spend far more time than the $70/month it costs for a managed cluster.
I find that K3s is great for getting started. It has traefik included and its less of a learning curve to actually be productive, vs diving in with K8s and having to figure out way more pieces.
I've just started to look into it, but it seems like the project has been focusing on improving the onboarding experience since it has a reputation for being a huge pain to set up. Do you think it has gotten easier lately?
No. Not easier in my opinion. And some of the fires you only learn after getting burnt badly. [1]
Note: my experience was all with cloud-provided Kubernetes, never running my own. So it was already an order of magnitude easier. Can't even imagine rolling my own. [2]
Out of curiosity why do you feel swarm fizzled out?
I’ve deployed swarm in a home lab and found it really simple to work with, and enjoyable to use. I haven’t tried k8, but I often see view points like yours stating that k8 is vastly superior.
According to the article you are wrong about "infrastructure as code". Kubernetes is infrastructure as data, specifically YAML files. Puppet and Chef are infrastructure as code.
Edit: not sure why the down votes, I was just trying to point out what seems like a big distinction that the article is trying to make.
They don't do loops or recursion. They don't even do iterative steps in the way that Ansible YAML has plays/tasks.
Yes, higher-level tools like Kustomize or Jsonnet or whatever else you use for templating the files are Turing-complete - but that's at the level of you on your machine generating input to Kubernetes, not at the level of Kubernetes itself. That's a valuable distinction - it means you can't have a Kubernetes manifest get halfway through and fail the way that you can have an Ansible playbook get halfway through and fail; there's no "halfway." If something fails halfway through your Jsonnet, it fails in template expansion without actually doing anything to your infrastructure.
(You can, of course, have it run out of resources or hit quota issues partway through deploying some manifest, but there's no ordering constraint - it won't refuse to run the "rest" of the "steps" because an "earlier step" failed, there's no such thing. You can address the issue, and Kubernetes will resume trying to shape reality to match your manifest just as if some hardware failed at runtime and you were recovering, or whatever.)
I think you could think of "infrastructure as code" as he described it as a superset of "infrastructure as data". Both have the benefit of being able to be reproducibly checked into a repo. Declarative systems like Kubernetes/"infrastructure as data" just go even further in de-emphasizing the state of the servers and make it harder to get yourself into unreproducible situations.
Do you have a good tutorial for doing Django or a standard 3 tier web app on Kubernetes? We are using kubernetes at my workspace, but it seems way too complicated to consider for something like that. Maybe if I can bridge the gap between architectures it will help.
Nothing. We use Terraform to provision a simple auto-scaling cluster with loadbalancers and certs, does exactly the same thing but there is no Docker and k8s. Few million lines less Go code turning yaml filed into seggfaults.
Consistency and standardized interfaces for AppOps regardless of the hyper-cloud I use. Kubernetes basically has an equivalent learning curve, but you only have to do it once
They operate at different layers. K8s sits on top of the infrastructure which terraform provisions. It's far more dynamic and operates at runtime, compared to terraform which you execute ad-hoc from an imperative tool (and so only makes sense for the low level things that don't change often).
EKS, GKE and the like have a number of limitations. For example: they can be pretty far behind in the version of K8S they support (GKE is at 1.15 currently, EKS at 1.16; K8S 1.18 was released in at the end of March this year.
The simple answer is that Kubernetes isn't really any of the things it's been described as. What it /is/, though, is an operating system for the Cloud. It's a set of universal abstraction layers that can sit on top of and work with any IaaS provider and allows you to build and deploy applications using infrastructure-as-code concepts through a standardized and approachable API.
Most companies who were late on the Cloud hype cycle (which is quite a lot of F100s) got to see second-hand how using all the nice SaaS/PaaS offerings from major cloud providers puts you over a barrel and don't have any interest in being the next victim, and it's coming at the same time that these very same companies are looking to eliminate expensive commercially licensed proprietary software and revamp their ancient monolithic applications into modern microservices. The culimination of these factors is a major facet of the growth of Kubernetes in the Enterprise.
It's not just hype, it has a very specific purpose which it serves in these organizations with easily demonstrated ROI, and it works. There /are/ a lot of organizations jumping on the bandwagon and cargo-culting because they don't know any better, but there are definitely use cases where Kubernetes shines.
I think this is a good answer. I'll add that as soon as you need to do something slightly more complex, without something like k8s you aren't going to be happy with your life. With k8s, it's almost a 1 liner. For example, adding a load balancer or a network volume or nginx or an SSL cert or auto scaling or...or...or...
> What? Setting up a load balancer or nginx is considered complex now?
It's not complex to set up a load balancer in a given specific environment. But it's another kind of ask to say "set up a load balancer, but also make it so that the load balancer also exists in future dev environments that can be auto-set-up and auto-teared-down. And also make it so that load balancer will work on dev laptops, AWS, Azure, google, our private integration test cluster on site, and on our locally-hosted training environment, with the same configuration script." All of these things can be done in k8s, and basically are by default when you add your load balancer in k8s. They can be done other ways, too, or just ignored and not done, also. But k8s offers a standardized way to approach these kinds of things.
> In my mind standardisation is how we really solve problems in much of software development.
I've been having this thought very often lately.
The only way for humans to do something faster is to use a machine. Any machine is built on some assumption that something is repeatedly true, that some things can be repeatedly interacted with in the same way.
Finding true invariants is very hard, but our world is increasingly malleable. Over time it is getting easier to invent new invariants and pad things out so that the invariant holds.
It's true not just for machines but engineering in general. Whether it's civil or mechanical or electronic or semiconductor engineering, their foundation is built on setting boundary conditions to make the natural world predictable so that it can be reliably manipulated. Things most often go wrong when those conditions are poorly understood, constrained, or modeled such as when using an unproven material, using imprecise parts, or ignoring thermal expansion when designing structural components.
Engineers have a plethora of quality control standards and centuries of built up knowledge to make this chaos manageable and the problems tractable.
I'm fine with standardization, as long as I set the standard. For standard 0, I propose that the only spelling for "standardization" will be with a 'z' (which is itself pronounced "zee").
I think the downvotes are missing the point. This is a key problem with standards: sometimes they standardize around something unreasonable. And tech is already riddled with all sorts of standards which are half-implemented in n different ways.
I think a better approach would be to have a specification for more robust negotiation protocols. When I see "standardisation," I already know that this means the same thing as "standardization" and furthermore that I should expect to see "colour"/"honour," organizations referred to in plural, "from today" rather than "beginning today" or "starting today," and even "jumpers" over "sweaters," "lorries" over "trucks," "biscuits" over "cookies," and more interrogative sentences in conversation. A British English speaker likely does the same process in reverse.
Perhaps I should clarify that "demanding American spelling is unreasonable" was specific to the context of HN (or other open discussion forums with international readership).
Within say, the volunteer-maintained documentation of MDN, the tradeoffs are quite different. There, ease of reading for reference by busy coders is much more valuable relative to ease of typing up a new contribution. Frequent switches between "color" and "colour" become a time-wasting distraction.
MDN should pick a standard and insist on it. And if Ubuntu chooses to require British spelling throughout, I'd say that's good.
Some of those requirements are far-fetched. Multiple cloud environments AND on-prem?
Ansible and Vagrant are not perfect, but I think they are far simpler than a single node k8s instance, and more representative of an actual production environment.
I’ve seen my company go multi cloud provider just to appeal to a single client. Now we’ll need multi cloud, multi continent setup to handle European clients. And I’m sure in another 2 years we’ll need our whole stack in China to support another clients requirements.
This is not my strength in any way, but hearing from those teams, Kubernetes will be a godsend
Setting up one is easy. Setting up one that gives multiple separate teams the ability to configure their services, and apply those changes to servers around the world and in different environments, repeatedly and safely, is harder.
We just spent months at my workplace working on a system to reliably define and configure a set of parallel silo'ed integrated datastores, services, and network stack within Kubernetes/ISTIO (and AWS), and to reliably upgrade new software revisions within those silos and to account for the changing "shape" of the configuration/content in these silos. It's repeatable and safe now, but it took a lot of effort.
There's huge difference between manually setting up load balancer - let's say HAproxy - and being able to just declare in an application that it needs "this and this configuration to route HTTP traffic to it."
The time I spent managing HAproxy for 5 services was bigger than the time I spent managing load-balancing and routing using k8s for >70 applications that together required >1000 load balanced entrypoints.
It's a lever for the sysadmin to spend less time on unnecessary work.
Anyone can ssh into a server and apt-get haproxy and tweak the configuration and get it "working" where the definition of working is accepting and routing traffic. But that's just a hobby setup. When people say setting up a load balancer is complex they are talking about professional setups, not a one off software install on a single server.
But I want to be able to update my haproxy config with a git push, and roll it back with a single command, without sshing into anything, if something goes wrong. I want my everyday administration to be simple. Not the initial setup.
Now set it up in 30 data centers around the world, with the ability for dozens of different teams to add and change their applications, and across multiple staging and QA environments.
My what and what? :P I do get your point, it is "easy" for some definition of such, but to be fair, k8s would automatically put the ip and port in for my part of it all at least.
You can use something like dokku for most of what you cited for much less overhead, if you are not planning to use the "highly available" feature for sure
Just running docker-compose on load balanced machines is pretty close to having all k8s features (that would give you an endpoint, scaling, running pods[containers],heartbeats and nodes[vms]). If you run Kubernetes on GCP you will see it's just a wrapper of GCP vms, load balancers, instance groups and disks. EG: GCP k8's autoscaling for the nodes isn't any better than just simple GCP load balancers and instance groups (it literally is the same thing). k8's best feature (only?): specify yaml files to declare the setup. That is great! But, you make edits to this 4 times a year - that is a ton of complexity for those 4 git commits.
It’s pretty close to get it up and running. It’s not close to operating. You have to roll your own load balancer, health checks, method of updating and managing the underlying nodes, you don’t have operators or guarantees about the control plane, etc...
Spin up a K8S in any major cloud provider and you get all this with a consistent API, which is where the value lies.
You have to do all those things, minus the load balancer, with Kubernetes. It’s only if you don’t run on-premise Kubernetes that really start to se benefits.
Honestly, as an ops guy, I would prefer to get up at 3AM do deal with a failed VM or load balancer, compared to dealing with any kind of failure in a Kubernetes cluster at 10AM.
I can understand wanting to be able to deploy to Kubernetes, it’s extremely flexible and relatively easy. But managing and debugging Kubernetes is still a nightmare, even just monitoring it correctly isn’t exactly easy.
With Kubernetes, you can run multiple apps in the same cluster and let Kubernetes manage the capacity on GCP for you. Nice this you have shared headroom for all your apps versus an app per VM in your setup (I think?).
What is wrong with actually using Google’s hosted Kubernetes that would make you want to run compose yourself in their VMs and setup auto scaling, physical machine upgrades, etc.
Dokku is magical. It blows my gob whenever I use it. It’s the best parts of Docker and Heroku together, and I can actually control everything that goes with my app.
Integrate it with your CI/CD pipeline. I have a Gitlab pipeline that runs kustomize and pipes the output to kubectl for every deploy to do things like setting secrets, image tags, creating ConfigMaps from config files in the repository, etc.
GKE uses other GCP products, but ”k8's autoscaling for the nodes isn't any better than just simple GCP load balancers and instance groups (it literally is the same thing)” isn't entirely accurate - GKE and K8S have logic that manages node pools that you won't be able to use with just instance groups.
OS for the cloud is exactly what it is. I see AWS, Azure and GCP as OEMs for cloud, just like Samsung, Oppo, Motorla, etc are OEMs for smartphones. Android was the open source abstraction across these devices. K8s is the open source abstraction across clouds.
The meaning of "app" on top of these two operating system abstractions is entirely different and the comparison probably doesn't extend beyond this. From a computing stack standpoint though, it makes sense.
AWS and other cloud provider provides far more reliable and simpler features/toolkings than k8s. For teams that's serious about building services on the cloud, having k8s to take over certain area of orchestration make sense, but the "operating system" K8s provides is just a tiny peace of the overall infrastructure.
Yes, people ought to do a side by side comparison of a new user learning to K8S v AWS v GCP before claiming Kubernetes adds more complexity than it returns in benefits.
Remember the first time you saw the AWS console? And the last time?
Because it is hard to manage the configuration. It's why tools like terraform exist.
Anecdote. I worked for a small company that was later acquired. It turned out one of the long time employees had set up the company's AWS account using his own Amazon account. Bad on it's own. We built out the infra in AWS. A lot of it was "click-ops". There was no configuration management. Not even CloudFormation (which is not all that great in my opinion). Acquiring company realizes mistake after the fact. Asks employee to turn over account. Employee declines. Acquiring company bites the bullet and shells out a five figure sum to employee to "buy" his account. Could have been avoided with some form of config management.
> Acquiring company realizes mistake after the fact. Asks employee to turn over account. Employee declines. Acquiring company bites the bullet and shells out a five figure sum to employee to "buy" his account. Could have been avoided with some form of config management.
That is completely the wrong lesson from this anecdote.
1) The acquiring company didn't do proper due diligence. Sorry, this is diligence 101--where are the accounts and who has the keys?
2) Click-Ops is FINE. In a startup, you do what you need now and the future can go to hell because the company may be bankrupt tomorrow. You fix your infra when you need to in a startup.
3) Long-time employee seemed to have exactly the right amount of paranoia regarding his bosses. The fact that the buyout appears to have killed his job and paid so little that he was willing to torch his reputation and risk legal action for merely five figures says something.
Make sense.
This is a classic example of bad operation management leads to unexpected outcome. Could have be solved much easier with proper configuration management.
Sounds like the exact nightmare a previous employer was living. AWS' (awful) web UI convinces the faint of heart to click through wizards for everything. If you're not using version control for _anything_ related to your infrastructure...you have my thoughts and prayers.
Pretty much. The lesson learned for me was to always have version control for the complete stack including the infra for the stack. I like terraform for this. Terragrunt at least solves the issue of modules for terraform reducing the verbosity. Assume things could go wrong and you will need to redeploy EVERYTHING. I've been there.
Migration on resources taking live traffic is not an easy thing. Before migrating the traffic over to a different end-point, the time and the engineering work to make sure the new endpoints works is money and cost. Also there could be statefull data in the original account and to do a live migration of data with new data coming in at X tps is absolutely hard work.
> Besides, personally I find AWS console much easier to understand. I don't get why people hate it.
The console is fine as a learning tool for deployment/management, and for occasional experimentation, monitoring, and troubleshooting, but any IaC tool is vastly more manageable for non-toy deployments where you need repeatability and consistency and/or the ability to manage more than a very small number of resources.
Google can’t migrate if the underlying hardware fails quick enough.
I don’t think AWS has talked about live migrate, but given stability of their VMs and rareness of “we need to restart it notices”, it seems like they have something.
I've "experienced" a hard drive failure on both platforms.
On AWS, I was getting pagerduty'd because the solo gateway box was down, we couldn't ssh in so after an hour of no progress with dubugging or support, we just hit the reset button and hoped for the best. Fortunately this worked and later we where told there has a disk failure.
On GCP, I didn't even know and only discovered it in the logs when I was looking for other audit reasons. Turns out your long-running google VMs are being migrated all of the time and you had no idea. They actually have a policy / SLA around it, basically saying they refresh their entire fleet of servers every 6 weeks iirc. Honestly, if AWS is not doing something like this, I'd have increased concerns about leaky security neighbors (i.e. someone who has a VM running for multiple years without software updates. Hopefully you should be protected on shared servers, but it is software afterall)
Console is suppose to just be a web UI to quickly make a change or explore the feature. Any high quality engineering team should avoid make changes through console because it's not testable and repeatable.
For code change, use codedeploy or container through ECS. For configuration and infrastructure changes, CloudFormation should be the right tool.
How do you view all the VMs in a project across the globe at the same time?
Do you need to manage keys when ssh'n into a VM?
Do you know what the purpose of all the products are? If you don't know one, are you able to at least have an idea what it's for without going to documentation?
The have also directly opposed many efforts for Kubernetes, even to their own customers, until they realized they couldn't win. Only then did they cave, and they are really doing the bare minimum. The most significant contribution to OSS they have made was a big middle finger to Elastic search...
Of course everyone's experience is different, but in my case...
> How do you view all the VMs in a project across the globe at the same time?
I'm not sure what it's got to do with k8s? I can't see jobs that belong to different k8s clusters at the same time, either.
> Do you need to manage keys when ssh'n into a VM?
Well, in k8s everybody who has access to the cluster can "ssh" into each pod as root and do whatever they want, or at least that's how I've seen it, but I'm not sure it's an improvement.
> Do you know what the purpose of all the products are? If you don't know one, are you able to at least have an idea what it's for without going to documentation?
Man, if I got a dime every time someone asked "Does anyone know who owns this kubernetes job?", I'll have... hmm maybe a dollar or two...
Of course k8s can be properly managed, but IMHO, whether it is properly managed is orthogonal to whether it's k8s or vanilla AWS.
If you're running pods as root, you're doing it wrong. That was a no-no with docker, and it's still a no-no for kubernetes. People still run non-containerized services as root too...
This is getting off-topic, but I didn't understand the rationale behind that. Processes running inside docker/k8s are already isolated, so unless you're running something potentially malicious, why would it matter if it's root or not?
(Of course, if you're running untrusted user code, then you'll need every protection you can muster, but I'm talking about running an internally developed application. If you can't trust that, you already have a bigger problem.)
If the container is running as root, and you escape the container, you are root on the host.
Containers share the kernel with the host, and are only as isolated as the uid the process in the container runs as and the privileges you grant that container.
I've spent in total a tenth as much time learning k8s and related systems, as I have spent on AWS.
Most situations I have a direct comparison, k8s takes less ops. Often thanks to helm.
The AWS console is designed for lockin and I could use configuration management for AWS too but the time required to go through their way of doing x is just not worth it. Unless I want to become a AWS solutions architect consultant
Sure, the most asinine thing (borrowing from Rob Pike) is to have a system where invisible characters define the scope and semantics of what you are writing. Now Helm takes this one step further (and I one beyond that before saying no more to myself and discovering https://cuelang.org) and starts using text interpolation with helpers for managing indentation in this invisibly scoped language. I hacked in imports, but was like, ok, making this worse.
So there's this problem and a number of experiments are going on. One camp has the idea of wrapping data / config in more code. These are your Pulumi and Darklang like systems. Then there is another camp that say you should wrap code in data and move away from programming, recursion, and Turing completeness. This seems like the right way to me for a lot of reasons both technical and haman centric.
I've pivoted my company (https://github.com/hofstadter-io/Hof) to be around and powered by Cue. Of the logical camp, it is by far going to be the best and comes from a very successful lineage. I'm blown away by it like when I found Go and k8s.
Recently migrated my company's k8s product to Cue and it's pure bliss.
Configuration should be data, not code. Cue has just the right amount of expressivity - anything more complex shouldn't be done at the configuration layer, but in the application or a separate operator.
There is still configuration, there has to be, you've just wrapped it so much it's not visible anymore (which is even worse than Pulumi, at least they are using an existing language). You still have to express (and write) the same information...
Darklang is solidly in the Pulumi camp, that's where outsiders put it. (I have seen the insides without beta / your demo, someone with a beta account showed me around a bit)
The real problem with Darklang is they have their own custom language and IDE. What exactly are you trying to solve?
> Remember the first time you saw the AWS console? And the last time?
There was a time in between for me - that was Rightscale.
For me, the real thing that k8s bring is not hardware-infra - but reliable ops automation.
Rightscale was the first place where I encountered scripted ops steps and my current view on k8s is that it is a massively superior operational automation framework.
The SRE teams which used Rightscale at my last job used to have "buttons to press for things", which roughly translated to "If the primary node fails, first promote the secondary, then get a new EC2 box, format it, install software, setup certificates, assign an elastic IP, configure it to be exactly like the previous secondary, then tie together replication and notify the consistent hashing."
The value was in the automation of the steps in about 4 domains - monitoring, node allocation, package installation and configuration realignment.
The Nagios, Puppet and Zookeeper combos for this was a complete pain & the complexity of k8s is that it is a "second system" from that problem space. The complexity was always there, but now the complexity is in the reactive ops code, which is the final resting place for it (unless you make your arch simpler).
> The SRE teams which used Rightscale at my last job used to have "buttons to press for things", which roughly translated to "If the primary node fails, first promote the secondary, then get a new EC2 box, format it, install software, setup certificates, assign an elastic IP, configure it to be exactly like the previous secondary, then tie together replication and notify the consistent hashing."
If I understand this correctly, all of the things could have been automated in AWS fairly easily .
"If the primary node fails" Health check from EC2 or ELB.
"get a new EC2 box" ASG will replace host if it fails health check.
"format it" The AMI should do it.
"install software, setup certificates" Userdata, or Cloud-init.
"assign an elastic IP, configure it to be exactly like the previous secondary, then tie together replication and notify the consistent hashing" This could be orchestrated by some kind of SWF workflow if it takes a long time or just some lambda function if it's within a few mins.
Ansible.
You can keep your YAML and deploy actual virtual servers on your cloud provider.
Kubernetes is an introvert and this doesn't correspond to anything but it's padded cell walls.
In Ansible it's an extrovert, an exoskeleton.
Kubernetes insides out makes it right again.
Make sense?
No, not following. I understand how master based provisioning management systems had their time and place, but we've largely moved beyond that to baking images, whether containers or VMs. Running a master based system comes with a whole house of other issues. Ansible is now relegated to a better system than based for installing packages and configuring the baked image. The time of booting a vanilla imstamce and then installing software when a scaling events happens is over.
By the way, what does ansible do to help with scaling applications?
Why do a comparison? K8S runs on AWS and GCP. They have managed services for setting up one. If you know K8S as a developer, then you simply consume the cloud K8S cluster.
I think the point is that there are people that claim that k8s adds a ton of complexity to your environment. But if you compare k8s alone with managing your infrastructure using (non-k8s) AWS or GCP primitives, you'll find that the complexity is similar.
The problem is that what the nodes in AWS are doing is fairly transparent. When my kubernetes pod does not come up, it’s always a hell of a pain figuring out why from just the events that kubernetes is giving me.
While that's true on the managing instances side, you also need to actually deploy the infrastructure to manage them (If you're not using some PaaS offering). You don't need to do this for other IaaS.
Honestly the last time I looked at k8s was like 5 years ago, but back then it looked like a pretty big pita to admin.
The last 5 years have been transformative for both cloud native development and also open source software
It is a completely different world that stretches far beyond Kubernetes, though I attribute much of the change to what has happened from / around k8s -> cncf
It's so easy, I can launch production level clusters is 15 minutes with four keystrokes and make backups and restore to new ephemeral clusters with a few more simple commands
> but back then it looked like a pretty big pita to admin
- well it's also a pita to update services without a downtime.
- and it sucks to update operating systems without a downtime.
- sometimes you reinvent the wheel, when you add another service or even a new website
however with k8s everything above is kinda the same, define a yaml file, apply it, it works.
and also k8s itself can be managed via ansible/k3s/kops/gke/kubeadmin/etc...
it's way easier to create a cluster and manage it.
That's exactly the point. You avoid lock-in to AWS or GCP by running on K8S instead. K8S becomes the "operating system": a standardized abstraction over different hardware.
Isn’t a “standardized abstraction over different hardware” just... an actual operating system? Isn’t “the operating system of the cloud” just... the actual operating system running on your cloud VMs? If you script a deployment environment atop vanilla Linux distros (e.g. ansible), you also avoid public cloud lock-in. (Side benefit: you also avoid container engine lock-in, and container layer complexity!)
Containers are a standard abstraction over the operating system, not over the hardware (or the VM, even). This has its use cases, but making it “the standard” for deployment of all apps and workloads is just bananas, in my view.
Kubernetes, when viewed as the OS for the data center, controls, manages, and allocates a pool of shared resources. When I install and run an application on my laptop, there are a ton of details I don't care about that just happen magically. Kubernetes maps this idea onto the resources and applications of the cloud.
Again, Kubernetes is far more than just deploying, running, mad scaling an application. It allows so many problems to be solved at the system level, outside of an application and developers awareness.
Take for example restricting base images at your organization. With Kubernetes, SecOps can install an application which scans all incoming jobs and either rejects them, or in more sophisticated setups, hot swaps the base image
What if instead they revamped their ancient microservices into modern monolithic applications?
I'm sorry, but I can't stand this kind of bullshit.
You cannot possibly take two random things, put 'modern' in front of one word and 'ancient' in front of another, to justify changing things.
The problem of Kubernetes is probably that people started drinking the microservices koolaid and now need complex solution to deploy their software that became more complex when they adopted a microservices architecture.
IaaS is not "the cloud". it was in 2008 when all we had was EC2 and RDS.
Today Kubernetes is the antithesis of the cloud -
Instead of consuming resources on demand you're launching VMs that need to run 24/7 and have specific roles and names like "master-1".
Might as well rent bare-metal servers. It will cost you less.
You are completely wrong. It is trivial to have your k8s control plane and your workload cluster in something like an ASG, and basically know nothing about the nodes in either group. I'm not even using GKS or EKS, just ASGs on AWS. My servers and applications are cattle, not pets.
I knew literally nothing about k8s in September and now I have multiple clusters humming along, treating the worker cluster nodes as a generic pool of compute, autoscaling the cluster as well as the pods inside it. Upgrading is a breeze, I have great observability, I can deploy experiments and new applications with a single CI step or click, in fact I have nodes that are killed and get replaced for cost savings by SpotInst in the middle of the business day and I don't even need to know about it. My load balancers and even DNS are all provisioned for me and I can use the same Helm charts to create an identical staging and production environment.
Kubernetes IS the spirit of the Cloud and 12 factor apps. It's not that scary, and with tools like Rancher and k3s you can make it even simpler.
> Kubernetes IS the spirit of the Cloud and 12 factor apps.
I would argue that Kubernetes has _nothing whatsoever_ to do with the spirit of the Cloud, and that in fact "serverless" embodies the spirit of pay-as-you-use consumption models.
This tool allows you to autoscale the cluster itself with various cloud providers. There is a list of cloud providers it supports at the end of the readme.
> Might as well rent bare-metal servers. It will cost you less.
Long term, but up front costs are what make cloud services appealing.
FWIW, it's possible to minimize your idle VM costs to an extent. For example, you could use one or more autoscale groups for your cluster and keep them scaled to one vm each. Then use tools like cluster auto scaler to resize on demand as your workload grows. You are correct that idle vm costs can't be completely avoided. At least not as far as I am aware.
> > Might as well rent bare-metal servers. It will cost you less.
> Long term, but up front costs are what make cloud services appealing.
There are no up front costs. GP said rent dedicated, not buy your own metal. If there's anything in cloud it's the many pre-written services (queue, database etc) but GP is right: if you go k8s you aren't going to use many/at all so why not just go and rent cheap servers that get deployed in two minutes instead of renting expesive virtual servers which get deployed in a few seconds?
Jonathan Schwartz, last CEO of Sun Microsystems, in March 2006:
"Frankly, it’s been tough to convince the largest enterprises that a public grid represents an attractive future. Just as I’m sure George Westinghouse was confounded by the Chief Electricity Officers of the time that resisted buying power from a grid, rather than building their own internal utilities."
If that is your definition of "cloud" then most stuff running on AWS and other "cloud" providers isn't. I agree that Kubernetes and even containers aren't the end all. I think they are a stepping stone to true on demand where you have the abstraction of just sandboxed processes compiled to WASM or something run wherever.
But as of where we are now, it is a good abstraction to get there. It provides a lot of stuff like service discovery, auto-scaling and redundancy. Yes you do need to have instances to run K8s, but that is as of date the only abstraction that we have on all cloud providers, local virtualization, and bare metal. So yes it isn't true on demand "cloud" but in order to work like that you need to fit into your service provider's framework and accept limitations on container size, runtime, deal with warm up times occasionally.
We had (have) discovery, auto-scaling and redundancy in PaaS. Most apps could run just fine in Cloud Foundry/App Engine/Beanstalk/Heroku. But the devs insist on MongoDB & Jenkins instead of using the cloud-provider solution and now you're back to defining VPCs, scaling policies, storage and whatnot.
Knative does exactly that. You can use your buildpacks and everything in Tekton and configure your application via Knative Services. No need to bother about anything else.
Outdated. We were further ahead in the abstraction ladder with PaaS, and honestly most apps could run perfectly fine in Beanstalk/App Engine/Cloud Foundry/Heroku.
But then the devs demand Jenkins, Artifactory and MongoDB, and instead of using cloud-provider alternatives you're back to defining VPCs and autoscaling groups.
The buzzword mumbo-jumbo on the first paragraph alone (which isn't really even your fault or anything, just the bogus pomp inherent to k8s as a whole) is already a scarecrow to anyone that "wasn't born with the knowledge", really.
It is pretty hard to get used to it. Brushing it away won't make it approachable.
I've yet to meet anyone who can easily explain how the CNI, services, ingresses and pod network spaces all work together.
Everything is so interlinked and complicated that you need to understand vast swathes of kubernetes before you can attach any sort of complexity to the networking side.
I contrast that to it's scheduling and resourcing components which are relatively easy to explain and obvious.
Even storage is starting to move to overcomplication with CSI.
I half jokingly think K8s adoption is driven by consultants and cloud providers hoping to ensure a lock-in with the mechanics of actually deploying workloads on K8s.
Services create a private internal DNS name that points to one or more pods (which are generally managed by a Deployment unless you're doing something advanced) and may be accessed from within your cluster. Services with Type=NodePort do the same and also allocate one or more ports on each of the hosts which proxies connections to the service inside the cluster. Services with Type=LoadBalancer do the same as Type=NodePort services and also configure a cloud load balancer with a fixed IP address to point to the exposed ports on the hosts.
A single Service with Type=LoadBalancer and one Deployment may be all you need on Kubernetes if you just want all connections from the load balancer immediately forwarded directly to the service.
But if you have multiple different services/deployments that you want as accessible under different URLs on a single IP/domain, then you'll want to use Ingresses. Ingresses let you do things like map specific URL paths to different services. Then you have an IngressController which runs a webserver in your cluster and it automatically uses your Ingresses to figure out where connections for different paths should be forwarded to. An IngressController also lets you configure the webserver to do certain pre-processing on incoming connections, like applying HTTPS, before proxying to your service. (The IngressController itself will usually use a Type=LoadBalancer service so that a load balancer connects to it, and then all of the Ingresses will point to regular Services.)
Assuming that like us, you spend the last 10 - 12 years deploying IPv6 and currently running servers on IPv6 only networks, the Kubernetes/Docker network stack is just plain broken. It can be done, but you need to start thinking about stuff like BGP.
Kubernetes should have been IPv6 only, with optional IPv4 ingress controllers.
It really feels like Kubernetes was developed by some enterprise Java developers. Nothing seems well defined, everything is done in the name of abstraction, but the rules of the abstraction are never clearly stated, only the purpose is.
I really hope someone takes the mantle of Leslie Lamport (creator of the language TLA - "the quixotic attempt to overcome engineers' antipathy towards mathematics") and replaces Kubernetes with some software with a first principles approach.
An ingress object creates an nginx/nginx.conf. That nginx server has an IP address which has a round robin IPVS rule. When it gets the request it proxy's to a service ip which then round robins to the 10.0.0.0/8 container IP.
Ingress -> service -> pod
It is all very confusing but once you look behind the curtain it's straight forward if you know Linux networking and web servers. The cloud providers remove the requirement of needing Linux knowledge.
I don't think this is accurate which plays into the parents point, I guess.
Looking at the docs, ingress-nginx configures an upstream using endpoints, which are essentially Pod IPs, with skips kubernetes service based round-robin networking altogether.
Assuming you use an ingress that does configure services instead, and assuming you're using a service proxy that uses ipvs (i.e. kube-proxy in default settings) then your explanation would have been correct.
For the most part, kubernetes networking is as hard as networking with loads of automation. Often, depth in both those skills are pretty exclusive, but if you're using the popular and/or supported CNI not doing things like changing in-flight, your average dev just needs to learn basic k8s debugging such as kubectl get endpoints to check whether his service selectors are setup correctly, and curl them to check whether the pods are actually listening on those ports.
It's confusing because a lot of people being exposed to K8s don't necessarily know how Linux networking and web servers work. So there is a mix of terminology (services, ingress, ipvs, iptables, etc) and context that may not be understood if you didn't come from running/deploying Linux servers.
When you are coming from old sysadmin world and you mastered unix systems architecture and software - what k8s does is very straightforward because its the same you already know.
K8S is extremely complicated for huge swarm of webdevs and java developers that really reqlly dont understand how the stuff they use/code really works.
K8S was supposed to decrease the need for real sysadmins but in my view it actually increased the demand because of all the obscure issues one can face in production if they dont really understand what they are doing with K8S and how it works under the hoods.
I think you're right for small clusters, you end up needing more sysadmins. But to manage 1000 node Kubernetes cluster, I suspect it can be done with less administration
Im managing 20000 vcpus infra both k8s and plain vms. IMHO there is no big difference there. It all depends on the tools you are using around orchestration.
In my experience having good sysadmins is still the key to best infrastructure management no matter the size of the company.
It helps going from the bottom up, IMO. It's a multi-agent blackboard system with elements of control theory, which is a mouthful, but it essentially builds from smaller blocks up.
Also, after OpenStack, the bar for "consulting-driven software" is far from reached :)
Devops/arch here, I think Kubernetes solves deployment in a standardized way and we get fresh clean state with every app deploy.
Plus it restarts applications/pods that crashes.
That said I think Kubernetes may be at its Productivity journey on the tech Hype cycle. Networking in Kubernetes is complicated. This complication and abstraction has a point if you are a company at Google scale. Most shops are not Google scale and do not need that level of scalability. The network abstraction has its price in complexity when doing diagnostics.
You could solve networking differently than in Kubernetes with IPv6. There is not a need for complicated IPv4 nat schemes. You could use native ipv6 addresses that are reachable directly from the internet. Since you have so many ipv6 addresses you do not need Routers/Nats.
Anyhow in a few years time some might be using something simpler like an open source like Heroku. If you could bin pack the services / intercommunication on the same nodes there would be speed gains from not having todo network hops going straight to local memory. Or something like a standardized server less open source function runner.
1) It solves many different universal, infrastructure-level problems.
2) More people are using containers. K8s helps you to manage containers.
3) It's vendor agnostic. It's easy to relocate a k8s application to a different cluster
5) People see that it's growing in popularity.
6) It's Open source.
7) It helps famous companies run large-scale systems.
8) People think that it looks good on a resume and they want to work at a well known company.
9) Once you've mastered K8s, it's easy to use on problems big and small. (Note, I'm not talking about installing and administrating the cluster. I'm talking about being a cluster user.)
10) It's controversial which means that people keep talking about it. This gives K8s mind share.
I'm not saying K8s doesn't have issues or downsides.
1) It's a pain to install and manage on your own.
2) It's a lot to learn--especially if you don't think you're gonna use most of it's features.
3) While the documentation has improved a lot, it's still weak and directionless in places.
I think K8s is growing more popular because it's pros strongly outweigh it's cons.
(Note I tried to be unbiased on the subject, but I am a K8s fan--so much so that I wrote a video course on the subject: https://www.true-kubernetes.com/. So, take my opinions with a grain of salt.)
My question is: Why is only k8s so popular when there are better alternatives for a large swath of users? I believe the answer is "Manufactured Hype". k8s is from a purely architectural standpoint the way to go, even for smaller setups, but the concrete project is still complex enough that it requires dozens of different setup tools and will keep hordes of consultants as well as many hosted solutions from Google/AWS/etc in business for some time to come, so there's a vested interest in continuing to push it. Everyone wins, users get a solid tool (even if it's not the best for the job) and cloud providers retain their unique selling point over people setting up their own servers.
I still believe 90% of users would be better served by Nomad. And if someone says "developers want to use the most widely used tech", then I'm here to call bullshit, because the concepts between workload schedulers and orchestrators like k8s and nomad are easy enough to carry over from one side to the other. Learning either even if you end up using the other one is not a waste of time. Heck, I started out using CoreOS with fleetctl and even that taught me many valuable lessons.
I didn't say more users, I said appropriate for more users. The alternative I mentioned is Nomad and I wish more people would give it a try and decide for themselves. The momentum behind it is Hashicorp, makers of Vault, Consul, Terraform, Vagrant, all battle-proven tools. The fact that there's one big player behind it really shows in how polished the tool, UI and documentation is.
The issue that I have with managed k8s is that these products will decrease the pressure to improve k8s documentation, tooling and setup itself. And then there's folks (like me) who want or need to run something like k8s on bare metal hardware outside of a cloud where the cloud-managed solution isn't available.
I got a bit disillusioned with k8s and looked at Nomad as an alternative.
As a relatively noob sysadmin, I liked it a lot. Easy to deploy and easy to maintain. We've got a lot of mixed rented hardware + cloud VPS, and having one layer to unify them all seemed great.
Unfortunately I had a hard convincing the org to give it a serious shot. At the crux of it, it wasn't clear what 'production ready' Nomad should look like. It seemed like Nomad is useless without Consul, and you really should use Vault to do the PKI for all of it.
It's a bit frustrating how so many of the HashiCorp products are 'in for penny, in for a pound' type deals. I know there's _technically_ ways for you use Nomad without Consul, but it didn't seem like the happy path, and the community support was non-existent.
Please tell me why I'm wrong lol, I really wanted to love Nomad. We are running a mix of everything and its a nightmare
Nomad + Consul is the happy path. Adding Vault into the mix is nice, but not required.
Consul by itself is the game-changer. Even in k8s it's a game-changer. It solves so many questions in an elegant way.
"How do I find and reach the things running in (orchestrator) with (unknown ip/random port) from (legacy)?" being the most important. You run 5 servers, and a relatively lightweight client on everything (which isn't even outright required, but it sure is useful!), and you get a _lot_ with that.
Consul provides multiple interfaces and ingress points to find everything. It also is super easy to operate, and has a pretty big community.
If you absolutely cannot have Consul, Nomad is still a really good batch job engine, and makes a very great "distributed cron," which is more extensible, scalable, and easy to use than something like Jenkins for the same task.
My team is pretty small (was 4 people, now 6) and we manage one of the worlds largest nomad and consul clusters (there are some truly staggeringly large users of Vault so I won't make that claim). Even when shit really hits the fan, everything is designed in a way that stuff mostly works; and there's enough operator friendly entry points that we can always figure out the problem.
Vault is great for just a PKI, even if you aren't using it for anything else. There are some tools that just do PKI, but Vault works a real treat at it. Any Terraform backend that supports encryption + Terraform + Vault gives you such an amazing workflow. We use a mix of short and long certs, with different roles based on what's getting a cert.
For now, we have CRLs disabled on all short-lived backends, enabled on long-lived backends and we're actually looking at disabling storing short-lived certs in the storage system at all, and just cranking the TTL down to really truly short. We've tested it as low as 30m, but a more real-world max-ttl is 1 week, with individual apps setting it as low as they can handle. For reference we run more than 10 PKI backends, and adding one (or a bunch) more is just a little terraform snippet for us.
The way it works via hashicorp template land, is that you just plop
{{ with secret "name-of-pki/issue/name-of-role" "common_name=my.allowed.fqdn" "ttl=24h" }} {{ .Data.certificate }} {{ end }}
into your Nomad template stanza, or use consul-template directly as a binary, or use vault agent with it's template capability. You can get the CA chain if required the same way, just hitting a different PKI endpoint.
Also, as of Vault 1.4, Vault's internal raft backend is now production ready, making it a snap to run.
Try running through a few of the Vault quick-start guides, and replicating them in Terraform as much as possible. There's a few things TF does not handle gracefully last I checked (initial bootstrap), but you can get around that by using a null_resource or just handling that outside Terraform.
Edit: just noticed an actual Nomad user replied as well, and I like their answer better. Consider mine an addendum. :)
Batch workloads rarely require Consul, but for deploying your standard network services on Nomad: Consul is basically required. You could likely use any number of service mesh systems instead (either as sidecars, Docker network plugins, or soon CNI), but you'll be doing a lot of research and development on your own I'm afraid.
The Nomad team is by no means opposed to becoming more flexible in the future (and indeed better CNI support is landing soon as a first step), but we wanted to focus on getting one platform right and a pleasure to use before trying to genericize and modularize it.
Thanks for reaching out! Since I have the chance I'll add - Nomad is pretty awesome, and I love the work your team is doing.
My org looked at Nomad at a time when there was a lot of pressure from above to deliver something as soon as possible. Two weeks just weren't enough to full lay of the land ¯\_(ツ)_/¯
Funny thing is even if I could plug in my own service discovery into Nomad, I would probably chuck it away and replace it with Consul after a few weeks anyway haha
I'm sympathetic toward the idea of a system made of interchangeable parts, but I also kinda feel like it's a bit unrealistic, maybe? Even with well-defined interfaces, there will always be interop problems due to bugs or just people interpreting the interface specs differently. Every new piece to the puzzle adds another line (or several) to a testing matrix, and most projects just don't have the time and resources to do that kind of testing. It's unfortunate, but IMO understandable that there's often a well-tested happy path that everyone should use, even when theoretically things are modular and replaceable.
Nomad isn't really feature mature or user friendly enough, you still eventually need 100 bolt-ons.
I think a Distributed OS is the only sane solution. Build the features we need into the kernel and stop futzing around with 15 abstractions to just run an isolated process on multiple hosts.
As the Nomad Team Lead I sympathize with your first statement, but I hope our continued efforts will dissuade you from the second.
Linux (and the BSDs) are remarkably stable, festureful, and resilient operating systems. I would hate to give up such a strong foundation. Nomad can crash without affecting your running services. Nomad can be upgraded or reconfigured without affecting your running services. Nomad can be observed, developed, and debugged as a unit often without having to consider the abstractions that sit above or below it. The right number of abstractions is a beautiful thing. Just no more and no less. :)
I'm resisting kubernetes and might go with nomad (currently I'm "just using systemd" and I get HA from the BEAM VM)... But I do also get the argument that the difference between kubernetes and nomad is that increasingly kubernetes is supported by the cloud vendors, and nomad supports the cloud vendors.
> I still believe 90% of users would be better served by Nomad.
Well sure, but if the story just ended with "everyone use the least exciting tool", then there'd be few articles for tech journals to write.
But Kubernetes promises so much, and deep down everyone subtly thinks "what if I have to scale my project?" Why settle for good enough when you could settle for "awesome"? It's just human nature to choose the most exciting thing. And given that I do agree that there's some manufactured hype around Kubernetes, it isn't surprising to me why few are talking about Nomad.
I'd say it's down to two things. First is the sheer amount of work they're putting into standardization. They just ripped out some pretty deep internal dependencies to create a new storage interface. They have an actual standards body overseen by the Linux Foundation. So I agree with the blog post there.
The second reason is also about standards, but using them more assertively. Docker had way more attention and activity until 2016 when Kubernetes published the Container Runtime Interface. By limiting the Docker features they would use, they leveled the playing field between Docker and other runtimes, making Docker much less exciting. Now, new isolation features are implemented down at the runc level and new management features tend to target Kubernetes because it works just as well with any CRI-compliant runtime. Developing for Docker feels like being locked in.
It's confusing, but Docker images (and image registries) are also an open standard that Docker implements [1].
A lot of the Kubernetes "cool kids" just run containerd instead of Docker. Docker itself also runs containerd, so when you're using Kubernetes with Docker, Kubernetes has to basically instruct Docker to set up the containers the same way it would if it were just talking to containerd directly. From a technical perspective, you're adding moving parts for no benefit.
If you use containerd in your cluster, you can then use Docker to build and push your images (from your own or a build machine), but pull and run them on your Kubernetes clusters without Docker.
Yes. The big difference, however, is that k8s removed docker from consideration when actually running the system. Yes, you have docker underneath, and are probably going to use docker to build the containers.
But deploy to k8s? There's no docker outside of few bits involving "how to get to the image", and the actual docker features that are used are also minimized. The result is that many warts of docker are completely bypassed and you don't have to deal with impact of legacy decisions, or try to wrangle system designed for easy use by developer at local machine into complex server deployment. And, IMHO, interfaces used by k8s for the advanced features are much, much better than interfaces used or exported by docker.
Yes. But I think k8s took a lot of attention away from Docker in terms of headspace and developer interest. RedHat for example is pushing CRI-O, which is a minimalist CRI-compliant runtime which lets admins focus even more on k8s and less on the whole runtime level.
It makes a bit more sense if you see Kubernetes as the new Linux: a common foundation that the industry agrees on, and that you can build other abstractions on top of. In particular Kubernetes is the Linux Kernel, while we are in the early days of discovering what the "Linux distro" equivalent is, which will make it much more friendly / usable to a wider audience
Likewise, Linux is also a confusing mess of different parts and nonsensical abstractions when you first approach it. It does take some time to understand how to use it, and in particular how to do effective troubleshooting when things aren't working the way you expect.
But I 100% agree--I think it's the new Linux. In 5-10 years, it'll be the "go to", if not sooner.
In my humble opinion because there is so much money and marketing behind it. If you go attend the OSS summit all the cloud players are sending evangelizers and having the whole conference to be about Kbuernetes.
Then a lot of people drink the koolaid and apply it everywhere / feel they're behind if they aren't in Kubernetes.
We are not in Kubernetes and have multiple datacenters with thousands of VMs/containers. We are doing just fine with the boring consul/systemd/ansible set up we have. We also have somethings running in Containers but not much.
Funnily enough at the OSS summit I had a couple of chats with people in the big companies (AWS, Netflix, etc.) and they themselves have the majority of their workflows in boring VMs. Just like us.
Kubernetes is an almost necessary tech when you operate your own cloud and that’s where it came from: Google.
The smart people at Google knew that by quickly packaging their own internal tech and releasing it on open source they’d help people move from the incumbent AWS.
Helping customers switch IaaS hurts the both, lock in is better, but it hurts AWS way more. Proof? They made it free to run the necessary compute behind K8s control plane, until recently that was.
Are there benefits on running your biz’ web app using constructs made for a “cloud”? Sure there is, that’s why people are moving to K8s. There is real business benefits, given a certain amount of necessary moving parts. LinkedIn had such a headache with this they created Kafka.
I suspect most organisations’ Architects and IT peeps push for K8s as a moat for their skills and to beef up their resumé. They know full well that the value is not there for the biz’ but there’s something in it for them.
I remember when Docker and K8 emerged and YAGNI kept everything in perspective. Unless you had a fleet of hundreds of servers to manage and spin-up at a moment's notice you just used Chef, Puppet or Ansible. Now nothing's too small for this ridiculously over-engineered technology. Got a WordPress blog? You're doing it wrong if you don't put it in a Docker container and launch it with K8. Same with a lot of Rails projects for which Capistrano was more than aadequate. Just gotta scratch that itch until you've no more skin left.
Yeah, I think people are overthinking it. The real reason is that if you do a superficial investigation you will quickly come back with the impression that k8s is near universally supported across cloud vendors and gives an appearance of providing a portable solution where otherwise the only alternative would be vendor lock-in. It makes it a no-brainer for anybody starting out with a new cloud deployment.
I host about a dozen rails apps of different vintage and started switching from Dokku to Digital Ocean Kubernetes. I had a basic app deployed with load balancer and hosted DB in about 6 hours. Services like the nginx ingress are very powerful and it all feels really solid. I never understood Dokku internals either so them being vastly simpler is no help for me. I figured for better or worse kubernetes is here to stay and on DO it is easier than doing anything on AWS really. I have used AWS for about 5 years and have inherited things like terraformed ECS clusters and Beanstalk apps. I know way more about AWS but I feel you need to know so much that unless you only do ops you cannot really keep up.
I found deploying databases with Dokku to be really intuitive. CockroachDB is great, but still a lot more steps than dokku postgres:create <db>. The whole certificates thing is quite confusing. Otherwise, k3s on-prem is great
You say that like it's a bad thing. A declarative model is infinitely better for representing complex systems than scripts and mind space. The challenge is actually being able to get to that point.
An sh script is a series of things that almost any developer who uses Linux can understand: command line statements. We use them all the time. Suppose the complicated declarative model you've made doesn't work one day and the person who originally wrote it is gone? Even if you have someone to debug it who knows the k8s languages: usually you can't just use the yml files alone, you need terraform or something, plus maybe some other services and "sidecar" containers that do other things for you. With a sh script, you just have the script with a bunch of commands that you can understand and look up easily, in a linear order, to figure out the problem. You might not understand every command, but you can run each one until you get to the error, then focus in on that area. With k8s, you need to figure out a huge series of intermixed deps and networks and services just to start, then find the one that is failing (if that is the one failing and it's not just being masked by another failed service that you didn't know about).
Having extensively used Chef and K8s, the difference is that they try to deal with chaos in unmanaged way (Puppet is the closest to "managed"), but when dealing with wild chaos you lack many ways of enforcing the order. Plus they don't really do multi-server computation of resources.
What k8s brings to the table is a level of standardization. It's the difference between bringing some level of robotics to manual loading and unloading of classic cargo ships, vs. the fully automated containerized ports.
With k8s, you get structure where you can wrap individual program's idiosyncracies into a container that exposes standard interface. This standard interface allows you to then easily drop it into server, with various topologies, resources, networking etc. handled through common interfaces.
I said that for a long time before, but recently I got to understand just how much work k8s can "take away" when I foolishly said "eh, it's only one server, I will run this the classic way. Then I spent 5 days on something that could be handled within an hour on k8s, because k8s virtualized away HTTP reverse proxies, persistent storage, and load balancing in general.
Now I'm thinking of deploying k8s at home, not to learn, but because I know it's easier for me to deploy nextcloud, or an ebook catalog, or whatever, using k8s than by setting up more classical configuration management system and deal with inevitable drift over time.
What a container lets you do is move a bunch of imperative logic and decisions to build time, so that there are very few decisions made at deploy time. I'm not trying to have a bunch of decisions made that are worded as statements. I've watched a long succession of 'declarative' tools that make a bunch of decisions under the hood alienate a lot of people who can't or won't think that way, and nobody really should have to, even if they can. There are so many things I'd rather being doing with my day than dealing with these sorts of systems because otherwise it won't get done, and I'm heavily invested in the outcome.
I think the build, deploy, start and run-time split is an important aspect that gets overlooked quite a bit, and is critical to evaluating tools at this point. That is why we aren't still doing everything with Chef or Puppet. Whether we continue doing it with Kubernetes or Pulumi or something else matters a bit less.
Repeatability is not the goal, as others in this thread have implied. The goal is trusting that the button will work when you push it. That if it doesn't work, you can fix it, or find someone who can. Doing that without repeatability is pretty damned hard, certainly, but there are ways to chase repeatability without ever arriving at the actual goal.
But what do you use to manage those containers and surrounding infra (networking, proxies, etc)?
I've been down the route of using Puppet for managing Docker containers on existing systems, Ansible, Terraform, Nomad/Consul. But in the end it all is just tying different solutions together to make it work.
Kubernetes (in the form of K3s or a other lightweight implementation) just works for me, even in a single server setup. I barely have to worry about the OS layer, I just flash K3s to a disk and only have to talk to the Kubernetes API to apply declarative configurations.
Only things I'm sometimes still need the OS layer for is networking, firewall or hardening of the base OS. But that configuration is mostly static anyways and I'm sure I will fine some operators for that to manage then through the Kubernetes API as IaC if I really need to.
I used to have a bunch of bash scripts for bootstrapping my docker containers. At one point I even made init scripts, but that was never fully successful.
And then one day I decided to set up kubernetes as a learning experiment. There is definitely some learning curve about making sure I understood what deployment, or replicaset or service or pod or ingress was, and how to properly set them up for my environment. But now that I have that, adding a new app to my cluster, and making it accessible is super low effort. i have previous yaml files to base my new app's config on.
It feels like the only reason not to use it would be learning curve and initial setup... but after I overcame the curve, it's been a much better experience than trying to orchestrate containers by hand.
Perhaps this is all doable without kubernetes, and there is a learning curve, but it's far from the complicated nightmare beast everyone makes it out to be (from the user side, maybe from the implementation details side)
It would mean I removed ~20% of the things that were annoying me and left 80% still to solve, while kubernetes goes 80% for me with the remaining 20% being mostly "assembly these blocks".
Plus, a huge plus of k8s for me was that it abstracted away horrible interfaces and behaviours of docker daemon and docker cli.
K3s on few devices. Thinking of grabbing a HP microserver or something with similar case for ITX ryzen (embedded EPYC would be probably too expensive), some storage space, maybe connect few extra bits of compute power into a heterogenous cluster. Put everything except maybe PiHole on it, with ZFS pool exported over bunch of protocols as backing store for persistent volume claim support.
Kubernetes is one way to deploy containers. Configuration systems like Ansible/Salt/Puppet/Chef/etc are another way to deploy containers.
Kubernetes also makes it possible to dynamically scale your workload. But so does Auto Scaling Groups (AWS terminology) and GCP/Azure equivalents.
The reality is that 99% of users don't actually need Kubernetes. It introduces a huge amount of complexity, overhead, and instability for no benefit in most cases. The tech industry is highly trend driven. There is a lot of cargo culting. People want to build their resumes. They like novelty. Many people incorrectly believe that Kubernetes is the way to deploy containers.
And they (and their employers) suffer for it. Most users would be far better off using boring statically deployed containers from a configuration management system. Auto-scaled when required. This can also be entirely infrastructure-as-code compliant.
Containers are the real magic. But somehow people confused Kubernetes as a replacement for Docker containers, when it was actually a replacement for Docker's orchestration framework: Docker Swarm.
In fact, Kubernetes is a very dangerous chainsaw that most people are using to whittle in their laps.
Hm. Systemd already runs all your services in cgroups, so the same resource limit handles are available. It doesn't do filesystem isolation by default, but when we're talking about Go / Java / Python / Ruby software does that even matter? You statically link or package all your dependencies anyway.
Not only systemd runs your code it also does file system isolation built-in, runs containers, both privileged and non-privileged and sets up virtual networking for free.
systemd-nspawn / machined makes the other systems look like very complicated solutions in search of a problem
Name may not be pretty but it's an official feature of systemd which is used to debug the systemd development and it is far easier to take backups incrementally because the container files are just plain files in /var/lib/machines/ and apparently you already have it if systemd is on your system. (May need an additional package to be installed from OS package repo.)
I run nspawn instances as development environments for developers and I can also run docker inside it.
Expectations have been set unrealistically high and online communities like this one make matters worse. Big players with dedicated devops teams use Kubernetes all the time so why shouldn't I? It's only a matter of time before "Hello world" tutorials include a chapter on container orchestration.
So we end up with a plethora of full stack developers who can barely keep up with their current development stacks willfully deploying their software on systems that they're just barely competent with.
I know this because I almost deployed a side project with Kubernetes because it was expected of me despite the fact that being mediocre at it was the best that I could hope to become and that's an easy way to chop off a leg or three.
I'm not intimately familiar with those, but I did a lot of similar things with scripts.
As far as I can tell: those are imperative. At least in some areas.
Kubernetes is declarative. You mention the end state and it just "figures it out". Mind you, with issues sometimes.
All abstractions leak. Note that k8s's adamance about declarative configuration can make you bend over backwards. Example: running a migration script post deploys. Or waiting for other services to start before starting your own. Etc.
I think in many ways, those compete with Terraform which is "declarative"-ish. There's very much a state file.
I've only ever used cfengine and Ansible, but they are both declarative. Hell, Ansible uses yaml files too.
I would be somewhat surprised to find out Puppet and Chef weren't declarative either. Because setting up a system in an imperative fashion is ripe for trouble. You may as well use bash scripts at that point.
I've used Ansible for close to 10 years for hobby projects. And setting up my development environment. Give me a freshly installed Ubuntu laptop, and I can have my development environment 100% setup with a single command.
Ansible is YAML, but it's definitely imperative YAML - each YAML file is a list of steps to execute. It uses YAML kind of like how Lisp uses S-expressions, as a nice data structure for people to write code in, but it's still code.
Sure, the steps are things like "if X hasn't been done yet, do it." That means it's idempotent imperative code. It doesn't mean it's declarative.
CFEngine is slightly less imperative, but when I was doing heavy CFEngine work I had a printout on my cubicle wall of the "normal ordering" because it was extremely relevant that CFEngine ran each step in a specific order and looped over that order until it converged, and I cared about things like whether a files promise or packages promise executed first so I could depend on one in the other.
Kubernetes - largely because it insists you use containers - doesn't have any concept of "steps". You tell it what you want your deployment to look like and it makes it happen. You simply do not have the ability to say, install this package, then edit this config file, then start this service, then start these five clients. It does make it harder to lift an existing design onto Kubernetes, but it means the result is much more robust. (For some of these things, you can use Dockerfiles, which are in fact imperative steps - but once a build has happened you use the image as an artifact. For other things, you're expected to write your systems so that the order between steps doesn't matter, which is quite a big thing to ask, but it is the only manageable way to automate large-scale deployments. On the flip side, it's overkill for scale-of-one tasks like setting up your development environment on a new laptop.)
I agree. It is impressive how much it can orchestrate. It is also very useless in the real cloud because developers there are dealing with higher-level abstractions to solve problems for the business.
The most simplistic task - execute some code in response to even in a bucket - makes kubernetes with all its sophisticated convergence capabilities completely useless. And even if somebody figures this out and puts the opensource project on github to do this on kubernetes - it just going to break at slightest load.
Not to mention all the work to run kubernetes at any acceptable level of security, or keep the cost down, do all patching, scaling, logging, upgrades... Oh, the configuration management itself for kubernetes? Ah sorry, I forgot, there are 17 great open-source projects exists :)
> The most simplistic task - execute some code in response to even in a bucket - makes kubernetes with all its sophisticated convergence capabilities completely useless.
That's because you're not thinking web^Wcloud scale. To execute some code in response to event you need:
- several workers that will poll the source bucket for changes (of course you could've used existing notification mechanism like aws eventBridge, but that will couple you k8s to vendor-specific infra, so it kinda deminishes the point of k8s)
- distributed message bus with persistanse layer. Kafka will work nicely because they say so on Medium, even though it's not designed for this use case
- a bunch of stateless consumers for the events
- don't forget that you'll need to write processing code with concurrency in mind because you're actually executing it in truly destributed system at this point and you've made a poor choice for your messaging system
Wait, I can do all these with s3 and lambda at any scale - for pennies :) Will probably take few hours to set everything up with tools like stackery.io
So once again, why developers need kubernetes for? If the most simple problem becomes a habitholy mess :)
You can save your Kubernetes manifests in any order. Stuff that depends on other stuff just won't come up until the other stuff exists.
For example, I can declare a Pod that mounts a Secret. If the Secret does not exist, the Pod won't start -- but once I create the Secret the pod will start without requiring further manual intervention.
What Kubernetes really is, under the hood, is a bunch of controllers that are constantly comparing the desired state of the world with the actual state, and taking action if the actual state does not match.
The configuration model exposed to users is declarative. The eventual consistency model means you don't need to tell it what order things need to be done.
A combination of things, mostly related to Kubernetes' scope and use case being different from Ansible/CFEngine/etc. Kubernetes actually runs your environment. Ansible/CFEngine/etc. set up an environment that runs somewhere else.
This is basically the benefit of "containerization" - it's not the containers themselves, it's the constraints they place on the problem space.
Kubernetes gives you limited tools for doing things to container images beyond running a single command - you can run initContainers and health checks, but the model is generally that you start a container from an image, run a command, and exit the container when the command exits. If you want the service to respawn, the whole container respawns. If you want to upgrade it, you delete the container and make a new one, you don't upgrade it in place.
If you want to, say, run a three-node database cluster, an Ansible playbook is likely to go to each machine, configure some apt sources, install a package, copy some auth keys around, create some firewall rules, start up the first database in initialization mode if it's a new deployment, connect the rest of the databases, etc. You can't take this approach in Kubernetes. Your software comes in via a Docker image, which is generated from an imperative Dockerfile (or whatever tool you like), but that happens ahead of time, outside of your running infrastructure. You can't (or shouldn't, at least) download and install software when the container starts up.
You also can't control the order when the containers start up - each DB process must be capable of syncing up with whichever DB instances happen to be running when it starts up. You can have a "controller" (https://kubernetes.io/docs/concepts/architecture/controller/) if you want loops, but a controller isn't really set up to be fully imperative, either. It gets to say, I want to go from here to point B, but it doesn't get much control of the steps to get there. And it has to be able to account for things like one database server disappearing at a random time. It can tell Kubernetes how point B looks different from point A, but that's it.
And since Kubernetes only runs containers, and containers abstract over machines (physical or virtual), it gets to insist that every time it runs some command, it runs in a fresh container. You don't have to have any logic for, how do I handle running the database if a previous version of the database was installed. It's not - you build a new fresh Docker image, and you run the database command in a container from that image. If the command exits, the container goes away, and Kubernetes starts a new container with another attempt to run that command. It can do that because it's not managing systems you provide it, it's managing containers that it creates. If you need to incrementally migrate your data from DB version 1 to 1.1, you can start up some fresh containers running version 1.1, wait for the data to sync, and then shut down version 1 - no in-place upgrades like you'd be tempted to do on full machines.
And yeah, for databases, you need to keep track of persistent storage, but that's explicitly specified in your config. You don't have any problems with configuration drift (a serious problem with large-scale Ansible/CFEngine/etc.) because there's nothing that's unexpectedly stateful. Everything is fully determined by what's specified in the latest version of your manifest because there's no other input to the system beyond that.
Again, the tradeoff is this makes quite a few constraints on your system design. They're all constraints that are long-term better if you're running at a large enough scale, but it's not clear the benefits are worth it for very small projects. I prefer running three-node database clusters on stateful machines, for instance - but the stateless web applications on top can certainly live in Kubernetes, there's no sense caring about "oh we used to run a2enmod but our current playbook doesn't run a2dismod so half our machines have this module by mistake" or whatever.
It is common to have significant logic and complexity in the configuration management manifests, but I'd argue that it's possible to move most of that to packaging and have your configuration management just be "package state latest, service state restarted."
Check out nix for actual development environments. Huge fan of that as well.
I can buy a new laptop and be back to 100% in a few minutes. Though the amount of time I spent learning how to get there far exceeds any time savings. Ever.
I've been testing out nix and I haven't found out how to install packages in a declarative way yet. Using "nix-env -iA <whatever>` seems really imperative.
How are you doing that? Do you use something like home-manager, or do you just define a default.nix and then nix-shell it whenever you need something?
Yes `nix-env -iA` is installing packages in an imperative way. I think it is there to be some kind of tool that people from other OS can relate to. Purist say you should avoid using it for installing packages and instead list global packages in `/etc/nixos/configuration.nix` for globally installed packages and home-manager for user specific ones, and if you need temporarily just to try something out use `nix-shell -p <whatever>`.
Back to your second question, you can configure the system through `/etc/nixos/configuration.nix` it is enough to configure system as a service. Pretty much everything you could do through Chef/Puppet/Saltstack/Ansible/CFEngine etc.
home-manager is taking it a step further and do this kind of configuration per user. It is actually written in a way that can be added to NixOS (or nix-darwin for OS X users) to integrate with the main configuration so then when you're declaring users you can also provide a configuration for each of them.
So it all depends what you want to do, the main configuration.nix is good enough if your machine to run specific service, that's pretty much all you need, you don't care about each user configuration in that scenario, you just create users and start services using them.
If you have a workstation, home-manager while not essential can be used to take care of setting up your individual user settings, stuff like dot-files (although it goes beyond that). The benefit of using home-manager is that most of what you configure in it should be reusable on OS X as well.
If you care about local development, you can use Nix to declare what is needed, for example[1]. This is especially awesome if you have direnv + lorri installed
When you do that you magically will get your CDE (that includes all needed tools, in this case proper python version, you also enter equivalent of virtualenv with all dependencies installed and extra tools) by just entering the directory, if you don't have them installed all you have to do is just call `nix-shell`.
I also can't wait when Flakes[2] get merged. This will standardize setup like this and enable other possibilities.
Chef and from what I heard, since I didn't use it, Puppet are declarative, but since their DSL is really Ruby, it is really easy to introduce imperative code.
Ansible uses YAML, but when I used it few times it felt that you still use it in imperative way.
The saltstack (which also uses YAML) was the closest from that group (never used CFengine, but the author wrote research paper and shown that declarative is the way to go, so I would imagine he would also implement it that way).
If you truly want a declarative approach designated from a ground up, then you should try Nix or NixOS.
I use Ansible for managing my infra, and the only time my playbooks look imperative is when I execute a shell script or similar, which is about 5% of total commands in my playbooks.
One way to test if your playbook is declarative is try to rearrange the states and have them in different order. If the playbook breaks with different order it is imperative.
For certain things like layer 4 and layer 7 routing or firewall policies, health checking and failover, network-attached volumes, etc you have to choose software and configure it on top of getting that configuration in that tooling. So you are doing kernel or iptables or nginx or monit/supervisord configurations and so on.
But basic versions of these things are provided by Kubernetes natively and can be declared in a way that is divorced from configuring the underlying software. So you just learn how to configure these broader concepts as services or ingresses or network policies, etc, and don't worry about the underlying implementations. It's pretty nice actually.
I've been using Kubernetes exclusively for the past two years after coming from a fairly large Saltstack shop. I think traditional configuration management is flawed. Configuration drift _will_ happen because something, somewhere, will do something you or the formula/module/playbook didn't account for. A Dockerfile builds the world from (almost) scratch and forces the runtime environment to be stateless. A CM tool constantly tries to shape the world to its image.
Kubernetes isn't a silver bullet of course, there will be applications where running it in containers adds unnecessary complexity, and those are best run in a VM managed by a CM tool. I'd argue using k8s is safe default for deploying new applications going forward.
It's not just the configuration format. There is a whole 'Kubernetes runtime' (what they call the 'control loops' aka 'controllers') that runs 24/7 watching the configuration live and making appropriate changes.
Unlike Ansible (and I suspect the others) where it's really only more of a 'run once' type of thing... And sometimes if you try running it a second time it won't even succeed.
Ansible is a little special in how imperative it is, a better comparison is Puppet which is intended to do periodic "convergence" runs, although these are more typically hourly or daily than continuous.
They are not comparable. You might use ansible, salt, puppet or chef to deploy kubelet, apiserver, etc. You could, barring those with self-love, even deploy Ansible tower on Kubernetes to manage your kubernetes infrastructure.
as a sys-admin, I like k8s because it solves sys-admin problems in a standardized way. Things like, safe rolling deploys, consolidated logging, liveness and readiness probes, etc. And yes, also because it's repeatable. It takes all the boring tasks of my job and let's me focus on more meaningful work, like dashboards and monitoring.
Lets use Bazel, and Bazel's rules_k8s to build\containerize\test\deploy only the microservices of my monorepo that changed.
Lets use Istio's "istioctl manifest apply" to deploy a service mesh to my cluster that allows me to pull auth logic / service discovery / load balancing / tracing out of my code and let Istio handle this.
Lets configure my app's infrastructure (Kafka (Strimzi), Yugabyte/Cockroach, etc) as yaml files. Being able to describe my kafka config (foo topic has 3 partitions, etc) in yaml is priceless.
Lets move my entire application and its infrastructure to another cloud provider by running a single bazel command.
k8s is the common denominator that makes all this possible.
Terraform explicitly doesn't want to deal with deployment of stuff that is inside VMs etc. and tries to tell you to use managed services or cloud-config yamls as the solution.
You can write your own providers, you can use the provisioned support, but TF doesn't like that and it shows.
K8s is great - if you are solving infrastructure at a certain scale. That scale being a Bank, Insurance Company or mature digital company. If you're not in that class then it's largely overkill/overcomplex IMO when you can simply use Terraform plus managed Docker host like ECS and attach cloud-native managed services.
Again the cross cloud portability is a non starter, unless you're really at scale.
What k8s really scales is the developer/operator power. Yes, it is complex, but pretty much all of it is necessary complexity. At small enough scale with enough time, you can dig a hole with your fingers - but a proper tool will do wonders to how much digging you can do. And a lot of that complexity is present even when you do everything the "old" way, it's just invisible toil.
And a lot of the calculus changes when 'managed services' stop being cost effective or aren't an option at all, or you just want to be able to migrate elsewhere (that can be at low scale too, because of being price conscious).
We have a mature TF module library and can roll out complex, well configured infra in a matter of hours, reliably. That said it's platform specific.
Sure, managed service costs are certainly a thing, but to my point that only really start to become an issue at significant scale, assuming you're well configured.
The cost metrics that make "it's cheaper to use managed service than pay the cost of extra engineer to specialize in infrastructure" aren't universal. In fact, I usually have to work from the opposite direction, where hiring a senior Ops specialist who can wrangle everything from shelving the physical hw to network booting k8s cluster on-premises can be cheaper that Heroku/AWS/etc.
> you can simply use Terraform plus managed Docker host like ECS and attach cloud-native managed services
That's not actually simple at all, and you would need to build a lot of the other stuff that Kubernetes gives you for free.
Kubernetes gives you an industry standard platform with first-class cloud vendor support. If you roll your own solution with ECS, what you are really doing is making a crappy in-house Kubernetes.
I'd disagree - my team migrated from running containers on VMs (managed via Ansible) to ECS + Fargate (managed by Terraform and a simple bash script).
It wasn't a simple transition by any means, but one person wrapped it up in 4 weeks - now we have 0 downtime deployments, scaling up/down in matter of seconds, and ECS babysits the containers.
Previously we had to deploy a lot of monitoring on each VM to ensure that containers are running, we get alerted when one of the application crashed and didn't restart because Docker daemon didn't handle it etc etc.
Now, we only run stateless services, in a private VPC subnet, Load balancing is delegated to ALB, we don't need service discovery, meshes etc. Configuration is declarative, but written in much friendlier HCL (I'm ok with YAML, but to a degree).
ECS just works for us.
Just like K8S might work for a bigger team, but I wouldn't adopt it at our shop, simply because of all of the complexity and huge surface area.
k8s as a bunch of other benefits beside just scaling and you can run a single node cluster with the same uptime characteristics as your proposed setup and get all these benefits.
And, we only have to learn one complex system and avoid learning each cloud, one of which decided product names which have little relation to what they do was a good idear
Because Google made it. Same thing with Tensorflow. And, fun fact, both are massively overhyped and a real PITA to learn and use. But Google uses it, so hey.
This just isn’t true. I’ve never used Tensorflow but Kubernetes is great.
We moved to it from docker swarm because docker swarm still has a lot of glitches with its overlay network. Rolling upgrades would leave stale network entries and its impossible to reproduce. Sometimes it happens sometimes it doesn’t.
With a managed solution, Kubeadm, or RKE it’s not hard to deploy anymore. All our infrastructure is in code, is immutable, and if you’re careful can be deployed into any kubernetes cluster.
Just like Docker has been great for easily deploying open source products, kubernetes is great for doing the same thing when you need to deploy horizontally. It’s easy for OSS to provide a docker image, a docker compose file for single node deploy, and Kubernetes yaml for a horizontal deploy.
I'll add my use case: we use hosted kubernetes to deploy all of our branches of all of our projects as fully functional application _stacks_, extremely similarly to how they will eventually run in production. Want to try something and show it to someone in the product owner level? Ok there will be a kube nginx-ingress backed environment up in build-time + a few minutes.
The environments advertise themselves via that same modified ingress's default backend. We stick a tiny bit of deploy yaml in our projects, the deployments kube tagging gives us all the details we need to provide diffs, last build time, links to git repos, web sites etc for the particular environment. The yaml demonstrates conclusively how an app could or should be run, regardless of os or software choice, so when we hand it to ops folks there is a basis for them to run from.
There's nothing too novel about Kubernetes, similar patterns could be seen in Erlang many years ago, though in different abstraction levels.
However, because enterprise ops prior to Kubernetes are both costly and brittle, Kubernetes just works for enterprises.
We had a huge PowerShell codebase and it was a nightmare to maintain. in the meantime, it's no way as robust as Kubernetes.
It's just as simple as that: sure, Kubernetes seems to be complex, but most enterprise stuff are even worse. At the same time, despite they are costly, the quality is usually pretty crappy because those scripts are written under delivery pressure.
You can tell by the volume of comments how interested in the topic the community is.
I've noticed that there are a lot of replies such as "it is overhyped" and "I can just run a VM".
Kubernetes is not for you as your use case may not match what it does and solves. Kubernetes provides a standard way of running your applications. It is complex but logical. Yaml sucks but it is simple and logical. I prefer to use terraform for kubernetes but it is the same thing, simple and logical. You cannot say the same with puppet, chef, ansible etc. All of those configuration tools are a big mess of different setups and scripts. I can go to any company and understand how their system works quite quickly. It makes searching for answers easy too because it is standard.
When you are running several services and there is an outage, it is a godsend. You can instantly view the status of things, how they are configured and when they changed. That is POWERFUL.
It takes a while to understand how all of the resources fit together but that is the same case with any type of deployment system and/or operating system.
p.s. I am not running that huge of a system, maybe about 5k containers total between dev, staging and prod. Maybe 500k requests a day. Running a couple kubernetes clusters is significantly nicer than running things in ECS.
TIL about kudo, The Kubernetes Universal Declarative Operator. We've been doing the exact same things in a custom go CLI for 2 years.
The kubernetes ecosystem is really amazing and full of invaluable resources. It's vast, complex, but well-thought. Getting to know all ins and outs of the project is time consuming. So much things to learn and so little time to practice...
I work on KUDO team. Would love to hear what you think about it. All devs hang out in #kudo channel on Kubernetes community slack, please don’t hesitate to join and say hi.
There's a silent majority of people that don't use k8s (or containers) - hell there is a significant portion of servers that don't even use linux. I find the majority of engineers my age (mid 30s) think it is nothing more than straight marketing - between said marketing fueled vc dollars and "every company is a software company" there's a very good reason why k8s has taken off but I'd ask the following:
Why should it have?
Many people I talk with will complain about security, performance and complexity of k8s (and containers in general). Non-practicing engineers (read: directors/vps-eng) will complain about the associated cost with administering their k8s clusters both in terms of cloud cost and devops personnel cost.
Someone earlier mentioned it was the new wordpress - I don't think that's an unfair comparison, although I would challenge the complexity/cost of it.
You don't necessarily HAVE to use K8s to get advantage of it. Use something like Knative and you're good to go. Google has Cloud run and Azure would soon come up with some similar abstraction on top of kubernetes.
Honestly, at least with GKE, hosting applications on managed k8s is not that complicated, to the point that I don't think it is a poor choice even for small teams who might not need all the bells and whistles of k8s. That is, so long as that small team is already on board with CI and containers.
Kubernetes got popular because it was the first system that came along that provided a CRUD API for resources of all kinds, including custom resources (CRDs), and was immediately compatible with public artifact hubs like DockerHub and Google Container Registry. The second one is the real kicker here, and I think is why Kubernetes "won" and Mesos et al did not. With Mesos et al you had to set up your own artifact storage. As powerful as Mesos was, there was no MesosHub.
Longer term, I think the contribution of Kubernetes will be getting us used to a resource/API-driven approach to infrastructure that abstracts away cloud providers, hardware, etc. But it will probably be superseded in the coming years by something that honors similar API "contracts." Probably written in Rust troll
I'm using Kubernetes extensively in my day to day work and once you get it up and running and learn the different abstraction, it becomes a single API to manage your containers, storage and network ingress needs. Making it easy to take a container and getting it up and running in the cloud with an IP address and a DNS configured in a couple API calls (or defined as YAMLs).
That being said, I will also be the first one to recognize that PLENTY of workloads are not made to run on Kubernetes. Sometimes it is way more efficient to spawn an EC2/GCE instance and run a single docker container on it. It really depends on your use-case.
If I had to run a relatively simple app in prod I would never use Kubernetes to start with. Kubernetes starts to pay itself off once you have a critical mass of services on it.
There are organisations with 1000's of services on Serverless seeing enormous benefits in reduced management overhead and reduced costs compared to the Kubernetes solution they previously ran.
My issue with serverless though is that you need to refactor your code to make it work specifically for it. If you don't start to think serverless on day one it gets more and more difficult to convert to it down the road.
Kubernetes, I think, exists in the 'hedgehog' at the middle of the diagram of various drives and fears.
There is some tech so simple that you just learn it and start using it, others that you know you can pick up when the time is right.
And software you would be happy to invest time in... as long as someone is paying you to do it, software you fear might keep you from getting a job if you don't invest in it.
There is software so simple it might be right (it isn't) and software so complicated that it must be important if people are using it/working on it.
So it's not that Kubernetes is good, it's just that it makes people neurotic enough to jump on the bandwagon. Been a few of those in my career. A few have stuck, most have not.
Kubernetes is insurance for companies against getting locked in proprietary technologies.
It also promotes immutable infrastructure and hence increases the portability. While some of the things like load balancers and ingress are controlled by cloud provider almost everything else can be seamlessly migrated to another cloud provider or on prem.
It makes dev, test, staging, prod environments consistent and also solves a lot of pain points of managing infrastructre at scale with autoscaling, auto healing and more. Istio adds a lot more kubernetes and makes the supporting microservices even easier.
Its going to be an important piece in Hybrid world as it brings a lot of standardization and consistency in two disparate environments.
I mean, k8s draws experience from 12+ years of thousands of high caliber engineers. It's like deliver modern cars to Chinese market in 1970s time. Of course it will be popular...
I can only speak for myself as a relatively late adopter, right around early 2020 this year.
I only consider that late because I've been reading the hype around k8s for many years already.
Became a late adopter of containers just before k8s actually. Now I've migrated most of my setups both privately and professionally to containers. And setup my first k8s clusters both at work and in my homelab.
So my perspective is that containers are first and foremost an amazing way of deploying software because all that complexity I did in ansible to deploy the software has been moved to the container image.
The project itself now, be it Mastodon, Jitsi, Synapse to name a few, package most of their product for me in automatic build pipelines. All I need to do is run and configure it.
And therefore, moving on to k8s, it would stand to reason that some of those services are able to be clustered. Where better to do such clustering than k8s?
That's just an ops perspective. We also have devs where I work and with k8s they're able to deploy anything from routes down to their services using manifests in CD pipelines. What's not to like?
Only reason one might get disenchanted with k8s is if you expect it to be a one-stop solution for your aging .net application. Not saying you can't deploy that in k8s, I'm just using it as an example of something that might not be microservice ready.
It's a developer tool made originally by google. Of course it's popular. Which isn't to say it's bad, it's just not much of a question as to why it's popular.
-------
Kubernetes - kubernetes.io
Kubernetes is an open-source container-orchestration system for automating application deployment, scaling, and management. It was originally designed by Google, and is now maintained by the Cloud Native Computing Foundation.
If the question was, Why is kube getting so popular with developers, it might get a different response. I wonder how many software developers come to kubernetes through the templated/helm chart/canned approach made by there DevOps team, not that this isn't a fine approach, but I find it a different conversation to say, Serverless, where anyone can just jump in.
After spending 18 months working on bringing kubernetes(EKS) to production, with dozens of services on it, the time was right to hand over migrating old services to the software engineers who maintain them. Due to product demands, but also some lack of advocacy, this didn't happen, with the DevOps folks ultimately doing the migration and retaining all the kubernetes knowledge.
An unpopular opinion might be that Kubernetes is popular because it gives DevOps teams new tech to play with, with long lead times for delivery given its complexity. Kubernetes usually is a gateway to tracing, service meshes and CRDs, which while you don't need at all to run Kubernetes, they will probably end up in your cluster.
Every person I know who wants to use k8s has never had to maintain it.
"Developers love it!" Yeah, I'd love someone to drive my car for me, too. Doesn't mean it's a great idea to use technology so complex you have to hire a driver (really several drivers) to use it.
If you already have 3 people working for you that (for example) understand etcd's protocols or how to troubleshoot ingress issues or how to prevent (and later fix) crash loops, maybe they can volunteer to babysit your cluster for you, do all the custom integration into the custom APIs, keep it secure, etc. But eventually they may get tired of it and you'll have to hire SMEs.
If you're self-hosting a "small" k8s cluster and didn't budget at least $500k for it, you're making a mistake. There are far simpler solutions to just running a microservice that don't require lots of training and constant maintenance.
Complexity isn't always bad, but unnecessary complexity always is.
- Setup VM: and their dependencies, tool chain. If you use thing like package that has native component such as image processing you event need to setup some compiler on the VM
- Deployment process
- Load balancer
- Systemd unit to auto restart it. Set memory limit etc.
All of that is done in K8S. As long as you ship a Dockerfile, you're done.
How big does your company get before you need to step away from a tiny handful of very large EC2s?
If you have 16CPU EC2 for your business logic, one for your DB, and you're smartly hosting your static content elsewhere or via Cloudflare ... I mean you need to have a 'big company' before going too far beyond that.
What gives? What are all these startups doing?
This is not a story about K8's, this is entirely something else, it's about psychology, complexity, our love of it, or rather our 'belief' that complexity = productivity that solving 'the hard infra problem' must inherently, be somehow be 'good for the company' because it 'feels difficult' and therefore must be doing something powerful or at least gaining some kind of competitive advantage?
(Aside from the 'Docker is Useful and K8's follows' point which actually makes sense a little bit ...)
I've tried to make the argument that Kubernetes introduces a level of complexity that should make everyone think twice before diving into that eco-system. I've tried to make this argument using both detailed, factual arguments, and also by using humor and parody. I am confused why Kubernetes has so much momentum, especially when you consider that most of the things we want (isolation, security, dependency management, flexible network topologies) can be gained much more simply with Terraform and Packer. With a mix of humor and detailed factual analysis, my most recent attempt to make this argument is here:
- Most code running on k8s hasn't hit full production load yet.
- Where it has worked well, its been managed by devs that know what they are doing.
- It something worth putting on a backed dev resume
- Apparent cost saving ('we just need 1 vm instead of 5', 'we can auto scale to infinity','we don't have yo pay for aws, we get it all on our own vms').
Wait a few months and we will see a slurry of posts that read 'why we moved away from kubernetes', 'top 5 reasons to not use kubernetes', 'How using kubernetes fucked us, in the ass', 'You dont need kubernetes', 'Why I will never work on a project that uses kubernetes', 'Hidden costs of kubernetes' and so on.
C'mmon, you know how this works. Just take the time and read the docs. They are well written (They just don't mention where k8s does not work well)
Kubernetes is great, but also very complex and almost an entire new paradigma to learn and understand. I feel like there's a huge void between no Kubernetes and Kubernetes that isn't being filled yet. Dealing with and/or managing Kubernetes is a task on its own, I have the feeling that container orchestration doesn't have to be that complex.
Something like an easy to use (and operate!) multi-tenant docker-compose on steroids with user management/RBAC and a built-in Docker image repository that gets out of your way would be amazing for small teams / startups that don't want to deal with the complexity of Kubernetes.
- There are big names behind it.
- It will replace VM orchestration platforms.
- Fear of missing out.
Jokes aside, when you've lots of teams, all working on small pieces of a large product and shipping on their own, iterating fast... you need a platform and ecosystem on top to meet their requirements. As you reach planet-scale, you need to NOT let your cost grow exponentially. Hence it is popular.
What if you're not planet-scale? Well, it will still help (attract talent, design for scale, better ecosystem etc.). Hence it is popular.
If you're building a business however, focus on business and time-to-market, definitely not the infra, i.e. kubernetes.
I think kubernetes is to Infra what RoR was to Web. Not necessarily in terms of architectural style of MVC, but more towards standardization of similar enough problems that can be put into a mutually agreed convention.
I've been completely perplexed by how I might repeatably and reliably setup a single DigitalOcean (or similar) server.
I can't just blow away the instance, make a new one with their API, and run a bash script to set it up because I need to persist some sqlite databases between deploys.
Nix looks promising, but also seems to be a lot to learn. I think I'd rather focus on my app than learn a whole new language and ecosystem and way of thinking about dependencies.
I don't think my needs are insane here, I'm surprised there seems to be no infrastructure as code project for tiny infrastructures.
Easy: the perception of infinite overhead. The cloud itself (e.g. EC2) gives you capacity-on-demand, but the glue to make all the nodes work together is missing. And its a really hard problem, in general, because its distributed systems. K8s fills that demand, or seeks to. The alternative is to roll-your-own, which is possible but expensive, error-prone, and difficult to hire for. (My last company discovered this to their detriment after sinking a LOT of time into building out a really complex Salt+Vagrant+AWS solution, and then decided to migrate to k8s).
Beats me. It doesn't correspond to either virtual server nor hypervisors. It certainly doesn't correspond to real hardware. Cloud OS my butt...
"Hey, let's take a zillion commodity cloud provider instances running on hypervisors, then install Ubuntu, then run Kubernetes on them, then run docker containers on them and fiddle about all day with yaml trying to make internal networking do insecure things to imitate real world infrastructure"
Just use Ansible if you miss YAML, and you can actually deploy to real hardware.
Everyone was trying to make a system simple and adopted, but if you want it to be adopted, it's going to need a lot of features. Also Google worked some real magic in getting Kubernetes being supported by all the cloud providers.
It's a framework that will enable you to do what you want, while being the standard.
You could write your script to do that in a simpler way, but most people already know the standard and it's easier for everybody to understand Kubernetes rather than your clever solution.
I think kubernetes is great conceptually if you're running on the cloud, but it's a very complicated domain, and has a lot ecosystem churn. Things break a lot if you're not careful. Upgrading dependencies is a constant pain. Certainly a time such
This article adds nothing to common knowledge everyone with a bachelors degree in computer science is completely aware of. Can anyone tell me what I missed? Or where is the reason why this post is trending among people whose background I cannot grasp?
Do you guys think k8s is doing a job which previously the jvm did in enterprise? i.e. if everything is on the jvm, building distributed systems doesn't require a network of containers.
Can k8s success be explained partly due to the need for a more polyglot stack?
I thought it was pretty insane yesterday when I read a YC-backed recruiting company was using Kubernetes. Absolutely insane. It's become the new, hottest, techiest thing that every company has to have even when they don't need it.
It's perfectly sane if their team already knows how to use K8s, especially if they use a hosted solution like GKE or Digitalocean K8s. (I'll admit that I'd never want to manage my own k8s cluster.)
Once you know K8s, it's not very difficult to use. Plus, it provides solutions to a lot of different infrastructure-level problems.
the main thing I like about them: configuration. it's trivial to split integration configuration from applicative configuration from deployment configuration, it's trivial to version configurations,
it's not unique in what it does, but even with puppet and the likes you always had this or that exception because networking, provider images varying selinux defaults etc.
kuberent on it's own already covered most ground, but configmap and endpoints really tie it together in a super convenient package
it's not without pitfalls, like ms aks steal 2gb from each node so you have to be aware of that and plan accordingly, but still.
> it's not without pitfalls, like ms aks steal 2gb from each node so you have to be aware of that and plan accordingly, but still.
This is what I hate alot about things like k8, docker, etc is the memory profile… pretty much makes it a non starter if you want to run it on anything low cost.
Digital ocean has a managed kubernetes service that does not cost anything except the resources you use. The master node and management is free, you only pay the node pools and stuff like block storages (their version of EBS) or load balancers.
At this point, most big cloud providers cost almost the same but in terms of maturity, google’s offering is still ahead. I have not tried out digital ocean’s hosted solution, but the might be the cheapest.
One thing I've discovered when hiring people is if I'm not using things like kubernetes, I don't get (as many) candidates apply. I also don't get as good quality candidates, either.
My opinion is that Kubernetes is the common integration point. Tons of stuff works with Kubernetes without having to know about each other, making deployments much much easier.
Because Google is an advertising company, their search engine controls what people believe in and they also have some good engineers but they are probably not well known. There is very little they couldn't advertise into popularity. Whenever you see overcomplicated software or infrastructure its always a way to waste executive function, create frustration and create unnecessary mental overhead. If the technology you're using isn't making it easier for you to run your infrastructure from memory, reduce the use of executive function and decrease frustration then you should ignore it. Don't fall for the fashion trends.
I'm not criticizing. I've actually used kubernetes and read the source code. It's a good tool, I just think its too much mental overhead for most companies since they won't use most of what it provides. If you're working on a large team with responsible parties who have a clearly defined roles it is a great tool but I've seen two person projects with startup infrastructure waste obscene amounts of time learning Kubernetes when they could have just stood up something basic with configuration management to get started and migrate to Kubernetes when it was reasonable to do so. People need to start with a goal and then ask what tool meets the objectives of their goal. In a lot of cases people complain about the tool they are using because they start with Kubernetes and then try to figure out how they can use it on the job.
Your point is do something simple because k8s is hard? 1) even small scale dev teams an business still need non simple.software processes. 2) learning Kubernetes is easier now than learning the underlying cloud. It's really about all the other things k8s provides. Maybe you haven't seen it used in enough contexts yet to appreciate those other benefits?
This is the wrong question. The question should be why are containers so popular? If you're going to use containers, kubernetes makes it easier to do so.
Not a microservices guru, but why are big companies (most famously Uber, who was sort of spearheading it famously) starting to abandon this architecture?
Call me biased [1] but K8s will take over the world! Yes you get containers and micro-services and all that good stuff, but now with Anthos [2] its also the best way to achieve multi-cloud and hybrid architectures. What's not to like!
Is there any benefit of Anthos over deploying straight to GKE if you're already bought into GCP? We've had this debate several times recently and can't come up with a good answer.
If you are bought in to GCP and plan to stay there, then maybe not much. OTOH, Anthos would allow you to do easier migrations from on-prem, support hybrid workloads, or consolidate multi-cloud clusters including those running on say, AWS [1] if you like.
To the first- yes, enormously so. If you know your history, it is the Linux to the Microsoft that is AWS- except backed by a business. (Google is maybe RedHat in that story, but the analogy is more inaccurate than accurate).
To the second, not really. GCP is mostly turning into an ML play.
I get the feeling K8 is the modern PHP. Software that's easy to pick up and use without complete understanding and get something usable. Even if its not efficient and results in lots of technical debt.
And like PHP, it will be criticised with the power of hind sight but will continue to be used and power vast swaths of the internet.
But languages are easy, there is the whole field of PL theory to draw from. If you're randomly throwing things together like Lerdorf was, there's a missed opportunity.
But what is the universally regarded theory that k8s contradicts? I don't think there is one.
In fact, I'd say that k8s is unusually heavily stepped in high-brow theories from both engineering and AI space. Just not necessarily ones that enjoy hype right now.
The storage of apiserver essentially works as distributed Blackboard in a "Blackboard System", with every controller being an agent in such a system. Meanwhile the agents themselves approach their tasks from control theory areas - oft used comparison is with PID controllers.
I don't think this is right. The reason I say that is because for the most part, teams new to k8s aren't building and managing their own clusters, they are using a managed solution. In that case, an application deployment only need be a few dozen lines of yaml. Most teams aren't really going to be building deep into k8s, and it shouldn't be hard to deploy your containers to some other managed solution.
Fair point, but then plenty of people were using hosted solutions for their naive PHP apps too. Managed solutions don't prevent poor/improper configuration in either case.
The managed hosts and/or their tools probably helped negate damage/resolve issues quicker. However I think that the idea that "all you need is a couple of dozen lines of yaml and a managed provider" is exactly why it's headed down a similar path.
For a real world examples just look at every improperly configured S3 bucket leaking data. Every private key accidentally posted to github from a careless 'git add -a'. Every API that doesn't properly check auth. None of these are within the purview of a managed hosts responsibility.
I'm not even against K8 in any of this. Just making the observation that - like PHP - it is empowering entire groups of people to do things they otherwise wouldn't be able to do.
if you're on microservices, it's no brainer. You'll need an army of DevOps with semi-custom scripts to maintain the same. It's really automating a lot of stuff.
Helm + Kubernetes let our company's ability to launch microservices with no DevOps involved. You just provide the name of the project, push to git and GitLab CI will pick it up and do the stuff by the template. Even junior developers in our team are doing that from day one.
Isn't that a future we dream about? If you have too much load it will autoscale pod, if node is overloaded it will autoscale node pool, if you have a memory leak it will restart the app so you can sleep at night. I can provide a million examples that make our 100+ microservices management so much simpler. No Linux kungfu, 0 bash scrips, no SSH, and interaction with OS, not a single devops role for 15+ developers team.
Our management of cluster is just a simple "add more CPU or memory to this nodepool", sometimes change a nodepool name for deployment for certain service. All done simple cloud management UI.
For those who call microservices fancy stuff. No, we are a startup with fast delivery, deploy cycle. We have tons of subproject , integrations, and our main languages are nodejs, golang and python. Some of these are not good at multi-thread so no way to run it as a monolith. The other one is used only when it's needed for high performance.
So All together Microservices + Kubernetes + Helm + good CI + proper pubsub gives our backend extremely simple fast cycle of development, delivery, and what's important flexibility in terms of language/framework/version.
What is also good is the installation of services. With helm I can install high availability redis setup for free in 5 minutes. The same level of setup will cost you several thousand dollars for devops work and further maintenance and update. With k8s it's simple helm install stable/redis-ha
So yeah, I can totally understand some simple projects don't need k8s. I can understand you can build something is Scala and Java slowly but with high quality as a monolith. You don't need k8s for 3 services. I can understand some old DevOps don't want to learn new things and they complain about a tool that reduces the need of these guys. Otherwise, you really need k8s.
Because soon from one program on a dev server, there is a need to run databases, log gathering, multiply the previous to do parallel testing in clean environment, etc. etc.
Just running supporting tools for a small project where there was insistence on self-hosting open source tools instead of throwing money at slack and the like? K3s would have saved me weeks of work :|
In this post it might only be an example, but I don't see anything, that necessitates the use of YAML. All of that could be put in a JSON file, which is far less complex.
YAML should not even be needed for Kubernetes. Configuration should be representable in a purely declarative way, instead of making the YAML mess, with all kinds of references and stuff. Perhaps the configuration specification needs to be re-worked. Many projects using YAML feel to me like a configuration trash can, where you just add more and more stuff, which you haven't thought about.
I once tried moving an already containerized system to Kubernetes for testing, how that would work. It was a nightmare. It was a few years ago, maybe 3 years ago. Documentation was plenty but really sucked. I could not find _any_ documentation of what can be put into that YAML configuration file, what the structure really is. I read tens of pages of documentation, none of it helped me to find, what I needed. Then even to set everything up, to get the Kubernetes running at all also took way too much time and 3 people to figure out and was badly documented. It took multiple hours on at least 2 days. Necessary steps, I still remember, not being listed on one single page in any kind of overview, but somewhere a required step was hidden on another documentation page, that was not even mentioned in the list of steps to take.
Finally having set things up, I had a web interface in front of me, where I was supposed to be able to configure pods or something. Only, that I could not configure everything I had in my already containerized system, via that web interface. It seems that this web interface was only meant for the most basic use cases, where one does not need to provide containers with much configuration. My only remaining option was to upload a YAML file, which was undocumented, as far as I could see back then. That's were I stopped. A horrible experience and I wish not to have it again.
There were also naming issues. There was something called "Helm". To me that sounds like an Emacs package. But OK I guess we have these naming issues everywhere in software development. Still bugs me though, as it feels like Google pushes down its naming of things into many people's minds and sooner or later, most people will associate Google things with names, which have previously meant different things.
There were 1 or 2 layers of abstraction in Kubernetes, which I found completely useless for my use-case and wished they were not there, but of course I had to deal with them, as the system is not flexible to allow me to only have layers I need. I just wanted to run my containers on multiple machines, balancing the load and automatically restarting on crashes, you know, all the nice things Erlang offers already for ages.
I feel like Kubernetes is the Erlang ecosystem for the poor or uneducated, who've never heard of other ways, with features poorly copied.
If I really needed to bring a system to multiple servers and scale and load balance, I'd rather look into something like Nomad. Seems much simpler and also offers load balancing over multiple machines and can run docker containers and normal applications as well, plus I was able to set it up in less than an hour or so, having to servers in the system.
You absolutely can use just JSON with Kubernetes and not YAML. The K8s backend services store configuration in JSON and the API protocols use JSON. There's even a K8s configuration management tool called Ksonnet that uses an extended, JSON-like language with full program-ability, instead of the template mess of Helm charts.
What I can tell you, is that the unbelievable bloat in the complexity of our systems is going to bite us in the ass. I'll never forget when I joined a hip fintech company, and the director of eng told us in orientation that we should think of their cloud of services as a thousand points of light, out in space. I knew my days were numbered at exactly that moment. This company had 200k unique users, and they were spending a million dollars a month on CRUD. Granted, banking is its own beast, but I had just come from a company of 10 people serving 3 million daily users 10k requests a second for images drawn on the fly by GPUs. Our hosting costs never exceeded 20k per month, and the vast majority of that was cloudflare.
Deploying meant compiling a static binary and copying it to the 4-6 hardware servers we ran in a couple racks, one rack on each side of the continent. We were drunk by 11am most of the time.
Today, it's apparently much more impressive if you need to have a team of earnest, bright-eyed Stanford grads constantly tweaking and fiddling with 100 knobs in order to keep systems running. Enter kubernetes.
> is that the unbelievable bloat in the complexity of our systems is going to bite us in the ass.
My favorite example of this right now is Vitess. Sure, it's a beautiful piece of technology. But, for a usecase my company is looking at, we'll be replacing one (exceptionally large) DB with in excess of 80 mysql pods, managed by another opaque-through-complexity system running on the top of kubernetes (which already bites us regularly even though it's "managed").
The complexity and failure scenarios makes my head ache, even though I should never have to interact with it myself.
Oh, and my current favorite PITA - having to change the API version of deployment objects from 'v1beta1' to 'v1' in over 160 microservice charts as part of a kubernetes version upgrade. Helm 2 doesn't recognize the deployments as being identical, so we're also have to do a helm3 upgrade as well, just to avoid taking down our entire ecosystem to do the API version upgrade. Wheeee!
> Oh, and my current favorite PITA - having to change the API version of deployment objects from 'v1beta1' to 'v1' in over 160 microservice charts as part of a kubernetes version upgrade.
How is this a problem unique to Kubernetes? Don't you have to make similar changes when upgrading a library or dependency that was in beta?
Persistant volumes rely on NFS (or a flavor thereof), which is not great for database performance.
But that's a moot point anyways, since Vitess doesn't use persistent volumes - it reloads the individual DBs from backups and binlogs when a pod is moved or restarted.
> Persistant volumes rely on NFS (or a flavor thereof), which is not great for database performance.
NFS is an option, but it’s not the only option. If you need locally attached storage you can use local PV’s which went GA in Kubernetes 1.14, or any of the plethora of volume plugins that exist for various network storage solutions.
I sincerely beg your forgiveness for forgetting about a feature that’s only about a year old, and a feature we can’t take advantage of in our own kubernetes environment, or with the Vitess cluster.
I work at a fintech that runs a rails monolith on k8s and you would probably puke at how cheap our hosting cost is given our AUM. Engineering is a funny thing.
I'm curious how long ago you were at that company serving 3m customers a day. I have not been in the industry very long, so I don't really know what things were like >5 years ago, and don't want to sound as if I'm pretending to be an expert.
That said, a couple thoughts that came to mind:
1. having only 4 servers in 2 locations serving 3m customers a day seems crazy to me, atleast in the context of current practices regarding highly available systems.
2. not sure your cost comparisons are fair, in the first case you're talking about cloud costs (so including hardware, 3rd party services/api fees, etc), but in the second you're just talking hosting fees.
If your first company had a relatively static, hardware-heavy (gpus doing most of the work) workload, easily handled by a few servers -- then it would be crazy to pay for a cloud provider. And it wouldn't make much sense to bother with k8s or containers either (imo).
On the other hand, if the more recent company has a dynamic/spikey, software-heavy workload, with a ton of different services, orders of magnitude more infrastructure, and (being fintech) much more demanding SLAs... then it might make a lot of sense to use a cloud provider and take advantage of k8s. Especially if you're a start up that doesn't have the time/expertise to deal with datacenter design.
I agree that there's a lot of unnecessary fixation on the latest and greatest these days, but there are definitely situations where kubernetes can be very valuable.
Good questions, and you are onto something here for sure. This wasn't too long ago, around 2014-2015.
This was all for a weather radar app, and you are correct, there really weren't any SLA's, but we had to handle very high loads. We did make use of cloud services for some pieces of the system (there was a database and a small API for some minor bookkeeping, mostly around users). I included those costs in my estimate of monthly expenses. We had lots of caches, for all our JSON and for things like user authentication, which saved us from having to really figure out the database side. The caches were typically push-based, so we didn't let user requests get to the disc, if we could help it.
The vast majority of requests were for those images though, which required moving lots of clumsy geographic data into the GPUs to render map tiles (at high-def and high zooms as well), so the requests were still somewhat costly to serve, even if they didn't hit a database. We were able to get away with a small footprint in the datacenter by making heavy use of CDN caching. Cache lifetimes for the latest weather images were often measured in seconds, and getting those timings right was crucial. Screwing up cache lifetimes would rapidly swamp the system with requests, but the software was good at continuing to keep latency low under heavy load, and degrading gracefully. In fact, the vast majority of bandwidth usage in the datacenters was actually not requests, but streaming geographic data from various government sources. We regularly had 50-100MB/s coming in, and we stored all of it in memory. The GPU machines had 100-200GB of memory, and we used all of it. We had to cycle through that memory pretty rapidly as well, so making sure allocations were low and memory was freed up on time was important.
It may not sound like we had much redundancy, but with all the caches, and each machine being quite powerful, it was better than it sounds in that regard. We often took machines in an out of nginx. The way the graceful degradation worked, we would prioritize the imagery from higher zooms (more zoomed out) so the worst that would happen on a typically day is that some very zoomed in images, in places few people were looking, might be slow or time out.
So, in the end, you are correct, the situations are different. The bank had to store things for a lot longer, and had to uphold more stringent SLA's and the like. That said, I still think they were flushing a lot of cash down the toilet, and making things over complicated :).
Thanks for the detailed reply, I found it very informative. I work on a small team developing 'private cloud' infrastructure for a large company, so I usually find myself on the opposite side of the arguement... trying to highlight the virtues of on-prem hardware and the downsides of 3rd party cloud providers.
We've had to work very hard to allow for developers/sre/ops folks to be able provision vms and bare-metal machines in our datacenters the same way they would in the cloud provider that we use. Obviously its not as fast, seamless or feature-rich as it is with aws/gcp/azure et al, but I'm proud of the progress we've made.
What really kills me though, is that a huge chunk of our engineers seem to think our work is a complete waste of time in the first place. We have several physical dcs, and tens of thousands of machines... but since most engineers don't have to think about costs, or about workloads other than their own, they think of us as out of touch and clinging to the past.
Nothing worse than getting snark about our platform from an SRE who spends their days in a web app glueing together the ready made services of google and amazon while acting as if they're building the world of tomorrow :)
Wow, sounds like really challenging work. I still think those skills will be valuable for a long time, especially since they are becoming more rare. I'm curious how you prepared for that job, did a specific degree help? I have only learned what was necessary to solve the problems I've been given, and picked things up from others on the job.
Network effects. Kubernetes is getting into the "no one fired for choosing" category. Whatever it's flaws you can be sure it'll be a round in 10 years (or more sure than other tech). With cloud providers creating flux in their offerings, who knows if the locked-in simpler alternative will be desisted in 10 years, but at least with k8s you can move if they stop supporting it.
Does there exist a solution that does what kubernetes does but without the complexity?
If such a tool does not exist do any of you feel that the creation of such a tool is within the realm of possibility?
I would imagine all these knobs could have default configurations that 99% of all users would be okay with and that the knob should only be exposed in a small amount of cases.
It sounds like your first company did one thing well, and the second one was trying to do a lot of different things. So while volume was lower, complexity was higher.
Don't get me wrong. I'd still probably build that as a monolith in Java instead of a thousand NodeJS services, but I can see how you end up with Kubernetes.
Nah, we already had orchestration systems (a'la chef, ansible, cfengine, and VM images to manage most of that complexity. Worst case scenario, there were always works-100%-of-the-time-80%-of-the-time shell scripts.
I remember back in the day we were deploying a production video streaming server at a customer site, and a team from IBM deployed their analytics suite alongside us. We had over 8Gbps coming out of a single 1U server. The IBM installed a separate server for every Java process that their application needed, which was something like 8. It was ridiculous.
I believe Oracle, IBM, and some others still charge per core/cpu for various products. It's a nice business model to be in if you have locked in customers.
Your argument is about complexity but it doesn’t really speak to kubernetes itself. You can have some dead simple architecture on top of k8s. Hosting your own k8s though, that’s pretty complex, but get that working (or have your cloud provider just host a cluster for you) and things can be pretty darned simple.
> I'll never forget when I joined a hip fintech company, and the director of eng told us in orientation that we should think of their cloud of services as a thousand points of light
Let's be real, if you are old enough to get that reference without Googling, you probably would not have lasted that long at a hip fintech company anyways :-P
tl;dr - Kubernetes is a good tool, but it has been marketed and evangelized to where it is today, it's meteoric rise is not organic.
I am a huge Kubernetes fan, and think that it is a good and necessary tool with little accidental complexity (most concepts are there because you will likely need them and/or that they are a valid concern), but my position is that the growth of Kubernetes has not been organic -- it's been heavily promoted and marketed and pushed to where it is today.
Let's compare a project like Ansible first release in 2012[0], and the first AnsibleFest is in 2016[0]. Ansible is a very useful abstraction/force multiplier for doing ops. If a dedicated conference is a measure of community/enthusiasm reaching a fever pitch, it took 4 years for Ansible to reach critical mass. Kubernetes had it's first Kubecon in 2015[1] ONE year after it's initial release in 2014[2]. Did it reach critical mass 4x quicker than ansible? Maybe, but I think the simpler explanation is that the people who want Kubernetes to succeed know that creating buzz and the appearance of widespread adoption and community is more important than it actually being there, as it becomes a self-fulfilling prophecy. Once you have enough onlookers, people motivated to work on open source (i.e. give away labor, time and energy for free) will come improve your project with you, serve as an initial user base, your biggest promoters, all the while strengthening your ecosystem.
Another interesting side to this is how thoroughly Kubernetes seems to be crushing it's competition -- DC/OS (Mesos), Nomad and other competition are not fighting a functionality war, they're fighting a marketing war. DC/OS and Nomad are not obviously worse in function, but certainly don't compare when you consider ecosystem size (perceived, if not actual) and brand. It's a winner-take-most scenario and tech companies are particularly good at seizing this kind of opportunity. Of course, if you compare the resources of the entities backing these projects, it's clear who was going to win the marketing war.
In a world of free tiers as a good way to get people locked in, developer evangelists who build essentially propaganda projects (no matter how cool they are), and shrinking attention spans, Kubernetes is a good tool which has marketed itself to greatness. In it's wake there are efforts like the CNCF which I struggle to characterize because it's hard to differentiate their efforts to standardize from an effort to bureaucratize. I'm almost certainly blinded by my own cynicism but most of this just doesn't feel organic. Big, useful open source software gets world-renowned after years/decades of being convenient/useful/correct/etc but Kubernetes (and other projects given the CNCF gold star) seem to be trying to skip this process or at least bootstrap a reputation out of the gate.
DevOps traditionally moved much slower -- I can remember what seemed like an age of "salt vs ansible vs chef", with all three technologies having had lots of times to prove themselves useful. Even the switch to containers instead of VM/user based process isolation took more time than Kubernetes has taken to dominate the zeitgeist.
The arguments used to consistently market software
1. It's portable
2. It's fast
3. It's declarative
4. It's fun / productive / easy
5. It's safe / automatic
6. It's an integrated framework
The opposites are also used to detract competitors.
The idea of k8 is that it will be portable to all hosting providers and linux distributions as opposed to developing shell scripts for Red Hat, especially multiple versions. I don't think it's easy or fun or fast.
Because everyone chases the newest, shiniest thing in tech, and it's not cool nor fun to make boring old stuff in C then copy one binary and maybe a config to the server.
Even if one does have a single binary and config file that one can just copy to a server and run, there's more to non-trivial deployments than that. For example, how do you do a zero-downtime deployment where you copy over a new binary, start it up, switch new requests over to the new version, but let the old one keep running until either it finishes handling all requests that it already received or a timeout is reached? One reason why Kubernetes is popular is that it provides a standard, cross-vendor solution to this and other problems.
Most web applications don't need any of that. Also, I didn't say k8s was useless, just that it's the new thing everyone wants (that they probably don't need).
Then you need to add management of storage for it, management of logs, integration of monitoring, healthchecks, maybe some multiple environment case because UAT is good thing to have, etc. etc.
That's more "basic farm vehicle / lorry" than Ferrari.
You always have those concerns, it's just implemented differently. Customer hurling abuse at you over the phone (or worse - in person) is a form of healthchecks and monitoring, if worse than often common "have someone log in to the server every day and check if it's alive".
So is frantically logging into server to manually truncate log files that filled your one and only disk volume and caused the above abuse.
So is "we're losing customers because of how slow it is" yet not having a single idea why it is slow, because it runs fast when dev checks on their laptop.
All of the above are based on actual real world events, sometimes involving large corporations. In fact, the large corps seem to have most issues with manual work, because they can afford throwing cannon fodder ^W^W "experienced engineers" at the problems.
At some point it becomes a question of what is good use of your time. I disagree heavily with people claiming that running kubernetes is somehow orders of magnitude more complex than anything else (especially with k3s and using non-etcd backing stores). The complexity is necessary complexity, which you can tackle in various ways including YOLO.
Sometimes the YOLO approach however bites in the worst moment, and spending time on bespoke scripts, or figuring out configuration drift, are all costs that show up as you tackle said complexity.
Personally, the reason I went with kubernetes in the first production deployment I did with it, after being vocal anti-docker person at work, was because of... cost efficiency. Both in terms of my time (even though we had to spend significant amount of time migrating, as it was lift&shift of existing software), and in terms of compute costs - thanks to heavily loaded nodes our worst compute bill never reached above 20% of previous "condition normal". I don't think we ever really had more than 10 servers on purpose. Using k8s paid for itself.
• Lets companies brag about having # many production services at any given time
• Company saves money by not having to hire Linux sysadmins
• Company saves money by not having to pay for managed cloud products if they don't want to
• Declarative, version controlled, git-blameable deployments
• Treating cloud providers like cattle not pets
It's going to eat the world (already has?).
I was skeptical about Kubernetes but I now understand why it's popular. The alternatives are all based on kludgy shell/Python scripts or proprietary cloud products.
It's easy to get frustrated with it because it's ridiculously complex and introduces a whole glossary of jargon and a whole new mental model. This isn't Linux anymore. This is, for all intents and purposes, a new operating system. But the interface to this OS is a bunch of <strike>punchcards</strike> YAML files that you send off to a black box and hope it works.
You're using a text editor but it's not programming. It's only YAML because it's not cool to use GUIs for system administration anymore (e.g. Windows Server, cPanel). It feels like configuring a build system or filling out taxes--absolute drudgery that hopefully gets automated one day.
The alternative to K8s isn't your personal collection of fragile shell scripts. The real alternative is not doing the whole microservices thing and just deploying a single statically linked, optimized C++ server that can serve 10k requests per second from a toaster--but we're not ready to have that discussion.