One of the issues I've often seen that my team mates send "right command" to wrong cluster and context. We have a bunch of clusters and it's always surprising to see some laptop deployments on ... production cluster.
I'm lazy and I don't like having to remember "the right way" to run something, so my solution is directories and wrappers. I keep a directory for every environment (dev, stage, prod, etc) for every account I manage.
I keep config files in each directory. I call a wrapper script, cicd.sh, to run certain commands for me. When I want to deploy to stage in account-b, I just do:
~ $ cd env/account-b/stage/
~/env/account-b/stage $ cicd.sh deploy
cicd.sh: Deploying to account-b/stage ...
The script runs ../../../modules/deploy/main.sh and passes in configs from the current directory ("stage") and the previous directory ("account-b"). Those configs are hard-coded with all the correct variables. It's impossible for me to deploy the wrong thing to the wrong place, as long as I'm in the right directory.
I use this model to manage everything (infrastructure, services, builds, etc). This has saved my bacon a couple times; I might have my AWS credentials set up for one account (export AWS_PROFILE=prod) but trying to deploy nonprod, and the deploy immediately fails because the configs had hard-coded values that didn't match my environment.
Very interesting solution to the problem. Pretty much everyone has their $PS1 set to show the current working directory, because the desire to know the implicit context of our commands ($PWD) has existed since the dawn of computing. Since then, we've added a lot of commands that have an implicit context, but we haven't updated our tooling to support them. That's a big problem, but I like your solution -- make the kubernetes context depend on the working directory, which your shell already prints out for you before every command.
(If I were redoing this all from scratch, I would just have my interactive terminal show some status-information above the command after I typed "kubectl "; the context, etc. That way, you know at a glance, and you don't have to tie yourself to the filesystem. And, this could all be recorded in the history, perhaps with a versioned snapshot of the full configuration, so that when this shows up in your history 6 weeks later, you know exactly what you were doing.)
With that in mind, I do feel like the concept of an "environment" has been neglected by UI designers. I never know if I'm on production, staging, private preview, or what; either for my own software, or for other people's software. (For my own, I use "dark reader" and put staging in dark mode and production in unmodified mode. Sure confuses people when I share my screen or file bug reports, though. And, this only works if you have exactly two environments, which is fewer than I actually have. Sigh!)
That's great idea. As long as you have that from the design , that's very cool. Moving existing infra to support the idea is just hard and quite a nightmare. In our new clusters, we apply that idea you've shared.
Agree, this is a huge pain point when dealing with multiple clusters. I wrote a wrapper for `kubectl` that displays the current context for `apply` & `delete` and prompts me to confirm the command. It's not perfect, but it's saved me a lot of trouble already — but encouraging other members of the team to have a similar setup is another story.
My approach was to have a default kubeconfig for dev/QA environments, and a separate for production. I had a quick wrapper script to use the prod config file - it would set the KUBECONFIG env car to use the prod file, and update my PS1 to be red, a clear differentiator that reminds me I'm pointed at prod.
Not a perfect solution but I add a prompt signaling both my current namespace and cluster, along with some safeguards for any changes on our production environment. In practice I haven't deployed something wrongfully in production ever.
I use a custom written script but I've used this one in the past - its pretty nice.
I have a prompt display as well, but to my own dismay, earlier that year, I applied some QA config to a prod system. (It did not cause substantial harm, thankfully.) After that, I changed my prompt display so that names of productive regions are highlighted with red background. That seems to really help in situations of diminished attentiveness from what I can tell.
We partially resolve this by having different namespaces in each of our environments. Nothing is ever run in the 'default' namespace.
So if we think we're targeting the dev cluster and run 'kubectl -n dev-namespace delete deployment service-deployment' but our current context is actually pointing to prod then we trigger an error as there is no 'dev-namespace' in prod.
Obviously we can associate specific namespaces to contexts to traverse this safety net but it can help in some situations.
direnv is our magic sauce for this.
We enforce that all devs store the current context in an environment variable (KUBECTL_CONTEXT), and define the appropriate kubectl alias to always use that variable as the current context. To do stuff in a cluster, cd into that cluster’s directory, and direnv will automatically set the correct context. I also change prompt colors based on the current context.
(This way, the worst you can do is re-apply some yaml that should’ve already been applied in that cluster anyway)
We also have a Makefile in every directory, where the default pseudo-target is the thing you want 99% of the time anyway: kustomize build | kubectl apply -f -
This approach allows the convenience of short, context-free commands without compromising safety, because the context info in the shell prompt can be relied on, due to the isolation.
There are some things which don't work well inside a docker container (port-forwarding for example), but it does make it simple to have isolated shell history, specific kubectl versions, etc.
When I was running the internal k8s clusters at a previous workplace, I simply got into the habit of compulsively running `kubectl config current-context` to check which one of the 50+ clusters I was currently connected to (designated test clusters for *playing with cluster infra", designated clusters for "devs playing around", designated prod clusters, with segregation between "batch-like" and "interactive" workloads, as we needed to treat the nodes differently in those, designated "run the CI/CD pipelines" clusters, as they needed different RBAC, ... and then duplicate between multiple data centres).
thanks for starting that thread, context is a major hurdle for beginners.
I myself am quite happy with the basics, but have an alias on k=kubectl and set-context that without argument displays the current-context. Before doing anything I rename or edit contexts in .kube/config to have a minimal amount of characters to type for the target ("proj-prod"). Using -l name= is another help in filtering, jsonpath and jq too.. as years ago with using the cli prompts with database products, building up muscle memory also gave me opportunity to grok the concepts at the same time.
After some attempts with different tooling, I came to like kubernetes for what it can do.
I used k9s before and that's an awesome tool. Tho it doesn't help when I want to send a command to my team mate and he just executes them on wrong cluster. It's the problem I want to solve
Create a User/Role for deleting (or whatever dangerous action) resources in prod cluster/namespace. Setup RBAC which allows your employees to impersonate as that user/role using kubectl --as. This way if you send your coworkers a command for dev environment and they try to run it in prod it will fail because they didn’t run kubectl as that impersonated user.
Totally agreed. This is the right way for many problems. Sometimes it's quite not possible to deploy the idea: In one of my past working spaces, everyone (even newbies) was provided with all _root_ privileges -- the idea was to help the team to learn from their mistakes (if any), and it's actually a great idea.
I'm glad to hear that this is a more common problem. When sharing kubectl commands, I always specify the --context flag explicitly so the person using it has to manually edit the context name to whatever they are using before running it.
I like the spirit of this but for dealing with multiple clusters, kubectx is pretty standard, always returns highlighting where you are and we don't have to type in the cluster name in every command. Also avoiding "kubectl delete" seems such a narrow case, I can still delete with "k scale --replicas=0" and possibly many other ways; at this point you are better of with a real RBAC implementation.
isn't kubectx the problem, not the solution? You think you are in one context but you are actually in another. You wanted to tear down the dev deployments but you nuked the production ones instead.
Nice list. Learned a couple neat things. Thank you!
Would like to add that my favorite under-appreciated can't-live-without kubectl tool is `kubectl port-forward`. So nice being able to easily open a port on localhost to any port in any container without manipulating ingress and potentially compromising security.
Something this guide misses that is helpful about explain is that it can explain down to primaries types. “K explain po” is great, but “k explain po.spec” will give more details about the spec and its fields. This dot field pattern can go as deep as needed, like pod.spec.volumes.secret.items
This command describes the fields associated with each supported API resource. Fields are identified via a simple
JSONPath identifier:
<type>.<fieldName>[.<fieldName>]
Add the --recursive flag to display all of the fields at once without descriptions. Information about each field is
retrieved from the server in OpenAPI format.
Use "kubectl api-resources" for a complete list of supported resources.
Examples:
# Get the documentation of the resource and its fields
kubectl explain pods
# Get the documentation of a specific field of a resource
kubectl explain pods.spec.containers
Options:
--api-version='': Get different explanations for particular API version (API group/version)
--recursive=false: Print the fields of fields (Currently only 1 level deep)
Usage:
kubectl explain RESOURCE [options]
Use "kubectl options" for a list of global command-line options (applies to all commands).
In case you work a lot with k8s, you can take a look as well at k9s, hightly reccomend it. It can save a lot of time with typings, especially to quickly check what pods/deployments are running, execute command in pod, describe to understand why did it fail, change cluster / namespace and so on
Along those lines, this was an interesting statement:
"You should learn how to use these commands, but they shouldn't be a regular part of your prod workflows. That will lead to a flaky system."
It seems like there's some theory vs. practice tension here. In theory, you shouldn't need to use these commands often, but in practice, you should be able to do them quickly.
How often is it the case in reality that a team of Kubernetes superheroes, well versed in these commands, is necessary to make Continuous Integration and/or Continuous Deployment work?
For the read-only commands, you can obv use them as much as you can, the issue is with the write commands. I see them as a tool for troubleshooting (eg, you are adding a debugging pod, not changing the running system) and emergency work that would be faster on command line than running the CI/CD pipeline but the final state needs to be in sync with the code (tools like ArgoCD help with this), otherwise it's a mess.
All that is good and dandy until you run a command and it spews a serialised Go struct instead of a proper error. And, of course, that struct has zero relationship to what the actual error is.
I'm pretty sure if you have the time to make a PR to fix it, it would be welcome. But I'm guessing it's non trivial or it would have been fixed by now - probably a quirk of the code generation logic.
Wow, I’d forgotten about this. The reason no one has fixed it is partially because I didn’t do a great job of describing what the fix was I expected to see (clarified now). Reopened and will poke folks to look.
> I'm pretty sure if you have the time to make a PR to fix it, it would be welcome.
Google had a net income of $17.9 billion in just Q1 of 2021.
I believe they have the resources to fix that, and I will not be shamed into "if you have time, please open a PR towards this opensource project".
> But I'm guessing it's non trivial or it would have been fixed by now - probably a quirk of the code generation logic.
I remember the "quirks of generation logic" being used as an excuse for Google's horrendous Java APIs towards their cloud services. "It's just how we generate it from specs and don't have the time to make it pretty".
For the life of me can't find that GitHub issue that called this out. Somehow their other APIs (for example, .net) are much better.
Typically in my experience, the further you get away from AdWords, the more broken Google's client libraries are.
I recall a little more than half a decade ago settling on the PHP version of their Geocoding API client library for a project because it was the only one whose documentation matched how you were actually supposed to authenticate.
Fortunately K8s is _not_ a Google owned project. It's managed by the CNCF which spans many different companies. Yes, there are a lot of Google people involved, but it really is a community project. Maybe I'm being naive but that's how I see it at least.
According to Wikipedia, though, "Founding members include Google, CoreOS, Mesosphere, Red Hat, Twitter, Huawei, Intel, Cisco, IBM, Docker, Univa, and VMware." [1]
Ah yes. I just love that free community spirit. Top 10-15 contributors are all paid money to work on this by Google, RedHat, Microsoft, VMWare, Goldman Sachs (and I couldn't bother to check others).
That is, 18 billion net income last quarter, 15 billion net income last quarter, 141 million net income last quarter, 6 billion last quarter...
These ginormous corps solve their own problems under the guise of open source, and gullible developers fall for the community promise.
For the port is trivial: `kubectl get pod <yourpod> --output jsonpath={.spec.ports[*].port}` or if you don't remember the json path just `k get pod <yourpod> |grep Port`.
For the IP address, why do you need that? with k8s dns you can easily find anything by name.
I have been using kubectl + zsh for quite a while.
But now my choice is Intellij (or other IDEs from JetBrains) + Lens, which I find more productive and straightforward (more GUI, fewer commands to memorize). Here's my setup and workflow:
1. For each repository, I put the Kubernetes deployment, service configurations, etc. in the same directory. I open and edit them with Intellij.
2. There's also a centralized repository for Ingress, Certificate, Helm charts, etc. I also open with Intellij. Spend some time to organize Kubernetes configs really worth it. I'm working with multiple projects and the configs gets overwhelming very quickly.
3. Set shortcuts for applying and deleting Kubernetes resources for current configs for Intellij. So I can create, edit, and delete resources in a blink.
4. There's a Kubernetes panel in Intellij for basic monitoring and operations.
5. For more information and operations, I would use Lens instead of Intellij. The operations are very straightforward, I can navigate back and forth, tweak configurations much faster than I could with the shell command only.
Every time I'm starting a new service to run internally or reviewing something we have going, I find myself struggling to find the right instance type for the needs.
For instance, there are three families (r, x, z) that optimize RAM in various ways in various combinations and I always forget about the x and z variants.
So I put together this "cheat sheet" for us internally and thought I'd share it for anyone interested.
I would add one more important point about kubectl?
If you don't work at Google, you don't need a complexity of kubernetes at all, so better forget everything you already know about it. The company would be grateful.
Joke aside, trying to sell something to the masses that could potentially benefit only 0.001% of the projects is just insincere.
Kubernetes is much more simple than what we would have to do without it, and my team is much much smaller than anything at Google. For what it does, it offers some good opinions for what might otherwise be a tangle of dev ops scripts.
If what you want to deploy is best described as “an application” it’s probably not the right tool for the job. If what you want to deploy is best described as “50 interconnected applications” it’s probably going to save you time.
> If what you want to deploy is best described as “an application” it’s probably not the right tool for the job. If what you want to deploy is best described as “50 interconnected applications” it’s probably going to save you time.
This is an excellent way of looking at it. I've struggled for many years to come up with a response to hacker news comments saying you don't need kubernetes, but this sums it up about as well as I could imagine.
Maybe so, but anyone should definitely use more criteria than my few word generalization to choose their deployment infrastructure. :)
We (mostly) chose k8s over other solutions because of other tools/providers in the ecosystem that made business sense for us. But we did need something to abstract our deployment complexity.
I’m mostly suggesting that I suspect many of the people with bad k8s experience didn’t really need it.
I’ve seen a number of people wrap a simple application in a container, slap it in a deployment/service/ingress and call it a day, it works, but using k8s that way doesn’t really add much value.
K8s is an enormously complex piece of software and I haven't met a great many people who "know" it inside and out.
Basic concepts and how to write a job/service/ingress, sure. Knowing the internals and how to operate it? I'd say that's only for specialists. Most people don't need to know what a Finalizer is or does. Most people aren't going to write operators.
It is a multi-year investment of time to deeply understand this tool and it's not necessary for everyone.
Except with the kernel, you only have to be familiar with the system calls and you don't need a team of people just to run, maintain and upgrade the kernel.
That and it tries to make breaking changes on the timescale of decades rather than every other minor release (so, once or twice a year?).
> Except with the kernel, you only have to be familiar with the system calls
I think it's safe to assume that any non-trivial use of linux involves non-default configuration.
> you don't need a team of people just to run, maintain and upgrade the kernel.
My relatively small company employed linux admins before we adopted (on prem) kubernetes. Their work has changed a bit since since then, but it isn't meaningfully more laborious.
I assume that less effort is required for cloud kubernetes offerings.
My whole point is that they're not really comparable from a level of effort perspective, despite claims.
Hosted Kubernetes isn't significantly easier either, as every host is offering you different things as "Kubernetes" and has different ways that you will need to manually intervene to overcome problems.
I'm only telling you this from experience, being years down the rabbit hole already.
I also speak from experience, from an organization that has had a lot of success with kubernetes. Perhaps we're in the sweet spot where our workload is suited for it but there still isn't a huge amount of complexity in maintaining it.
Modern istio provides a lot of value to a single application. mTLS security, telemetry, circuit breaking, canary deployments, and better external authentication and authorization. I’ve seen each done so many different ways. Nice to do it once at the mesh layer and have it be done for everything inside the cluster.
This is getting downvoted for cynicism maybe, but I feel it's the most important advice here. Know /when/ to use Kubernetes.
It's very often the wrong tool to deploy our tiny app but many of us go along with it because it ticks some management boxes for various buzzwords, compliance, hipness, or whatever. Once you get out this hammer factory, it's a big and complicated one, so you will probably need a full time team to understand it and manage it. It's also a metric hammer factory, so you'll need to adapt all your other tooling to interoperate. Most of us can get by with lesser hammer factories, even k3s is less management.
If you just need to deploy some containers, think hard if you want to buy the whole tool factory or just a hammer.
This kind of comment is on every single HN post about Kubernetes and is tiresome. I also think it's off topic (TFA is about kubectl tricks, not about the merits of K8s).
I think it's important to have comments like those as Google, who does not use Kubernetes, is exerting a lot of pressure on the industry to adopt it. It is an extremely complicated tool to learn to use well and companies act like there aren't reasonable alternatives.
Those of us who have gone through it are often coming back with war stories saying to use something else. Some of us have invested thousands of man hours into this already and have strong opinions. At the very least, give Nomad a look. It is maybe a tenth of the effort to run for exactly the features most people want and then some.
People need to be made aware that there are options. I have friends at companies that have large teams just dedicated to managing Kubernetes and they still deal with failure frequently or they spend their entire day-to-day tuning etcd.
We get paid because we know these tools. It's why we're desired: because the company thinks they want K8s or they're one foot in EKS and they're doubling down. We don't get hired because we dare to suggest they dismantle their pilot cluster and take a sharp turn into Nomad.
Most of us aren't the engineering heads of our departments. So you'll forgive us if we continue pushing the moneymakers we have in our heads and setting up our homelab clusters. I want to be paid, I want to be paid well. It may as well be pushing the technology stack that scales to megacorps because who knows maybe I'll make it there one day.
So I wrote this https://github.com/icy/gk8s#seriously-why-dont-just-use-kube... It doesn't come with any autocompletion by default, but it's a robust way to deal with multiple clusters. Hope this helps.
Edit: Fix typo err0rs