Hacker News new | past | comments | ask | show | jobs | submit login

In the opinion of a last semester CS student who has never written an application from scratch that needed more than a SQLite DB (so take me with a half grain of salt), it seems like premature optimization, while always talked about, is very common. I see people talking about using Kubernetes for internal applications and I just can't figure out why. If it's a hobby project and you want to learn Kubernetes, that's a different situation, but in the case of making a real application that people will use, it seems like a lot of us can get away with a single DB, a few workers of our app, and maybe a cache.

I'm speaking out of little experience though. I just think that a lot of us can get away with traditional vertical scaling and not think too much about it.




In big enough organizations, it is very easy to lose track of who owns what, especially when it is those little ad-hoc internal tools. Manually managing the infrastructure for them is a recipe for them to become permanently enshrined in the wasteland of "services we think we use, but do not maintain because we don't remember who needed it or put it up or how to configure it".

K8s isn't the only answer, but if you are already using it for your large applications, it isn't much work to reuse the existing tooling and infrastructure, and now you st least have the dockerfile as a reference if nothing else.

OTOH, if you have an existing tooling setup / pipeline that is not K8s, there isn't a good reason to use it for a small application.


docker and k8s adoption can force a company that over years has developed and perfected a standard practice of deploying application, with a half-ass'd solution that ends up solving the wrong problems and costing way more. The "shipping beats perfecting" mantra is very much at play here. This is due to the amount of time it would take to achieve parity. At the end of the day, the new solution ends up looking like a step-back.

Such a practice combined with the mentality that software engineers should do their own dev-ops can easily lead to an environment of spaghetti applications where every developer working on the new platform does things slightly different because the replacement solution wasn't complete and had to be addressed by countless band-aids by engineers across the band of talent.

Furthermore, for the features that were able to achieve parity, you're now forcing your entire organization to re-learn the new development process. The docs our abysmal and the software engineers that developed the original solution have moved on since the hard work of "productionalization" remains and they're not interested in that.


> docker and k8s adoption can force a company that over years has developed and perfected a standard practice of deploying application, with a half-ass'd solution that ends up solving the wrong problems and costing way more.

docker and k8s adoption can also force a company that over years has developed and perfected a half-ass'd solution to deploying applications, with a single(-ish) source of truth that ends up solving the right organizational problems instead of the wrong technical ones (and ends up costing way more, at least in the short term).


Having a Dockerfile that copies a few binary blobs into an age-old distro image isn't an improvement, it's a huge liability. And most of that stuff that no one knows anything about anymore is like that. Same as with an old VM or PM.

I'd rather have that old crap as a physical machine. Why? Because the hardware lifetime "naturally" limits the lifetime of such applications. If the hardware dies, it forces a decision to spend some money to keep it running or throw it away, which, given that hardware is expensive, usually results in throwing it away.


Set up your docker file to be part of your CI so that your binary blobs are built from source with regularity? That’s typically the solution I’ve seen work well. Manually maintained stuff (especially for stuff that may not be the thing everyone is primarily doing) generally doesn’t scale well without automation (speaking as someone who’s seen organizations grow). This is also true of “getting started” guides. Can’t tell you how much maintenance and run time I’ve saved converting those wikis to Python scripts.


Yes, of course. That would be ideal. That's what we do for everything we can control.

But as someone in the IT dept., far too often you get some container that either was built by someone who long left the company or an external consultant who got paid to never return. Sourcecode is usually unavailable, and if it is available, will only build on that one laptop that the consultant used. The IT department gets left with the instruction "run it, it was expensive" and "no, you don't get any budget to fix it". That results in the aforementioned containers of doom...

Yes, I'm bitter and cynical. Yes, I'm leaving as soon as I can :)


I hate to raise a seemingly obvious point, but this doesn't seem like a problem with Docker.


In theory that works. In practice it rarely does. Docker et al gained popularity because they made it way more practical for projects to be managed as the world works rather than as it should be, for good or ill. Before Docker it was moving old applications to VMs and before that it was running them in chroot horror shows.


> the hardware lifetime "naturally" limits the lifetime of such applications

Oh, my sweet summer child. Of all the things that "naturally" limits the lifetime of such applications, that is not it. Consider the case of the mainframes running the US banking system, for example.


Can still remember the multinational were wanted to host the webapplciation we developed from them at their data centre. They wanted to charge us (not the business unit for some odd reason) nearly €20.000 per year to host a RoR web application.

Ended up, hosting it ourselves, and in the last year they were paying €100k per year for it. As we would just sell the same setup for each deployment for their customers. They probably been cheaper off to host it themselves.


What you're describing is called resumé driven development. It happens every few years when people want to cash in on trends/buzzwords that people believe will be disruptive to all industries but are just tools to have in the toolbox for most. New tools pop up all the time that fit this mould. Over the past ten years I can think of Hadoop (Big data), MongoDB (NoSQL), Kubernetes, "Serverless" computing, and TensorFlow. While all these tools have legitimate use cases, they are often overused due to marketing or industry hype.

Adding artificial intelligence to your recipe application is unlikely to make any sense, but people do it because they want to have AI software engineer on their resumé.


For artificial intelligence, I think it's more often marketing driven development. It's easier to seem disruptive if you claim to have AI in your product. Easier to get funding and have people talk about your company. I feel like it comes more often from business executives than technical people.


You reminded me of the Phillips toothbrush with AI. Marketing like this makes the term AI worthless.


I think it's easier to run a small k8s cluster than it is to attempt to recreate a lot of the functionality provided manually, especially if you're running in a cloud where the control plane is handled for you.

It provides unified secrets management, automatic service discovery and traffic routing, controllable deployments, resource quotas, incredibly easy monitoring (with something like a prometheus operator).

Being able to have your entire prod environment defined in declarative yml is just so much better than running something like a mishmash of ansible playbooks.

If your application runs on a single host and you don't really care about SLAs or zero downtime deploys, sure, use some adhoc deploy scripts or docker compose. Any more than that, and I think k8s pays for itself.


Agreed. What is it that people are doing (or not doing) where a simple managed k8s cluster is more work than the minimum way to do this?

Are teams not even setting up automated builds and just sshing into a box?


For me that’s Heroku. I just push my app (RoR) and done. I’ve actually moved it once to k8s for a few months and decided to move it back after I understood how easy heroku made the opa side of the business.

Note: it’s a side project, 50 business users, $5k annual revenue, 4h per month as target for the time spent on customer support, admin and maintenance. So it’s both easy to pay heroku and important for me to not spend too much time on Ops.


k8s is not simple to learn for legacy admin teams, even those who have some container experience. It is simple to kubespray a semi-working cluster on-prem or cloud provider k8s your way along but if you need to actually learn how to deal with multiple ingress controllers, a service mesh, need multiple storage providers, affinity, gpu accelerated apps, secure k8s and the other problems solved in the legacy world (in myriad ways) k8s can be regarded as a disruptive interruption in your ability to operate. All this being said I lead both IOT edge(via k3s) and science (cloud & air gapped 'vanilla' k8s) platform teams and appreciate the chance to sell k8s and make easy money.


I see people talking about using Kubernetes for internal applications and I just can't figure out why.

There is benefit in having established platforms for running your code, and this is especially true for large orgs where the people who run the systems are an entirely different group from those that developed or assembled it. And people (+ their skills) are what cost the most money in any business.

It's true that many/most systems don't require a full Kubernetes stack (for instance), but if a critical mass of the business IT is going that way, doing the same with your own makes sense from an economies of scale PoV.


> There is benefit in having established platforms for running your code

You do know k8s is very new, there's a constant stream of changes and updates to it, etc etc? it's not established. It's known, but that's it.


1.0 was released in 2015. There are stable LTS vendors for it.

It's pretty established. And much saner than cobbling together Ansible/Puppet/Chef playbooks for everything.


Saying that 2021's Kubernetes is established because 1.0 was released in 2015 is like saying that 1991's Linux is stable because Unix had existed for 20 years at that point. Kubernetes 1.0 and 1.20 share the same name, design principles and a certain amount of API compatibility, but it's impossible to take a nontrivial application running on 1.20 and just `kubectl apply` it on 1.0. Too much has changed.

Kubernetes is just now entering the realm of "becoming stable". Maybe in five years or so it'll finally be boring (in the best sense of the word) like Postgres.


Of course 1.20 has a myriad of additional featues. But 1.0 concepts are in 1.20, the fundamentals are stable. Schedule and run containers, expose them to the external network via a load balancer (or node port).

The declarative aspect is stable. Yes, many people are writing insane go programs to emit templated ksonnet or whatever that itself has a lot of bash embedded, but that's the equivalent of putting too much bash into the aforementioned configuration/orchestration playbooks.


Playbooks are terrible. They are a replacement for expert knowledge of platform tooling. There is no replacement for expertise and knowledge of the platform.

Serious problems are always reduced to understanding the platform, not the playbook. Ansible and the python ecosystem are especially broken. I will _never_ use another playbook to replace mature ssh driven deployments.


Yep, agreed. I found that the active control loops (coupled with the forgiving "just crashloop until our dependencies are up" approach) that k8s provides/promotes are the only sane way to ensure complex deployments. (The concepts could be used to create a new config management platform, but it would be really hard, as most of the building blocks are not idempotent, and making them such usually requires wrapping them and combining them with very I/O and CPU heavy reset/undo operations, blowing caches and basically starting from scratch.)


Hmm. The expectations and engineering of a platform like kubernetes and 'fail till right' require a lot of alignment work to get to the point where 'fail till right' works.

There is no silver bullet. Automation and self healing is a selling point but when it hits engineering it usually is a dud in terms of incorporation in existing environments.

The real novelty would be to generate a declarative description from the customer and provide an in place deployment solution via k8s. That would be the ultimate replacement solution.


I think what they may mean is more that there is an established platform within the organization. One that you have expertise and experience with, monitoring/backup/security tools that work with it, etc. K8s might not have as long a pedigree as VMs, but if you already have a setup to run K8s, people who know how to use it, documentation and tooling that allow devs to run their apps on it securely and efficiently, etc. it's pretty reasonable to want to encourage devs to "just" use k8s it if they want to stand up a new service rather than spinning up whatever random collection of technologies that dev happens to know better.


Well, firstly, the core components of k8s are pretty well understood now, and the APIs to them are not just going to change overnight.

Secondly, it doesn't really matter how old the platform is. As I said, it comes down to how familiar the operations team are with it. If you are running an on-premise or hybrid cloud, ops will want something familiar over all environments. They're not going to be happy with k8s on one, ansible on another, etc.


> but in the case of making a real application that people will use, it seems like a lot of us can get away with a single DB, a few workers of our app, and maybe a cache.

A couple of points:

1. Kubernetes can run monoliths. It's certainly not exclusive to microservices or SOA. It's just a compute scheduler, quite similar to AWS's EC2 reservations and auto-scaling groups (ASG's).

2. I can't speak for every corporation, but if you already have patterns for one platform (note: "platform" in this context means compute scheduling. eg: AWS, GCP, Kubernetes, Serverless) then you will inevitably try to copy patterns you already implement internally. A lot of times, for better or for worse, it's not what fits best unless what fits best and what you have available are highly conflicting.

3. A lot of times "scaling" is actually code for multi-tenancy. As an industry, we should probably be explicit when we're scaling for throughput, redundancy, and/or isolation. They are not the same thing and at times at odds with each other.

4. I don't really like your use of "real application" here as it implies some level of architectural hierarchy. My main takeaway after 10+ years of professional development is that architectures are often highly contextual to resource availability, platform access, and personal preferences. Sometimes there's a variable of languages too, because some languages make microservice architecture quite easy while others make it a royal PITA.


I know one of the biggest Ecommerce shop in Asia were using 1 big DB with multiple read only slave in monolithic architecture for more than 5 years.

However not only driven by DB performance, but also on organizing hundreds of engineers they adapted microservice architecture. Then they slowly migrating to per domain specific DB, it is just classic microservice migration story.

While single DB may bring us pretty long way, designing the system into more discipline logical domain level segregation will help when there's need to move to microservice.

*looks like HN reader quite sensitive with microservice migration comment, usually this kind of comment got down voted easily.


Stack Overflow runs what is essentially a monolithic architecture. Though they do have a few services, it isn't what I would describe as a micro-service architecture.

https://stackexchange.com/performance


Monzo (UK bank) has 1600+ microservices, but mandates a common framework/library and uses Cassandra. (Which is basically a shared nothing, bring your own schema "database".)

So it makes sense to combine advantages of different approaches.


using non-ACID db for financial services probably requires lots of trickery.


it is web scale though


Upvoted, but I am not sure tokopedia is even in the top 10 in Asia.

Also, the fascination with GMV tends to make it looks like high scalability is required. In another HN discussion, someone mentioned about running the database for an ecommerce that had 1 billions GMV a few years back. Assuming a conservative $5 per order, that translates to about 6 orders per second on average.


It's resumé-driven development, and it's also entertainment-driven development. Bringing in new technologies gives you a chance to play with a new toy. That's an effective way to make your job more interesting when the thing you're supposed to be working on is boring. Which, in business applications, is more often than not the case.


In today’s job market resume driven development is a very rational choice. I work in medical devices so we are pretty conservative and generally way behind the cutting edge. This makes it really hard to find jobs at non medical companies. I would recommend anybody who has the chance to use the latest and shiniest stuff to do so because it’s good for your career .


Very good point. Seems like yet another example of how carefully optimizing all the individual parts of a system can paradoxically de-optimize the overall system.


I haven’t worked at a FAANG or any other company even close to that level of scale, so you can take me with half a grain of salt too.

But what you said is absolutely true. It’s also something you will very much experience once you start working professionally.

I’m in no position to give you advice, and I think I might be giving advice to myself...just don’t let it get to you.


I have worked at FAANGs before.

I am in firm agreement. I think that far too many people are trying to solve the problems that they wish that they had, rather than making it easy to solve the ones that they do have. Going to something like Kubernetes when there is no particular reason for it is a good example of that trend.

When you really need distributed, there is no substitute. But far more think that they need it than do.


I’m not saying in all companies, but as you grow you have lots of different teams with different needs. So then you spin up a tools team to manage the engineering infrastructure since you can’t do it as-hoc anymore (CI, source control, etc). So to make that team more efficient, you let them force one size fits all solutions. While this may feel constraining for a given problem domain, it actually makes engineers more portable between projects within the company which is valuable. Thus having one DB or cloud thing that’s supported for all teams for all applications is valuable even if sometimes it isn’t necessarily the absolute best fit (and the complexity is similarly reduced as good companies will ensure there’s tooling to make those complex things easy to configure in consistent ways). Your tools team and the project team will work together to smooth out any friction points. Why? Because for larger numbers of engineers collaborating this is an efficient organization that takes advantage of specialization. A generalist may know a bit about everything (useful when starting) but a domain expert will be far more equipped to develop solutions (better results when you have the headcount).


> I see people talking about using Kubernetes for internal applications.

I think the important issue when first starting a project is to create a "12 Factor App" so that if and when you create a Docker image and/or run the application in Kubernetes, you don't have to rewrite the entire application. Most of the tools I write run on the CLI but I am in fact a fan of Kubernetes for services, message processing and certain batch jobs simply because I don't have to manage their life-cycles.


12 factor apps sacrifice performance and simplicity of your environment for scalability. Unless you are guaranteed to start with a worldwide audience its complete overkill. A better solution is to write your application with the rules in mind with the goal of making it easy to transition to a 12 factor style app when its needed. Scale up then scale out will result in the best performance for your users.


The 12 factors are mostly common sense that apply in pretty much any situation - they help with fancy deployments but also with single servers on DO or even on-Prem servers.


The thinking on efficiency vs scalability is largely influenced by the US tech company meta where founders give up equity so they don't have to worry about profitability for a very long time.

In that case, it is preferred to burn piles of cash on AWS instead of potentially needing to sacrifice revenue because you can't scale quickly enough.

An architecture that is not scalable is considered a failure whereas one that is complex and inefficient is much more tolerated (as long as it can scale out) ... at least until the funding dries up or Wall Street activists get involved.


Very many successful applications can indeed run on a single DB server (modulo redundancy in case of failures). Vertical scaling isn't trendy, but it is effective, until it's not.

I have yet to encounter a real situation where it suddenly became impossible to run a production DB on a single, high-spec server, without knowing far enough in advance to plan a careful migration to a horizontally scaled system if and when it was necessary.


Speaking as someone with 20 years in the industry, what you say is correct. Most applications would be find on a single server and a classic LAMP stack. But that ain't cool these days.


Kubernetes is for when you need to allocate CPU like you allocate RAM, and you don't want to be tied to a higher level API sold by a vendor.


Like sched_setaffinity didn't exist and cgroups can't be used outside some container env?


Service and cluster autoscaling. Automatically allocating and provisioning new nodes on compute demand, and releasing them when done.

It can be done without k8s, but without something similar (e.g. Mesos) you're coding to cloud vendor APIs. K8s is like a cloud operating system, it gives you portable APIs for allocating compute and other cloud resources.


qemu/kvm and libvirt as an example api to address your resource idiom + a load balancer/rev proxy API. You are missing my point afaict.


Kubernetes isn't only about scaling. The repeatability of deployment process is a great asset to have as well.


But there are much simpler ways than K8s to achieve automated/repeatable deployments, if that is your goal.


Can you please name a few?


As I'm not a K8s user take this with salt.

Heavily depends on your exact situation. Keeping your source code in git and let it build on a CI server was almost always enough for me. If your build server runs windows this is usually just fine.

C and C++ compilers on Linux have this IMHO very unpleasant property that the operating system wants to manage your development environment, so that the build output is a function not only of the repo, but also of things not under your control. I have little experience with that (I'm mostly a desktop developer, and these desktop applications are for windows), but so far simply not installing any -dev packages seems to have eliminate that problem. Put the library source code in the repo, use a git submodule, use a per repo package manager like nuget or cargo, whatever, just make sure your build input is in the repo and only there. For other programming languages this is generally not a issue. Thankfully no Linux distribution I'm aware of tried to sneak in their version of log4net, so far. Same seems true for every other languages I'm aware of, so this is a non-issue for any other language than C and C++.

Automated tests can run in a chroot to make sure all runtime requirements are contained in your build artefact. On Windows running them in a clean VM might be a good idea, but keeping the machine clean seems enough. Don't give people the password to that machine, so they are not tempted to "fix" problems by installing software there, fiddling with system settings, creating "magic" files, but by chaing the code in the repository.

I used repository singular, but this works for both monorepo and polyrepo.


That allow for rollback/rolling, seamless or canary transitions and encapsulate the entirety of a mature release cycle into a declarative operational rubric? None I am aware of outside of k8s competitors.

However I think this value can be oversold and it is a simplification of processes that were used before k8s to achieve the same end.

Say that I have v1 ready for release and want 20% of my users to hit those endpoints in prod. Simply done in k8s. Also simply done with rev proxying and automated configs in nginx + custom app provisioning _without_ k8s in announced maintenance. None of these ideas lives in isolation.


Micro-service architectures are an absurd overcomplexity for smaller internal apps. They only really make sense when a monolithic system becomes too large for a single team to manage, at which point the micro-service boundaries reflect team boundaries.


The author says this in the first paragraph.


Kubernetes is a container system, mostly orthogonal to 3- or 4-tier application design.


And typically a single DO droplet would suffice for a toy project or POC, for which Ansible is probably the more expedient option. But maybe they're not in a rush, and learning K8s is just another feather in their cap .


Ah, but is it? I'm not sure. One difference between these platforms is that they're very optimized for the "typical app deployment" process, whereas Ansible and such are more generic.

Just getting a simple deployment script in Ansible requires programming a bunch of steps - copying archive, verifying it, unpacking it, then atomically configuring the system to use the new version, finally cleaning it up - whereas with k8s you just let it know you want to Deploy something and it does.

Of course, that's only true if you already know k8s, but then again if you work with both toy and larger projects, chances are you'll have to, sooner or later.


I'm assuming that it's someone who is unfamiliar with k8s or Ansible. You can quickly cobble together something solid from geerlingguy. It might need a day or two to hash up some basic yml - not long. Or... you could just ssh on your server and configure it manually, and cross your fingers - even quicker.

Even if you start with k8s and you want to scale, at some point you'll want to configure your images in a predictable way with something like Ansible. Whether you do that for the images you deploy, or your configure it in situ for a single-node, it's still the same complexity cost.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: