I don't think you can diagnose over-engineering after the fact. Unless you were in the room, or have access to written specs from a meticulously documented team, you don't understand the conditions under which the code was written.
Maybe the company was betting at the time on an integrations marketplace, or a lot of partnerships, so a robust integration platform was built. Then the company moved on to another bet after only writing one or two integrations on the new platform. Nobody over-engineered here. Everything was built to spec. But three years down the line the new dev is going to get assigned a bug in one of the integrations and lament how the code was over-engineered. Hindsight is 20/20.
Lots of stories from the trenches including many in this thread make this mistake. The same goes for 'tech debt'. If you weren't there, you don't know.
I think the lesson here is that with great product market fit everything is under engineered, with poor product market fit everything is over engineered. Statistically you are more likely to be building an over engineered product with poor product market fit than you are to be building an under engineered product with great market fit.
Now is over or under engineering a bad thing? that depends on how many resources the company can muster, your customer tolerance to bugs/failures, and whether you can build a small subset of features well. In practice I've observed products with fewer, but well built features tend to succeed more so than buggy featureful products. Large companies may opt to overbuild everything simply so that they can prove that the product was bad rather than having to decide whether it was product or execution that failed.
I'm currently walking away from a code-base (and the company who let it be built) that was obviously built by someone who's mental model of how the code works was incorrect. At some very basic levels, it under-performs compared to literally every other example of this sort of code I've ever seen (and it's extremely common), which results in a subpar user-experience and an EC2 instance way bigger than necessary to run this sort of code. A lot of code running doesn't need to run in the first place, and due to the incorrect mental model, there's a substantial amount of code that doesn't run.
Nobody really understands how bad it is, though, because it's barely tested (and the engineer who wrote most of the code has argued that we don't need tests, for various reasons) and all of the engineers on the team who didn't build it initially, have told me that they're not familiar enough with how this sort of product should work to understand that it doesn't work like that (again, this is a super common kind of product).
There's a lot of other anti-patterns (I started keeping a list in my first week) but I think these are the most damning. This code is in production. Nobody at the org brought up these issues earlier in development (like saying "hey, nobody else's product does [annoying thing that doesn't need to happen], why does ours?"), which leads me to believe that the whole technical org is at fault.
It sucks and I can't wait to be done with this experience.
I don't disagree, but I think that the reason that this fits here is because the poorly engineered software is unnecessarily complex (not in the "someday we might need this" way, but in ways that I'm having a hard time articulating without giving away the product because I know my coworkers are on HN). In fact, it was the ridiculous complexity that drove me to discover the disparities between the original developer's mental-model and the working-model. It just so happens that I've worked on this sort of thing before (as have thousands of others) that I was pretty quickly able to understand where the original developer went wrong (trust me, I'm just an ordinary developer, not a 10x elite hacker or anything).
Typically over-engineered software looks something like the following.
1) The product is horizontally scalable to 3 orders of magnitude more traffic than the product will ever receive.
2) Bespoke assembler/hand crafted memory management/other do not touch code to shave 5ms off a 5 ms call on an API that isn't latency sensitive.
3) Ability to handle N customer tenants while only handling 1 customer 3 years later.
4) multi-region redundant deployment for 1 customer strictly based out of the US with no availability requirements.
5) 100% coverage for integration tests, unit tests, CI/CD, for a single tenant blog that is never updated and no one reads.
6) Custom internal framework with micro-service splits to support team of 100 engineers working on app when there are 2.
7) Automated continuously trained ML ranking model which performs .001% better than sorting by popularity or other trivial heuristic on a social media app with 10 users.
The common theme in many of these cases is that these are pretty good ideas if you have unlimited people, but have fairly low impact. In some cases these may be counter-productive to successful delivery at the company/product level. A piece of software built wrongly due to poor understanding of the product domain is often considered under-engineering.
Isn't the common theme: "this is what big tech does" ? Ie, it seems a lot of small companies/startups look at "best practices" and just copy them. It's not just engineering. It happens in product, hiring, marketing, everything.
It's completely wrong given the context 99% of companies are operating in. It's no wonder a Zuckerberg can come along and build a crazy successful company in his early 20s. He likely didn't know any "best practices". Common sense was enough.
You see, overengineering is exclusively of the "some day we might need this" kind of complexity. Anything else is a different problem, very likely with very different causes, even if the consequences are similar, so it's not useful to analyze together.
Some of it makes sense when the reasons were explained (e.g. "we figured this would make it easier for [business unit] to do [thing] which turned out not to be something [business unit] actually has interest in doing").
Other examples are purely of this sort (e.g. "we'll be glad we have this in place when we finally hit [x] users or enter [market]").
Interesting angle. What if the business asked for it, the dev team delivered it, but the business subsequently gave up on it, and now it's overengineered in comparison to what remains in use. To avoid this scenario, I don't try to outthink the business team, but I do push people toward temporary manual workflows sometimes, and promise to automate later if the idea actually pans out.
Yes the business team will often ask for things they don't need but I can justify that cost. It can still impede other development but we tackle that when it becomes a problem not before.
I agree completely on pushing off features that are not core because it is better to learn in production than to daydream in meetings. I would say most phase-2 features never get implemented.
Complexity kills projects and the will to work on them. It can be caused by either over or under-engineering. With the first you end up with extraneous layers of abstraction and badly applied design patterns that make it hard to understand or reason about the code...in the second you end up with a big ball of mud that's impossible to understand or tease apart.
In both cases the code gets harder and harder to change to the point that no one really knows how it works or wants to risk touching it.
Over engineering is a form of bad engineering, at least in my book. Fun starts when some parts of a product are over engineered while others are under engineered... One of the cases good averages get you nowhere...
I think nearly everyone would agree that "over == bad" in most settings, but the reverse is being implied by the GP - that bad engineering is a form of over engineering.
No tests? Nobody knows how it works? It doesn't perform well and doesn't fit the problem it's solving? That's very much not over engineered in my book.
You've checked-out, so it's too late, but depending on the circumstances, this can be a situation for an experienced engineer to step in. The biggest challenges are political, but also triaging the issues and coming up with a migration plan. "Rewrite all the things" rarely goes well, especially if they already work, just on beefier hardware than is needed.
> and the engineer who wrote most of the code has argued that we don't need tests, for various reasons
Sounds like you made the right choice to walk away. "My code doesn't need tests" smacks of match-over-the-shoulder type of chaos creation. This person has likely never had to maintain their own code or someone else's, nor has anyone ever had to maintain their code. 0/5 would not work with.
Tests are wonderful. Tests are proof your code works. They are something to be proud of. And they defend your code against the chaos monkey of future people refactoring and rearranging.
I understand the sentiment but there are so many situations where this is not true.
All it takes is a good enough sales and marketing team, or the right executive relationships, especially if the team being “supported” is small.
You can say the company or division is presently functioning, but the time constant on the product is different, and the coupling is too loose to say anything else with accuracy.
> All it takes is a good enough sales and marketing team, or the right executive relationships, especially if the team being “supported” is small.
That's the point though. Getting caught up in engineering purity or even whether it's objectively _good_ is a waste. The product exists to drive business (financial) metrics. If it's doing that, it's working.
The one thing that I'd challenge about this line of thought is that the product isn't necessarily "done", there are expectations to add features & fix bugs.
If the product is poorly crafted, and especially if it's also poorly tested, adding features can introduce bugs, and fixing bugs in one place can cause issues elsewhere. Development then slows to handle these occurrences, which can then impact product viability.
I'm aware of technical debt and it's consequences. The point is that the vast majority of developers treat their job as creating good software, when it's absolutely not.
I'd say it's creating software that is "good enough" (good enough to do what it needs to, right now, sufficiently frequently that paying customers return).
> I don't think you can diagnose over-engineering after the fact. Unless you were in the room, or have access to written specs from a meticulously documented team, you don't understand the conditions under which the code was written.
The specs and the decisions surrounding the code are usually part of the over-engineering.
> Everything was built to spec.
Even with reasonable specs, I've worked with, and later managed, engineers who had a strong drive toward over-engineering solutions. I've also done it myself, and I suspect most people reading this have done the same.
The problem is that many of us engineers became developers because we enjoy building things and we enjoy solving complex problems. It can be exciting to imagine a particularly complex version of a problem and how elegant and fun a complex solution would be.
It can be as simple as a programmer who tries to write their own CMS when the spec could be solved with a simple blogging platform. If you're an ambitious programmer, writing a CMS is fun! Installing Wordpress is blech.
I didn't truly understand the ubiquity of the problem until managing ambitious engineers who wanted to do complex things. At times it felt like 1/4 of my job was just asking "If we wanted to simplify this solution, what would that look like?" Repeat for a few rounds and eventually the team has shed a lot of unnecessary complexity without sacrificing the spec at all.
With the benefit of hindsight, I think you can conclude that the system was over-engineered. If you’re trying to answer only “was it over-engineered?” I think you can answer it with hindsight.
If you’re trying to answer “did the people who built it exhibit poor judgment resulting in the system being over-engineered?” then you need to know more than just the after-the-fact outcome.
In my experience, the situation you're describing more often results in "sloppy" code -- it seems convoluted, non-obvious, poorly thought out until you realize when the pivots happened. Overengineered code usually looks neat at first glance, maybe even impressive, especially to non-technical stakeholders. Then when you actually dive into it, you notice a bunch of wrong abstractions that introduce layers of indirection without offering any sort of valuable flexibility. "This metaclass that implements a context manager could've just been a slightly longer than average function" type stuff.
A codebase that pivoted multiple times makes more sense the more time you spend in it, an overengineered one makes less.
It's a question of framing. Nothing is ever over- or under-engineered. It's over- or under-engineered for a purpose.
What that distinction recognizes is that it's possible for something to have been well-engineered at one time, while still being over-engineered today. Header files in C and C++ are an example of this phenomenon. They solved a very real problem with technical constraints from 40, 50 years ago, both in terms of computers' capabilities and compiler technology, but, since those problems don't exist anymore, they function as over-engineering.
Agreed, but "nothing is ever over- or under-engineered" is a bit too bold. There is definitely over engineering just for the sake of over engineering, and under engineering caused by incompetency.
Technically speaking, for any given situation there's no way you can be wrong.
But this post describes a distinct and, by my experience, dominant industry phenomenon. And the stakes are as high is OP says (dead products).
Industry engineers in the large are massively divorced from product outcome. This should be no surprise, as it is a seller's market:
Delivering product simply and directly can be, depending on perspective, somewhat of a mundane and Sisyphean activity.
Engineering "stuff" is far more intellectually engaging, and there's no market pressure (for engineers) against over-engineering. If my company fails to deliver a product outcome, whether a viable company or startup, in this market there is no recourse at all on my salary let alone my hireability.
I say "depending on perspective" because if you can find a way to be more intellectually and emotionally stimulated by shipping stable product over time, then ime you _can_ unite your joie de vivre with market outcomes.
But that does not at all describe most engineers in industry who generally at best shed off some market value while spinning unnecessary webs of toy abstractions (ie liabilities).
My new colleague used to wonder about certain things i.e. why is this is done this way, it doesn't make any sense. As I had been there a lot longer, I would share the technical and non-technical background/restrictions we operated with. Eventually when another new colleague joined, he told the guy, "I used to wonder why some parts of the code are setup that way; now that I have the background I can say that if you are wondering about those same things, trust me, there is a reason/background - its not because the people who did it that way were stupid".
Occasionally, it is because of that though. If not 'stupid', at least inexperienced. I've had plenty of things I've done that worked, but were, in hindsight, 'stupid' (and have been called out on that). Sometimes, people try to make a lot of post-hoc justifications for a block of code or a data/tech decision that really is, just... 'stupid'. Again, that's more likely down to inexperience than anything else, but not every decision is a 'good' one just because people 'had their reasons'.
Examples
Having a 'user info' table with 190+ boolean columns for 'country' ('US','CA','DE','IT', etc) in case someone wants to indicate they're 'from' two countries.
Joining views on views of joined views which are themselves built on joins of other views is, likely, not a terribly sound data decision. (X: "It worked fine last year - you must have broken something." Me: "well... last year you had 450 users, and now there are 37000 users - this isn't a performant way of doing this". X: "Can't be true - the person who wrote it is a DBA")
> Occasionally, it is because of that though. If not 'stupid', at least inexperienced. I've had plenty of things I've done that worked, but were, in hindsight, 'stupid' (and have been called out on that). Sometimes, people try to make a lot of post-hoc justifications for a block of code or a data/tech decision that really is, just... 'stupid'. Again, that's more likely down to inexperience than anything else, but not every decision is a 'good' one just because people 'had their reasons'.
I agree on the point regarding the decision being 'stupid' but in this case it was not because of inexperience. On the contrary, it was because an overruling decision by an experienced manager. So the point is that it is not always technical or due to inexperience (though that does happen) - from what I've seen, it is quite common for such things to happen due to hierarchical/ego/political issues as well.
you're not wrong, but there's a whole class of problems that occur outside 'enterprise' structures. Often it's just a lone cowboy building something for a small business, and the business owner have absolutely no way to determine if what's being delivered is 'good' in any meaningful sense.
I don't think you can diagnose over-engineering after the fact. Unless you were in the room, or have access to written specs from a meticulously documented team, you don't understand the conditions under which the code was written.
Can you prove it in a court of law? No. But you still can often easily diagnose it. And you'll usually be correct.
For example the conditions under which https://wiki.c2.com/?SecondSystemEffect happens are well-known, and the pattern has been repeated often in the half-century since it was named. If it is a second system, and you can find complicating abstractions in it that get used exactly once in practice, then with better than 90% odds it was over-engineered. And the farther you dig, the stronger the evidence will become.
Specs are often over engineered. This is part of the problem. Too much effort in rowing the boat, not enough effort in figuring out the right direction to point it first.
This is why commit messages are so important. Documentation that is external to the code its referencing will always drift but a commit message is a snapshot in time that is attached to the thing it is referring to. In my experience not enough people take commit messages seriously and just basically give a summary of what changed not why it was changed.
Commit messages get lost too easily. It’s the worse way to document reasoning around the code. There’s many proposals to document code outside of code and none of them has ever worked. Just comment your code.
> Maybe the company was betting at the time on an integrations marketplace, or a lot of partnerships, so a robust integration platform was built. Then the company moved on to another bet after only writing one or two integrations on the new platform. Nobody over-engineered here. Everything was built to spec.
I would argue this is a case of over-engineering. Though, it extends beyond pure engineering.
This is poor resource management and business de-risking. An effective approach here would have been: "Let's build out a handful of integrations. Make sure they're going to deliver the value we want. Then abstract it into a repeatable pattern".
I've been in exactly this position, in fact I conceived and created the situation at my last company. I found it uncanny how on point OP was.
The trouble with "Let's build out a handful of integrations. Make sure they're going to deliver the value we want. Then abstract it into a repeatable pattern" is that when you're building an integration marketplace, you need partners, and they don't want to be your research bitches.
They just want to build their integrations once, and gtfo. If they think you're signing then up for a revolving wheel of experimentation, they'll just politely stay away until someone else has done the hard yards.
Sure if you're a Microsoft or a Google you'll have any number of willing partners who will put up with anything to be the pioneers in your integrations marketplace.
But otherwise, they're using your integrations marketplace purely for the benefits, and they don't want to be building on sand.
Well how do you know something is overengineered?
It's when you know it will never (or too far in the future) need to scale accordingly to the engineering.
One thing that bugs me is the notion that "Software rewrites are something you should never do", which is a mantra so often repeated that it has acquired the status of self-evident truth, despite the only evidence being (usually) presented is an example of a web browser from 20 years ago! (Which incidentally spawned Mozilla, so not exactly a complete loss; especially from the POV of society rather than shareholders, but I digress).
Having rewritten a bunch of systems (sometimes several times) I can attest that it will not always lead to the death of the company. The trick is of course having modular enough systems that they can be rewritten from scratch in a reasonable amount of time.
It can also be a great way to increase the simplicity of the system as typically the old version was designed with a very imperfect understanding of the problem and no operational experience servicing it; further learning were usually crudely patched on top and you often end up in a conceptual hodge-podge where words mean subtly different things depending on the context and translation layers need to be inserted between the contexts etc.
Often a (good) rewrite starts by clarifying the conceptual model. I like the saying "clear writing reflects clear thinking", and in programming there is a lemma "clear thinking produces clear code".
I suspect that the main issue with rewrites is that the users or product managers see it as an oportunity to add new features or redo old ones extensively. In the end the scope of the rewrite is no longer a rewrite but a new product that is incompatible with the original it was supposed to replace. I have seen this happen a couple of times. A straight rewrite for technical reasons and well defined scope does not suffer these issues.
This is a great point, and my successful rewrites have done the opposite, reducing scope/capabilities. "We changed other systems and no longer need to handle x/y/z in this service", especially when most x/y/z's are edge cases now eliminated.
one huge issue with a lot of technical and product debt is that any re-write gets saddled with a huge dam of expectation bursting. Many people who have been told they can't have their feature for years because the volition of the software team has been close to zero (hence the rewrite) for so long suddenly push their demands onto the new product. Its hard for a re-write to focus on an MVP as a consequence.
Arguably it can be better to float a completely different boat and see if it swims but that can result in a product positioning problem where you then have two versions of the same product but with an uneven feature set.
Old developers have left. New ones come in. The systems the old devs built suck, so the new team convinces management to do a rewrite. The new devs don’t really understand the problems encountered by the old ones so they repeat all the same mistakes. New system, same problems.
The lesson we’re supposed to take away “no one should ever do full rewrites”. It’s a stretch, but IMO the proper takeaway should be 1) really understand the old system and 2) have a very good reason before doing full rewrites.
The way I've heard it is is "Evolution, not revolution".
You evolve your code with refactoring and rewriting only pieces at a time. This is opposed to revolution, also known as "the big rewrite", which replaces the entire application all at once.
Your "modular enough systems" seems to indicate that you also favor evolution over revolution.
> despite the only evidence being (usually) presented is an example of a web browser from 20 years ago!
Actually normally the evidence is lots of other companies that failed to do rewrites. It's just that one was a full scale fuck up. I'm currently working at a company that literally it's echoing Netscape. The issue wasn't the rewrite it was rushing a half finished rewrite out the door. It was stopping product development for so long.
My current employer started a rewrite but called it a migration gave it a 3 month deadline. 3-months to write all the features it took 7 years to write. They realised this was impossible and remove a bunch of features and decided this rewrite would remove features they will add back later. But they still kept on setting months for something that has taken 18 months so far with even more features removed. It almost a daily thing that yet another product feature was removed to cut down time. They claimed they were feature complete in september because it had to be done under all circumstances, they found out they hadn't done 50% of the features they said they would. So with more rushing of the features they hadn't written they then started talked about releasing it before it had passed QA. They announced the release date before it had passed QA. We have partners using it and saying it's broken for them. They don't have all the data. And yet they're still releasing it on Monday. Why? Because it had to be done in 2021. They're rushing a half finished rewrite out the door to hit a target set by management. So they spent 18-months removing features and when they release it, it will be buggy.
So, yea I mention Netscape a lot because honestly, this sounds the same. Rush out a half done rewrite while allowing the competitors to improve their product while we made ours worse.
> Having rewritten a bunch of systems (sometimes several times) I can attest that it will not always lead to the death of the company. The trick is of course having modular enough systems that they can be rewritten from scratch in a reasonable amount of time.
I would say that the trick is not to do the rewrite. You refactor each part until the entire system is rewritten.
> I would say that the trick is not to do the rewrite. You refactor each part until the entire system is rewritten.
This is exactly the right way to approach this. The best way I’ve seen to rewrite a complex system is to literally work off a branch and deploy it in QA beside the old version. The hardest part is figuring out the right way you want to direct traffic to the “new version”.
My team inherited a massive system that was the key revenue generator for a multi billion dollar company. It was an operational nightmare from deployments to stability. It had at least 1 24 hour outage that was nearly impossible to root cause.
We slowly chipped away at it for 8 months running in parallel in QA until we were satisfied that it was functionally equivalent. Started running traffic in prod while we tuned it to start taking real traffic and had the whole thing replaced in 18 months.
The system was replaced, is handling 2X the load in prod of the old system and hasn’t had an outage years
Having dealt with a similar setup that basically had to be re-written, I totally agree. I would also like to add that if one senses themselves to be in a situation like this where the system is a messy build-up over years, try to resist adding things that are absolutely not essential. Otherwise, the guy doing the clean-up/re-write down the line may be forced to take not-so-clean approaches to cater to those non-essential bits mainly for backwards compatibility.
Rewrites are also sometimes very good for your career. A friend working at Google for about ~1.5 years said a new Sr. Director of Engineering joined and their first decision was to rewrite the project, and it would take 2 years.
So now the Director has job security for about 2 years, gets the big launch near a promotion cycle they have a small chance of being considered for promo, and gets to blame the predecessor for all the problems.
Complexity kills your product - Overengineering is just one instance of complexity - Technical Debt like having state and data all over the place is another one - I quit my last job and I happily blame this article for convincing me to quit: https://itnext.io/the-origin-of-complexity-8ecb39130fc - coordination causes complexity and this killed me - we had everything not once but twice or more in different places - just one example - it's much more complex but me and my colleagues were grinded between an old codebase that nobody understood anymore and kubernetes, ci, etc.pp on the other side because management sold this without understanding that you need a process and time for digesting and applying the concepts in the team.
This sounds exactly like the last place where I worked. I quit in anger Not because I don't like Kubernetes but because management sold it as the fixall solution.
This is like saying that water is lethal - both are true in the sense that some x (water, complexity) is ok, but too much is bad.
Complexity is an unfortunate (but necessary) side-effect of (1) adding features and (2) optimizing for performance, both of which are critical for building a product.
If you try to remove all complexity from your product, you won't have a product. Instead, you have to try to minimize complexity while still delivering the same set of features and level of performance.
The basic point is that the default tendency in many environments is to run toward ever-increasing levels complexity -- often without even being aware that this is happening -- rather than be (as I believe one should be) by default always skeptical of added complexity.
If you try to remove all complexity from your product,
This wasn't what the commenter was advocating, of course.
Rather it was an attempt to simplify the basic message of: "Complexity has attendant risks which are too easily ignored -- and which in short enough order, can kill hour product."
basically I just looked for an excuse to post the link to the article - which is really a great article and describes the problems inherent to software in an - for me at that time - new and fascinating way.
Of course in practice it's always messy and rightly so, but I've learned the hard way that you better don't loose control about your state and data - because that's where the ugly bugs are.
The article is very good. Just ignore my drivel :)
I interpret “early over engineering” not as adding complexity necessarily. It could also be about decreasing complexity but by taking too much time. Early on you’re supposed to rush to your MVP and add tech debt, not spend too much time to design something pretty and modular. Later on once you know your company can survive multiple quarters, then you can spend more time refactoring.
It depends, there’s a trade off. At some point things can become very complicated if you don’t abstract them and put them in their own lib or framework.
The less you use external dependencies, more time for you to focus on your own product instead of solving problems updating those dependencies and replacing them.
Six months ago I left a company that was working on an overengineered product. Even worse than it being overengineered was that it was also under documented. Working on anything was a pain, because the CTO wanted everything to follow his well thought out, and frankly very cleverly engineered design patterns, but he couldn't clearly communicate what those patterns were. And the entire company amounted to transforming and cleaning data sets using in-house tools, which could easily be done with existing tools too. Both myself and the other senior engineer on the team left at the same time. I felt bad leaving them, because they were trying to grow and had a ton of funding and deals with FANG companies but they were struggling to find engineers that the CTO thought were smart enough. I didn't want to burn bridges, so I didn't end up telling them that the problem wasn't a lack of qualified engineers, it was an over-engineering CTO who struggled to communicate.
> they were struggling to find engineers that the CTO thought were smart enough.
If you have to find the smartest people to keep the wheels on, you’ve already lost.
Disdain for the bell curve is the fastest way to get overengineering. Very few things have to be rocket science to create a good company. Everything else should be dead simple so that anyone can work on it. That also means you have to compartmentalize the hard bits so they don’t leak across the entire codebase.
But some people get bored with mundane code and will make things interesting by replacing a boring task with an exciting one. It’s part of the Curse of Knowledge. You can’t run a company like an action movie. Everyone gets burnt out.
Being ins a similar role myself, how do I ensure that engineers stay happy working on the project that we're working on? I'm finding myself actually doing the opposite of the CTO you mentioned and pushing them towards adopting more off-the-shelf components instead of maintaining homegrown stuff but I think I'm causing a degree of upheaval by doing this. Their justifications for push back however, often smell of sunken cost fallacy to me.
It’s a real problem, if you’ve hired ‘architects’ whose skill and passion is creating new systems, rather than connecting and modifying existing tech. Video games went through this shift about 20 years ago, and it was painful. Many architects ended up moving to the game engine companies, or to other tech areas than games.
I think the best thing you can do as a CTO is define the problems/goals and desired outcomes very clearly. Think of possible solutions if you can but don't share them, and give your team the problem to solve. And let them stumble a bit, because in the level of buy in and growth you'll get is more than worth it - it'll be their solution after all.
I'd second this. Your role as a CTO is to define technical strategy and help align your team with that strategy, not to tell your team how to do their jobs. They know how to do their jobs. That's why you hired them. Share your strategy and goals with the team, and trust their judgement on the specific decisions that help you get there. There's nothing more annoying than being an engineer whose technical decisions aren't trusted. If you have a quality engineering team, most of your decisions as CTO should be about strategic direction, not specific technical choices. Specific technical choices can be entrusted to your team as long as they're in line with the strategy.
Literally the only two things leadership should do to be above the mean:
1. Clearly define a vision for the future/ goal(s) that should be achieved
2. Get out of the fucking way of your minions, and trust that you hired correctly, to let them figure out how to get to the finish line
Bonus Points:
3. Your temperament is in the Goldilocks' zone of neither being too much of a spendthrift, nor too much of a miser, when setting budgets, i.e. you're not some rando without P&L experience that was tied to your bonuses.
Sorry to point this out but I find it problematic that neither you nor the engineers are making decisions out of experience..
My recommendation is that you hire someone with a TON of experience that makes these decisions not based on Medium post he read last week..
I am a developer and I always try to re-use a solution (preferably open source) instead of rolling my own (I'm happy to do the latter in case the need/situation demands but those cases are not that frequent). I also happen to enjoy devops (specifically CI/CD stuff and enjoy the integration aspects of connecting things to achieve a goal and see the entire pipeline producing the end-result). So I guess you'll have to find people that have the integrator mindset instead of those with NIH syndrome. At the same time, you do need some talented developers as well in case you do need to roll your own. It is a balance like most things.
As someone who is miserable on a similar project, the best thing you can do is give ample breathing room for the labor. The codebase is stress inducing, so being empathetic on deadlines and and what may appear to be shifting attitudes (good days, bad days) means a lot. Give off days early, try to make sure people pace themselves.
Otherwise they’ll burnout and the resentment of the state of the codebase will be jet fuel to that brand new revolving door on your team.
Commiserate to some degree, ‘look, I know this shit sucks’. If they don’t know you empathize, they won’t trust you.
> and pushing them towards adopting more off-the-shelf components instead of maintaining homegrown stuff
Are you spending your time second-guessing the developers and micro-managing them on the exact thing they are supposed to be experts?
Your phrasing has that "feel", but it's far from a sure conclusion. Anyway, if it is the case, you may need to reevaluate both your hiring (should you get more senior people?) and management (this is a clear "should never be done", unless your people are explicitly under training) practices.
Off the shelf software isn't an instant win. You're signing up for lock-in, a set way of doing something, a boundary you can NEVER cross, and domain and language you will never be able to change. This language will leak into your software and may not be a good fit for the end user you're trying to serve.
That's a lot of trade offs to avoid maintaining the subset of features you need from said software in house, being able to leverage internal knowledge, being able to streamline all your environments and tooling to suit your needs instead of catering to the needs of said software.
This isn't an argument one way or another, it's just pointing out that there are trade offs you need to make consciously or otherwise.
In your position I would be extremely concerned about pushing off the shelf software onto Devs that either lack the clout to be comfortable pushing back or lack the communication skills to clearly articulate all the trade offs really taking place.
It's very easy for a boss figure to push through requirements with off the cuff pointed questions. The reports often need to push back with orders of magnitude more research and thoughtfulness than went into the question or suggestion.
Same been in that role and in a similar role at the moment. It's hard but this is the industry I can't change that anymore I just have to work with these people long enough so I can retire.
The most important thing is to find the right balance. This article goes into one direction. But to be honest, most of the time I see it shift into the other direction: in product driven orgs the drive to implement features and bring them to market quickly is more important than engineering quality. But in the end you end up with something where implementing new features takes so much time because of the complexity that built up because you started to implement a product where the specs where unknown in the beginning and only materialized later. That's the point when you should re-engineer your system. But yeah... it's all about balancing and finding the sweet spot.
To put more succinctly, the least code wins. What we're talking about 2 opposite problems that cause the same end problem, too much code.
Overengineered what-if everywhere often causes way too complex systems where even simple things become hard, de-engineering is nigh impossible because it'd break things by taking away functionality (unless an alternative manifests itself).
Non-engineering hackjobs where we basically have N copy-pastes or more or less the same functionality everywhere, cleaning up anything will require a fair bit of testing to avoid breaking anything in the process, possible but causes a lot of uncertainty and risky if you won't see it unless failing in prod (because it goes hand in hand with bad deployment practices).
As mentioned in the article and above already, you need people with experience and to give those the business knowledge to find the correct balance.
I think there's a very specific form of overengineering afflicting products currently, which I'd classify as "endless revisiting". This is where companies build something which works well enough, but then get trapped in a cycle of endlessly tweaking that one thing. Inevitably the amount of code churn in this one area combined with the need for some semblance of backward compatibility results in something that is fragile, complex and slow.
Plus the annoyance as a user that whenever you use this thing, all your tools are in a slightly different place and work slightly differently to how you left them. IMO there's a need for better balance between "it works well enough, leave it alone" and "we haven't fully optimised this workflow yet" in product development.
Yeah, I see this happening a lot as well. A prime example is Spotify. Their app is done. It has all the features it needs. Instead of just focusing on making those features work more reliably, they seem intent on doing a big UI redesign every few months, adding bloat to the app and making it more frustrating to use. It’s really rare to find companies that just build a thing and stop adding features once the core functionality is done.
The issue here isn't so much over or under-engineering, but rather "Are we building the right thing?" or "Are we building the thing right?"
In a startup you don't know if you're building the right thing so trying to build it right is premature optimization. Since you have limited resources you really have to focus on ensuring that you're building the right thing...if you aren't it doesn't matter how well designed or built it is no one is going to use it.
Once you've validated you're building the right thing you can start focusing on building it right, but by they you probably know where the pain points are and where you need to spend the effort.
The trick with this though is that everyone needs to be aligned on that approach and be honest about the fact that corners may be being cut to get something working fast. Where I've seen this go badly is where a crappy initial version was built to get to market fast, but then no time was made available to address the defencies in the initial release.
The "super simple code" at both ends of the graph aren't equivalent. The latter is more "simplest code possible".
Overly simple, hacky, narrowly spec'd code produces the same tech debt as overengineering. Anybody who's worked in a move-fast-break-things type startup will know how much engineering resources are wasted on rewrites/bugfixes due to this.
Ultimately, as with many things in life, you need to find the right balance.
I think the graph needs to continue a few more years, but the author hasn't lived that yet. The "simplify" mindset is also something you can take too far without experience.
On the other hand, simplifying an already over-engineered product is almost next to impossible simply because many jobs depend on it. Maybe I'm cynical but I'm beginning to think that software complexity grows until it justifies all the head counts in the department.
Yeah it is like this. Even simple sites require a ton of engs thrown at it. On the business side I guess they figure spending more money on tech will make them safe, but it's like of you wanted a bowl of soup and hired ten chefs to do it. But I can't complain since I am one of the soup chefs, and I help make sure we have tests that prevent the soup from becoming too salty
This is the same phenomenon as the Peter Principle, where people Rose to the level of their incompetence. It sounds like a joke at first, but, of course they do. People get promoted when they excel. When you are in over your head, you stop getting promoted. Of course, over time, people grow into their roles, and regain their competence.
Devs will build software until they can no longer do so because the codebase is larger than their collective abilities to manage.
> Devs will build software until they can no longer do so because the codebase is larger than their collective abilities to manage.
I'm developing a principle of radical simplicity to attempt to combat this: always keep things absolutely as simple as they can be.
It is always easy to add complexity later, never to remove it. So the only conscientious choice you can make is to keep things as simple as possible while satisfying the requirements.
You should also critically evaluate requirements that introduce complexity.
Also, I don't think people mention this enough here: complexity == bugs.
Ability to keep things "simple" (basically produce code / architecture that is easy to understand) when solving problems in business domains (which usually are very complex in real live) is an art that few can successfully practice.
This is both interesting and disheartening - the evolution would concern itself more with the software organization then the product (hence the product death).
I'm wondering about this now I guess it is old fashioned now to use environment variables and bare EC2 servers, managing your own APIs and websockets/DB on same server as opposed to breaking everything out. You need to use cloud formation and "oh did you know there is an AWS service for that?" Then you are using 5 services instead of 2. This twelve factor app concept. Don't know when is the right time to do this/at what scale.
My last startup was spending stupid money on AWS services and horizontal scaling for an app which had, when I left, about 100 users and maybe 2 concurrent at a given time. And they had been doing this since before I joined when the numbers were much lower. The complexity was idiotic and no one but the devops guy who set it up could grok it. We still managed to have frequent downtime
my two cents : until you hit scaling wall (and when you will congratulations, you are either successfull or a video streaming platform), a big server to upgrade is the best way to go and focus on building product.
Then you hit performance problems, any sysadmin could help you handle 2x/4x/10x scaling with simple separations of service and maybe some hours of downtime. In the meantime you probably have weeks/months to think about going really crazy with your infrastructure.
What kind of numbers am I looking for? Thousands of users? Ops/sec? I realize it is partly due to what your thing is doing eg. website vs. a complicated app.
Yeah it is a good problem to have assuming you have cash flow.
Of course it may vary a lot but i would say a hundred of thousands of users (100 000).
To explain the number, I assume a user will spend 1% of it's time on your app (that mean the average user spend around 15min per day (EVERY DAY) on your app, that may not sound impressive but it aldready is) and an nginx server on a powerfull machine could handle around 1k connection at the same time.
If you want to dig about an exemple of number far less abstract I remember stack overflow published their settings :
When your service can no longer process requests as fast as they come in, you've hit a wall. Until then the simple solution is to allocate more resources to your service (i.e. scale vertically).
There's obviously a balance to be struck but the more often you do something the easier it is to do. If you have a simple app with a db, backend, frontend and proxy, vertically scaling every time you hit that wall is going to be very painful. A little complexity goes a long way - using a managed system adds a very small amount of complexity in exchange for some breathing room when you need it. The last thing you want when your service is in a death spiral is to start thinking about the practicality of migrating your db and taking the hours of downtime/working overnight/weekends to do it at a convenient time for your customers.
As others have said, it is unecessary complexity. Over-engineering is an ambiguous term.
For example, premature optimisation is frequently mentioned as over-engineering but is it premature optimisation to use a map/dictionary instead of an array for key-based access? Nope. That is just correct. Is it over-engineering to know that if your product succeeds, you could end up with X-hundred objects which will use up all the RAM you are storing them in and therefore you might want to make them smaller/store them off-board? Of course, you can come back and refactor later but it is so much easier for the person who writes that code to understand what can be done on day 1 rather than assuming it is cheaper to refactor later only when needed.
I think a better take is for people to accept that no app lasts forever. If we build that assumption into our worldview, it will help us make better decisions. I still think some engineers think there is some perfection that means the app will live forever.
> use a map/dictionary instead of an array for key-based access? Nope. That is just correct.
Maybe, maybe not. While the map/dictionary is easier to use, if the number of items is small (which it often is) the array will be substantially faster because the CPU will pre-fetch the next element into the cache while you are still checking if the current one is the right key. Maybe - check with a cache aware profiler, and you need to design your array to be cache friendly.
Of course the above assumes your map/dictionary is on the hot path (seems unlikely), your code is too slow (probably subjectively, but in some real time applications this can be objectively measured), and you are writing in an efficient language. IF All of the above are true, only then should you go to the complexity of using an array to do the job of a map/dictionary.
Also depends on expected changes to the no. of items. If you think it can shoot up, then you are better served by sticking to map/dictionary (and paying the minor penalty, assuming not in hot path) compared to starting with array for low-volume and then having to make code changes to map/dictionary to handle increased volume.
I've written a lot of complex stuff. In fact, I'm doing it right now.
There needs to be a "sweet spot," where we have enough complexity to achieve goals (which can be business and cultural, as well as technical), and as much simplicity as possible.
A lot of folks think that using dependencies makes the project "simpler."
It does, for the final implementation team, but that complexity still lives there; along with a whole bunch of cruft, tech debt, potential legal issues, security issues, etc.
Unfortunately, you can't do complex stuff, simply. Some level of complexity is always required. Managing that complexity is where we get to play adults.
Right, using too dependencies don't make things simpler. They make them easier as long as the dependencies can do exactly what you need. However sometimes you come to a point where the dependency can't do something that's needed (often by the time I realized a third party lib can't do sth, I could have implemented it by myself without the dependency) or where you've accumulated so many dependencies that they conflict with each other or where connecting two different libraries is the really hard part. No silver bullets.
Figuring out the sweet spot in architectural design seems to me still an art/craft, that comes from intuition and by experience.
[As far as dependencies... using dependencies may sometimes be simpler than the alternative(s), and sometimes not, sure. It depends on the nature of the dependencies, the problem, and the alternatives. :) ]
Complexity is also one of the best moats in the business, if you have a "gross" problem with many edge cases there's few people who will want to eat your lunch, and most next generation up and comers often tout their simplicity compared to your old complex product because they certainly cant argue feature parity ;)
> It does, for the final implementation team, but that complexity still lives there; along with a whole bunch of cruft, tech debt, potential legal issues, security issues, etc.
I call this squeezing the balloon, the complexity doesn't go anywhere, it's inherent to the requirements of the system. You just put it somewhere else.
This is just code factoring at a macro(-ish) level.
For well maintained deps there is the extra boon that it takes work off your hands though. For instance building a React app with Next.js saves you from ever having to deal with webpack and you get big upgrades for free.
> For well maintained deps there is the extra boon that it takes work off your hands though.
Absolutely. Nowadays, it's pretty much impossible to write anything without some level of dependency; even if just the development toolchain and standard libraries.
The problem is that a lot of outfits and people are releasing subpar dependencies that smell like the kinds of high-quality deps we are used to, but, under the hood, are ... not so good ...
Nowadays, it's fairly easy to write up a Web site with lots of eye candy, and gladhand a bunch of corporations, enough to get their brand onto your Step and Repeat Page, so it looks like your dependency is "big league," when it is not. In fact, the kinds of people that couldn't design a complex system to save their lives, are exactly the ones that are experts at putting up a wonderful façade.
What gets my goat, are proprietary SDKs, often released by peripheral companies. These can be of abominable quality, but are also the only way to access their products, so you need to load a bloated, sluggish, memory-leaking, thread-safety-is-optional, crashes-are-so-much-fun library into your app, and hand that library the keys to your sandbox, just so you can get at the device.
I've been writing exactly that kind of SDK for years, and know that we can do better. I'm a big proponent of open SDKs, but many peripheral manufacturers are absolutely dead-set against that.
These SDKs often mask significant complexity. That's what they are supposed to do. They also generally translate from the workflow of the device, to one that better suits application developers. Some SDKs are just a simple "glue" layer, that directly map the low-level device communication API to the software. That can be good, sometimes, but is usually not so good.
1. Without good development practices due to lack of Experience or plain old Rational Thought the only possible outcome is Operational Deficiency.
Every Best Practice is context-dependent, thus having Deficient Resources makes them inapplicable in certain cases. Some teams can really suck with the same tech stack when others flourishing using it.
2. Basic organizational anti-patterns, like Mushroom Management, and broken retrospective lead to rediculous outcomes.
Even plain old Micro-services and Micro-frontends can be a basis of Stovepiping and applying Mushroom Management. Usually, again, due to Lack of Competence and Sheer Hubris.
3. “Premature Optimization” only used in context of over-engineering by those who didn’t read the book, but use Halo-effect cognitive bias to project compensated qualities onto the term itself. There are a lot of Psychological Compensational Needs under the hood.
It’s like “Why Agile has nothing to do with Discipline ?” or “Why senior developers turning the project into a sandbox due to the lack of self-fulfillment ?” or “Why most of the MVP’s lack Concise and Validated Definition of Viability ?”
Complex doesn’t mean Hard or Expensive. Simple doesn’t mean Easy or Cheap.
Too often “over-engineering” is just an organizational and psychological issue and not an Engineering one.
Stop operating on Feelings.
Six Sense of the Fifth Body Anchor Point is not a reliable Key Performance Indicator.
"Stop overengineering" was the excuse I heard back on a rough project when I insisted we add logging beyond just the returned HTTP status code to our service before shipping to 200M+ people. But that would take an extra week or two in the current system and that's just too much investment for some silly over engineering.
Had we gotten decent logging in place early, we could have saved a terrible change that got passed our canary rings, which got us called to see the CEO and made news.
So, I learned early that "overengineering" can also be a management excuse for cutting corners that shouldn't be cut.
Overengineering is typically a term thrown around when a manager wants to ship a system in half the time it actually would take to make a half-decent system!
The biggest consequence for me was not mentioned: that your carefully planned design very soon becomes an obstacle to something a user actually wants done, at which point you say "we can't do that". Which is one of the worst things you can do. In this sense almost all of the systems I interact with, modify etc. are over engineered.
Interestingly, dang (HN mod) once mentioned in a feature request comment that he has a mental model of a complexity budget. This very closely fits my own perception: You can only add so much complexity to a system before it starts to break down under the load. Estimating that budget, and how much new features will consume, makes a good engineer in my opinion.
Another problem is that these overly complex systems are often very fragile when something changes. In a very theoretical situation there could be a case where someone with an overengineered client consumes your JSON API and their client breaks when you add a field to a certain service response. Something that should have been no problem suddenly causes total breakage of your software and then you'll have to alter that very complex piece of software. Something that could've been prevented if you kept things simple and robust and simple chose to ignore extra fields or headers you weren't using anyway.
That's indeed the irony. The over-engineering happened in an attempt to be resilient to future change, but the outcome can often be the opposite. We've all been there, I think?
I've had this exact ticket in the past: A customer built a complex Java client for our public API which assumed no surplus fields in the responses, and as we added a new field, their entire application broke down, causing huge losses for the customer. I wasted so much time on explaining how our stability promise does not extend to added fields!
I do not understand how is this over-engineering. Over-engineering means that you worry too much for future/edge cases that might or might not come and you want them covered too early. The fields situation looks like bad engineering unless the reason that the "no surplus field" has an explanation which falls under "covering future needs" that escapes me at the moment.
Oh boy, does this ring a bell with me. I've already written 5575LOC according to cloc (and threw away 7449LOC in the process) and I'm far from finished writing the code to email the data from a contact form in PHP. But it ticks all the boxes!
It's all OOP
SOLID principles
99,6713% typed (according to psalm)
Purposely written to be unit-testable in PHPUnit strict mode (but no actual harness yet)
I suspect under-producting has killed far more products than over-engineering. The ratio of decision making power to decision making abilities is way out of balance for most Product Managers. Even at the big tech companies I feel most PMs are unimaginative MBA types that can optimize but not innovate and have no grasp of the concept of opportunity cost.
In terms of power structures, Product Managers decisions largely go unchecked in a lot of places. Engineering decisions face significantly more scrutiny, especially in places that work in short sprint cycles.
A product manager's decisions go unchecked most of the time because engineers and responsibility for business concerns are like oil and water. That's why PM exists, no? If engineers could just talk to marketing, sales, etc. then there wouldn't be a need for them, but alas, that is not the case.
No one is denying the need for PMs. OP is pointing out that PMs have too much decision making power, with too little accountability in most organizations.
Your argument is akin to, "PMs can't code, so alas we need engineers, and that's why shitty engineering exists, and there is no way to make it better"! Nope! We need product practices, akin to engineering practices, with 360° feedback and analysis, and the product management should be held accountable for their decisions!
Right. And I'm saying that is because engineers don't want the responsibility that they often request because the entire point of PM's existence is offloading that responsibility. They seem more than happy to complain about it though
This is still missing the point. The way to improve PM accountability isn't engineers fixing them. It's the organization's and leadership responsibility to ensure PMs are held accountable.
Engineers would be far happier if they don't have to do good engineering and can do away with shitty software without accountability. But there are checks and balances to improve engineering quality, and those aren't created by PMs. It's the engineering org that champions good engineering practices and accountability and post mortems. Same should be done in PM organizations.
From a UI/usability perspective, one of the dangers with overengineering is the "wall of options" issue, where all users - even the majority of them that just need something simple - still need to read and understand all these advanced options. As a product manager and UI designer you have a couple ways to deal with this. You can choose an opinionated subset of features that make sense for a given niche and target only that demographic. You can go the corporate way, keep the wall of options and just require training for users. Or you can try the balancing act - choose a sane subset of features as the default, and hide the more advanced options, so they don't bother normal users, but still have them as possible options. There are many important choices regarding how much to hide, and where, and how to make it discoverable, and how to make it possible to gradually dig deeper, and for users to self-identify as someone that needs to dig deeper - and those details are often as much art as science. But get it right, and you've got one of those rare killer apps that both newbies and experienced users enjoy.
I must not be the target user. I don't want to tinker with gradients, I just want a button with some text in it. Yet, I see options for the former, but I had to carefully align two different elements for the latter.
I'm sure this can all be solved with modeling out your design system or whatever, but as a product manager who needed to make a quick mockup, I found it a little overwhelming.
These are just fantastical musings of a product manager, who is pretty far from engineering, and thinks that their products fail because of engineers, not because they didn't get their product right, and/or think that every engineer who doesn't produce a fully functional Facebook with news feed and friends, in 39 days is overengineering their product and are far removed from users!
Fwiw, my experience is that generally a lot of blame for over engineeering tends to be lumped on engineering when in reality its usually a wider business failure.
Every competent engineers I have met has been capable of grasping that:
- They shouldn't rely on the use of crystal balls.
- Complexity should be minimised.
- If it can be done today, it can be done tomorrow.
So if a business provides its engineers with:
- A process that helps them develop an intuitive understanding of their customers' needs.
- At least one other similarly capable person to work with so they know they're not the only person going to be looking after the project off into the future.
- A procurement process for off-the-shelf solutions that is less painful for them than rolling their own.
- Time to test and document where projects should go if the initial version is successful.
- And crucially, confidence that they'll be given enough time to do the additional work if it becomes necessary.
Then the business, at least in my experience, will be in a pretty good position to prevent almost all over engineering.
I often find myself prefering languages with few abstractions even when some more complex language might be "better".
Reason is that with "bare" languages like C, you can choose your own design more freely. Case in point: Serenity OS.
Author of that OS replicated entire libc and still refuses to use C++'s STL. He created his own abstractions that he feels confident using. That is true engineering.
You should check golang, it’s much more safe than C and has very few ways of doing the same thing. It makes every codebase look like simple code. It’s always a pleasure to read or refactor. That’s the only language that managed to do that imo.
The concept of overengineering is good on paper, but in practice it's being overused and not understood precisely.
I see devs freaking out when you use the word abstraction now...
Suggest extracting business logic from a React component and you don't know anymore if someone is gonna raise the "overengineering" flag.
If you start a new project, do you not use any library or framework at the beginning?
Obviously you take decisions according to how much the project/feature is expected to scale.
If your estimation was too low, you might cripple your development at some point, and need a rewrite or at least some significant refactoring.
If it was too high, I guess you overengineered.
What matters is asking yourself the question, and of course as engineers we like to challenge ourselves into building the best possible solution, but we equally need to consider how likely it is that such a solution is never needed.
In the enterprise I see this all the time. Step one of the project is lets look at kubernetes or whatever is hot lately. Even for something stupid with 10k users max. What they actually need is a 30 line terraform script and a preconfigured AMI.
This is why I used to throw things into simple kubernetes setups (that I just ran the same terraform scripts to create) and just tell the management "yeah, now it's ready for all of your features".
I argued for a while, then realized I didn't have to.
The best designs are so simple that people do not even understand that it is a design. That is one reason why it looks like senior devs write simple code.
Making overly complex products developed in an absence of engineering input is also a major issue, at my last employer this was endemic to everything product threw at us. If you have no concept of how difficult something is to build you have no reason to not require it; by the time it gets to engineering its either too late to change or results in endless CRs to fix.
I'm seeing this a lot with several of the "devops" or cloud architects (or whatever the preferred title is these days for the guys who manage the apps running on the servers) that we've hired in the last year.
I've noticed two distinct types:
* The "AWS cert" admins who have a dozen different cloud certs, but little practical experience. Every problem is to be solved using some over-priced, over-engineered conglomeration of cloud service. It doesn't matter if it's just an internal app to be temporarily used by a few dozen users for 6 months, they immediately begin following some "best practice" guide by setting up some complex multi-region, multi-subnet, read-replicated, load-balanced, auto-scaled, content-delivery-networked, containerized, Cloud-Formated, hourly-snapshotted, hot-swappable, OAuth secured and social media-login-enabled, automated environment that would be appropriate for only some retail giant's Black Friday operations, not a single CRUD app, to be used only temporarily by 10 users in HR dept.
* The "automation expert" who takes the requirements to set up and maintain a few environments (e.g. dev, test, and production) that might need to be re-created only a few hours, 1-2x per year, and instead spends weeks crafting some monstrosity in Ansible or Cloud Formation or Terraform, complete with all sorts of required infrastructure servers that themselves bigger and more complex than the actual working environment itself. And what's worse is that none of these frameworks like Cloud Formation are ever 100% anyway, so you can't just "push a button" and create a new environment. Instead, there are a dozen holes in the different tools that need to be manually plugged when run, so it's not like a developer or junior devops person who doesn't know the environment inside and out or understand the undocumented quirks could use it themselves anyway, if and when the original guy leaves the company.
I own an ecommerce company, and I'm using a huge (>$100M funding raised) company that's supposed to be a tech-enabled 3PL.
It was a terrible choice, and their idiotically overengineered solutions are a big reason why.
The most egregious one is that instead of weighing packages to find out their weight, they have an algorithm that estimates the weight. My packages are designed to come in at just under a pound (it's a big cutoff for shipping prices). Needless to say, slight problems with their algorithm can and do lead them to overbill me.
A ton of work when they could just use a scale to do the job much, much better (and I'm sure save time overall with all the refunds of overbilling factored in).
So many things can kill your product. And I agree that having an engineering-led product can be especially prone to the dangers of over-engineering... but...
Not having users / customers can kill your product.
Not building the right features can kill your product.
Not doing enough testing can kill your product.
Doing too much testing can kill your product.
Having toxic / inexperienced / unmotivated staff can kill your product.
Having a bad marketing plan can kill your product.
Not having enough staff can kill your product.
Not having enough funding can kill your product.
Technical debt of all kind can kill your product.
Bad data schemas can kill your product.
Under-engineering can kill your product.
Over-engineering can kill your product.
...
This is in no way a complete list, but from my experience the items on this list are ordered with the ones most likely to kill your product put at the top.
Most products/startups fail. This is part of the reason why. But all of this is survivable if you have customers and revenue. That needs to always be the primary focus.
Over engineering also have a way to burn people out when a very vocal engineer say it is imperative that this and that should be scalable and secure. It delays actionable areas before people even want to hack you or use you
This is a true story about an entrepreneur who was suffering a bad case of perfectionism. Instead of showing his MVP to potential customers, he just kept investing in the software, chasing an almost impossible ideal of high availability
While "overengineering" is usually defined as making things too complex, I think it is a bit misleading, that's because simplifying, removing or making things cheaper is also engineering. In fact there are many well paid engineers who work full time simplifying things.
As much as I hate the cult of Elon Musk, I have to admit that's one aspect of engineering he really gets, and I suggest you listen to his interviews with Tim Dodd / Everyday Astronaut, where he gets technical, trust me, it is not the usual bullshit.
All that to say that "overengineering" is bad doesn't mean you shouldn't have a lot of engineers working hard on a problem. In fact, a lot of "ovengineering" is the result of not enough engineering. They picked a (complex) solution without thinking instead of really studying the problem.
A critical job of a system designer is to see into the future and know which components need to be made custom so that the business can have the flexibility to make the changes where it needs them.
Over engineering is a term that is as useful as saying
- product requirements list was too big
- features were never communicated clearly (tree swing comic..)
- right people / people with relevant experience were not hired..
Overengineering is a non technical problem but of course most tech companies only interview devs on their technical skills.
I've found is that once a dev has an idea of how something is suppose to work it's hard to get them to think in a different way about that thing. So when that thing doesn't work out for some reason they just keep adding exceptions and adding exceptions until you have the horribly over-engineered solution.
> overengineering has killed more products than the absence of good development practices
But as the author defines overengineering, it is clearly not a good development practice.
Further, there is no evidence provided to support the claim.
Maybe a truer lesson is that the authors work suffers under too much scrutiny. We're all better off imagining a similar thesis and then imagining it is actually true.
I never understand why people criticize using interfaces for most functionality. How else do you write effective unit tests? Interfaces with dependency injection make unit tests far easier to write. Smartly designed interfaces also make code easier to read and understand. Lastly, adding new interfaces is trivially easy, so why not?
The insidious thing about over engineering is that it's usually committed by very experienced engineers. Experienced engineers rarely under-engineer, that tends to be fixed very early in one's career.
As we get more competent and read more books, we have the tendency to get enamored by new fancy abstractions. We get too clever and then we get in our own way.
Best real-world example: I inherited a project that was an giant "microservice" with 50-ish endpoints, and a Mongo database and dozens of collections. After probably 3 months of wrestling with this thing I had a realization: "this whole thing can be just a simple command line tool and one collection in Mongo". That reduced the code by almost half and it became so much easier to work with. It's frustrating that it could have just started this way.
I used to work for a company which the founder (and original engineer) was both very curious about how compilers worked and paranoid about having the company's code base stolen. That came to fruition with the following idea: "if I develop my own language and compiler, even if someone steals part of the code they still won't be able to run without the language specification and compiler!".
He designed his own programming language, compiler and, why not, the database and tools as well. It was a mess. It got all the limitations of a poorly projected side project and no benefit in the long run besides from huge technical debt.
15 years later, with dozens of clients' sites with the software installed and running locally, the company was locked in a horrible software stack that was next to impossible to move away from by gradually replacing modules, because the programming language had zero interoperability, the original creator was retired and no one besides himself had worked on it for years; while continue giving support to existing customers, because there were fires everyday, and with the same engineering team headcount.
That is one of the top 5 "oh my" in my career. I'm glad I left that behind.
Clojure's Rich Hickey? No. I'm not too familiar with him or Clojure, but it seems he put a lot more thought on it than the co-founder in my story could ever dream of.
OP suggests there's a hump in the curve of experience vs over-engineering; more experience correlates to over-engineering until it gets to a certain point, at which even more experience leads to less over-engineering.
Is it true? I don't know, I think it matches my... experience.
Part of it is that the "experience" needed to avoid over-engineering is helped if it's not just engineering experience, but domain experience too. If someone is constantly jumping industries/domains, it might take longer to get there. I think they still would eventually.
Often I'll tend to over-engineer the initial solution, but as I think about all the complex edge cases that needs to be handled etc I often find myself asking "does the customer really need all this".
After discussing it with the customer, a simpler solution often emerges, where they change the requirements slightly allowing for a much simpler solution that might even solve the actual needs better.
While it's not just down to experience, I'd say it has a strong influence on being able to see beyond the given requirements towards a better outcome.
My own personal learning curve matches that graph almost perfectly. I'm probably not as close to the right side as I'd like to think I am, but I'm always trying to move in that direction.
Over the last years I've been working with an audio library that makes several simplifying assumptions, like
- Audio interfaces are never added or removed over the app's lifetime
- Audio is never rerouted between interfaces, or at least the app doesn't need to know about it
- The details of an audio interface, like number of channels or supported sample rates, do not change while the interface is running
- Audio latency never changes and callback timing is always precise, so timestamps are not really needed and a single number is enough
Of course, none of these are really true and adding support for some of them would require rewriting a few APIs and many, many implementations. On the overengineering side, of course it has its own smart pointers, string class, thread implementation etc just in case someone needs to build for a target that doesn't support C++11.
In the world of hardware, which I realize is a parallel universe to the one in this thread where "product" apparently means "app", the answers are "all the time" and "no", respectively.
If someone didn't take the time to do actual "engineering" - which is to say, using mathematics to formalize design requirements (again, possibly a foreign concept to the app-building class) and just YOLOed the design based on intuition, you can end up with something so fundamentally broken in concept that "fixing" it requires a bottom-up redesign starting from fundamentals. A child's drawing of an airplane is not progress towards a blueprint of one!
quite a bit. under engineered as in 'never really tested, is known in general not to work'. usually, yes in theory it should be easy to fix - but since the parent organization doesn't value getting a handle on what's really going on, it never does.
At first I thought that the article will be about putting so many features, that “in the end it will be able to send e-mail”. Because this is what is killing products - adding, and adding and redoing design as if users really were begging for it. And then comes new product that is leaner and more simple and takes over the old product and cycle starts from the beginning. I think that in capitalism it is not possible to not overengineer a product.
Engineering within the wrong paradigm can kill your product.
I see far more problems on a day-to-day basis with patterns and anti-patterns taken too far. For example, I never use the factory pattern, because it leads one down the Java road where everything ends up an object with mutable state. Which isn't scalable over some metric (like a million lines of code) because a human brain can't trace execution, even with a debugger. A far better pattern generally is to take a functional approach of accepting data, swizzling it, and returning the resulting data without mutability or side effects.
Another really bad pattern is when execution suspends and resumes somewhere else (goto hell). Any project which uses queuing, eventual consistency, promises, nonblocking streams, even basic notifications or things as complex as monads will encounter this nondeterminism and inability to statically analyze code. Note that pretty much all web development suffers from some aspect of this due to its async and multi-signaling nature.
So what I do now, which I don't see much these days, is solve problems abstractly in a spreadsheet (pure functional programming), in the shell (the Actor model) or as a flowchart (declarative and data-driven design), and then translate that to whatever crappy language/framework I have to use for the project. I find that today, roughly 90% of developer effort goes to discovery, refactoring and testing of ill-conceived code. Only 10% is "actual work" and that's probably a stretch.
Which is incredibly heartbreaking for me to see, since I grew up on software like HyperCard, FileMaker and Microsoft Access which solved much of this in the 1980s and 90s in a no-code fashion. One of the very first "languages" I used was a visual programming environment called Visual Interactive Programming (VIP) for the Macintosh by Mainstay, which unfortunately today I can find almost nothing about, to show what an impact it had on computer science hah: https://duckduckgo.com/?q=Visual+Interactive+Programming+VIP...
With mainstream languages and frameworks like Node.js and React, and their predecessors like Ruby on Rails and Angular, I just keep thinking to myself "never have I seen so much code do so little". It's all overengineered man!
If you find yourself over-engineering due to stuff you think you should be doing, things that ‘real developers’ do, understand you are vulnerable to being pimped. It’s no different than anyone else handing over their common sense and self worth in pursuit of an abstract form of validation (so abstract that you can internalize the validation cycle even in the absence of a physical superior).
If you find yourself defending your bad over-engineering and those that sold you the lifestyle, understand you are fully pimped and are now defending the pimp, even to your own detriment.
Happens to all of us from time to time, snap out of it. Don’t get pimped by ‘faangs’ or ‘notable person of interest’. Don’t listen to everything I said either, lest you want to get pimped by me.
Unless you can wholeheartedly make an objective argument for why you used a pattern, a tech stack, a process, in plain simple words, sans ‘that’s how the big boys do it’, sans ‘this makes me a real developer’, you are simply pimped and spewing out pimped out thoughts of your overlord pimps. Be ready to be wrong and backtrack and own the mistake and correct course, but don’t you dare hold on to it, or I’ll probe to see who is pimping you out.
I was never more free from my overlord developer pimps until the day I realized they evangelized impractical solutions. Never hoe’ing for anyone ever again.
It's also important to try stuff out, fail, recover, and try again. That "Code Complexity vs. Experience" graph in the article is not completely a joke. Very few people can tunnel through the complexity hump without years of failures and successes behind them. Moreover, you might not have a choice.
You might find yourself dropped into an obstacle course of complexity that other people created and that you have to keep running following their arcane patterns and practices-- while at the same time implementing new features and refactoring it into something workable before it becomes completely intractable. I think almost everyone faces this problem (except for maybe the most orderly and elite workplaces?).
My message was mostly for those that hide behind mistakes via an appeal to authority. Trying new things is a risk, which is fine, but to own the the success of the risk means you must also pay the collateral of owning the failure. The message is for those who don’t put up the collateral and hide behind ‘this is what everyone(the pros /sarcasm) is doing’ or ‘this is how it’s done’, and never reflect objectively.
No one would argue against trying things, it’s where all creativity and innovation comes from. I argue against dysfunction, the whole ‘the ship is not sinking’, when in fact it is.
Anyway, perhaps I’m speaking too personally, because I am on a literal Titanic right now, so apologies for that.
Maybe the company was betting at the time on an integrations marketplace, or a lot of partnerships, so a robust integration platform was built. Then the company moved on to another bet after only writing one or two integrations on the new platform. Nobody over-engineered here. Everything was built to spec. But three years down the line the new dev is going to get assigned a bug in one of the integrations and lament how the code was over-engineered. Hindsight is 20/20.
Lots of stories from the trenches including many in this thread make this mistake. The same goes for 'tech debt'. If you weren't there, you don't know.