Hacker News new | past | comments | ask | show | jobs | submit login
How a startup can survive technical debt (andreschweighofer.com)
136 points by fidrelity on Jan 2, 2021 | hide | past | favorite | 116 comments



One of the big ah-ha moments I had while reading Start Small, Stay Small by Rob Walling was that most developers have an aversion to carrying technical debt because they've experienced managers that never allow them to pay it back later. But when it's your startup (or small software company), you can choose when to pay back technical debt.

It sounds obvious, but I hadn't thought of it that way. Once I realized I had full control of when to pay back tech debt, it made me more comfortable accruing it strategically.


But when it's your startup (or small software company), you can choose when to pay back technical debt.

This is true but you will always feel that paying off the debt isn't growing the business or giving any real benefit to the customer. If you can fight that then it works. If you can't then you'll get more and more technical debt.


I don't know that this is true for all technical debt.

Generally there should be a motivation behind "fixing" technical debt that is expressible in terms of business value, whether that means faster page loads, more uptime, or ability to deliver X,y,z faster (or at all in extreme cases)

If you can't express the trade-off of solving the debt in terms of business value then it's going to be low priority by definition.

Another strategy for things that as engineers we know will make things better in the long run but show little immediate value can be to ask for a small amount of time per sprint/week to invest - ideally you should still be able to articulate why something is worth changing though


> I don't know that this is true for all technical debt.

Even if there might be some exceptions for some technical debt, it's always true from the product owner's and business perspective.

Keep in mind that it takes resources, and sometimes even downtime, to address technical debt. In the POV of a product manager, addressing technical debt is an investment with the promise of little short-term returns with the tradeoff of significant risk in the form of downtime and regressions and bugs.

And more importantly, technical debt is not a "one and done" deal. There are always more rewrites and refactoring to be done, and the one you're doing today might be the topic of technical debt discussions in the future.


Actually recognizing when that statement is false is what pushes you further as a developer.

This statement tend to be true when you accumulated so much debt that you cannot fix it anymore, but if you end up there with the premise it's your startup, you failed as the tech lead


Job prior to this one was at a raw stage startup (two cofounders, both new college grad). Most of the back-end code base was written by a now-departed first employee (also a new grad), and it was a trash fire. The "tech" founder had a decent front-end code base, where they were very much doing what you describe: I could tell which parts of the code were written when, based on their skill growth, etc etc.

As an employee, it wasn't my call when to pay down the back-end's debt. I was very happy to get out of that.

Anyway. Yes, it's different when it's your debt, and what makes it your debt is then whether you're the one who decides when to pay it.


Wasn't the point that as a founder you're more likely to accrue technical debt? Your anecdote sounds like the opposite.


Hey, I thought I recognized your name! I enjoy your book reviews.


Oh, cool. Glad that they're useful!


The thing is you need to pay back the debt before the management culture coagulates and everything has to be broken down into dozens of Jira tickets each requiring hour-long grooming meetings. You won't have the freedom then to do the root-and-branch refactoring you need to do.


first time i had a tech lead and product manager who understood when tech debt needed to be tackled was a huge relief to my previous roles/teams.


It's very simple; no juniors at the beginning!

They can make things work, but they don't have the experience to avoid tech debt even in the short term.

I have seen this as a CTO, unfortunately I came too late and the company broke.

The terrible management played a bigger part, but the code base was so terrible and the tech debt so big they couldn't ship even very simple features in a timely manner. Testing and release was insanely slow.


I really agree with this.

Junior developers just can't write clear, simple code. They accidentally over-complicate everything, in the most unhelpful way possible. If you have a senior developer that does that, I would consider them a mid-level developer at best.

I've seen this play out in 2 large companies I've contracted for so far.

At the start of a new project, recruitment look for developers with the lowest daily rate they can find and end up with a team of recent grads / people switching careers within the company etc. That's all fine as long as you add a good measure of developers with >5 yrs relevant experience to the mix. But as these people are twice as expensive, they often start without them.

A few years later the product "almost" works. But it's full of bugs, it takes months instead of days to make minor changes, and it still requires an enormous team of people who are all irreplaceable (they possess the "secret" knowledge about the product that usually would be embodied in clear code intentions, past jira tickets, commit history, basic documentation etc.)

Meanwhile a new competitor in the market has developed a much better product.

This situation can last for years (because some companies are a bottomless money pit, despite being tight with it)

Eventually some expensive people get hired to put the "finishing touches" on the product and they start the long slog to fix it up (or just quit in horror after a few weeks)

So far: 1) retired the entire project after years of development 2) company got bought out and the product scrapped after several years of investing money on it 3) is in progress (company has enough money to fund the situation for years, but time will tell if we manage to untangle the mess before it kills the project)


I can't emphasize this strongly enough: no juniors. Probably for the first 3+ years. They're a disaster, and it's taken my company a year -- with another year or more to go -- to recover.

It's not even necessarily their fault; they just need piles of time and mentorship and direction that you can't afford to give yet. Worse, you'll quite possibly end up firing them when you eventually start reigning them in and they don't like being told how to build things.

Trust me, it's been an extremely painful experience.


"Worse, you'll quite possibly end up firing them when you eventually start reigning them in and they don't like being told how to build things."

Had this exact thing happen at my company. It's tough when you have to start doing this, but even tougher if you don't. There are times I certainly think we'd have the same output with half the people if we'd kept the bar higher.


Counter-anecdote: one very successful SaaS startup I know launched with one unsupervised junior dev building everything, because they had no budget for more. He knew so little about coding he didn't even know what tech debt was when I spoke to him, a year in. But he got the v1 to market, which got them investment, and allowed them to hire in experience round him, and start paying off a mountain of tech debt. It was painful, but it worked. And, crucially, they could not have even got started if they'd not done it that way.


I don't think it's a "counter-anecdote". It worked DESPITE of the junior, not because of him.

The startup I'm currently working for has the same story. A junior wrote most of the code, but we got hired at the first round. It will take months to solve the spaghetti, but it might work, but I'm very experienced handling even huge and terrible code bases (300-500 thousand of lines of legacy code).


I ONLY know this story. In the past 15 years I've only seen "started out underfunded and as a mess" and ""never went live because of overengineering", nothing really in-between, but I'm in Europe where Seed Fundings are pretty small.


Actually it did work 'because of' the junior dev. They couldn't afford a senior dev, so the only viable path to success was to hire a junior. If the point of the original anecdote was 'don't hire a junior for your startup' the point of mine is 'sometimes do'.


I've seen this a few times. Terrible code but successful company. The companies could not have afforded better programmers. Instead of seed funding they funded with debt, technical debt.

The hardest part is explaining to the owners why it's so expensive to fix everything. They don't understand how it was so cheap to build yet so expensive to maintain.


I think a team of 1 developer is going to result in some not-too-bad code eventually (as long as the project is small and that developer is diligent and cares about the project).

But a team bigger than 3 developers where no one has any solid experience is a recipe for disaster IMO


If their product was made by one inexperienced dev and they got it to market and landed investment I can't think of a better time to invest in a rewrite or v2.


IIRC this is what happened with Bird's iOS app.


I think it depends a lot on the juniors. I’ve seen some really terrible code written by experienced people who just didn’t hone their craft, and some juniors who figured out some good practices very early on


Years in the business != seniority


Yeah, I met juniors who improved at an insane rate, and they became awesome developers, but it's very rare.


I've seen a startup struggle with so much waste from a bunch of senior engineers massively overengineering an app to the point of developing something akin to their own front end framework... before being anywhere in sight of profitability or even PMF. The pain lasted for years after, even when a successful business model started to come into view.

With the exact outcome that you describe.

Senior v. Junior is not the problem here. Developing according to your stage and future is. Build things that can be thrown away next week when your product shifts. (ie. low coupling, simple, avoid abstraction adventures).

_Maybe_ "senior" devs are more likely to understand that. I'd argue it's rather devs with any multi-year experience at an early stage startup, regardless of skill level.


Someone who unnecessarily overengineers an app is not a senior engineer by my definition. They are just somebody who working in the field for a long time.


Honest question: how are "junior" and "senior" defined here?


Here is how I think about this usually:

- Junior: somebody who is happy that they found a working solution. Most code what they write doesn't even work or they don't know yet how to approach problems.

- Normal / mid-level: developers who can easily find working solutions to problems, but not always the best or easy to understand. Should be able to handle edge cases and errors properly.

- Senior: How to solve problems is not even a question for them, they can come up with multiple ways solving the same problem, but their chosen solution will be flexible, easy to understand (even for Juniors) and maintainable in the long term. When they make a technical debt, it is a conscious decision and not a fatal mistake.


While hiring senior developers doesn't necessarily inoculate you from the same fate, your point about the lack of experience is real. Some lessons can only be learned via the ass-kickings that come with experience.


Is the problem actually the presence of juniors, or rather the fact that in "agile" all developers are treated as replaceable, so the juniors randomly get assigned strategically important tasks where they can do long-term damage?

I imagine that after a month, when the senior developer(s) laid down the foundations, it should be possible to find work for the juniors (not all at the same time, but adding them gradually). They just cannot be assigned tickets at random.


One technique I've used as a team lead to limit tech debt that makes it into production is to have devs write prototypes in a different language than what we actually support in prod. This has a few really nice advantages.

1. Devs enjoy getting to use new languages in the real world, and helps keep us learning.

2. The better you know a language, the more tempting it is to take shortcuts. When you don't know a language well enough to write really gross things that work "for now", you have to think carefully about the simplest possible solution to a problem that you can express clearly in an unfamiliar language.

3. You learn tons the first time you solve a particular problem. At the end of the solution, when all of your hacks and tradeoffs are fresh in your mind, you are the best possible person to tackle all the shortcomings and immediately do a rewrite in a language you are expert in.

3. Management really can't twist your arm to just go ahead and transition the prototype to MVP. All you can really do is throw it out there as a public beta with no guarantees while you build the MVP properly using everything you just learned from doing it the first.

4. You can start collecting feedback on what users want included in v1 and get an idea of where the user base is heading with their desires and plan some of that into your design, again reducing long-term tech debt.

If your prototype sticks the landing well enough for management to decide to move forward to MVP, then it's good enough to mark your territory in the problem space while you do the rewrite. You won't lose competitive advantage while it's sitting out there in a separate VPC collecting users and activity.

In a healthy company, this isn't a difficult sell to management. They'll understand the short and long-term value to the company. In more toxic environments, it will look more like you're throwing a poison pill into your dev process—which you kind of are. So, you know, tread carefully. But when people buy in to this process, it works really nicely and has far better long-term outcomes.


> The better you know a language, the more tempting it is to take shortcuts. When you don't know a language well enough to write really gross things that work "for now", you have to think carefully about the simplest possible solution to a problem that you can express clearly in an unfamiliar language.

This has not been my experience. Much of the worst code I've run into was written by engineers that were new to a language or framework (including myself).


To be clear, I am not suggesting that the resulting code will be good. But it tends to be more basic and not rely on weird tricks that deeper knowledge of the language will allow you to do.

My larger point is that the first attempt to solve any problem in code is going to suck, so you might as well suck in a different language that won't ever have a chance of being deployed into prod. You learn some things, you get some exp with a new language, and a better frame of mind for solving the problem in a better way.


Moreover, what's idiomatic in one language isn't in others. Non-idiomatic code is one type of technical debt, and you can get that when translating code from one language to another.


That's because developers see early-stage startups in the same way musicians see dingy pubs: a badly-paid and generally unpleasant venue to learn your craft and make all your beginner mistakes with minimum responsibility and damage to reputation. You don't want to be playing in dingy pubs all your career, though.


> The better you know a language, the more tempting it is to take shortcuts. When you don't know a language well enough to write really gross things that work "for now", you have to think carefully about the simplest possible solution to a problem that you can express clearly in an unfamiliar language.

I have the opposite experience, the worst performance bottle necks I have had to trouble shoot are from principal engineers (and tech leads) that use medium to high level abstractions in a ruby on rails that don't actually understand what how many sql commands they are triggering by writing clever one liners that taco my db, from a rails 2.4 method that has a much better rails 5 system they don't know about.


I'm confused by your example. Sounds like someone who knows (or thinks they know) a language well writing code that is "good enough for now" and causing problems down the road?


Maybe I'm purely wrong but this sounds like a "CV-oriented development". (Which is motivated by add as much languages and stacks as possible to CV).


> The better you know a language, the more tempting it is to take shortcuts.

I used to work with experienced and passionate Scala developers. On the one hand, their expertise helped them to avoid common language and runtime pitfalls.

On the other hand, the desire to produce elegant, pure functional code, experiment with and utilise powerful abstractions slowed the development down, led to the code that was hard to understand and jump on for new team members.


How is the runaway to allow such practices?


The time consuming aspects of new product dev are normally not around the time it takes to write the code. It's translating product requirements into solvable problems, logic flows that solve those problems, bumping into unseen edge cases in the product concept, and coming to consensus about what compromises are acceptable in the prototype, bumping into sharp edges while you flesh solutions out because you didn't anticipate a thing when you started your base design or didn't understand the relationships between objects and functionality when you kicked the project off.

These problems that take the most time in any project are mostly human, conceptual, product, and communication problems. Not code problems.

Once you have sorted out all of these problems and solved them, a rewrite in a different language with a better design moves very quickly. You don't have to double the runway or the time to MVP. Time estimations are always wrong anyway. But in my experience when I've gotten buy-in for this approach I would guess it adds 25-30% of actual time overhead to a roadmap. Prototypes always take longer than expected, and an immediate redesign/rewrite takes less time than expected.

Selling this approach to leadership really boils down to clearly identifying the value of developer time. It's not code; it's solving business problems. Once those are conceptually identified and solved, the design and code tend to fall into place without a ton of trouble. The problem is that no one really knows what the business problems are until you try to solve them and really dig into the details.

Prototypes should be understood as the process of defining the problem space and uncovering all the hidden issues that people haven't really thought through just yet. MVPs should be an actual product based on that exploration and problem definition and solving. What I'm suggesting here is really just a process boundary that reflects the difference between the two things.

"Can runway handle that?" is really a lot like asking if you can afford to build a product at all.


It really does depend on the management and environment of the company as you’ve stated. I might be more cynical about this, but I see it more likely that management would just leave the project at the prototype stage, release it, and move dev resources to other things.


This is a definite risk if you haven't planned for it and got buy-in. But it's a risk that's also fine with me: I don't want to support and develop shitty prototypes that I've whipped up and know to be awful from a maintenance perspective. So if my garbage gets sidelined and I go work on something else, that's also a win and also not my problem anymore.


How many startups are actually killed by tech debt?

I saw many successfull companies with shitty software.

I had the impression the business part of things killed much more companies if they didn't get it right.


You could argue that Netscape was killed by technical debt. [1] But I think it's rare to get a case that's clear enough to say. Technical debt gradually reduces product agility and raises costs, so it can be hard to tell the difference between a business failure and excess technical debt. A lot of startups run out of money while trying to find product-market fit, and careful management of technical debt plus the right product processes can get you way more product experimentation.

[1] https://www.joelonsoftware.com/2000/04/06/things-you-should-...


Except the citation doesn't really argue that technical debt killed Netscape, but rather that it was killed by a rewrite from scratch - the article ends:

> If you are writing code experimentally, you may want to rip up the function you wrote last week when you think of a better algorithm. That’s fine. You may want to refactor a class to make it easier to use. That’s fine, too. But throwing away the whole program is a dangerous folly, and if Netscape actually had some adult supervision with software industry experience, they might not have shot themselves in the foot so badly.


Why do you think they rewrote from scratch?

The from-scratch rewrite is a classic response to declining productivity. In the metaphor of technical debt, it's equivalent to declaring bankruptcy.

As Jamie Zawinski wrote in 1999: "We never distributed the source code to a working web browser, more importantly, to the web browser that people were actually using. We didn't release the source code to the most-previous-release of Netscape Navigator: instead, we released what we had at the time, which had a number of incomplete features, and lots and lots of bugs. [...] The code was just too complicated and crufty and hard to modify [...] By being a cleaner, newly-designed code base, so the theory went, it was going to be easier for people to understand and contribute. And this did get us more contributors. But it also constituted an almost-total rewrite of the browser, throwing us back six to ten months. Now we had to rewrite the entire user interface from scratch before anyone could even browse the web, or add a bookmark." -- https://www.projectseven.com/grafitti/october_2001/zawinski....

He doesn't use the "technical debt", because that didn't start becoming popular until 2004 or so: https://books.google.com/ngrams/graph?content=%22technical+d...

but if you look at Ward Cunningham's original 1992 description, it sounds pretty congruent: "Shipping first time code is like going into debt. A little debt speeds development so long as it is paid back promptly with a rewrite... The danger occurs when the debt is not repaid. Every minute spent on not-quite-right code counts as interest on that debt. Entire engineering organizations can be brought to a stand-still under the debt load of an unconsolidated implementation [...]"


I think you're missing the fundamental point made in your citation - whether you suffer from technical debt or not, you can solve the problem without rewriting from scratch. Again, quote from the article cited above:

> There’s a subtle reason that programmers always want to throw away the code and start over. The reason is that they think the old code is a mess. And here is the interesting observation: they are probably wrong. The reason that they think the old code is a mess is because of a cardinal, fundamental law of programming:

> It’s harder to read code than to write it.

> [...] Even fairly major architectural changes can be done without throwing away the code. On the Juno project we spent several months rearchitecting at one point: just moving things around, cleaning them up, creating base classes that made sense, and creating sharp interfaces between the modules. But we did it carefully, with our existing code base, and we didn’t introduce new bugs or throw away working code.

> A second reason programmers think that their code is a mess is that it is inefficient. The rendering code in Netscape was rumored to be slow. But this only affects a small part of the project, which you can optimize or even rewrite. You don’t have to rewrite the whole thing. When optimizing for speed, 1% of the work gets you 99% of the bang.

> Third, the code may be doggone ugly. One project I worked on actually had a data type called a FuckedString. Another project had started out using the convention of starting member variables with an underscore, but later switched to the more standard “m_”. So half the functions started with “_” and half with “m_”, which looked ugly. Frankly, this is the kind of thing you solve in five minutes with a macro in Emacs, not by starting from scratch.


I don't think I'm missing that point.

Once the technical debt is so deep that bankruptcy seems like a good option, digging out is also not a great option. It's certainly possible now if it wasn't then; people like Michael Feathers have done great work charting paths out of an accumulation of garbage code. But at the time of JWZ's resignation letter, Martin Fowler's book Refactoring had been out for all of 3 months.

In particular, Netscape was faced with a choice: they could either do a total rewrite quickly, or take much longer to do a slow rewrite while still pushing forward, but at a much reduced development speed. That's fine if you're some large enterprise in a stable field with a solid revenue stream. But a startup competing in the then-fast-evolving world of the Internet? Competing against the monopolistic giant of Microsoft, who was desperate to dominate the space?

There's little reason to think a slow cleanup would have been any better than a fast rewrite for Netscape. Especially if their solution involved open-sourcing things, which a) requires a non-awful code base, and b) reduces the pace of refactoring and cleanup. So I think it's fair to say that either way Netscape was killed by technical debt.


> It's certainly possible now if it wasn't then; people like Michael Feathers have done great work charting paths out of an accumulation of garbage code. But at the time of JWZ's resignation letter, Martin Fowler's book Refactoring had been out for all of 3 months.

Nothing has fundamentally changed between now and then. Sure, the information may be more readily available, and we can argue that it's unfair to judge Netscape by today's standards, but even so, I think the article you linked makes a strong argument that technical debt did not kill Netscape, and that their approach to managing it did. This is why it goes on to discuss other, more successful approaches.

> In particular, Netscape was faced with a choice: they could either do a total rewrite quickly, or take much longer to do a slow rewrite while still pushing forward, but at a much reduced development speed.

Were they actually faced with this choice? The article you linked points out there are at least 3 reasons people choose to do rewrites. This is one of them ("architectural problems"). Even still, the article's argument is that this is a false dichotomy, and that there are better alternatives.

I also think "total rewrite quickly" is an oxymoron and a bit disingenuous. Your article points out it took them three years. In my experience, they typically take full-blown products at least two years - if they are successful at all.

If you are worried about Microsoft beating you because you are slowed by technical debt, you certainly need to be worried about Microsoft beating you because you didn't ship anything for three years. That is the entire point the article makes. The very first sentence reads:

> Netscape 6.0 is finally going into its first public beta. There never was a version 5.0. The last major release, version 4.0, was released almost three years ago. Three years is an awfully long time in the Internet world. During this time, Netscape sat by, helplessly, as their market share plummeted.

I find your next point surprising, given your citation:

> There's little reason to think a slow cleanup would have been any better than a fast rewrite for Netscape.

The article you cite makes this exact argument:

> You are putting yourself in an extremely dangerous position where you will be shipping an old version of the code for several years, completely unable to make any strategic changes or react to new features that the market demands, because you don’t have shippable code. You might as well just close for business for the duration.

It then goes on to list alternatives, including a "slow" rewrite, during which time you can still ship code.

None of this is to say that too much technical debt can't slow you down to the point that you cannot keep up with your competitors - but it is certainly not the argument your citation makes. Its argument is tangential, and related only in the sense that technical debt is presumably what pushed Netscape to make the decision it did.

I have to say, I am fascinated by our difference in takeaways from this article.


I'm not sure why you think I'm responsible for the content of Spolsky's 20-year-old article, or agree with it deeply. I don't know that he would even agree with it. It was just the first example I recalled of somebody arguing that a technical-debt-driven rewrite killed Netscape.

A total rewrite, if done well, is going to take less resources and way less calendar time than a refactor-and-cleanup approach done equivalently well. That's what I mean by quick. Was Netscape's rewrite done well? I doubt it. But there's no reason to think their refactor-and-clean up would have been any better. Which again leads back to my point: they were probably fucked either way at that point. Note that in the ensuing years, nobody has made major direct revenue from selling a browser.

The thing that has fundamentally changed versus 20 years ago is major advances in dealing with existing code. Back then, writing tests was extraordinarily rare. Now if it's not the norm, at least it's common. Things like continuous integration are common. Builds and tests are much faster, and the amount of computing power available to an individual dev is hugely improved. The tooling advances are incredible as well, from debuggers to languages to runtimes to IDEs with built-in understanding and refactoring of the code. All of this makes incremental cleanups much easier that was possible in 1999 with a 1994-era mishmash of C and C++.


I don't think you're responsible for the content, nor have I stated you must agree with it deeply. In fact - I am arguing the opposite: The cited article does not make an argument for technical debt killing a product. As you stated above, it's an argument that "a technical-debt-driven rewrite killed Netscape."

> A total rewrite, if done well, is going to take less resources and way less calendar time than a refactor-and-cleanup approach done equivalently well.

This is totally up for debate, and largely what Joel's article is talking about. Does "less resources" include opportunity cost? You may be able to get something out in less calendar time, but will it be as feature complete? Will it get rid of the old bugs that plagued the product? Will it avoid introducing new ones? There are many reasons to believe that a refactor would be better - for most projects this is probably true (at least in my opinion, and evidently in Joel's too!)

I, like many others, have seen the disastrous results rewriting from scratch can produce - if they produce results at all.

I'm not sure that the time builds take has any impact on which approach is faster. Certainly tooling advances might - but it's unclear to which approach the advantage is given. And while tests certainly make refactoring far easier, this can often be done as part of the refactoring.

And let's also remember that the article cited is written 20 years ago - the timeframe you're referring to. This idea that rewriting from scratch is dangerous and costly is not predicated on recent advancements in tech.

We can argue whether Netscape's approach to rewrite from scratch was the right approach to take, and we can also argue about whether rewriting from scratch is generally appropriate, but the intention of my original comment was to point out that Joel specifically makes the argument that it was rewriting from scratch that killed Netscape - not technical debt itself - and that there are alternatives which would've been far less risky (from his POV.)


Ok. You have achieved your aim. You pointed it out. I obviously disagree with that conclusion. Seems like a fine place to end this discussion.


Could one argue that the rewrite being necessary was the result of too much technical debt? Thus the debt did contribute to the failure.

But I wouldn't go so far as to call it the sole reason the project failed - that just sounds too easy an explanation.


The article cited makes a strong argument for why you should not write from scratch, even with immense technical debt - and offers alternatives.


> But throwing away the whole program is a dangerous folly, and if Netscape actually had some adult supervision with software industry experience, they might not have shot themselves in the foot so badly.

It’s difficult to imagine Firefox having the success it had (being a browser for the modern web) if it was based on the Netscape 4.7 source code. I really think they had no other choice.


The article cited makes a strong argument for why you should not write from scratch, even with immense technical debt - and offers alternatives. For example:

> Even fairly major architectural changes can be done without throwing away the code. On the Juno project we spent several months rearchitecting at one point: just moving things around, cleaning them up, creating base classes that made sense, and creating sharp interfaces between the modules. But we did it carefully, with our existing code base, and we didn’t introduce new bugs or throw away working code.

> A second reason programmers think that their code is a mess is that it is inefficient. The rendering code in Netscape was rumored to be slow. But this only affects a small part of the project, which you can optimize or even rewrite. You don’t have to rewrite the whole thing. When optimizing for speed, 1% of the work gets you 99% of the bang.

Quite frankly, Netscape was very good for its time. Firefox today is also drastically different than Firefox 1.0 ("It’s difficult to imagine Firefox having the success it had if it was based on the Firefox 1.0 source code").


Tech debt rarely kills a company, but it does kill teams, in descending hierarchical order. It usually goes like this:

    Day 1
    Stakeholder: "The app is down, what is happening?"
    Engineering lead: "We are working on it. Do not worry".

    Day 2
    Stakeholder: "The app is down, what is happening?"
    Engineering lead: "Do not worry, we got this".

    Fast forward to Day 15:
    Stakeholder: "Can you explain why the app is down again?"
    Engineering lead: "We got this"
    Stakeholder: "That is what you said last 2 weeks ago. We will send someone to take a look at it"
    External auditor: "wtf wtf wtf. These guys do not know what they're doing."

    Day 16:
    Stakeholder: "Say hello to Bob, he is your replacement. Please work with him making sure he has all the context he needs during this transition".
    Engineering lead: "But, but... it's just technical debt"
    Stakeholder: "You may leave now"


The engineering lead should be fired. If the problem is technical debt, he should do a proper post mortem and alert the stakeholder to what is required to solve the problem for good.


Sounds like the reasonable thing to do. But some people are brainwashed to love tech debt:

1) Some people mistakenly interpret "maximizing the work not done" with maximizing tech debt.

2) Power is the ability to make people do what they do not want to do. Saying "no" is a way to exercise that ability. Some leaders become addicted to the feeling of exercising their power.

3) Cohesive groups of people tend to prioritize cohesion over rationality during decision making (i.e.: "groupthink"). Groupthink makes teams unable to make correct decisions or assess risks in a rational manner.

Many startups suffer of issues 1, 2, 3. And some even perceive those weaknesses as strengths. When that happens, the only way out is to decapitate the team, punish groupthink and reindoctrinate the team members.


Why would the app be down that long though? Just take on more technical debt to get it back up and running. If Bob gets the app up and running then that’s all that matters to the stakeholder.


It doesn't have to be continuously down. It only has to be down often.


To the end users and management the two are essentially the same outcome: customers leave your non-functional service.


“It works for me.”


Technical debt can slow down product development, which means you can’t iterate quickly enough to compete or to get product market fit. I’m working on a game right now and there’s some debt in there just to get content ready to demo, but everything’s optimized to be easy to change because, otherwise, the whole project will stall. It’s scary trying to find the right balance


Technical debt is a co-morbidity. It's not necessarily the thing that kills you, but it increases the chance problems becomes fatal.


"How many startups are actually killed by tech debt?"

Basically 0. If you have a good idea - you can overcome. If you have a crappy idea and an amazing engineering team you are doomed.[1]

https://randsinrepose.com/archives/1-0/


The article you've linked is not relevant for this thread. It points out that if you have Pitch (the idea), People, Process, Product, you will probably fail anyway.

Technical debt, if it's relevant at all, is relevant at much later phase than that.


No it points that if you have a crappy pitch (idea) you are doomed. I would reread.


A part of a product I worked on was killed by the business and outsourced instead. I’m 100% sure this was in fact because of technical debt, even though my team members would disagree.

It wasn’t even technical debt in the sense that it was bad code, it was because they prematurely tried to optimise everything, so adding a feature would take twice as long. We also couldn’t add new devs to the project because the team lead’s decision to not document the code, because “code should document itself”.

Bugs could only be identified by the dev who wrote that feature because the code was so obscure and undocumented.


> A part of a product I worked on was killed by the business and outsourced instead. I’m 100% sure this was in fact because of technical debt, even though my team members would disagree.

I've also had this experience, where a product accumulated so much tech debt (in part, due to my decisions) that it was simpler for the business to outsource the functionality. That wasn't the only reason, but it was a big one. When tech debt sand gets in the software delivery gears, the allure of vendors increases. Sure, the business won't get things exactly like they want them, but they won't have to deal with missed deadlines and degraded functionality either.


That’s exactly what happened. I’m sure the vendor code is riddled with spaghetti code, poorly optimized and undocumented. But as long as they deliver, the business doesn’t have to care about it. They can also just say what they want, and it gets done, without any real pushback from developers.

I can’t blame them for doing this tho, I often felt like we spend most of our time discussing implementation, doing code reviews and optimizing code for the sake of optimizing it. Towards the end it was also clear the business just didn't want to deal with our team anymore, they were annoyed we couldn’t just build the feature they wanted.


> When tech debt sand gets in the software delivery gears, the allure of vendors increases

Well put. Could not have said this better myself. I've experienced situations where the tech debt sand is a consequence of fairly unique business and internal system constraints. The allure of vendors did increase and the company did eventually research vendors, but nothing addressed our snowflake use cases perfectly.

I don't think that's necessarily a positive reflection on our product though. If anything, the lesson I take from that is to build your business processes and teams to reduce as much accidental complexity as possible.


At this point in my career, I just run away from a company that operates that way.

Code tells me what the coder was doing. It does not tell me why. If the why is not blindingly obvious, it deserves a code comment.


What're some good project that has the right balance of documentation? in your opinion

I am looking for projects that are very small.


I think this depends on a per project basis, in some projects the code can speak for itself, any competent developer would be able to look at the code and understand what’s going on.

In projects with a lot of business logic, or projects that require understanding of a specific subject, it’s often better to write comments to clarify what’s going on. Sometimes code alone isn’t enough.


I agree and to that list I'd add:

- Performance optimizations (always worth mentioning why you ugly'd up some code for a necessary speedup)

- Kludges where you're working around something in an external library or service


can you give me any examples


I think I've worked for several - I couldn't say for sure because the first I know about it is when the company has that unscheduled all-hands meeting that I've learned to recognise. The one where they break the news that they've run out of money and your contract is terminated.

Too much debt slows you down, and if you don't move quick enough as a start up, the money stops. It's that simple.


I work in a fear-based culture right now. The CTO never saw a kludge he didn’t want into production stat, and in his dual role as CEO won every disagreement. There are many pieces on tech debt that are true and valid, this one IMHO covers more salient issues than most.

If you’re reading this comment before you read the article: it’s about how to avoid getting caught too badly in the first place, not about magic bullets to get you out when you’re basically bankrupt.


I will never again work in a company with a dual CEO/CTO. No matter how well-intentioned and talented that person is, it always leads to a toxic environment. It's an instant red flag that the person is incapable of delegating responsibilities. Separation of duties at the C-level is important. There needs to be some amount of healthy friction in the leadership team to make appropriate compromises. Hope you are able to get out of this situation soon.


> I work in a fear-based culture right now

That sucks. I have been there. Awful. For me, the silver lining is that I no longer ignore stinky red flags


Yeah, I think it can be painfully helpful to go through that miserable experience at least once. But my very strong advice to others:

Do NOT put up with that.

You don't have to. There are so many companies, even now, where you can get decent work with much more sensible norms.

There's always kind of a gravitational pull toward excess debt because "get the [fix/feature] out there in front of [users/client/exec]" is such a tempting drug for management, even if they're generally good. But the amount and toxicity of the debt varies widely. Find a company that's sane about this if you value your own sanity imo.


Thank you for the support! Doing my best, and leaving when the right opportunity appears.


Yep, same, and it hurts the most loyal and hardworking the most. Do what's best for yourself and find a better job.


I enjoyed this article and I think it's helpful for technical leaders in early stage startups. I feel like a lot of these companies will continue having problems mostly because the people that are attracted to smaller/early-early companies are the same kinds of people that don't want to work in structured (more corporate/deliberate) environments.

My experience is a bit more on the agency side where technical debt is caused by developers/project managers not really caring about the longevity of an application. The idea of going back and fixing anything that isn't specifically called out and invoiced for is so far away from the priorities of these businesses.


> My experience is a bit more on the agency side where technical debt is caused by developers/project managers not really caring about the longevity of an application. The idea of going back and fixing anything that isn't specifically called out and invoiced for is so far away from the priorities of these businesses.

This resonates with me. I consider myself a better fit for product focus vs project focus, preferring to find something meaningful and nurture it in terms of value to customer, and value to developers (code quality).

Recently I accepted a job at an Agency, which a coworker described as a "chop shop". I became depressed. The app I would work on, got frenzy activity from Oct - March. It had to be released in March no matter what. Between that time there was no activity, no refactoring, no addressing tech debt. In fact, one week in December the client was unable to pay for some technical budgeting reason and we all just sat around for that week despite looming deadline.

Well, over the years this thing accumulated tech debt. Even though it was a cash cow for the agency: reliable income at reliable times from a good customer. One would think the health of the product would somehow invite extra effort. Every year new devs would start working on it (new devs, because turnover at these places is high). They would look at the mess and begin resume driven development to use the shiniest new things that were left out, hoping to fix and compensate for the problems from just sheer neglect, and then move on for the cycle to repeat.

That said, there are people that just prefer hopping from project to project to experience new things.


I suspect the tech debt that many start ups encounter is a matter of execution. They likely require original solutions to tough problems in very short time. The way to reduce tech debt in this case is to make deliberate time at great expense to frequently refactor complexity out of the system.

In the corporate world, on the other hand, the problem is an astonishing fear of originality. Most places I have worked have invented here mentality baked into every decision. Every 5 to 8 years they start over completely with a different tool, framework, or language hoping it magically solves for the group’s prior poor implementation.


I've found it helpful consider the type of technical debt too. Get the data models right and most other technical debt is a lot easier to pay down. If you keep adding work on top of bad data models, things get more and more out of hand. I'll always fix data modelling as soon as the abstraction is found to be incorrect, even though that can cause a delay.

But, if you have repeated functions, or two similar react components which could be merged, that's a far lower risk to current and future development velocity.


Code is like a performance by musicians. They play together typically based on a plan known as "score".

But when they play a piece for the first time the performance is typically much lacking. They have to rehearse the symphony or rock-opera or whatever.

Removing technical debt is like playing the song again, this time better. You learn what works what not and can do it better. The challenge is how to know the code-performance is good enough.


The title is good, and contains an often overlooked point: startups need to survive tech debt precisely because they do not need or want to avoid tech debt. Startups generally don't have the resources to do things properly. They don't have the people, the cash, or the time. So they trade on their future value: get cash from investors on the promise of future returns. Pay below-market salaries but promise quick promotions and options packages. Take on tech debt to get to market now, and hope to pay it off later. And, when it works, this is a great strategy. Tech debt is good if you're a startup. But only if you survive it.

Personally I've never seen a startup that was actually killed by tech debt, although it does sometimes cause good employees to quit, and of course it causes product delays. It's like any form of debt: a dangerous but oftentimes necessary tool.


Every bug fix is paying down technical debt. We don't squash every bug, we fix the most important ones. Otherwise, we'd be swatting bugs all day.

If there is a bug in the woods, and the bug takes a shit, but no one is around to hear it...

Now, is it going to kill a startup? Hardly. At that point, an MVP, it's probably still small enough that the better decision is to rewrite it altogether. If you split the server and the UI to begin with, you can re-design the front-end and make a party out of it, try to make it pretty this time.

If you're at the point where you need more than 1 person to handle the whole app, then you're not a startup anymore, imo. You have a proven small business.


"Revenue solves all known problems"


One tactic I have used with moderate success is to explicitly carve out time for fixing tech debt. A few things we’ve tried:

1. Pledge to schedule 20% (pick your percentage) of the sprint as tech debt payback rather than stories. The problem here is that this stuff tends to drift to the bottom of the sprint and then slip if anything else is delayed. This is where most teams start as you can start at an arbitrarily small percentage and scale up gradually.

2. Fixit Friday; the middle Friday of a sprint is for fixing tech debt, not working on sprint tasks. Engineers are encouraged to advocate for tasks they think would speed us up, and we also have done themed pushes where e.g. the whole team chips in to migrate the codebase to a new linter. One issue here is that it’s hard to make progress on larger initiatives, which leads to:

3. Fixit Sprint. At the beginning of the quarter there is often a bit of downtime as the leadership digests the last quarter and plans the new one. This can be a good time to take a sprint off product work, and really dig into some substantial re-tooling or refactoring.

Depending on how much resource you’re allocating to tech debt, you can mix and match these, I’d say 1 & 2 are probably mutually exclusive, but you could combine either with 3, or just do 3 instead.

I think it’s important to be clear about when you are taking on tech debt, and when you have cleared the hurdle and are paying it back, and clear rules like these help to communicate that message.


I feel when you have to do this, it's already a lost cause.

The company either thinks strategically or it doesn't. If it doesn't then it will only hurt you if you do. Just deal with it, work in the way they except (i.e. create even more tech debt for some fast results) and once it all crashes, tell them the whole thing needs a full rewrite (which you know will fail anyways). Not really much you can do here except for leaving.

If the company thinks strategically, then no need for something like a 20% schedule. You should be empowered to make these decisions yourself and you should sometimes create a lot of strategical tech debt and sometimes take a month to fix something if you know it will pay off. You are the expert - you need to understand the business for these decisions and then you communicate back (as you already mentioned), but that's about it.


> You should be empowered to make these decisions yourself

Very few tech leads or devs are empowered enough to tell upper management they aren't going to work on the tasks assigned to them and will instead work on fixing a technical debt issue. And I can relate to this, because if I asked a dev on my team to work on building out a new report a user asked for and they respond they're going to fix technical debt instead, I wouldn't be happy.

Also we already know how this ad hoc approach to technical debt works in most organizations. You get 0% to work on technical debt.


> Very few tech leads or devs are empowered enough to tell upper management they aren't going to work on the tasks assigned to them and will instead work on fixing a technical debt issue

But everyone is already doing that _to some degree_. Everytime you do a mini refactoring, and be it just some code formatting, isn't strictly "working on the tasks assigned". Unless you define "working on the tasks assigned" a bit wider, which you should.

> And I can relate to this, because if I asked a dev on my team to work on building out a new report a user asked for and they respond they're going to fix technical debt instead, I wouldn't be happy.

But you, as the lead developer (or however you call it), should have some insight on the longer roadmap. And if you see that there will be a lot of new reports necessary in the future and the current reporting-system has a lot of tech debt, you should decide if this tech debt should be tackled and then align with your team to do it. And this is what my point is: you should not even have to tell your manager (unless he is the one to decide if the tech debt is tackled or not).

You should have all the necessary information and decision power regarding resources, priorities and technological&business strategy to make this decision. If you don't have that, your manager needs to have that _and_ needs to have the competence to make the decision about refactoring without having to rely on your or other technical people. If he can't do that (which I've seen is often the case) then this is where the problem lies: you are not empowered enough and you can't make the decision yourself, but you should be the one and not your manager.

In the same sense, you should not be the one to decide how your developer implements the new report. Maybe he will decide to pull in a new library, maybe not. Maybe not doing it would create some new tech debt or the other way around. But that should be his concern, not yours. Your job is to tell him what the priorities are and that's about it.

> Also we already know how this ad hoc approach to technical debt works in most organizations. You get 0% to work on technical debt.

If the decision makers don't care and you can't educate them about that you should be empowered, then I believe there is no point in fighting. It will just make everyone unhappy. Leave or create a lot of technical debt while moving fast until the company is stuck. Who knows, maybe that's better? The market will make the decision.


The approaches I outlined above are partly intended to be a framework for educating stakeholders, and partly a way to get alignment within the dev team on how much time to spend on tech debt fix; at an early stage startup it is very easy to get tunnel vision on adding more features, and forget to step back and make improvements. Or for some engineers it’s hard to do the converse, and stop at the 80/20 solution.

I think by talking explicitly about tech debt you can get the non-technical stakeholders on board. (It helps I think of you were warning them about tech debt when you were taking it on, too, and not just starting to educate when it comes time to pay back.)

In my experience teams run better when everyone is somewhat aligned on how much tech debt to be taking on or paying back.

I am sure there are other ways to get alignment too; I’m interested to hear what other leads have implemented. I could believe for some team compositions the “let each engineer figure it out” approach would work well, but I think that is kind of like the Agile “self organizing team” — it’s rare that you get to work in a team where everyone is experienced enough for a free form approach to work, especially when you are growing fast.


Not sure who you mean when you say "stakeholders". But if you mean CEO / the board or other (non-technical) departments/teams, then I don't think the is a good idea.

Your CEO (or whatever stakeholder you mean) should not care about _how_ you work, they should trust you to do the right thing and enable you to do so. They should also give you feedback and correct your course, but only in terms of what you work on and with what priority. They should not care how you do it.

In the same way your lawyer does not tell you what kind of legal internal documentation they do and if they go to court X or Y to ask for something, you should not tell that to your stakeholders. You tell them only high level things that are relevant. Tech debt is not one of them, progress and estimated dates is one. And just like how the lawyer recommends you a certain path, you should recommend one to your stakeholder. E.g.: "Yes we can build this feature, but we could also build this other feature, which is almost as good and takes half the time. I suggest to do it this way".

> I think by talking explicitly about tech debt you can get the non-technical stakeholders on board. (It helps I think of you were warning them about tech debt when you were taking it on, too, and not just starting to educate when it comes time to pay back.)

Let's say you are in the position to do so and they trust you. Do you think it is productive to "warn" them? Let's be honest: even for developers it is very hard to understand the direct and indirect impacts of tech debt. And that is even though they can directly experience the effects. How difficult do you think is it for people for whom this is just an abstract term?

TLDR: I think it is easier for you to understand the business goals and make a technical decision than it is to make business people understand the technical constraints and educate them well enough to take the right decision.


Why would you not want business people to understand the technical constraints? Maybe easier in the short term to make the decision for them but longer term that just builds up barriers and an “us” and “them” culture, which becomes very unhealthy.


Oh, they should understand technical constraints on an appropriate level! I just think that technical debt should not be part of that, just as a lawyer spares you most of the constraints they experience.

I like your second sentence though! If explaining technical debt helps with the "us" culture, it should be done.

It's just that, from my experience, it usually does the opposite. And I mentioned the reasons: it is too difficult to explain and to understand, even for developers. Also, the most important aspects/impacts will almost never be measured - not talking about development speed here (which is already too difficult to measure), but motivation, attrition, broken window behavior and so on.

A better way to create an "us" culture as a developer is to make sure to be involved into product decisions from the very beginning, including contributing ideas. And being hold to the except same KPIs/OKRs as other departments. That also means, as a developer you need to have the discipline to heavily sacrifice code quality and create technical debt sometimes and you might have to push this against other developers. That is the price to pay.


"If the company thinks strategically, then no need for something like a 20% schedule. You should be empowered to make these decisions yourself... You are the expert."

I fundamentally agree with this approach and there are companies that do as well. What is the term that encapsulates this sort of culture? Is there one?

If there is one, is there a reliable / repeatable way to determine during the interview phases if the company you're interviewing with is this type of culture?


I don't know, but I'm also interested in the answer.

I would say this is one of the strengths of the culture at my company and we spend a lot of time trying to get that across to engineering candidates. I'd love to improve our messaging there if we can, though.

We talk about a culture of trust, that we hire good people and then trust them to do their jobs. We talk about avoiding hierarchy and beurocracy where it doesn't help us. We talk about engineers driving their own work forwards and we make it clear that we're looking for people who can do that.


Sounds like the opposite of my current situation, luckily I'll be moving on soon to hopefully something more similar to your company.


You also need to get comfortable taking out tech debt as a startup. A lot of times you can get away with it at little or no cost.

I wasted lots of time prematurely paying back tech debt only to have that code later deleted or rewritten.

It is important to properly prioritize tech debt and pay back things that the business is currently paying high interest on.

Once the product matures it makes sense to be more proactive in doing things right, but in the early stages, things are changing too fast that you just need to get out a rough prototype.


Exit before the technical debt comes due.

Also, if you're growing fast enough that it matters, you'll need to redesign the whole thing anyway. Google did that on their search engine at least five times. In the early days, it took weeks to update the index.


There's another way: Move fast enough to justify rewrites. This even happens at larger companies like Facebook when they rewrote React. https://engineering.fb.com/2017/09/26/web/react-16-a-look-in...


I wonder how many startups were killed solely due to tech debt repayment. I had this sin too. Not that there were too much of debt, but it's unbalanced wish to repay it and stuck with that was the source of never finished refactoring that took enough time to kill everything.


There is no right kind of technical debt. By that I mean write code intentionally.


but the sprint ends tomorrow and I'll get a bad performance review if my tickets fall over to the next sprint!!!

/s obviously but this is more real than it should be


Be like America, thrive by incurring debt.



If you're a startup, there is no way you will move fast if you don't incur tech debt. Why is it seen as a bad thing? It should be a given.

Once you know that your company is there to stay, you hire experienced people and create a strategy to pay it back.


From my experience with lots of VC money.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: