Managing Technical Debt (2012)

vinceguidry · on June 25, 2016

Technical debt's main difference from financial debt is that it can't really be quantified, because it's hard to really identify. Most software has a lifetime, initiatives don't maintain economic value indefinitely.

At the end of that lifetime, the lights get shut off and the code mothballed, never to see the light of day again. At that point, the balance sheet hits zero and you never have to pay it back.

Software where this doesn't happen eventually has the kinks filed away. Old Fortran codebases and the like have fixed maintenance costs and are worked on by savvy greybeards who are excellent at providing their own job security. Nobody wants to rock that boat, and nobody does. The greybeards are the ones who pay off the debt because it's their livelihoods that get threatened by instability. Management just keeps happily writing checks, devoting their attention to squeakier wheels.

So it's a tradeoff. Initial greenfield development by one dev or a small team eventually gives way to the bread-and-butter of keeping projects responsive to economic needs. What we call technical debt is really just architectural problems made both from inexperience, and from not specifying the requirements of the project well enough.

Decisions made by the greenfield team add to the friction that the maintainers experience in steering the ship. The problems need to be identified and corrected in the course of the maintenance. Unfortunately most devs are not really capable of remaining productive while fixing architectural issues, so management perceives this as prioritizing the needs of the devs over the needs of the business and so tend to balk at putting resources into paying off debt.

Spooky23 · on June 26, 2016

The article uses an infrastructure example. That's a much more predictable type of technical debt.

Even then, orgs are terrible at managing risk and costs.

In my first job, I was a DBA managing a big reservation system somehow (the how is a funny story too). They had an issue where someone fubared the initial database config, leaving us with a time bomb. Critical tables were beyond the number of tablespace extents supported, and there was a 100% probability that the database would essentially freeze up to avoid data loss.

We came up with a plan that would eliminate the risk, but would require an extended maintenance window, which translates into a 4-hour SLA hit. Cost:$20k or less. The SVP said no, there's no budget for an SLA hit, wait for Christmas holiday downtime (6 months away).

Two weeks later, on the busiest business day of the year, database is frozen at 9AM. It took us 48 hours to restore as we needed to restore everything and fix some corruption. Each hour of downtime cost $7,000/hr in wasted labor in the business, and the SLA penalty was harsh.

It's an egregious example, but I've seen lots of similarly dumb decision making in my career.

vinceguidry · on June 26, 2016

> someone fubared the initial database config

Utter incompetence. Developers seem to have this almost-magical ability to blame management for their fuck-ups. Guess I can't blame them too much, there's practically no decent software engineering training anymore. All we get is irrelevant science-y shit.

> We came up with a plan that would eliminate the risk,

You don't call it a risk if there's a 100% chance. Managers are not going to understand scenarios you put in front of them, you have to guide the conversation. You say, "this is going to happen if we don't fix this right effing now." Then you threaten to quit if they don't take you seriously. But again, professional pride is something nobody teaches developers.

Spooky23 · on June 26, 2016

It was a mess that I had inherited, this was 16 years ago... my guess is that they weren't aware of some of the platform differences as they transitioned to Sun. But yeah, total fuck up.

Keep in mind I was newb, 4-6 weeks out of college. A dramatic walkout wouldn't have proven anything. They'd just blame me and dirty up my name. The management made a dumb call.

Today, I'd probably be able to stop a situation that ridiculous, but I'm also a more respected professional with more conference room time.

hackits · on June 26, 2016

I would say that the threatening to quit because they don't take you seriously kind of re-inforces their view point why they don't you take you seriously in the first point.

vinceguidry · on June 26, 2016

If that's the case then they're only hiring you for grunt labor and you should quit anyway.

weinzierl · on June 25, 2016

> Most software has a lifetime, initiatives don't maintain economic value indefinitely.

Most software is rewritten many times before it's initiative loses its economic value most of the time for political reasons sometimes because it accrued too much technical debt.

hackits · on June 26, 2016

Mostly rewritten because they business needs change. Software is there to service the business not the other way around.

49531 · on June 25, 2016

I worked on team once where tech debt was "avoided" at all cost. It actually had a weird effect on the code base. We had a practice of writing a pull request 3 times before shipping it, kind of refactoring everything before we merged it.

We ended up with a really complicated codebase as a result. It was tricky to dive into the project because everything seemed so layered with complexity.

Now a days I try to take the most naïve approach to a bug or feature, that way if something breaks it's straightforward to fix. I only abstract when a process has slowed me down a few times.

This is all done with your normal tech debt fighting tools though, like linting, style guides, CI and unit tests.

PaulHoule · on June 25, 2016

This is part of the problem with quantifying "technical debt", that is, what some people think is "good code" is overengineering.

njharman · on June 26, 2016

Your team failed to avoid technical debt. Needless complexity and over engineering are debts.

ktRolster · on June 26, 2016

  >It was tricky to dive into the project because everything seemed so layered with complexity.

That's technical debt. Good code makes things easier to add, work gets done faster, not slower.

ScottBurson · on June 26, 2016

"Perfection is finally attained not when there is no longer anything to add, but when there is no longer anything to take away." -- Antoine de Saint-Exupéry

SatvikBeri · on June 25, 2016

The problem with the "debt" terminology is that debt is actually fairly predictable and easy to manage. But bad code tends to have very volatile effects–often the cost of working with it is 0, and sometimes it's enormous. "Unexploded Land Mines" might be a better analogy.

undergrowth54 · on June 26, 2016

"Unhedged call option" http://higherorderlogic.com/2010/07/bad-code-isnt-technical-...

mathattack · on June 25, 2016

Debt is predictable, but perhaps not the ability to pay. My company may know how much it has to pay each month for the next 20 years, but it won't know the financial conditions then.

I find the hard quantification part is explaining it to non-technical executives. Saying "You can't have feature X because of long term investment Y" is very hard. The one thing I've seen work is telling engineering "You can use 25% of your time and resources to pay this back"

jldugger · on June 26, 2016

The key thing about technical debt is quantifying its interest rate. If you tell an executive "I have a method for the business to take out a huge loan, expedite time to market, and has no observable interest rate", of course they'd take it.

Declaring technical debt bankruptcy and requesting a do-over is rarely persuasive, particularly for a profitable service or product. If proponents of technical debt want to win these corner office debates, they need to start developing quantitative models for predicting the size of the debt, and the interest rates. And it may well be impossible, because it's not clear cut that the next design will actually be better. I've seen simple user account management shell scripts replaced with piles of error prone Java, requiring a large set of new DAOs for every new integration.

coredog64 · on June 26, 2016

There's a good chance that the do-over will be subject to "second-system syndrome": All the unicorns and rainbows that were put into the backlog suddenly become part of your MVP.

0xmohit · on June 25, 2016

Several points are fairly obvious but seldom practiced in the real world:

  Technical debt can be viewed in many ways and can be caused by all levels of an organization.

  Of particular importance is helping nontechnical parties understand the costs that can arise from mismanaging that debt.

On a lighter note, as @sadserver says [0]:

  you cannot pay off technical debt with tears

[0] https://twitter.com/sadserver/status/723504636615778305

amelius · on June 25, 2016

My biggest problem with technical debt is that sometimes very simple things get really hard to do. And as a consequence it is really hard to explain to my boss (who can do a little programming himself) that it is so difficult. Ugh.

davidgerard · on June 27, 2016

One of my favourite technical debt essays is the one by Denise from Dreamwidth about how they cleaned up the LiveJournal codebase - suitable for non-technical readers: https://denise.dreamwidth.org/57248.html

ktRolster · on June 25, 2016

Technical debt is a code smell.

Refactoring happens sometimes, but if you are accruing technical debt every cycle, there's a problem and you should spend your retrospective trying to figure out how to fix it.

rhizome · on July 1, 2016

So, write your best code first or else you're below standard?