Fundamentally, poor code leads to destruction of business value. The next genera...

onion2k · on Oct 20, 2021

looking back on it, that was a good candidate for a total rewrite

It's virtually impossible to rewrite a large scale application where the only documentation is the code, especially if the code is bad enough to warrant a rewrite. You will break things, lose features, potentially corrupt data. Total rewrites of unspecified, untestable, poorly understood systems should be the last resort.

That said, the system was still running with customers using it 2 and a half years later, which is a very strong signal that it didn't need a rewrite after all. It sounds like you got it right - refactor as you go is usually the best way out of a bad codebase.

adam_arthur · on Oct 20, 2021

Yeah, definitely a good rule of thumb :)

But in this case a total rewrite would have saved time, and improved velocity. Looking back on it. It was impossible to really know at the time, though.

Fortunately this project was more presentational than business logic heavy, so missing functionality would have been easy to notice when testing (all visual).

I wouldn't agree that running is a sign of success necessarily. There are a lot of running legacy systems that get beaten out and replaced by competitors due to low development velocity.

Of course a lot of that comes down to the culture of the org and so on. But in a lot of cases architectural flaws are at the root of it.

The real question with a rewrite is a cost-benefit one. And it's based on estimates so there should be a lot of room for error.

quickthrower2 · on Oct 20, 2021

A way around this is a side by side test of a lot of the functions. Maybe generate tests form system 1 outputs and use them to test the greenfield system. Where they differ you then decide where the bug is.

adam_arthur · on Oct 20, 2021

Yeah this would work very well for certain types of apps.

An API would be easy to validate with this approach, for example.

honkycat · on Oct 20, 2021

Seconding this: Poor code quality can eat a business alive.

We cannot change ANYTHING. ANYWHERE with our current stack, due to years of product micro-managing engineering and never having room to re-build or improve things. Also just bad practice from inexperienced engineers with 0 leadership.

We tried to do the "microservice" thing but we never ACTUALLY got the bandwidth in our sprints to do all the work that is needed, and product never backed us up to deprecate old APIs that were too tightly coupled with a ton of other functionality to adequately refactor( Also the code quality was so poor it is hard to reverse-engineer the API ). So we have the new APIs sitting dark.

Here is another fun knock-on effect: Everyone who joins quits as soon as possible, because their job is nothing but fucking around with shitty code written by someone half-way to Tahiti by now.

Heck: I have had to convince myself to stay several times now[0]. I ask myself: Is what I am doing worth my time? Am I growing? What am I going to tell my next employer? Will saying: "I spent a year nursing a dying monolith built on a mountain of bad practices" get me my next job? I hope I can help heal this place and get it to a better position, but better people than I have failed.

0: I had a run of crappy gigs and I don't want to switch since this is tolerable so far.

ToJans · on Oct 20, 2021

I think you did a great job and are selling yourself short. Your story implies that you were unable to architect for the requirements upfront, because they probably were too volatile.

You provided business value ASAP, and, if tech debt was ignored in the beginning, it's probably because there were good reasons to do so.

Your job is to architect a good enough system based on the requirements you have and the assumptions that you make. It might help to document these using something like ADR, but this depends on your context. I found out about a decade ago that documenting these assumptions and estimating impact/risk and potential mitigations might help to motivate your choices.

In hindsight it's always easy to tell why an existing architecture was the wrong choice, but this is a fallacy, as architecture is based on a snapshot, and you have to find a delicate balance between changing architecture and providing business value all the time.

I usually advise against big rewrites, in my own code I try to aim for "making parts easily disposable or replaceable"; if the mess is abstracted away it doesn't seem to matter in 90% of the cases. In the 10% where it does matter, you'll end up refactoring anyway because it's probably your core domain and your insights and learning are evolving faster there, and you will inflict a lot of pain on yourself if you don't respect proper development practices in those parts.

TeMPOraL · on Oct 20, 2021

> That's why I always cringe a bit when people advocate putting out something quick and dirty for business reasons. Usually the damage you cause in technical debt is difficult to reverse, and you end up wasting a lot more time in the long run.

> Of course, this moreso applies to SaaS where your product lives forever.

The thing is, though, SaaS products don't live forever. In fact, they have ridiculously short half life :). Sometimes it's the effect of the problems of code quality (in which case the company dies prematurely, or the product gets rewritten in a new tech stack). But often enough, it's the cause.

Plenty of startups are eyeing for an exit from day one. They fly their "rocket ship" the Kerbal way: building the vehicle out of old oil tanks welded together and filled with rocket fuel, and having a bunch of green men on the outside desperately plugging every leak as the rocket tries to break out of the atmosphere, praying it'll get to the Moon before it explodes. There's a top-down pressure to move fast, instead of worrying about code quality, because there's only one goal: exit or bust.

orev · on Oct 22, 2021

It sounds like that “bad” code was good enough to provide enough business value to keep the company going and you employed for at least 2 years, so I’m not sure your assessment is accurate. Maybe you mean “destruction of value” in terms of opportunity cost, but that’s much more abstract and very hard to truly measure.

Businesses run on cash flow like blood. Once the cash stops, it dies right then and there. It doesn’t matter if a better design will only take a few more months if you can’t make it to tomorrow.

quickthrower2 · on Oct 20, 2021

It feels like for that to work you need to be managed by people that appreciate this always to the top, to the point where this affects hiring, on boarding, etc. you also need a compatible business model.

Quality code where you can get away with it not being is a fragile thing (in the fragile/anti fragile) sense.

I wonder if the hill to die on is getting a good architecture that minimises the impact of bad code not as bad.

adam_arthur · on Oct 20, 2021

Yeah, the strength of a business is tightly coupled to the principles and culture underlying that business.

I've found many speak to caring about code quality, but in the startup scene it seems quite rare for people to actually live it.

People obviously talk about linting and syntax, but appropriate separation of concerns, abstractions, coupling and so on are way more impactful. Unfortunately those things are also difficult to quantify without knowing the context. Hard to train people on these things, it takes a lot of thinking and experience. But common threads around best practices and principles start to appear.

I've mentored/managed many people and always enforced quality as a cultural element of our teams, and you can see that those kind of principles become "viral" in a sense. They get adopted by those you work with, and they impress those principles upon others down the line.

To your last point, I do think proper separation of concerns is probably the most important principle. Such that you can easily fix any mistakes that are made.

Microservices are a good vessel for this in theory, but it's possible to build a distributed monolith in microsevice form. e.g. exposing low level apis and having a high level of chattiness. So yeah, those boundaries need to be well defined haha.