Red Flags Signaling That a Rebuild Will Fail

nostrademons · on July 11, 2018

#5 has a converse - oftentimes, the only way to get a rebuild to succeed is to drop features, and it's a major red flag if management insists on 100% feature parity.

The way to distinguish this from the #5 situation in the article is to ask if you're dropping features because they're hard or because nobody uses them. The former is a red flag; the latter is a green flag. Before you embark on a rebuild, you should have solid data (ideally backed up by logs) about which features your users are using, which ones they care about, which ones are "nice to haves", which ones were very necessary to get to the stage you're at now but have lost their importance in the current business environment, and which ones were outright mistakes. And you should be able to identify at least half a dozen features in the last 3 categories that you can commit to cutting. Otherwise it's likely that the rewrite will contain all the complexity of the original system, but without the institutional knowledge built up on how to manage that complexity.

lordofmoria · on July 11, 2018

> Before you embark on a rebuild, you should have solid data (ideally backed up by logs) about which features your users are using, which ones they care about, which ones are "nice to haves", which ones were very necessary to get to the stage you're at now but have lost their importance in the current business environment, and which ones were outright mistakes.

This is so important. I've been on many a project where, 3 months in, we wish we had historical tracking data on user activity to back up our instincts to cut a particular feature that seems worthless. The worst part? Even if you add it immediately, you'll have to wait 2-4 weeks to get a sufficient amount of data.

manicdee · on July 12, 2018

Also important to realise that a feature that is rarely used (view history, remove user) might be more important than one used more often (dashboard widget that nobody pays attention to)

Cthulhu_ · on July 12, 2018

Yup; statistics are only part of the picture and value of a story. Compliancy is another one for example; sure, few people will use the 'download all my data' and 'delete my account' options, but they're mandatory for GDPR compliance and not offering them may cause a huge fine. There's a lot of these compliancy features.

kornish · on July 12, 2018

> The worst part? Even if you add it immediately, you'll have to wait 2-4 weeks to get a sufficient amount of data.

I think this was the problem a product like Heap [1] was designed to solve: just track all user actions, forever, and then assign pipelines after the fact based on what you want to check up on.

Don't work at Heap or anything, just love the team and product.

[1]: https://heapanalytics.com/

civilitty · on July 12, 2018

Any solutions (technical or procedural) that are capable of maintaining user privacy?

I don't think "just track all user actions, forever" is going to be a legally defensible solution for much longer, even in the US.

kornish · on July 12, 2018

Tracking events without user IDs would still allow for aggregate feature usage tracking.

Out of interest, what makes you think that an application won't legally be able to record the ways in which a user interacts with that application?

Obviously I'm not speaking for Heap; just curious.

CalRobert · on July 12, 2018

We need case law to settle the matter but in general, the GDPR indicates that if you don't need to collect the data in order to perform the requested activity, you need explicit consent for collecting it, and will be held to a high standard in court if this every comes in to question.

chriswarbo · on July 12, 2018

Yes, but like the "cookie law" before it, it's absolutely fine to go ahead and do it if it's required (in the case of something like logging aggregate usage counts of APIs, that's easy to justify as a requirement for maintaining a reliable service; it's basic server monitoring).

Things like online stores using cookies to track a user's shopping cart across requests are completely fine, yet it seems like legal departments decided to be overly cautious and treat all cookies as potentially infringing. GDPR may be triggering similar reactions.

I wouldn't have a problem with that if marketing departments became equally cautious, but they seem to just slap on a banner and carry on as before :(

methyl · on July 12, 2018

> if you don't need to collect the data in order to perform the requested activity

It's about data that can identify a user, not any data. A collection of actions with anonymized user IDs will not allow to identify the user (in most cases), so it's fine to keep it.

kornish · on July 12, 2018

Very good to know.

Correct me if I'm wrong - seems like anonymizing the usage data complies with the GDPR, and thus the grandparent post still stands.

noir_lord · on July 12, 2018

As long as you anonymise in a way that you can't de-anonymise it should be OK.

wierd0 · on July 12, 2018

>>it seems/should

GDPR, I'm hoping that I don't have to bother my users with a "do you consent to" popup when the only thing I want to do is to log server-side the API calls so that I can see patterns in usage and such. If I were to show such a "do you consent to" popup users might mistakenly think I'm one of those techcrunchers with hundreds of data partners that all get to see your PII. I do not want to affiliate myself with those type of actors.

Anonymously of course. Should be fine, yeah?

icebraining · on July 12, 2018

Recital 26:

"The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable. This Regulation does not therefore concern the processing of such anonymous information, including for statistical or research purposes."

As long as it's not linked to a particular profile ("pseudonymous" doesn't count, it could still be linked), it's fine.

https://gdpr-info.eu/recitals/no-26/

throwaway2048 · on July 11, 2018

One thing to be careful not to fall afoul of when you choose to remove features is assuming there is some kind of meaningful average user.

A good example is MS Office, there are an huge amount of features that only 5% of users might ever use, but the majority of users are likely to use quite a few of these niches individually, and if you remove all the low use features, you piss off basicly everyone.

I think the mistaken idea of an average user is why a lot of metrics driven software seems to get more and more useless with every update.

(I cant see the present/away status of contacts in the newest skype, really guys? )

perlgeek · on July 12, 2018

> And you should be able to identify at least half a dozen features in the last 3 categories that you can commit to cutting.

Ideally, you disable them in the old software, and observe how many people complain.

Too often, product management commits to cutting a feature, and then caves in when paying customers complain. It's best to know in advance which category a feature really falls in.

wink · on July 12, 2018

Ideally, disabling features in the old software is not so complicated that a rewrite suddenly sounds even more enticing. /s

munk-a · on July 11, 2018

I think it's important to separate feature improvements from a technical rewrite, ideally in the rewrite you mostly just make things work the way they did, sometimes you might fold a feature improvement into it but if you come out of the rewrite with a more stable product that has about the same usage stories you should consider it a success.

Sometimes you will want to fold features into a rewrite (remove prompting the user to confirm X twice) sometimes this will ease development and be worth it but other times it'll pay off to just retain the old functionality but add it to a list to be user tested later.

Once the tech is solidly over then take a swing at updating the poor UI, do it agiley so you can back out of changes that the user base rejects since (at least within my more modest usage studies) not everything people depend on comes up or gets reported. I'd much rather rollback a design feature branch then have users get change fatigue when you're forced to rollback your new shiny rebuild and the whole project ends up being shelved.

mehh · on July 12, 2018

If only my current project had done this we would of saved millions!

maxxxxx · on July 11, 2018

I have seen that before. You kill yourself refactoring a feature only to find out it's never or barely used. Deleted code and features are the best.

hinkley · on July 12, 2018

You hamstring the product to make a feature work one way then find out that what they really wanted would have been easier to implement but they never asked because they thought that would be harder.

gregmac · on July 13, 2018

Almost all feature requests are asking to implement a particular solution rather than asking to come up with a solution to solve a particular problem.

The way I try to solve this is to ask "why?" as many times as it takes to get to a fundamental business problem. Then it becomes easier to have a user story (as opposed to a specific feature request) and come up with other solutions that can be measured against the story. It also helps to keep the product focused, as it's easier to tell when a story is not for your target market vs a feature request -- and then you can make a conscious decision to either stay away or deliberately expand to that market.

watwut · on July 12, 2018

That is why social skills and good analysts are important.

kwhitefoot · on July 12, 2018

I've been fighting this problem for my whole software development career (more than 35 years now and no end in sight).

hinkley · on July 12, 2018

When this happens a couple times you start sounding like Honey from the Incredibles.

It’s difficult not to sound combative when they say they want a convertible but you have to wheedle out of them that they want to take a proverbial road trip through monsoon season. No, you get a Land Rover with a snorkel or you wait, pal.

So bossy and difficult. Why won’t you just give us what we asked for? These meetings would go so much faster.

philliphaydon · on July 12, 2018

I once worked on a feature that apparently lots of clients were asking for. It took 4 weeks to implement. Went to production. Never heard anything of it. 2 years later if we could modify the feature to work for another usecase. I looked at the database. The feature had never... ever... been used.... rows returned = 0

maxxxxx · on July 12, 2018

That's why its important not to believe what customers and product managers say what features they want. I have had a ton of occasions where it turned out that what they really wanted was totally different from what the devs were told.

philliphaydon · on July 13, 2018

Yeah, I always ask our user stories to have a 'background' section explaining the problem and reason for the feature request so it can help us understand the importance and purpose of the feature.

brazzledazzle · on July 12, 2018

There have been a couple times where I’ve tried to use a feature that should have been awesome but was terrible then it got pulled in a newer version of the product. It was incredibly frustrating to wait for a fix that never came. Data on what’s used is good but you need to get feedback about what sucks to go along with it.

Cthulhu_ · on July 12, 2018

Feature parity is the reason why some of the projects I've worked on caused #2 - can't get customers to switch if there's no parity yet. The MVP for some of those projects took a year to get to. Mind you it'd probably have been 6 months if they didn't opt to go for a microservices architecture.

micheljansen · on July 12, 2018

A bigger red flag, in my experience, is an unwillingness to even consider dropping any features. Often combined with a desire to add new features during the rebuild. Always goes wrong.

eecc · on July 12, 2018

100% feature parity sounds like the advice in #4, involve Marge, without actually having a Marge to call. That’s supernatural development;)

maxxxxx · on July 11, 2018

"Red Flag #4: You aren’t working with people who were experts in the old system.”

I think this is most important. A lot of people want to rewrite because they don't understand the current system and don't want to bother learning. Before you rewrite you really should understand the current state deeply.

majormajor · on July 12, 2018

The way I've phrased something similar before is "don't do a full rewrite if you couldn't write up a plan for refactoring in place to fix the problems with the old system."

If you can build that plan, and make the case that it will be easier to do the full rewrite, go for it. But if you couldn't put together the fix-in-place plan, you might not understand everything the old system does well enough to actually estimate the size of a rewrite...

(This isn't solely for full-parity rewrites: if you're dropping features, what does that look like dropping from the old system?)

gwbas1c · on July 12, 2018

I was involved in a rewrite where it would have been much easier to refactor the old system.

A year into the process one of the c-level leaders pulled me into a room and asked why I couldn't fix the legacy code, and I basically told him that he should have pushed back on it. I couldn't fix the legacy code because that would be months of refactoring that should have been done instead of the rewrite.

Context: the legacy code had some design flaws that required major refactoring, but the legacy code "worked" except for very large deployments. The only problem was that the legacy system wasn't modular, so it didn't have unit tests and wasn't cross platform. All of those problems are easier to tackle via refactoring instead of a full rewrite.

dataflow · on July 12, 2018

> The way I've phrased something similar before is "don't do a full rewrite if you couldn't write up a plan for refactoring in place to fix the problems with the old system."

Hmm... there have been a number of times when I've banged my head against the wall trying to figure out how to make my own code do something, until I finally bit the bullet and decided to rewrite the entire chunk from scratch and suddenly it took a fraction of the time I had spent trying to fix it to get it written and working. Not sure how to reconcile this with the advice you gave.

miceeatnicerice · on July 12, 2018

I agree, rewriting with a clear head works wonders - but, to be fair to the op, when you rewrite your own code you'll be very appreciative of all the challenges and possibilities.

It's a very different kettle of fish to rewrite from scratch strange code you've not properly explored and given a chance to - which is the usual situation.

dataflow · on July 12, 2018

Great point, I didn't realize that aspect!

mannykannot · on July 12, 2018

Knowing what the system should do, in sufficient detail that there is nothing of significance to be discovered with regard to its requirements, while simultaneously not actually knowing enough about how it works to the point where you could plan how to refactor it, is quite a corner case in the field of legacy systems (the latter is quite commonplace, but the former is almost unheard of.)

kwhitefoot · on July 12, 2018

Rewriting a chunk is much easier than rewriting the whole application.

In fact rewriting a chunk sounds rather like refactoring.

zelos · on July 12, 2018

Rewriting "a chunk" of code kind of is refactoring, though.

fgonzag · on July 12, 2018

But rewriting one chunk at a time could be considered refactoring.

dataflow · on July 12, 2018

Yes, two other people have said the same thing already. I don't personally agree with it but I don't have a response beyond that.

ebikelaw · on July 12, 2018

Yep, seen that. I worked on a system where the company did not really want a reimplementation but they destaffed a project in one site and reconstituted it with all new people at another site. The new people decided to rewrite from scratch. A year and a half later I start getting questions by email from the new people, questions indicating that not only do they not understand the implementation of the legacy system, they also do not clearly understand the business requirements that resulted in that implementation. Meanwhile, the maintenance of the old system had been neglected to such an extent it had fallen behind critical company-wide mandates. This was more of a lesson about why you shouldn’t destaff a project over some petty geographical squabbles, but also quite clearly about why you should always incrementally reimplement software rather that rewriting it.

antsar · on July 12, 2018

Also known as Chesterton's Fence.

https://en.wikipedia.org/wiki/G._K._Chesterton#Chesterton's_...

_asummers · on July 11, 2018

Even having the entirety of the original dev team there, time takes its toll on recollection of reasoning behind some of the strange decisions made in something that would warrant a rewrite. Much preferable to not having them, of course.

DiabloD3 · on July 12, 2018

Something I do is if the code looks weird or is rather small for how much work went into it, I leave a comment that says why this was done... just so I can remind myself in 6 months when I go "who the fuck wrote this garbage... oh, me."

mcguire · on July 12, 2018

On the other hand, experts will frequently demand that the new system do the same thing as the old system, in the same way.

You can't blindly listen to the experts.

munk-a · on July 11, 2018

#4 is sort of terribly worded, the summary line is something that is important and pretty independent, make sure you're working with expert users of the system... then the explanation brings in a Senior Dev as a good resource to tap. This is the wrong direction, you really want to consult with the system experts to see their rationale for requesting what might seem like odd functionality in the first place.

#4 also mixes a good deal with #5 in that any changes you make (even purely good ones in your view) will require retraining of users and cause a kerfuffle when rolled out to your user base, people _hate_ change.

arendtio · on July 12, 2018

I can't state it better. If you don't understand the decisions made during the development of the old system you are unlikely to come up with something much better.

pbreit · on July 11, 2018

This strikes me as dangerous. Didn't the experts build the first system? Don't you want to deliver a fresher system? Won't the experts be attached to the old way of doing things?

zbentley · on July 12, 2018

> Didn't the experts build the first system? Don't you want to deliver a fresher system? Won't the experts be attached to the old way of doing things?

With all respect, that means you should not be in a position to rewrite legacy code, or to commit others to such a rewrite.

If all the experts you have worked with have been, in your eyes, overly attached to the old way of doing things, you have one of two issues:

- You have not had enough experience in the field, and have not worked with experts that actually have perspective about when/how to rewrite, abandon, or rework their code.

- You have dogmatically condemned people who think that the latest-and-greatest tech may not be a good solution to the problems at hand to the "old fogey" bin.

Either issue means you're not ready to make decisions at this level. Learn more. Research more. Watch more. Listen more.

Weirdly, gaining this perspective has less to do (in my experience) with years on the job, and more with diversity of team/business environments worked in.

maxxxxx · on July 11, 2018

Keep in mind that you will be one of these people in a few years for whatever you are doing now. The previous people most likely weren't dummies but had to deal with the technology and constraints at the time they built the system in the same way you are doing it now.

grantism · on July 11, 2018

No necessarily. It depends on what has lead to the need for a rebuild. Sometimes there weren't previously the resources to "do things properly", Sometimes a feature might only added for a specific client, etc.

You need that previous knowledge to know the "why" of things & if that why is still valid.

IMHO it's more dangerous if you're working with experts who don't want to improve the system.

pjc50 · on July 12, 2018

Ah, "the public have had enough of experts", the attitude responsible for most of our present political disasters.

tomelders · on July 12, 2018

I’ve carved a career out of rebuilds. I’m working on a rebuild right now. There’s a ton of companies out there who’ve done very well with their home grown antiquated systems from the late 90’s and early 00’s that are now facing stiff competition from young upstarts who had feature parity from day one and are knocking out new features at break neck pace because they’re leveraging the latest and greatest in tools, technology, and thinking.

I’ve always been a big believer in rebuilding your product from the ground up. I think it’s something you should always have going on in the background. Just a couple of devs whose job it is to try and rebuild your thing from scratch. Maybe you’ll never use the new version. But I think it’s a great way to better understand your product and make sure there’s no dark corners that no one dare touch because they don’t understand what it does, how it does it, or why it does it the way it does.

And I’ve always believed that if you don’t want to rebuild your app from scratch, then don’t worry, a competitor will do it for you.

So I agree with every point raised in this article. And I think it does a great job of articulating the issues that often go unspoken. But I’d like to add one more. And for me, this is the biggest issue for any company wanting to rebuild it’s product.

If your sales team has more clout than your designers and developers, then you’re fucked. And in the enterprise software world, this is the norm. An uncheked sales team that get’s whatever it wants has already killed your product and made it impossible to rebuild. Their demands are ad-hoc, nonsensical, and always urgent. So urgent that proper testing and documentation are not valid reasons to prevent a release. Their demands are driven by their sales targets, and the promises they make to clients are born out of ignorance of what what your product does, and how it does it.

This is not true of all companies. Many companies find a reasonable balance between the insatiable demands of a sales force and the weary cautiousness of their engineers. But if your company submits to every wish and whim of your sales team, and you attempt to rebuild your product, then you’re screwed.

flukus · on July 12, 2018

> I’ve carved a career out of rebuilds

What's your learning process? If you don't do maintenance how do you know your rebuilds aren't creating the same problems that lead to the systems needing replacement?

I've got a very well founded distrust of people that only work on green field projects, they're generally responsible for the system's that need rebuilding.

omeid2 · on July 12, 2018

I have also come to believe that people who jump to rebuilds also tend to have very shallow technical skills and are not keen or capable of studying and analyzing a system at depth.

tomelders · on July 19, 2018

Complete nonsense.

tomelders · on July 18, 2018

Well there's little else you can do when your app is a Java applet and your runtime has just vanished from the web. These things still exist, and people like me are rebuilding them.

I don't appreciate the snark in your comment.

darkerside · on July 12, 2018

It's very hard to get a man to understand something when his salary depends on his not understanding it. By the same principle, as someone who has built a career out of rebuilds, we shouldn't be surprised that you'll recommend this solution for a majority of hypothetical problems. I don't think you are intentionally misleading people, and I'm sure that you want the best for your clients and that you believe that's what you're providing. It's just that, for anyone else reading this thread, please realize that you're getting one side of the story.

Incremental rebuilds are not sexy. Adding unit tests to legacy code (thereby making it not legacy code according to Michael Feathers) is not sexy. Sticking with the tried and true technology is not sexy. But they are typically the most successful approaches for those not compensated for changing things for change's sake.

hvidgaard · on July 12, 2018

> I’ve always been a big believer in rebuilding your product from the ground up. I think it’s something you should always have going on in the background. Just a couple of devs whose job it is to try and rebuild your thing from scratch.

Their time is much better spend working on improving the "legacy" codebase. Simple refactoring and splitting the codebase in a modular fashion, mean you can work on limited parts of the system in isolation. This makes incremental improvements and switch to new tech much easier, and certainly less risky than a rewrite.

spronkey · on July 12, 2018

Depending on how heavily coupled the legacy codebase is, "Simple refactoring" really may not cut it.

I mean, you can write a bunch of pinning tests, then try to prise out various bits and pieces, sure.

But what if all the stuff you're trying to prise out can now be accomplished with a few open source libraries that didn't exist way back, with a very simple rewrite of your business logic on the top?

That's a situation I've encountered quite a few times - a lot of legacy code that's largely boilerplate, with business logic drizzled over the lot, oozing into the little cracks.

konschubert · on July 12, 2018

> Just a couple of devs whose job it is to try and rebuild your thing from scratch.

That may be good value for big established corporates, but for startups and smaller companies I don't think it is.

Djvacto · on July 12, 2018

Well (on a relative scale), won't most startups or smaller companies be more in the phase of "writing" as opposed to "re-writing"? I think the advice above would in theory apply to companies big enough to have legacy codebases.

esdkl22 · on July 12, 2018

> If your sales team has more clout than your designers and developers, then you’re fucked. And in the enterprise software world, this is the norm. An uncheked sales team that get’s whatever it wants has already killed your product and made it impossible to rebuild. Their demands are ad-hoc, nonsensical, and always urgent. So urgent that proper testing and documentation are not valid reasons to prevent a release. Their demands are driven by their sales targets, and the promises they make to clients are born out of ignorance of what what your product does, and how it does it.

Well said. This is easily my #1 biggest pain point as a developer.

omeid2 · on July 12, 2018

> I’ve always been a big believer in rebuilding your product from the ground up. I think it’s something you should always have going on in the background. Just a couple of devs whose job it is to try and rebuild your thing from scratch.

Hahaha. Just a couple of devs?

tomelders · on July 12, 2018

Their job isn’t to build the whole thing. Their job is to research and explore how new ideas and tools might be useful to your business.

It’s just R&D. It’s not an exotic idea.

zbentley · on July 12, 2018

Some of the best colleagues I've had, and teams I have worked with, have made this position (formally or informally) a rotating one. It's a great way to learn.

Corollary: this position needs to be at least two devs. Otherwise, you're rotating in people for redundant discoveries rather than mentorship.

omeid2 · on July 12, 2018

The problem with this idea, to put in terms of the main article, is that it raises Red Flag #1.

It may be a great way to learn, but I think that is better achieved with something like Google's famous 20% program not some vague rewrite attempt with no direction.

zbentley · on July 13, 2018

I completely agree. Having a pure-research team with a mandate of "all your research must be geared towards totally reimagining the entire product" is a dumb idea. Having more granular (and collaboratively driven) goals from "some of your research must be geared towards totally rewriting an area of our application that is a major pain point, and for which all previous attempts to do incremental changes have failed for technical reasons" to "look into better tools or strategies we could use to tune performance of, or write better tests for, swaths of existing code" is more realistic and more useful.

Obviously, other, not-pure-research devs should be given time to do some of that work as well, otherwise the research team becomes the "saviors that are always about to come back over the hill" for every other team while they kick their respective cans down the road.

repolfx · on July 12, 2018

That's pretty different to what you were talking about originally. You started by talking about rebuilds and said, have two guys just rebuilding the product. Now you're saying they're doing R&D. Those are very different tasks.

Ensorceled · on July 11, 2018

Red Flag #6: Key stake holders keep moving the goal posts.

If your goal moves from feature comparable but on a modern platform, to new features, to a complete reinventing of the product all without actually shipping ... you might be in trouble.

I had a rebuild go 6 months over. In the heated executive meeting at t+3 months I was called to defend my team and pointed out that the VP Product had just delivered “final” specs literally the day before. How could we be on track with development if PM is 3 months past “end of development” with design specifications. The fact that the specs were changing weekly because “we’re agile” is a whole other issue.

whatshisface · on July 11, 2018

People sometimes complain about how developers like to "write the operating system and then a language" when it comes to handling every foreseeable permutation of what the program might every be desired to do, but we're all so used to unstable requirements that sometimes the metaphorical programming language research is the only thing that will be general enough to find a use next week.

oneplane · on July 12, 2018

I almost had a few similar situations, but after pointing out that being agile doesn't just means changing requirements but also changing time paths or simply different deliveries after each change it got a whole lot clear what agile (and scrum) is good for, and what it's not good for (i.e. agile process but expecting waterfall results doesn't work).

Cthulhu_ · on July 12, 2018

> The fact that the specs were changing weekly because “we’re agile” is a whole other issue.

The article touches on that too; simplified it's stating that if you're not live within 6 months, you're doing waterfall.

Ensorceled · on July 13, 2018

That’s not waterfall. Waterfall you don’t start dev before specs are final.

Waterfall isn’t just a synonym for “the wrong way to do it” :-)

alkonaut · on July 11, 2018

The truth I think is more often that the legacy system is too old and brittle to improve, and customers are demanding ever more complicated features from it.

So you rebuild as a new system as a gamble, because even though it shows all the traits described, the new system is at least one that anyone is willing to develop, and one where features can be added, and to which people can be recruited.

We know big rebuilds have small chances of sucess. But that doesn’t mean you shouldn’t do big rewrites. You are in a bad place if you even consider. Maybe the big rewrite means the company has an 80% risk of going under. Still could be that safe bet.

bkovacev · on July 12, 2018

This is yet another article where there's a clear managerial-only approach. Sorry, but I dont dig this.

As a developer you're constantly fighting managers who want to rush things to get them out and who will eventually blame you for a bug/non-defined behavior once you hit a certain milestone.

To me it seems the author of the article doesn't understand the tech debt. If you've ever worked in a startup you'd know that the requirements are ever-changing, thus that if a certain payment system is put in place, it might evolve to the point where you really need to refactor it and in order to enable the refactor you have to refactor the whole business flow as well. If there's more than 2-3 features affected by a new feature, a big refactor is definitely needed.

Only one solution offered, which I dont think is adequate because why would I leave something in that was only meant to provide value for short term and then build on top of it till I kill the old system?

LolNoGenerics · on July 12, 2018

His argument is against rewriting a whole codebase. Refactoring is surely an alternative.

gaius · on July 12, 2018

Missing the biggest red flag of all, engineers wanting to just play with new toys and pad their CVs. Ask the engineers why they want to rebuild and listen carefully to the answer and if it’s vague handwaving and buzzwords (microservices! Containers! New JS framework!) and no hard numbers to justify it, just say no.

For example “we spend X/year on AWS but if we spend Y to rewrite in C++ we need fewer VMs and can cut that to Z/year” is simple calculations. If your engineers can’t even do that, their motives are suspect.

ebiester · on July 12, 2018

On the other hand, “we cannot hire anyone to work in COBOL/Perl 5.8/Tcl/other outdated language” is a very real problem. It turns out that 2018, developers are judged for working too long in old technologies even when we know as in industry that a developer can learn a new language.

gaius · on July 12, 2018

I wonder if that’s really true. I bet loads of people would be delighted for the chance to go on using their old favourites.

wffurr · on July 12, 2018

It's absolutely true. People with enough experience to have "old favorites" tend to be very senior and expensive or retired.

New grads and junior engineers can end up trapped in a career dead end if their first job is on seriously old legacy tech.

https://medium.com/@csixty4/pick-was-post-relational-before-...

I almost fell in the same trap, but quit a similar job to go back to grad school and get my Master's in CS.

shoo · on July 12, 2018

http://boringtechnology.club

adrianN · on July 12, 2018

The problem is that Y and Z are just numbers you make up. Reliably estimating them is impossible without at least building a prototype.

gaius · on July 12, 2018

Sure, but prototypes cost orders of magnitude less than Y. And your engineers can scratch their new toy itch at zero risk.

adrianN · on July 12, 2018

In my experience getting any funding at all is about as hard as getting the whole project approved.

pspeter3 · on July 11, 2018

I think people also deeply underestimate the time it will take. We've undergone an incremental rewrite for ~4 years at Asana.

jupake · on July 12, 2018

Used your software once before. Loved it! You guys should do a blog post about your rewrite experience. Would love to know what your tech stack was and what your new one looks like.

toshaga · on July 12, 2018

This could be relevant: https://blog.asana.com/2017/08/performance-asana-app-rewrite...

pspeter3 · on July 13, 2018

That is definitely the best out there. I'm hoping to write another about what our current stack is.

solox3 · on July 11, 2018

With this good article I think I have a good question.

The reference to Martin Fowler’s strangler pattern (https://www.martinfowler.com/bliki/StranglerApplication.html) was mentioned in the article to grow the new system in the same codebase until the old system is strangled. In my case (Ionic 1 to 2) however, both the entire framework and the language are different. How should the strangler pattern work in this case?

twunde · on July 12, 2018

For webapps you would use a reverse proxy such as nginx or haproxy and replace your application page by page. Then configure the reverse proxy to send all requests to /home to go to the new stack and all other requests go to the old stack. Then flip the switch for every page you finish converting. For backend work, it's similar. You can have an api built in a new stack and it can just have a different endpoint or use a reverse proxy. Backend workers can pick up work from a different queue or you can switch the old job worker off and turn on the new one, and then monitor that everything is working as planned. The really important thing about the strangler pattern is that you need some easy way to turn on bits of functionality while turning off the corresponding old parts. It can be feature flags, it can be routing middleware. You can rip out the guts of the angular routing mechanism and use that to flip the switch.

wink · on July 12, 2018

Seconded. Took part in a moderately big rewrite with this strategy and it worked pretty well.

Identify key components and subsystems and rewrite them one by one. From the outside you seem to be switching over one REST endpoint after the other, but of course internally it's a bit more difficult, but applications often enough have enough parts that are not SO intertwined that you can do stuff like this. It's a bit related to how you break up a monolith. Find bigger, less coupled parts and shave them off and just touch the glue code.

ronpeled · on July 11, 2018

There's no super easy way here. One way to get this done is find independent areas of the app that can be replaced without coupling. Then start building up as you go with the new system. At some point you'll be about 70% through of which you can decide if you want to make the jump and focus your efforts to completely uproot the old one.

Sorry for the abstract reference here, but it applies to almost any replatforming out there. In most cases it is a very expensive operation for a business and needs some major reasons in order to justify such a move.

omegaworks · on July 11, 2018

Are there any examples out there for how to do this with React in an existing AngularJS codebase?

gknoy · on July 12, 2018

Instead of a Single Page Application, make it an MPA (multi-page application), each of which is basically a separate SPA. You get latency when swapping between sections of your app, but on some codebases (such as for an internally-used app), that's less of a problem.

We did something similar to this when we broke up our Ember application so that we could code new things in React. We still maintain our Ember codebase, but are rewriting parts of some routes in React, and adding all new things in the React app.

We deploy ours as separate pods in a Kubernetes cluster, but you could even host them on the same server with separate nginx routes.

The initial ramp up of this is a little frustrating, as it seems you're adding extra overhead to everything, the long term goal is to have infrastructure and workflow that supports having part of your app in The Old Proven Thing, and part in The New Hotness. This is valuable whether you're switching to React, or upgrading from Ember 2 to 3, etc, as it lets you upgrade a smaller set of dependencies, and experiment with things.

Forge36 · on July 11, 2018

The company I'm working at is doing this currently. The new product is on the web and the old one is a full client windows program. The biggest hurtle will be to find the balance between largest/smallest pieces which can be transitioned as seamlessly as possible.

curyous · on July 11, 2018

I'm surprised at what gets called a pattern these days. Fowler didn't describe it as a pattern, but just because Mr. Guru said it, it is now a pattern?

pbreit · on July 11, 2018

The ways it's referred to as a "StranglerApplication" in this post [1] does suggest more than "just saying it".

1: https://www.martinfowler.com/bliki/EventInterception.html

lyqwyd · on July 12, 2018

This article really captures the risks of a rebuild. I’ve been through a number of them, all but 1 abject failures. The one success was driven by the executive understanding that the company would fail without a rebounds, and it was still 6 months late, resulted in one of the cofounders being fired, an extremely painful rollout, and the company still failed, due to other problems.

My firm belief is that when you need a rebuild, you are already well into a fail state as a company. Not to stay there can be no recovery, but it is an indication of some deep problems for the company, beyond anything the engineering department alone can resolve... and if the rebuild is not coming from the executive leadership, it is an even bigger issue as it will more likely lead to bigger problems than it will solve.

ellimilial · on July 12, 2018

This is gold.

I've become a member of a team the company scrambled to deal with a `legacy` python/SQL - based ingestion/storage system in an effort to 'harden' it. Despite my best efforts, we are going for a full rewrite into java/spring/avro/mongo/es. We have internal users talking SQL and utilising the system at the moment, a fair amount of data is relational.

I have run out of ideas how to convince the team and stakeholders, will have a one-shot chance to talk to VP. Any ideas how to voice the concerns about the full re-design (perhaps I'm just being difficult)?

sonnyblarney · on July 12, 2018

1. Given the risk, cost and limited upside, the onus is on the refactor team to prove that it needs to be done. Where is the ROI, factor in the risk. Where is this in the stack of things to do? Are there better ROI things?

2. Consider 'what the point' is in the first place, because the entire world could be run on python/SQL and it would be 'hard'. I don't think anyone would consider 'Mongo' to be 'hard' usually people use it because it's fast and easy, not hard. Consider maybe only replacing one part at a time, i.e. Java-SQL.

3. Consider a simple clean up or refactor. No need to learn no languages and tools when maybe you just need a house clean.

4. People seem to be going back to SQL because of it's inherent standardization - so many reporting and analysis systems use SQL as an interface, to the point where even NoSQLs are starting to use SQL.

pedalpete · on July 12, 2018

I'm a big supporter of "replacing one part at a time", and wish I had done that on a rebuild I'm just completing.

In fact, I thought I was. We split our app into 3 parts, rebuilt part 1, then part 2, but part 1 couldn't be released to customers until part 2 was done, and we kept our legacy system supporting the majority of our users until we are done with part 3, which is nearing completion now.

I thought that was "replacing one piece at a time", but it isn't most users aren't touching it until part 3 is done, and at that point, they are experiencing a new system from scratch.

mratzloff · on July 12, 2018

Without knowing the performance requirements and where the current system is failing, it's hard to know if the technology stack will work for your needs—with one exception.

If users speak SQL, they will reject Mongo. The users of the system are the ones who will determine project success or failure.

Think about the data analysts, product owners, etc. who use the system. Interview them. Find out exactly how they use the system currently. Do they query in an ad hoc way? Do they rapidly iterate on their queries? Watch them interact with the system. If it's any way other than through dashboards that an engineer updates on request, you are in for rough seas.

Users must always determine the contours of a new system. There are big data solutions that speak SQL. Some are cloud-based, some are not. Some are faster than others. The team should be able to show you why they rejected those as solutions.

rwmj · on July 12, 2018

Is "rebuild" new jargon for "rewrite", or does it mean something different? I thought the article was going to be about builds failing.

ConceptJunkie · on July 12, 2018

Yeah, I did too until I started reading the article.

Using the normal sense of "rebuild" didn't make sense.

wellpast · on July 12, 2018

Red Flag #1 should be that you’re doing a rebuild.

borplk · on July 12, 2018

This is so frequently true that people are tempted to make it a strong NEVER. But that is also a mistake.

There are some legitimate cases where you really should be rebuilding.

You may not have seen such a case since they are rare, but they do exist.

A good rule of thumb is to try your absolute best to avoid a rebuild. If at the end of your hard work you still feel defeated and forced to go with the rebuild option, you probably should rebuild.

realusername · on July 12, 2018

Sometimes you just have to. In one previous company, the "system" we were trying to trash was an unmaintainable VBA CRM homegrown mess which was creating lots of internal issues in the company due to the nature of spreadsheets. It took almost a year to replace but it was 100% worth it.

CydeWeys · on July 12, 2018

I'm potentially looking at a situation like this right now at work. We're on a NoSQL DB and it's just not working too well for us anymore, so we would like to transition to something that provides more relational semantics (PostGres, Spanner, something like that). Migrating the backend between one kind of DB and another is non-trivial, especially because the whole ORM needs to be ripped out as well. It's not a full rebuild of the application but it's definitely substantial in effort level.

Sometimes a rebuild is just necessary, because you are on a tech stack that is no longer working for you, for whatever reason. How would you solve that kind of problem?

grey-area · on July 12, 2018

I'd definitely vote for PostgreSQL, it can handle large loads effortlessly, it's reliable, and yet they keep adding great features.

It could also function pretty much like a nosql db initially, to ease your transition, then you could migrate gradually to using it as a relational db. You need strong checks on data integrity before you start - you could consider double writing (to old orm using nosql + new orm using psql), and comparing data stored to be sure you don't miss anything at first, before you switch?

wellpast · on July 12, 2018

As incrementally as possible. Eg, does your entire data model need to move at once? (Probably not.)

hvidgaard · on July 12, 2018

The first thing you do is refactor with the existing DB, so you have a clear DataStore component. Then you make your shiny Relational DB implementation of that DataStore. Now you run both side by side and for everything you do in the old DB you do the same in the new DB, and you compare the results. At some point you can turn off the old DB with confidence and sleep well knowing that the new DB behaves the way you expect.

jpeeler · on July 12, 2018

Firefox seems to be doing pretty well with their incremental rewrite into rust. I do wonder how long it will take to complete the transition versus doing a complete rewrite instead.

nerdponx · on July 11, 2018

Another red flag not mentioned here: the old system doesn't have an end-to-end suite of functional test cases you can rely on.

lgleason · on July 11, 2018

I recently left a project that demonstrated most of these traits. Usually these things are the top of the ice-burg.

teddyh · on July 12, 2018

Know your burgs and bergs. A “burg” (or burgh) is a fortification, or more usually refers to a city built around (or inside) that fortification. A “berg” is a mountain, or a large hill. Therefore, an iceberg is an “ice mountain”, and a “burgermeister” is a “city master”; i.e. a mayor.

de_watcher · on July 12, 2018

You forgot to mention that you should use "tip" instead of "top" in this idiom.

Here is a video with more detail: https://www.youtube.com/watch?v=dQw4w9WgXcQ

kazishariar · on July 12, 2018

¯\_(ツ)_/¯:'Dual commits' to the rescue! -pun intended

Chyzwar · on July 11, 2018

The rewrite is usually when it is too late for the project. Need for re-write mean that project maintenance was ignored and technical debt reached critical levels.

I would start by firing people that led to this situation.

clintonb · on July 11, 2018

If you fire those people, you remove your source of expertise on the old system. Yes, they did a poor job of maintaining the old system, but their knowledge may be valuable to understanding the old system and creating requirements for the new system to reach parity.

maxxxxx · on July 11, 2018

" I would start by firing people that led to this situation."

You are one of those blessed people who can architect a system and the architecture holds up for decades. From my experience most systems will end up in a big mess over time if features get added. There is almost no way around it.

flukus · on July 12, 2018

> You are one of those blessed people who can architect a system and the architecture holds up for decades.

This is exactly why maintenance is needed. Proper maintenance that includes things like updating the architecture and gradually migrating the whole system to that architecture, rebuilding small unwieldy components, updating and migrating database schemas as the product evolves, removing unused features.

If a product is just getting bugs patched and nothing else then it isn't really being maintained, it's being deprecated. Unfortunately as an industry we still think that there are distinct build and maintenance phases and that the latter can be done with less resources.

bokonon12 · on July 12, 2018

Yup. So much of the time the system starts out at one thing and morphs to another. That can easily lead to core problems with your architecture

JumpCrisscross · on July 11, 2018

> I would start by firing people that led to this situation

Thereby fomenting Red Flag #4, not "working with people who were experts in the old system.”

Chyzwar · on July 11, 2018

I am not saying to fire everyone. I am just saying that someone needs to be responsible. If you keep the same people in power they will repeat the same mistake. You need to keep domain knowledge but clueless management is just a burden.

zbentley · on July 12, 2018

That's utopian. The reason cautionary rules like those in TFA exist is because "just undo/revoke all the bad shit that led to your current situation", while certainly appealing in concept, is often impossible in practice. Instead, litmus tests like these, which can be practiced at the dev team level, prove useful. Who knows, if enough dev teams at a company arrive at the same conclusions using reasoning like this, perhaps they really can put enough pressure on management to induce firings of prominent debt-incurring people. Even (and more likely) if not, that understanding at the engineering level will help mitigate future damage/mistakes.

watwut · on July 12, 2018

New people will repeat most of original people mistakes too - many bad designs look good before you tried them.

But also, wtf is it with people that the first instinct in any kind of situation is to fire everyone.

ConceptJunkie · on July 12, 2018

I think it's the phrase used by someone else in the comment chain: "clueless management".

You can have the best developers and architects in the world, but clueless management will sabotage anything they do, whereas good management can accomplish plenty with teams that aren't the best possible.

Buttons840 · on July 11, 2018

They're already gone, almost certainly.

lovich · on July 11, 2018

I've found that they are usually still there but as they are the CEO/CTO it's difficult to get them fired

bumholio · on July 11, 2018

Hey, a bunch of shell scripts glued to some DOS executables were good enough in my time. We had no fancy schmancy github back in those days, yet we built this business on nothing but hard work and pizza.

Why, the sources of the DOS exes were long gone by second year, lost in the crash of that old Windows Milenium machine that used to sit in our dorm room and was uniquely configured to compile them using Turbo Pascal - we figured it was a safe option to use as a source repository. But that still didn't stop us - we implemented the remaining features by patching assembly.

nerdponx · on July 11, 2018

That, or you are their replacement.

a_imho · on July 12, 2018

People write legacy systems from day 0, especially in resume driven development.

borplk · on July 12, 2018

Often those people are high up enough that they are not going anywhere.

For example an executive/management team that over-commits the organisation and creates a culture of rewarding technical debt and punishing maintainers.

Rather than fixing these issues they will continually search for a super hero employee who is going to come in on a white horse on monday and fix it all up in two weeks.