Hacker News new | past | comments | ask | show | jobs | submit login
Refactor vs. Rewrite (remesh.blog)
198 points by ntietz on June 2, 2020 | hide | past | favorite | 118 comments



In my experience, refactor vs rewrite is the wrong approach. Knowledge of the business must be maintained.

If knowledge is on the team because the company is stable, rewrite. 37 signals is a clear example of this.

If the knowledge is in the code, because multiple people or teams worked on the code base over the past few years, refactor. A rewrite will be the recipe for disaster.

I never found an exception to this rule.


I have been responsible for an exception to this rule. We were new people coming in to take over, decided to go for rewrite, and were successful.

But part of the reason for the rewrite was that the logic (translation of business domain knowledge into code) in the old version was wrong resulting in unstable behavior. It could have been fixed in the old version (very many places) but the lack of a test suite too tipped the scales for rewrite (with test suite).

You can say that we learned the domain (at least) as well as the authors of original system after coming in, but won't that be the case of any successful rewrite?..


It seems to me it's still not contradictory to GP: domain knowledge wasn't in the code, it was outside. The thing didn't work in all cases so you had to get the specs from something other than the code; rewriting according to this exterior source was the sensible thing to do


If you are able to get the domain knowledge fast enough, sure.

Or as you said, the domain knowledge has changed from the original.


Great heuristic!


IME whether you rewrite or refactor, the lesson is the same: You have to grind your way into the good architecture. It doesn't become good because the code is fresh, but because you have battle scars to show.

And a success story in that case comes from having a complete learning loop. In a lot of orgs the learning itself is argued against for one reason or another - development proceeds with as little feedback on quality as can be gotten away with.


Here's a nice quote from Jony Ive to support your point

Much of the design process is a conversation, a back-and-forth as we walk around the tables and play with the models. [Steve Jobs] doesn't like to read complex drawings. He wants to see and feel a model. He's right. I get surprised when we make a model and then realize it's rubbish, even though based on CAD renderings it looked great.

He loves coming in here because it's calm and gentle. It's a paradise if you're a visual person. There are no formal design reviews, so there are no huge decision points. Instead, we can make the decisions fluid. Since we iterate every day and never have dumb-ass presentations, we don't run into major disagreements.


>IME whether you rewrite or refactor, the lesson is the same: You have to grind your way into the good architecture.

I think that only applies to things nobody on the team's built before. From my domain, HFT, if we hire a senior dev who's built blazing-fast futures arbitrage systems working for other companies, chances are if he rewrote our system it'd have a very nice architecture from the get-go, because he's already solved a similar problem multiple times before (and learned from the mistakes in those architectures).


And then, that senior dev will pick a newer, better CPU, on-device bus, network topology, or your setup simply has a different scale, and hard-learned heuristic “don’t use more than x foo’s for y” or “the network is the bottleneck; use huge machines” become obsolete, and your dev will get some new battle scars.

I guess that will happen in HFT all the time, as you don’t want that blazing-fast setup, you want something even faster.


"Grind" your way into good architecture? Good architecture is designing, not trial-and-errored. The only grinding I do is mental before I type a single letter of code.


So all the code you’ve ever written was perfect in every way from day one?

You’ve never made a suboptimal choice?

And never made a mistake?


Have you heard of our lord and savior, Domain Driven Design?

Here are the blue and red books.


Are these books Evans(blue) and Vernon(red)?


Yes. I think Vernon's is the best as it has more emphasis on the process and organisation instead of the code.


Interesting. I’ve only read the blue one, will get a copy of red. Thank you!


I feel like you have denied the existence of people who are capable of writing down a good architecture on the first try, without all the grinding.


No good plan survives first contact with the enemy...

Less pithily, the good architecture will almost certainly end up with arbitrary patches and madness after 2 years of active use, if anyone cares about it at all... And then somebody will join and say 'This is a mess, we should refactor and/or burn it to the ground.'


I don't doubt that this could accurately reflect your experience in the industry, but it is not a universal truth. I've worked on sustaining engineering of systems that were years old, the stewardship of which involved refactoring only, and which are still in production today. There are lots of systems that are more or less right in version 1.


> There are lots of systems that are more or less right in version 1.

While true, in my experience this usually happened when at least one - if not all three - of these conditions were true:

1. The problem domain is relatively static, unmoving, and self contained

2. The systems in question is relatively small

3. The developers in question have written substantially similar systems before, or ground locally, or used version 0.x version numbers first (in any of these cases, is it really "version 1" per se?)

...game development is admittedly weak in all 3 categories, and high on time pressure. Oh sure, there's some solid code that will probably last decades in many codebases, but even the best programmers will occasionally make a system that - while good in isolation - needs reworking when combined with other systems - to say nothing of the content and rendering pipelines that vary wildly by decade.


4. A system that never needs to scale much beyond its original projected load.


First read as "original project lead", but I think that could apply too.


That's certainly true in my experience. I've seen some great, readable, codebases written by solo developers that had lots of previous experience writing code in larger teams.

Caveat: it's never something that's too large or complex, of course.


0. The requirements were known in advance of starting the design.


> the stewardship of which involved refactoring only

I've known several such systems, and the reason they stuck with the original architecture is they wrote it in C, and it's very hard to iterate architecture in C programs. The programmers will just keep bashing the C code so it works well enough.

Yes, I know this is a provocative statement.

The reason behind it is that C's ability to encapsulate behavior is just not very good. Even simple things, like "is it an array or a linked list" leak out all over the place.

Even simpler, "is it a value or a reference type" means one has to swap "." and "->" everywhere. C++ half-fixed it by introducing reference types, D fixes it all the way by noting that "." can be unambiguously used for both operations.


To some degree that's matter of luck. Sometimes all your assumptions are correct from the start and the requirements don't change much. But often a lot of things change over time until the first architecture doesn't fit anymore. I would agree there are better and worse architects but luck definitely plays a role.


Yeah it is always important in such discussions to remember how vastly different development can be. Developing embedded software for an appliance have vastly different constraints compared to a web app startup which pivot every three months.


It's funny to say, but I can't agree. I've designed and implemented many custom systems for which there was never a predecessor, and not only did it work splendidly at first release, but have been extended and enhanced many times over the years with no refactoring. Many of these systems have been extended enough to start taking over other systems that were never intended.

Design a system to be modular from the very beginning with only the most fundamental of API signatures and you win. Many of these signatures may even no-op for years, but because they're fundamental to the problem, you know there could be a case where they're needed.

I've never had a failed or late project and most of my projects are ones that were assigned to me because no other team would touch them due to all of the attention and required SLAs.


Nonsense. Competent designers with 8 or 10 years experience should absolutely be capable of architecting a solid system that is cleanly extensible for forseeable business requirements.

Probably one huge confounding factor, however, is the continual drive by juniors (or supposed seniors who think like juniors) to use the latest & shiniest technologies. These of course these are unknown to the team, of questionable long-term suitability and introduced without knowledge of the "right" way to use them.

In the article the OP describes using Elixir without A) needing it or B) being able to handle its downsides. Plus many random versions of Angular and React. Plus the overheads of an excessive focus on SOA.

My emphasis in architecture is on using simple & strong tools, and delivering great architecture in the problem space. Configurability, extensibility and DSLs are my forte. I don't need a new language -- I know how to use the ones I've got.


I have yet to know any such people :-)


Agreed. Every new design is informed by past mistakes. Programming is not an a priori field.


Just avoid touching anything complex :-)


Or a deadline.


Certainly possible to make a good one on the first try. A great one? Probably not, unless you are doing something absolutely trivial and done several times before, like a CRUD app.


"Everyone has a plan, until they get hit" - Mike Tyson


I'd suspect anyone who claims that is a bit full of himself. Maybe their work isn't so great to other people.

Or maybe it is. I won't rule it out.

But I also think that anyone who does this probably works too hard. It's so much easier to build things incrementally.


>It's so much easier to build things incrementally.

I would say its much easier if you know what you are building in advance. Building incrementally is usually more about exploring ambiguous ideas and not having a clear vision in my experience.


I'm always building something I've never built before. And the requirements always change as you go.

So knowing what I'm building in advance never really happens.


So they got off of Elixir and microservices, but their introduction of Typescript and GraphQL didn't pay off like they hoped, and now they need to get off GraphQL. I think the lesson of "Choose boring technology" applies here. One of the reasons rewrites are risky is that very often the new tech doesn't live up to expectations.


yeap, this is a good point. This is the first time i see a rewrite to "older". Older architecture, older language, older methodology. And the only part that serm to not have met expectation are the ones using new stuffs ( graphql and maybe typescript).

Maybe there's a rule here : complete rewrite is ok if you're rewriting to "older" tech.


If the business model is boring, you’ll need complex technology to compensate. If your business model is complex like in a lot of enterprise software, you should strive to keep the technology as simple as possible. Simple doesn’t mean to not use complex or cutting edge things IMO but to be conservative in using things that push the cognitive load of your collective team. A 10 person company probably doesn’t need a K8S cluster for a mobile app but I’d be surprised if there isn’t one for a 10k+ software company.


"If the business model is boring, you’ll need complex technology to compensate." Could you elaborate on this? What is the complexity compensating for? Boredom of the developers?


I suppose the boring business model needs some secret sauce, i.e. advanced technology, to stay competitive -- since it's boring, many people are likely to do it, the margins are thin, you need to optimize.


Compensating for competitive factors both for customers and for investors to care that you're worth putting money into. If you don't have something that makes you special from your competitors, what's the point of you existing? That's just a small business like a mom & pop grocery store or a boutique web design agency (both of which are endangered species). If you don't have any competitors, then you'd better have something of value, correct? Or is one's company Theranos et al? In enterprise, the competitive factors can be things like relationships to certain vendors and this can supersede technology secret sauce and is as equally valuable as a trade secret. You get overly-complicated business models that become a house of cards like Enron or WeWork which make unsophisticated investors think they know what they're doing, and as engineers we are attracted to our own biases toward shiny new things that might solve all our headaches.

But you're also right in that some technology companies basically hire developers to give them resume / vanity projects that don't necessarily help the business be competitive directly. However, that is another ball game that's into global brand territory which follows rules more of religions and politics than of capitalism.


I am also in the middle of moving off graphql to REST api. It is a 5 years old project and the graphql part was implemented with a very old spec. Since it has not been well maintained, it could not catch up with any features offered by graphql and users need to write their own queries by hand due to spec compatibility.

Rewriting it to REST allows us to "rebrand" these apis. At least, it does not scare our api users as the word "graphql" did.


Yes but they were able to add lots of keywords to their resumes so it wasn't a complete loss.


I rewrote a codebase for a major stock exchange in about four months. It was originally in Ruby, which is my most proficient language, and my implementation was in Python, my second most proficient language.

The original Ruby codebase had been worked on for years and was a complete mess. Not because of Ruby, but because the people that developed it were a little sloppy or junior or time pressured. Who knows?

It's totally possible to do a rewrite and to do it well. The advantage of having a pre-existing implementation is that it's far, far easier to write tests before you implement because most edge-cases are covered in the existing implementation. Make sure you do a feature freeze before you start, then once you're ready for production do both shadow testing and canary testing and things will go well.


> The advantage of having a pre-existing implementation is that it's far, far easier to write tests before you implement because most edge-cases are covered in the existing implementation. Make sure you do a feature freeze before you start, then once you're ready for production do both shadow testing and canary testing and things will go well.

There are two ways I've seen it go vastly over schedule when these things bite you.

1) There might be edge cases covered in the original implementation that neither you nor current stakeholders remember or know about. So discovery can either take way longer than expected (until you're confident you know 100% of what the code does) or implementation can get constantly delayed by "oh, yeah, we do still need that."

2) Feature freezes are a rare gift. Usually it's "we need this new thing" which leads to engineering saying "that would be hard to do in the current system, we should rewrite it" so now you have to rebuild all the old stuff plus build the new thing. And its requirements will probably keep changing on you.


I'm not sure it's super common that many companies would agree to a 4-month feature freeze.

I don't think having a existing implementation makes it easier to write tests. I think having a ton of domain knowledge about the system and what inputs it could possibly get is what helps you here. And all the better if the existing impl also has a comprehensive test suite that can be feasibly 'ported' to the new language.

The existing implementation helps you if you decide to come up with new tests, because you can compare the result of those tests with the old and new implementations. But if you don't even know what new tests to write (which requires domain knowledge), you're stuck.


> Make sure you do a feature freeze before you start

Easier said than done. Oh, it is easy to get the deciding people to do lip service to it when planning the rewrite originally, but a lot harder to enforce it a few months down the line.


God bless whoever wrote tests before you. Whether it be a rewrite or working on the current code base, those who have the time and resources and don't write tests are inexcusable and bad stewards


Sometimes you can produce tests (not unit, but integration and such) from the existing system by treating it as an oracle. That's what I did once, it worked really well and in the end of the differences between it and mine (where mine technically failed the tests), we found out the old system had dozens of critical bugs and mine hand a handful (old system gave many false negatives, which was a major no-no, mine gave some false positives which were tolerable; manual inspection can clear up the latter but not the former).


I'm the person that you're replying to responded to.

This is exactly what I went through during that four month period. The existing tests weren't very useful and I was completely changing the structure of how the program was architected, but the input v output could be measured. I, like you, also discovered critical bugs in the existing implementation because I needed to cover the entire combinatorial set or possible inputs whilst the original implementers presumably went through them as they came up.


If you could do it in 4 months i seems to be a relatively simple system. It's also not clear how your new code will adapt to changes in the next few years. It may end up in a big mess again.


I constantly refactor my code, even the ancient stuff. Here's my most recent one:

https://github.com/dlang/dmd/pull/11202

which changes a data structure from a linked list to an array.


I was about to comment something sarcastic, then I googled you and instead I'll just say thanks!


I don't mind if you say something sarcastic, but I am happy to say "you're welcome!"


My team inherited a 70%-complete AngularJS application, the user interface for a passengers and crew on a new series of airplanes. It worked fine enough on modern tablets and phones, but there were also some interface devices hardwired into the planes, which would be very challenging to update at this point, and on those devices the software ran very slow.

Through several experiments, we determined that the sluggishness was unfortunately coming from AngularJS. We tried some performance improvements, but it wasn't getting very far.

We built some prototypes using other Javascript frameworks, and found VueJS gave us the best performance.

Management was concerned about us rewriting. The end customer was concerned about us rewriting. So risky! But we saw no other way to deliver adequate software performance.

We got done on time, with great system performance, to the accolades of a very happy end customer.


I’m my career, UI rewrites have been By far the most successful.

It may just be anecdotal, but my assumption has been that most of the edge cases are more easily identified because UIs are by nature going to have most of their business logic visible to the human eye.


When a developer says "I am going to rewrite ..." you have to question why? If you were not doing this "rewrite" then what else would you be doing? Refactor is usually a term used to minimally modify legacy code to allow new functionality to be added.

Rewriting implies rewriting legacy code (regressions) and wasting time. If a developer says "it's the only way" then you need to be sure you can trust their judgement.


I tend to agree. Part of the problem is when developers get thrown into unfamiliar code bases but can't read code well, they may think about just doing a rewrite.

But really, Thoughtful refactoring is better. And as another comment noted, in many cases good architecture comes from learning by past mistakes of yourself and others.

OTOH, there are edge cases. I'm currently finishing up a rewrite, but it was due to the existing software being written in a language that most people aren't proficient in and isn't really the norm in our company.

I've witnessed but did not participate in two rewrites. In one case, a company wanted to switch languages and databases (got sick of oracle licensing.)

In another case, the company had a pretty good justification for a rewrite; the system was a monstrosity of .NET on top of VB6. In fact, in this case after the initial rewrite attempt failed and there was a change in management we pivoted strategy. Instead of trying to rewrite the world we would just continue to refactor things as we made changes to those modules, and only hard-rewrite the VB6 code. I don't know if they're still doing that, but it was a -lot- more productive than the rewrite team spending a year producing nothing that ever made it to production.


Rewriting implies rewriting legacy code (regressions) and wasting time.

This isn’t necessarily true. A few years ago, we rewrote a large part of a web-based UI. The original implementation was a Java applet, a reasonable choice for an interactive diagram at the time it was first written. However, over time the applet had become a liability because browsers were planning to drop support, and in the meantime some useful alternative technologies had become available. In a situation where you are porting an established system to run on a new platform, rewriting really is the only way and therefore isn’t a waste of time.


This is definitely an interesting case where your product suffers through external factors. Things like languages being EOL'd (python), technology becomes obsolete (like you mention here), vendor forcing change through API and security change through OS upgrade (iOS), and security CVE issues. These all cause sustaining to be done to older code in unexpected ways.


I frequently write code that is publically available on GitHub and used by other people. Sometimes my first attempt at structuring the code works out well (more and more often as I grow in experience). Sometimes... Not so much. Usually a refactor is enough, but occasionally I have to buckle down and say, to be ready for public consumption, a rewrite would be faster.


I often suggest rewriting (or large-scale refactoring) to modularize big balls of mud to permit automated testing. I can always justify this by pointing out how much time is wasted doing manual testing, as well as how much time is wasted starting up a full-blown instance of an app and then navigating to where the problem is just to troubleshoot something.

In my experience, though, "rewriting is wasting time" types reject that reasoning because they don't really have a good handle of software engineering concepts.


> In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, "I don't see the use of this; let us clear it away." To which the more intelligent type of reformer will do well to answer: "If you don't see the use of it, I certainly won't let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it."


Refactor is a good word to use if you don't want to upset the coworker who's code you completely replaced.


I found the move away from GraphQL again interesting, maybe because it supports some of my concerns about it. I never looked in detail into GraphQL, but just from the general principles I understand I found it rather scary to decide to use it.

It's a big complicated thing directly between the data and the frontend. If anything goes wrong there, now I have to debug a much more complex piece of software than if I use a REST API. And just like in the article, based on my superficial knowledge I was concerned that it could be easy to create bad queries, and hard to fix performance issues.

I'm not saying GraphQL is bad, I really don't know enough about it. But given the central position it takes, I would not feel comfortable deciding to use it without a serious amount of prototyping and getting to understand how it works. It often feels weird to me that such a complex piece in this kind of critical position is often casually recommended for any kind of web application.


> it could be easy to create bad queries, and hard to fix performance issues

You can do complexity analysis of the query before you run it and deny requests that have complexity above a certain value. Some graphQL libraries provide this out of the box. Of course it won't eliminate all possibilities of bad queries, but at least most of them.

For me GraphQL simplifies life quite a bit. We have an internal library that plugs a lot of things in automatically but leaves full customizability options, which makes it quite convenient to build APIs. On the front-end there are very useful libraries that again make life simpler. In short: the biggest benefit is that I have more time to spend on things that actually matter, instead of inventing my own protocol that in the end is just going to be a worse graphQL.


Maybe controversial but I think original authors should be able to rewrite, i.e. they can learn from previous mistakes and incorporate those lessons into the rewrite.


Oh yeah they definitely should, after all they will also have a lot of domain knowledge which new developers have to relearn (I'm in that situation myself at the moment).

But, they need to have actually learned since then. They need to prove that if given another shot, they can do better.

I've been able to do that, we built an application in BackboneJS, then Angular became big; we used that in a separate application, then decided to gradually rewrite the original in Angular (we were able to ship it to customers during the rewrite, rewriting things on a per-page basis without doing much with the design). It took a lot longer than the original, but also because new features were added and we were doing things much neater, with proper unit tests and the like.

We learned that we had underestimated the complexity of a single page application, that it's not just a matter of "show whatever the back-end is returning so make sure the back-end is tested, front-end doesn't really matter much".


It's always a judgment call. But in general, there are a few things that make the choice easy.

Any UI project, I tend to look at the expected shelf life of the thing knowing full well that most UI projects get discarded pretty quickly. Stuff like backbone, angular 1, etc. were pretty popular a few years ago and you don't hear a lot about people whining about refactoring those. Mostly those code bases have already been changed; several times probably or abandoned.

So, if we have a bit of UI and it's important for it to be right and it is clearly not and I have a team of developers leaning towards unceremoniously dumping the old code base, I'm not going to object very long or hard. The reality is that it probably can be fixed but all the people interested in doing that have gone.

For backend systems it's different. I regularly see code bases with long histories. Often a good sign of how good they are is if people can work on them with confidence. In those cases, I lean towards doing more incremental changes. Update dependencies, modernize a few things even if they've been fine, etc. If that's not the case, it's a net cost sink and something needs to be done.


A good rewrite looks like a refactor.

Strangle that monolith and none will be the wiser.


I love that answer. Strangling for me is the best of both worlds: you can do it iteratively/cautiously but at the same time you can start "fresh" if necessary/possible.


Conversely, a good refactoring process eventually results in a complete effective rewrite.


I think we need to focus on even more basic point while deciding between refactor vs. rewrite. That is whether the basic DB schemas/Data Structures are well designed/suited to current business understanding or not. Because rest of the structure stands on top of them.

I do not see how you can refactor stuff if the majority of the DB schema/Data Structures are built incorrectly and/or not suited to current understanding/layout of the business. You will have to rewrite.

The only case where refactor can possibly work is that basic DB schema/Data Structures is 90% good and changes are only required at the business logic level.


I feel there's room for something in-between (reimplementation? resurrection?): start with a clean slate, and bringing in large swaths of functionality and code and then rebuilding other parts. I think this is useful when you have a lot of work to upgrade an older version of your core framework or many dependencies, you're changing a large piece of your infrastructure (like a database), or you have a lot of work to "fix" your test suite.


I think rewriting is a great option, provided both options are weighed up appropriately.

However saying this, we made the decision to rewrite a legacy system around a year ago and the project got parked due to "urgent" projects being fed to our team. This was frustrating but we could not account for this external factor when weighing up the options.


I think changing priorities is a risk that has to be taken into account whenever you are considering any task that will extend for a long time -- rewriting code is just one example.

If you have to go on for many weeks / months on a project without the possibility of showing intermediate results, it is naive not to think that there is a high risk of the project being cancelled.


I completely agree with you, it is just a complicated metric to factor in.

I guess one of the frustrations was the fact that our team had been been assigned with assurance that we will not be affected by changing priorities.


I'm in the middle of a rewrite, but because my employer also wants me to do work on the existing application, I end up doing a lot of refactoring as well - if only for my own sanity and to help me understand the existing code.

We're talking an only 8 year old (continuously developed during that time) app built in PHP and JS, but by someone who never bothered to learn web development proper. Resulting in a lot of copy / pasted code, string concatenation to produce HTML, etc. The problem isn't so much that, it's that the original developer was also highly productive, so the codebase exploded to nearly 200K LOC.

They recognized the existing UI was not going to work, so after the previous developer left, they put out a job advert to which I was recruited for. The job being to rewrite the application with modern technologies, with stuff like structure and tests built in this time.

I was okay with that, I mean I could probably do some things with the existing UI but it would be very labor intensive and if it was ever finished, it would still be a mid-2000's-looking Dojo app.

I've (for now) decided on Go and React, deploying it as a self-contained application (our distribution is via RPM packages on bare metal or virtual servers). Not sure about Go at the moment, most work is straight data manipulation (think REST API, database and writing XML files), struggling a bit with things like data consistency (because Go won't enforce you to set all fields, mind you neither does Java but you can have Lombok generate constructors and the like).

Anyway, my problem right now is that I was hired under the premise that maintenance of the old app would be like a day a week at most. Of course, in practice they're asking me to add new features to the existing application, so I've been spending over half my time on it instead, not making much headway with the rebuild.

My fear right now is that they'll scrap the new UI in favor of just keeping the old one around. But there was an article a while ago that pointed things like that out, I believe it was titled something like "why rewrites fail". Very close to home.

I mean I still believe a rewrite will be beneficial, but it's going to take at least another year and a half I think before it's feasible to replace the existing UI.

Should have a chat with the boss whether we can hire more people to work on it and / or the existing application.


If it hits the same database, build the new features in the new stack, and launch them. Users should click from the old to the new without even realizing. You have to figure out some details like sharing sessions, but this way avoids adding onto the old application. Otherwise you'll never catch up.


My last 3 clients wanted me to rewrite their React Native apps back to native iOS and Android like in a separate codebase. I’m not sure if this is a trend but their common complaint was that they are sick of the performance of the app and each platfrom has a unique bug. Also this one project written in RN took like a year to develop.


I think everyone is missing the big point here: they current implementation was in an untyped language and so refactoring was just to hard. Use languages with good type systems and then refactoring is easy, and always the correct technical choice.

Let me say that again, the killer app of type systems is that no mistake can "total" the code.


The lack of static typing does not make refactoring incredibly hard or difficult. In fact, I would argue it's even easier due to the way dynamically-typed languages pass around data. Really, as long as you follow some basic rules and don't propagate complexity into your system too-much, refactoring is a breeze. When I do it, the types are inconsequential almost.

The things I do worry about when refactoring is developer-induced complexity. E.g. "This random corner case in this function returns a null, but I don't see the consumer handling nulls." At that point, it's hard to tell their intention and whether that null's explosion is handled way further down the line in some unrelated function as part of normal execution flow. I.e. Sometimes that behavior is expected, and my "fixing" of it would introduce a bug. That is what causes regression bugs when refactoring, not the types. Sure, one or two "type typos" do slip through the cracks, but that's inevitable and one quick run-through of the system picks that up relatively easily.

Trust me, once you spend a non insignificant amount of time developing, refactoring and debugging software in a dynamically typed language, you start realizing that static-types are more of a crutch for the compiler than they are to assist the developer. Especially in more object-orientated languages with complex types and inheritance hierarchies. Think C#, Java, et al with their "abstract base classes", interfaces, interface-inheritance, virtual override methods, method overloading, type casting, etc.


Any halfway decent static type system would capture that nullability and prevent accidentally using a null value without checking, so you seem to have slightly undermined your own argument here.


Static type systems don't necessarily prevent accidental null value usage. Not sure where you get that impression?

However, for initialized variables, nulls (or None) was an example I used as that is the common one in python and conveys the point rather than language details. All of the "decent static type" systems I'm aware of have the same issue with undefined values that break behavior. E.g. zero's as integers, empty strings.

Heck even in C# where where you have a Nullable type, it gets abused more often than used legitimately and is seen as a nuisance by most developers. Not only that, but even with that static type system you mention, initialized variables are a big problem. Hence all the null-reference exceptions that are all-too common.


Static type systems don't necessarily prevent accidental null value usage.

The better ones do.

Some languages with static type systems, notably C and its descendants, have reference or pointer types that are nullable by default. With the wisdom of hindsight, that design decision is regrettable; Tony Hoare himself famously called inventing null references his “billion-dollar mistake”.

There are safer alternatives. For example, you can have a type that makes optionality explicit, so it contains either nothing or a single value of some known type. Before you can work on the contained value, if there is one, you must deliberately extract it; the type system will prevent you from accidentally using the optional value in place of the contained value. In Haskell, this type is called Maybe a. Rust has Option<a>. In OCaml, it’s 'a option.

All of the "decent static type" systems I'm aware of have the same issue with undefined values that break behavior. E.g. zero's as integers, empty strings.

Again, with a sufficiently expressive type system, you can encode properties such as a list being non-empty in your types. This lets you prevent illogical actions like trying to take the head of a list with nothing in it. You sometimes see these techniques if you’re working on high reliability systems with formal verification.

You can also handle edge cases safely by replacing a partial function that is undefined for certain inputs, such as dividing by zero or taking the head of an empty list, with a total function that gives you back an optional value as described above.


"This random corner case in this function returns a null, but I don't see the consumer handling nulls."

Modern type systems tend to incorporate handling of null values in the type system though.

I can recommend looking at Crystal. You will find that the overhead of providing sufficient types is pretty small.


I mainly work in languages with good type systems. But I've also contributed a bunch https://github.com/mesonbuild/meson/pulls?q=is%3Apr+author%3... I can assure you despite all the unit tests refactoring just takes way, way, longer.


More expressive type systems might make refactoring safer, but they don’t necessarily make it easier.

The former happens because it’s harder to change something encoded in the types accidentally.

For the latter, it should be easier to change something encoded in the types deliberately, but often the opposite is true.


Having a type system enables zero risk automated refactorings, even complex ones and make them a no brainer. Not having that builds a reluctance to do even simple refactorings.

A good example is the simplest possible refactoring: renaming things. I was doing this in pycharm on a simple python project the other day and it proposed modifying just about all dependencies on the classpath because it couldn't tell apart things that were in scope and out of scope of the refactoring. I've seen similar things happen on javascript and ruby codebases. Renaming things is a PITA in those languages. Not safe at all.

On any Kotlin or Java code base I do this all the time without thinking twice. I rename stuff, I move stuff, extract variables, auto fix things, etc. It just happens. A rename is a complete non issue for that. Doesn't matter if it's a local variable or the package name of your entire code base. You can trivially modify thousands of lines of code with a keystroke without breaking stuff.


It seems to me that what you’re talking about there has more to do with having clear rules in the language for scope and modularity than to do with the type system.


Those clear rules are called the type system. The fact that it's static means the same stuff that the compiler uses to tell what is what may also be used to build syntax trees to facilitate transforming your code base from one valid state to another. It's impossible to do that with dynamically typed languages and at best you get some partial guarantees combined with some string replacing.


A crude but correct algorithm for renaming all instances of an identifier that refer to the same entity (variable, function, type, etc.) could be something like:

1. Locate all occurrences of that identifier in your code base.

2. In each case, determine whether this is the place the underlying entity is defined or a reference to an identity defined elsewhere. If defined elsewhere, locate that definition.

3. Change all occurrences of the identifier that relate to the same definition as the one you started with.

If you have clear rules for things like the scope of an identifier and how identifiers may be imported and exported across modules, there is nothing in that algorithm that is necessarily specific to static or dynamic types.

It’s true that knowing the types statically can make a difference in some cases. A common example would be object.method notation, because there the context matters: the method being identified depends on what type of object you have. If you can’t identify the type of the object in some way, via a static type system or otherwise, then maybe you can’t identify the method and its underlying definition either.

However, it’s worth noting that in these sorts of late-binding environments, the operation of renaming all occurrences of an identifier that relate to the same definition probably isn’t well-defined anyway. Before you can automate a refactoring operation, you need to specify exactly what it means, and in a situation like this, the specification is ambiguous.


No, it's really easy: change a definition's type, and then fix each type error. The errors give you a guided tour of the codebase that's impossible to get without static type checking.


You still need to fix all of those type errors, though. As we encode more information within our types, the effort to maintain them naturally tends to increase as well.

Your profile indicates that you work with Haskell. In Haskell, we often encode possible effects explicitly in types, while many other languages do not restrict effects to the same degree. This provides a degree of safety in Haskell that those other languages lack. However, it also means that if refactoring moves the place where some effect can be caused, that may require a change to the types that propagates widely through the system.

For example, suppose we have a system where the high level code is wrapped in some logging monad. We decide to refactor so that some of the log writes move to a much lower level, perhaps so we can then add more detailed information to the logged messages. At this point, the entire call chain down to where the logging will be done is infected by the logging monad. This is perfectly correct in terms of type safety. It is also work that would be entirely unnecessary if we were performing an equivalent refactoring in a language that did not encode so much information in its types in the first place.


Yes, I work in Haskell. As you say we have lots of types---and type errors. But refactoring is still insanely easier. I'm kinda infamous for bundling a refactor with every feature, in fact, because the extra work is just so minimal I cannot help myself.


This blog post doesn't seem to mention how long the process took - that's what I'm most interested in!


Yes, this is the key warning of Spolsky’s post [1]: The rewrite team had invested years of work and still had nothing shippable.

[1] https://www.joelonsoftware.com/2000/04/06/things-you-should-...


Rewriting has a concave payoff. It requires high levels of certainty to be valuable. The bigger the rewrite, the more likely you're less certain than you think.

Refactoring is the same as rewriting, just on a smaller scale, which increases certainty.

Joel's advice assumes you will overestimate your certainty.


Some rewrites can be easy because a better technology exists that makes it easy. For example, you might have a NoSQL DB with lots of hand-rolled queries, and then you get a query language for it, so now you can rewrite a great deal of code. You still hand-roll transactions because eventual consistency, CAP, and all that, but then you get a SQL DB that... gives you all of that anyways but with good ole SQL (with eventual consistency, not ACID, but hey, CAP), so now you can rewrite the queries and the transactions, and maybe that's the lion's share of the app, so you've essentially rewritten the app. What about the REST layer? Well, you get inspired by PostgREST and write a similar tool for your new DB and there's nothing left of the original.


I would argue that rewrite is slightly more difficult then refactor. Refactor means, after all, making the current system better while rewrite would mean no update/change/feature on the current system and more pressure to get it done.

I would herby say that a good team with experienced people will succeed in both cases while an less experienced team will probably have a higher chance of doing a refactoring then a rewrite.

I like refactorings better. It allows me to optimize a system slow and steady.

I like rewriting better if you don't need to support the old env while doing the rewrite and having expierence in the old system.


> a handful of engineers focused on machine learning (ML). [...] We leaned into a singular moderator talking to a group of people.

I am curious about why an early chat/conference company needs so many ML Engineers? What are you using them for?

(the Remesh mentioned in the article seems to be https://remesh.chat/ and not https://remesh.ai/, based on previous articles, the logo and blog domain)


I hear more often things like "oh I don't have experience with X, so let's not use it". I mean programming languages are easy, frameworks are hard. Like learning how to use Django (which is pretty easy as far as frameworks go) to me seems harder than using React (assuming I know that already) with TypeScript instead of JavaScript. I would probably be really happy to use TypeScript instead.

If Elixir was a really bad match because it was hard to use the required libraries, OK I can understand that.


> programming languages are easy, frameworks are hard

That's been my observation as well, but I find myself surrounded by folks who insist that any framework should be a ten-minute tutorial away from mastery (even people who ought to know better).


I don't know if this is off-topic or not. More times than I care to admit I've lost hours of coding due to crashing without saving first. Without exception when I redo the lost code it's far better designed than the original and usually (not always) in a lot less time. I'd almost be inclined to recommend always throwing away your first completed effort and rewriting it while your mind's fresh.


A really undervalues idea in software development is architecting software for rewriting it in the future. If you make your software small and well defined, then it becomes a reasonable investment to simply rewrite it from scratch.

This is a great talk about this idea: https://vimeo.com/108441214


Optimise for Deletion! I always say this, but it's difficult to get people to understand just how important a concept this is.


Probably it should consider the alignment to the business strategy. Developers tend to rewrite because of tech challenges. But I agree with the conclusion and rewrite should be considered if it is impacting the strategy.


> ... We had very poor test coverage in the areas most in need of refactoring because they were the oldest code, written before we established good test practices.

That's probably not a coincidence. I like to ask those most strongly opposed to test-first development how they refactor code. I have yet to get an answer the leads to anything other than a rewrite at some point.

Of course, the rewrite might be coming for the test-first codebase as well. But in that case, the test suite can be leveraged to great effect as a detailed specifications document. This is one of the benefits of test suites that doesn't get enough attention.


How is this a test-first argument? You can write tests at anytime, bad discipline is bad discipline. I've never worked somewhere that didn't write tests because "they didn't write them first", they didn't write them because they were an immature organization.

My current job leaves it up to the developers, and most don't write tests first. Its no miracle that despite us not writing tests first, all changes have their unit tests, integration tests, and end to end tests. This is due to the fact its baked into our SDLC and enforced in code review.


> You can write tests at anytime

Actually, every single Java enterprise application I've worked on for the past 20 years has been test-resistant: the developers declare everything "public static" so none of the application can be run without running all of the rest of it (since each class imports another that transitively imports everything else, all with static initializers that read configuration files and connect to live database instances). Every refactor/rewrite I've ever suggested has been specifically to allow tests to be written.


Wait, this is confusing. By test-first do you mean TDD? Even if you don't do TDD you gotta have tests before refactoring, I've never seen anyone writing tests after refactoring. I don't think this is controversial at all. Anyone please correct me if I'm wrong.

(I've seen refactoring without testing but that's another matter)


AngularJS seems like the number one culprit to cause people to raise their hands and go with the rewrite over refactor approach.


Refactoring/Rewrites is the MOST rewarding task that can be achieved with both legacy and new code. I do it from the start of my career (when I don't even know the name of it).

And that include, regulary, rewriting to different langs (like, for example FoxPro -> C#, C# -> Python -> F# -> Swift (aborted) -> Rust (noob) -> Rust (now I know rust, almost!) <-- yes, the same project, years on it.

Exits a lot of FUD against this. Is risky? Is hard? Can be messed up? Everything is staked against you? Yep LIKE ANY OTHER CODE ACTIVITY.

----

Things that help in make this a success (assuming, of course, an average, half-decent developer. But I have done it when I not even know anything, so who knows if it work?):

CRITICAL:

- Have source control

- Have a task manager (mine: pivotal)

- Split stuff in small tasks (couple hours max)

- Priorize to know the data(schemas), the business REAL priorities (because all the time, everything will be declared of imperative importance). The code? Less so. If the code is good then you are not considering this at all.

- This can be done in the small or in the large. So, a full rewrite can be totally done in hours. If all is so tangled this is not true, then the code is beyond salvation, REWRITE.

Nice:

- If possible, deployable code in one go (of the old code)

- The database (or data in general) is kind of easy to grasp (if not, all is a mess, REWRITE). With a good database, rewrite is far more profitable than refactoring messy logic.

- Have good test that can be used to validate it? Great. I rarely have that luxury.

To become good:

- Rewrite/Refactor is a muscle. The more you do it, the better you become.

- Do A LOT of MICRO side-projects or experiments. Trash them, rewrite them, explore them. Make your mind used to it (note: this is not "ruin my life, only code yo!", is to start a major idea with a micro side project before commit bigly)

Everyone know the second time your code is better. Imagine if you do 5 rewrite in successions. That code will be celestial!

- Be confident in being able to read code. Read code. And read more. Much more if is a new lang.

- Ask question, and ask the same question, differently.

- Pick a project that you can do in your sleep (mine: Pseudo micro ORMS) and start with that when new lang or framework. In a side project. Must be of small size (few files, at most)

- If change lang, framework, be aware that some are BETTER than others. Learn to know why and when start using it.

- Dedicate a few minutes learning useful stuff.

- Do the HARDEST stuff first, at least, a significant piece of it. If the hard is done, the rest is easier!

- Have a solid selection of companion tools that you RARELY need to change (mine: Sqlite, Postgres, nginx, ubuntu (deploy))

The MORE messed up, the BIGGER is the payoff:

- Not have tests or are all useless? Start writing test and all is tangled? Rewrite

- Like a stupid idiot you are 4 months into a refactoring (doing stuff properly and all) and all paths lead to doom? Rewrite (you probably feel it 3 months ago. Idiot me)

- You testing (manually, auto, whatever) take X times than write code? Rewrite

- The customer is on fire, nothing work, all things are delayed? Rewrite

- Everyone in the path of requirements are even more lost than you? Rewrite

---

Surely exist some stuff that is conveniently ignored in this list: Have a customer/team that accept the rewrite.

Look, if the customers/boss/team is THAT bad, then NOT MATTER what you do, you ARE SCREWD.

But I'm lucky, and have worked in situations with serious trouble (but decent humans around, even if can't help much and I must figure a lot on my own) have tell me that people are more accepting of rewrites than some could think. Even delayed projects can be tolerated if exist a good flow of improvements.

P.D: This is my experience in my sector, and the few I have done in a mid-size startup...


I think Google even has a policy that almost all parts of code in their mono-repo needs to be rewritten (not just refactored) every 2 years or sth like that.


I guess nobody felt like rewriting Google Reader!


Not "significantly technically challenging" enough, so no promotion opportunities there.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: