Cute litte story, but the moral the author takes away from it is actually wrong.
The factory didn't remove the onions from the recipe because no one could remember why they were there. They were removed because Levi asked why they were there, and when no one remembered, he investigated the original purpose of the onions and determined after that investigation that they were no longer needed.
I'm sure many of us have been bitten by this as developers. You see a few lines of codes or a feature and you have no idea why it's there, so you remove it. What happens? Random, seemingly unrelated application starts failing or angry customer calls wondering why some feature is no longer available. Whoops...
So my takeaway from this story was actually quite different from the author's. I took: "Before you remove the onions, make sure you understand why they were added in the first place."
Closer examination revealed that the switch had only one wire running to it! The other end of the wire did disappear into the maze of wires inside the computer, but it's a basic fact of electricity that a switch can't do anything unless there are two wires connected to it. This switch had a wire connected on one side and no wire on its other side.
It was clear that this switch was someone's idea of a silly joke. Convinced by our reasoning that the switch was inoperative, we flipped it. The computer instantly crashed.
Before you remove the onions, make sure you understand why they were added in the first place.
Unfortunately, while this is correct, it is nowhere near strong enough.
Just because you know the initial hypothesis that drove someone to throw the onions in, doesn't mean you know what the onions are actually doing. The onions are interacting with all the other ingredients. Onions are more complicated than you think.
Before you remove the onions you must understand what they are actually doing in your recipe, not merely what you think they are doing... or you must be prepared to find out. Because even very tiny systems are generally too complicated for you to fully understand, you usually have to settle for science: You must have good enough test coverage to detect any breakage as soon as possible after the onions go out, so that you may frantically throw the onions back in and then go back to the drawing board.
And the longer the onions have been in the recipe, the longer the recipe has evolved in the presence of onions. Take the onions out and all the other changes that have happened since the onions went in might have to be tweaked. Oh, the subtle things you could find.
Here is what can happen: You'll take out the onions, and then five years later your customers will complain that your company's red paint has started peeling, ten years ahead of the fifteen-year warranty period. Oh, no. You call an all-hands meeting of your QA team. They spend months doing expensive research, while your product engineers fly from city to city trying to pacify the increasingly irate customers. Eventually the scientists find that the paint can be fixed by mixing in some kind of protein or other. But why did the paint used to work? Oops, the onions had protein in them! Sure enough, if you throw in some special Essence Of Onion Protein compound the problem goes away. Great. Now all you have to do is issue a field recall of about six years worth of your product.
You will bring this story to your CEO and the shareholders, and they will ask why you didn't just pay for some onions? Or, if removing the onions was critically important, why you didn't run a small set of onion-free test batches on a parallel process line and then put those batches through lots of tests, both in the lab and in the field, before making the change?
I've worked as a semiconductor engineer, specifically in charge of diagnosing problems that arose in the field, so believe me when I tell you that I've seen "simple", "well understood" little tweaks in a semiconductor processing recipe cost companies millions of dollars and one hell of a lot of stress. I've also lost six months of my grad student career to one specific bad step in a recipe. Once you get a working recipe you do not change a thing without a specific plan to thoroughly measure and document the effect of that change. Find someone from Intel and ask them how this works: I've heard that you can't so much as touch a knob in Intel's fabs without a signed change order.
And why we should write our recipes (code) carefully to imply intent such that dependencies like this are not created unnecessarily. Consider:
"Add one onion", vs
"Dip piece of onion in mixture until it begins to fry"
Latter clearly indicates no aspect of the recipe should be a dependency on the onion or inherit properties from the onion as the onion will be removed before runtime (consumption)
This still wouldn't be sufficient in mechanical_fish's example. Someone might still remove the onions (and the needed proteins) after the introduction of thermometers with the reasoning that this would be a "simple" and "well understood" tweak.
With very old code bases, you need to start removing things, otherwise it becomes unmaintainable. Sure, you need good testing and code coverage to make sure you did it properly, but that applies to any change.
It doesn't mean that the people implementing it 15 years ago were idiots, no, it all made sense at the time. But after years of maintenance some parts grew into an abomination (losing any touch with the high-level design), and other parts are not needed because the specific gadget it was supposed to support no longer exists.
If you don't refactor, you'll eventually have a landmine-ridden place where every change has impact on 10 different unrelated places. A company I worked at had this problem. They had to keep adding developers to handle bugfixes and feature requests, and it only became worse instead of better because they never made available time for refactoring.
"Removing code that you don't understand" is indeed wrong and shouldn't happen, but at least if you use SCM can always look back to understand why it was added. And then remove it anyway.
Believe me, that process makes sense for a factory but for source code you eventually end up with voodoo programming. The source is too complex for any human to understand, and it is impossible to teach new developers which didn't "grow into it".
Oh, I agree that you'll eventually be unable to maintain or improve a codebase built on coincidence, just as you will eventually reach the limits of a chemical process that uses onions. For example, your onion process will be inconsistent at some level, precisely because onions are a hack with a lot of potential side effects. What breed of onions, again? How big should the individual onions be? Don't onions vary according to the soil you grow them in?
So you'll probably have to take the onions out, sooner or later. But: You've got to be prepared for the real costs of that project. It is very, very tempting to tell yourself that your simple change is "obviously" going to be cheap. Particularly in software, which is not chemistry, let alone biology [1], and which is so close to being deterministic and consistent and provable -- after all, the individual components often are that simple (at least in the absence of cosmic rays and lightning strikes), and every complex program starts out simple and easy to change, and the complexity can grow so slowly that you barely perceive the day-to-day increase. So you can convince yourself that you know what's going on in a software system. And then, whoops, you make a change, and it has a side effect, which has another side effect, and then the bug reports come in, from users you may not have even realized you had.
---
[1] This is why it's great to spend a little time moonlighting as a biologist: You can practice living in a world without certainty. Biologists really don't know more than a small fraction of what's going on. Even individual bacteria are mysterious. If you've been raised on physics you will be shocked by biology. You really have to learn what controls and statistics and experimental design are about when you're trying to measure a biological system.
> I'm sure many of us have been bitten by this as developers. You see a few lines of codes or a feature and you have no idea why it's there, so you remove it. What happens? Random, seemingly unrelated application starts failing or angry customer calls wondering why some feature is no longer available. Whoops...
Yep. Tells me OP is not a developer, and has not read the article he linked. Because the title makes no sense at all.
If the priority is to not break anything, sure, don't fix what's not broken.
Now every piece of code which is there for no know reason remains a problem. If you want to keep the code base simple, better remove either the code or your confusion (add a comment, refactor, whatever).
Whenever your product is in for the long run. For those cases, better temporarily break it than eventually being unable to manage it at all. That line of code no one understand is a technical debt. Sometimes, it is better to pay the debt right now.
I'm currently working on a 2 millions LOC program which never paid its dammed debts, and I weep every day before this holly Big Ball of Mud.
If our highest priority was always "Don't break anything", we would never ship to production after the first move. "Make it better" is often a higher priority.
I think you mean this as part of the investigation in what this code does.
That is, after setting up a test/dummy system, you remove the code, run your tests and try to break it. Perhaps sticking in breakpoints or print lines or whatever..
I doubt you really mean to experiment on the business-critical live system.
Thats unlikely to be very effective anyway. The most obscure code is often a bug fix for some really obscure bugs. Maybe it fixes some weird bug in Swedish XP SP1, or something which you don't find just by removing it and testing. Situation depending, you will need to actually read and understand the code and investigate properly.
Obviously, it is not a tactic to use on a live production system. On the other hand, no programming at all at a live system is the prudent way of doing it.
Good test coverage help a lot here though - remove something and you'll get a set of failing tests to investigate.
Or you notice nothing breaks so you leave it out, joyfully congratulating yourself that you reduced total LOC count by one. Then six month later things start breaking and you have no idea why.
Refactoring and removing code needs courage and tact, but sometime it needs to be done. If you have a database that is not properly normalized, the more client code you add to it, the harder it is to normalize later.
The biggest problem with refactoring and removing code is to explain it to non-techy project managers. If you say that the code or the design is bad, you are saying the guys behind it are bad (you or the previous team), which is not very tactful. I usually try to explain that a software system is a living thing, that need some regular washing up. Also, it is possible to explain that the next features or optimizations will take less time to implement after refactoring.
I tend to blame myself. "I didn't fully understand the problem space when I first wrote it, and now I have to rewrite it because it is a horrible mess and too complex to maintain".
"You should never remove code that you don't understand."
Never is a long time.
It's fairly easy for low-skill developers to write code that's time-consuming for high-skill developers to understand. In fact, making simple things hard to understand is pretty much the definition of bad code.
So, you use your judgement. If it's a particularly subtle and important part of the code, spend the extra time to make sure you're not missing anything. If it's not, then just rip it out and don't waste time in a maze of strange control flow, redundant code, useless invariants, and confusing assumptions.
> If the priority is to not break anything, sure, don't fix what's not broken.
The issue generally arises because things are broken, investigation leads to a piece of code which creates the breakage (everything is perfect before, things are broken after), nothing seems to use it (the project is naturally pretty much void of automated tests), you remove the code, it fixes the issue, and then you learn that prod has started failing hard.
I heard a similar one, about a woman who had learned from her mother always to cut the ends off the roast before putting it in the oven. One day, teaching her daughter about how to cook a roast, her daughter asked why, and she realized she didn't know - so she went to her mother.
Her mother said, "I always thought it was for the taste, but I learned it from your grandmother; we should ask."
So they asked the grandmother, who said, "Oh, when your father and I got married, all we had was a very small roasting pan, so I cut off the ends of the roast so it would fit."
This is basically the origin of all religions, especially 'modern' monotheistic ones.
Good, sensible ideas (at the time!) get codified. Throw in a charismatic leader of some kind who (apparently) goes around trumpeting these smart ideas. Voila, religion.
Any dead religion is by definition not a modern religion. If you look at religious movements started within the last 50 years in the U.S., there are few if any that fit that pattern.
Back to that two page function. Yes, I know, it's just a simple function to display a window, but it has grown little hairs and stuff on it and nobody knows why. Well, I'll tell you why: those are bug fixes. One of them fixes that bug that Nancy had when she tried to install the thing on a computer that didn't have Internet Explorer. Another one fixes that bug that occurs in low memory conditions. Another one fixes that bug that occurred when the file is on a floppy disk and the user yanks out the disk in the middle. That LoadLibrary call is ugly but it makes the code work on old versions of Windows 95.
That doesn't sound like it's at odds at all. The point of the article is that you should investigate why something was done in the past so that you don't keep repeating the same thing even when it isn't necessary anymore. All of the things in your quote are still required and shouldn't be removed.
Well, "If you don't remember why onions are in there, investigate carefully as to why they were added in the first place, then take 'em out if they're no longer relevant" wouldn't fit in the title!
GK Chesterton tells a similar, but sort of contradictory, story:
In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, "I don't see the use of this; let us clear it away." To which the more intelligent type of reformer will do well to answer: "If you don't see the use of it, I certainly won't let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it."
In most technology fields, it seems like Onion Theory works: we can gradually get rid of the hacks we used to get round limitations. But when you're dealing with people and institutions, Fence Theory dominates: human interactions have lots of complicated nth-order effects.
Onion Theory and Fence Theory are deeply similar. In both cases, the point is: be aware of practices that don't make sense. Either they illustrate long-ago hacks and compromises, or they tell you something important but non-obvious about the nature of what you're doing.
The advice I took from Levi's story is: consider the possibility that things you take for granted as necessary actually aren't. They could be mistakes, or artifacts of limitations that have since disappeared.
(It's particularly useful for startups to notice the latter, incidentally. A lot of existing businesses turn out to be artifacts of limitations that have since disappeared. E.g. hotels are to some extent a hack to get around the fact that it used to be impossible to search all the available space in the city you wanted to visit. If you built a whole building full of bedrooms, you could afford to pay to advertise it and to pay someone to work full time taking reservations.)
There was a This American Life episode a little while back that had a counter example to this anecdote: sometimes there is something in your process that makes your product special but you don't even know it until it disappears.
Audio here[1] at 34:30. A brief summary found here[2]:
"Jim Bodman, Chairman of Vienna Sausage Co. in Chicago tells the story of how the company built a brand new, state-of-the-art facility in 1970, replacing their old factory, which was actually a warren of buildings on Chicago's south side that was built up by gradually buying up buildings over the course of 70 years, until the factory complex occupied an entire city block. Once they moved into their fancy new digs, however, they faced a problem: the hot dogs weren't coming out the same. They didn't have the same distinctive red color or desired snap. They couldn't figure out what was wrong, since the ingredients, spices, cooking time, everything was the same.
After a year and a half, they still haven't figured it out...until one night, when some guys from the plant are out at a bar, reminiscing over drinks about the old days in the former plant. They start talking about Irving, a fixture at the old plant who knew everyone, whose job was to take the uncooked sausages to the smokehouse. But, given the "Rube Goldberg" layout of the old factory, it took Irving half an hour on a circuitous route to get from A to B. And they realized: Irving & his trip was the missing secret ingredient."
I think the moral of the story for programmers should be:
If you're putting something in your code that looks like a vegetable, please put a comment on it.
Ten months down the rode you won't remember why that silly looking hack is there and neither will anyone else. It might not be applicable anymore and it might be safe to remove, or it might be holding together something critical.
Several weeks ago on HN I came across this slideshow: http://www.slideshare.net/olvemaudal/deep-c which sparked some debate as to the merits of deep, complete understanding of one's environment. I believe that those who do not have a broad understanding invent their own miniature cargo cult to get by -- or perhaps it is that they are willing to cargo cult and therefore do not develop a deep understanding?
Whatever the case, it's incredibly common in the world of software.
There's a dangerous flip side to this: In my experience just about all programmers want to remove things from code that they don't understand or see the reason for. Why is that weird check there? Who thought it was a good idea to allocate memory that way? Very often there ARE good reasons for those decisions, and they can stem from bug fixes or odd special cases that may not be obvious..
Right, which is why you don't just remove the onion because you feel like it. In that story, he found out why they used to put the onion in and determined it was no longer necessary.
Moral: if you don't understand code, don't change it, understand it.
This has been said in some replies in this tree, but sometime you can take the piece of code you don't understand as a black box, and bridge its behavior with new code, comparing both outputs during a few iterations, then remove old code.
To me, this is the real point of the story. If you are doing anything at all non-obvious, then eventually it will make no sense at all, so write down why you did it.
I think it would be good to ask yourself two questions every time you write a piece of code.
1. Is it obvious what this code is doing? (For example, i+=1 is always obvious, at least in regards to what it is doing.)
2. Is it obvious why I'm doing this? (If you're in a loop, i+=1 is pretty obvious as to the why as well, but elsewhere, it might require explanation... and you probably shouldn't be using the variable name i outside a loop, but that's not the point I'm trying to make here.)
For some idea of what this looks like, I've been working on a piece of code today that uses an external program through its API to load a file and make some changes to it. I didn't document the 3-4 lines it takes to load the file and get the object I need. Why? Because it's just how you load a file. Anyone could figure out that out from the documentation. It's already clear what I'm doing: loading a file. And why I would want to do that is obvious given the intent of the program: the whole program is created to perform some operations on files.
I do, however, document why I'm changing the working directory of the program to the folder the file is in. Because the program always saves to its working directory when you call Save from the API, which is not at all obvious or documented in the API.
For one rule (certainly not the only one): anything you had to figure out through trial and error is non-obvious.
It might be interested to collect some example of things programmers decided to or not to document and explanations for those decisions.
My experience over a long career of fixing bugs is that it just doesn't matter. Comment-free code doesn't tell you why it does what it does. "Well"-commented code devolves into lies over time. Hard bugs are just hard.
What does help is keeping the voodoo out in the first place by reducing the "non-obvious" bits to an absolute minimum. Which of course is isomorphic to saying that "great code has less bugs". Not exactly profound. But notably it has nothing to do with commenting or documentation, per se.
Funny story, the 3rd job I had was at this large telecommunications company. I was doing consulting, but one of the jobs I had to do was make sure that these very important reports were generated. They had to be printed out and sent over the interoffice mail.
I figured, why not generate the reports programmatically and stick them on an intranet website (this was 1999 so "intranets" were becoming a buzzword). I asked my boss, he said okay go contact the people who need the report and see if they are okay with that.
I called the person A that normally got the reports, and they said "Oh, I don't need them, I pass them off to B". So then I called B and they said they passed it to C. This went on for about 5 people, and then it turned out that no one needed the reports. It was just something that was ingrained into the process, but no one actually needed.
The funny thing is that because no one had ownership, I could neither change the process, nor get the reports from being printed in the first place. Truly Office Space-esque
This piece of advice does not apply to legacy systems. There is a lot of stuff in there that nobody knows how it works or why it is in there, yet attempting to throw it out has led to so many disasters in the past that no one dares to even think about removing it. Of course there is no budget or political will to investigate - money comes in just fine even with the onion in the varnish.
What if you can't remember why you decided to adopt "If you can't remember why onions are in there, take 'em out" as one of your personal guiding maxims?
So true. The hardest part is taking away features from your own product. We need to learn to let go of non-crucial features. built. launch fast. iterate faster.
The factory didn't remove the onions from the recipe because no one could remember why they were there. They were removed because Levi asked why they were there, and when no one remembered, he investigated the original purpose of the onions and determined after that investigation that they were no longer needed.
I'm sure many of us have been bitten by this as developers. You see a few lines of codes or a feature and you have no idea why it's there, so you remove it. What happens? Random, seemingly unrelated application starts failing or angry customer calls wondering why some feature is no longer available. Whoops...
So my takeaway from this story was actually quite different from the author's. I took: "Before you remove the onions, make sure you understand why they were added in the first place."