Hacker News new | past | comments | ask | show | jobs | submit login

> If the two pieces of code are likely to change in different ways, for different reasons, that is a strong reason not to extract it into a function even if they happen to be character-for-character identical for the moment.

The future is much more malleable than your immediate needs.

But even if your future turns out to be true, the code you need to refactor is then already extracted to a function, so you can easily duplicate that function, make your localized changes, and change the callers to the new function. So this is still the best route.

> Less code means fewer bugs, but that doesn't mean I should be working on the gzipped representation.

Don't be absurd. gzipping doesn't preserve your program in human-readable form. Extracting code into a reusable function makes your program more human readable, not less.




In the parent comment, you said there was no reason. I maintain that it's a strong reason. It may or may not be a sufficient reason, weighed against other considerations. In particular, if it harms readability that's also a strong consideration.

But sometimes it can help readability, too. "DRY" as a principle was originally formulated in terms of repetition of pieces of knowledge rather than code, and I think in those terms it's far more useful. If this code represents "how we frob the widget" and that code represents "how we tweak the sprocket" and there's no reason for those to agree, they should probably be separate functions. Pulling them out into a "tweaking_sprockets_or_frobbing_widgets" function is making things less readable, because it's conflating things that shouldn't be conflated. If there is not some underlying piece of knowledge - some statement about the domain or some coherent abstraction that simplifies reasoning or some necessary feature of the implementation - combining superficially similar things is just "Huffman coding".


> Extracting code into a reusable function makes your program more human readable

When done properly, yes. When done to the point where a five line function is created with ten inputs (yes, this is real), no. But DRY tells us that the five lines of duplication is unconditionally worse.

Hell, I've even seen things like logging/write tuples (i.e. log the error, write the to a socket) encapsulated, even though the only non-parameter code ends up being the two function calls.

Anything, taken to extremes is bad. The problem with DRY is it encourages that extremism.


> But DRY tells us that the five lines of duplication is unconditionally worse.

I agree that's often how DRY is understood, and that it can be a problem.

It is not how DRY was originally formulated, which was "Every piece of knowledge must have a single, unambiguous, authoritative representation within a system." This differs from blind squashing of syntactic repetition in two important ways. First, as under discussion here, if things happen to be the same but mean different things, combining them is not "DRY-er". Second, there can be repetition of knowledge without repetition of code. For instance, if we are telling our HTML "there is a button here", and our JS "there is a button here", and our CSS "there is a button here", we're repeating the same piece of knowledge three times even though the syntax looks nothing alike.

I make no claim as to whether the flawed, more common understanding or the original intent is what "DRY really means", but I think the latter is more useful.


This is correct, and the below should not be taken in argument with it.

DRY as a guiding principle sometimes has a secondary beneficial effect that was not discussed. Two pieces of code that happen to be the same but "mean different things" should not automatically be deduplicated by dumb extraction. However, the fact that those two things share code may, when viewed through the lens of "prioritize-DRY-ness", hint that the two share a common underlying goal, which can be abstracted out into functionality that can be used by both.

Put another way: if the code to control a nuclear reactor circuit and the code to turn on a landing light on a plane happen to be the exact same, they shouldn't be blindly deduplicated into some library function, but the fact that they're the same may indicate a need for a more accessible, easily-usable-without-mistakes way of turning that kind of circuits on and off.


> When done properly, yes. When done to the point where a five line function is created with ten inputs (yes, this is real), no. But DRY tells us that the five lines of duplication is unconditionally worse.

I'm not convinced by your example. There are plenty of mathematical calculations taking numerous parameters that I think should be in a distinct function.

Even for non-mathematical calculations, 5 lines of code that are used repeatedly as some sort of standard pattern in your program should also get factored out. Like your logging example, ie. you consistently log everything in your program the same way, then sure, refactor that into a shared function. Then if you suddenly find you need to log more or less information, you can update it in one place.

Of course, I understand your meaning that sometimes factoring out doesn't make sense, but if you find repetition more than twice as per DRY, refactoring seems appropriate.


I have another example.

In web frameworks, there is usually a little bit of boiler plate for each view.

You could refactor this completely away, but not without an almost total loss of flexibility and a good amount of readability too. Often the views will look very similar, then start diverging as the project grows.

With you on refactoring common patterns out, and yes, some people don't do this enough. But really, the important thing there is that those patterns are truly common to a large degree and should stay in sync - so it's worth it to introduce a maintain a new concept to keep them that way.


You DRY up code by writing indirection. That's the expense of all abstractions. You can't believe that all indirection is worth it at all costs, so I'm not sure what point you're belaboring.


> You can't believe that all indirection is worth it at all costs

I think I've been pretty clear about the costs and when this is worth it, particularly in my first post in this subthread, which I'll quote here:

> That said, the moment you have to start adding parameters in order to successfully factor out common code, ie. to select the correct code path taken depending on caller context, that's when you should seriously question whether the code should actually be shared between these two callers. More than likely in this case, only the common code path between two callers should be shared.

Or if you want a more concise soundbite: refactor if your indirection is actually a clear and coherent abstraction.


> When done properly, yes. When done to the point where a five line function is created with ten inputs (yes, this is real), no. But DRY tells us that the five lines of duplication is unconditionally worse.

I work with people I would describe as... junior at best (lots of boot campers) and I see this all the time. Functions that just return an anonymous function for no reason, half JSON blobs returned from functions that are called with one string instead of just repeating the blob in the code, etc.


>The future is much more malleable than your immediate needs.

http://wiki.c2.com/?YouArentGonnaNeedIt


You Aren't Gonna Need It.

That’s a popular claim. I wonder how many failed projects could have it as their epitaph.

Have you ever worked on a project where the requirements changed so fundamentally from one day to the next that you truly, honestly had no idea where you were going next?

I haven’t. I’m not aware that I’ve ever met anyone else who has, either.

The claim that requirements always, or even usually, change so dramatically within such short timescales that it isn’t worth laying any groundwork a little way ahead simply doesn’t stand up to scrutiny, in my experience. Any project that was so unclear about its direction from one day to the next would have far bigger problems than how the code was designed.

Otherwise, there is always a risk that by being too literal, by ignoring all of your expectations about future development regardless of your confidence in them, you climb the mountain by climbing to one small peak, then down again and up the next slightly higher peak, and so on. This could be incredibly wasteful.

Of course requirements often change on real world projects. Of course I’m not advocating coding against some vaguely defined and mostly hypothetical future requirement five years in advance. But often you will have some sense of which requirements are going to be stable enough over the next day or week or month to base assumptions on them, and insisting on ignoring that information for dogmatic reasons just seems like a drain on your whole development process.


>That’s a popular claim. I wonder how many failed projects could have it as their epitaph.

Way less than over-ambitious projects that died because of things that they didn't need, immortalized in lots of classic Comp-Sci literature. From Fred Brooks' books to Dreaming in Code:.

There's a reason it's a popular claim. In fact, popular means it's just repeated by many -- but this claim one can read repeated by the most experienced and revered programmers (or an analogous one, e.g. the KISS principle, "Do the simplest thing that could possible work", etc.), from the Bell Labs guys to the most celebrated programmers today.

>Have you ever worked on a project where the requirements changed so fundamentally from one day to the next that you truly, honestly had no idea where you were going next? I haven’t. I’m not aware that I’ve ever met anyone else who has, either.

Welcome to my life :-)

Not being snarky -- rapidly changing requirements is the number one complain in my kind of work.


Well, I didn’t say there was only one way a software project could fail! My point is simply that I believe anticipating and allowing for future requirements is a matter of costs and benefits. It’s about comparing the cost of making a wrong step and then having to backtrack with the cost of following a circuitous route to the final destination instead of a more direct one. Both are bad if we make the wrong choice, and we can’t see the future to make an informed decision about the right choice, but we can at least look at the expected cost either way and make an intelligent decision in any given case.


If it's not actually re-used then it's just making me jump around to see what's actually happening rather than reading straight through the code.


> If it's not actually re-used then it's just making me jump around to see what's actually happening rather than reading straight through the code.

Firstly, you only actually create a function either when it is being reused, or because it's functionality is a logically separable responsibility and so you factor it out for understandability.

Either way, the function should also have a meaningful name describing its purpose so you don't have to jump around to understand what's actually happening.


> The future is much more malleable than your immediate needs.

In practice I'm not really sure what you mean?

A good rule of thumb on large scale software projects is complexity begets complexity.

It then takes a lot of effort to return to simple code.

> Extracting code into a reusable function makes your program more human readable, not less.

Is that a given?

I've come across plenty of small functions that I couldn't understand without checking the calling functions for context.


> In practice I'm not really sure what you mean?

Meaning, your future needs are ever changing and often unclear. Your present needs are immediate and usually obvious. Meet your present needs first and foremost without sacrificing flexibility to meet future needs. Factoring code into functions accomplishes this.

> I've come across plenty of small functions that I couldn't understand without checking the calling functions for context.

Sure, happens to me too when I don't assign meaningful names, or the functions don't actually encompass a small, meaningful set of responsibilities, or the functions use deep side-effects that require reasoning within larger contexts.

The problem with such programs isn't factoring into functions though. If anything, this step reveals latent structural problems.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: