For bonus points - add robust code folding to this, especially when the folding persists across edit sessions.
I worked on a large embedded system where all the principal developers used such an editor (there was some occam heritage). One incautious copy and paste of 'a handful' of lines, and you had accidentally replicated a complete subsystem! It made debugging 'interesting'.
Every time I think 'Oooh - code folding - what a great idea', I retrieve those memories.
More generally, any time editor cleverness is interposed between the creative idea and the source code, I get a twitch. It should be possible to work on the source code without requiring anything more than a bare-bones editor - that is all I can assume about what my favourite editor shares with whatever future developers (incl. myself) happen to be using.
Yes, that's why in my previous job I simply forced all my team to use Vim, with almost the same vimrc. Before me, some were using Eclipse, jedit, even dreamweaver, that was a mess, also with encodings.
But to complete Jacques' rant, I'd say that if duplication is bad in the part of the code handling logic and actions, it is not bad and often "normal", or even "better" in configuration parts or in content parts (eg templates). Factorizing a list of configs to replace some duplication with logic (loops, conds) can be a very bad idea.
> Think about that ctrl-C, ctrl-V sequence next time you are about to use it.
Many people use "cut and paste" as a generic term that refers to either copy and paste or cut and paste. You have to figure out which they really mean from the context.
I know what he mean, but that's not what he said.
And it is one of the primary reasons why bugs exists, you mean something, but program something else.
Great article. Code expands to fully use all available editing tools. If you can see 1000 lines of code on your screen at once, you'll work with 1000 line chunks of code. If you have great tab-completion, you'll make your APIs big enough to make the tab completion useful. If you get a million lines of boilerplate every time you create a file, every file will be more than a million lines long.
What's great about ed is that you keep most of the program in your mind. There is no tab completion. There is your brain. And when you keep the whole program in your brain, you write less code and the code you write is better thought out. I wish we could go back to the days of ed, but all we can do now is consciously resist the desire to work at our tools' maximum capacity.
Then came the full screen editors. They were a bliss compared to line editors. Really. If you don't believe me try 'edlin' for a day or so. We'll see how you like that. Then you'll be wondering how come any software got written at all for the first 2 decades of computing.
I have been wondering that, actually.
On the topic of the article, luckily, the Rails community (and python, and others) seem to have embraced this idea, under the heading "DRY principle" (Don't Repeat Yourself). There might be a connection between that embrace of DRY and the fact that Ruby/Python afficionados tend to prefer "simpler" text editors (Textmate, Vim, Emacs, Sublime) rather than IDEs.
Worth noting though that this can go too far the other way. I've definitely written things where, in my refusal to repeat myself, I've made a hard-to-read mess.
Here's an example from when I was very first starting out with ruby:
This is a pretty hard to read piece of code, despite my rambling comments. All done to avoid copying and pasting the same easy-to-follow method ~10 times.
But the problem with this code isn't that there is too little duplication! The trouble wass that your grasp of the language wasn't that good and didn't have any idea of how to organize the code.
So, would you have solved it by a more readable form of metaprogramming? Seems to me it's either that, or copypasta.
(Could still be me being a noob, but I can't see some other architectural means of avoiding a choice between either duplication or tricky-to-read metaprogramming in that case.)
Although I'm not from the punch card era, I overall agree with the sentiment of this article. I learned to code has a hobby when I was a kid and then took a CS degree. During the CS degree I was indoctrinated out of "bad habits" that I developed as a self-taught programmer. One of these bad habits was being terse. I took my degree during the golden age of createFactoryOfServerProcessGenerators() type things. Now, with more experience and wisdom, I realize I was right all along. A development environment that comfortably fits in your brain makes programming more effective and fun. And maybe I'm too idealistic, but I strongly believe the two are related.
Copy-and-paste is bad, I agree, but on the other hand, I hate it. And by 'it' I mean all of it. I have never seen a large system that could, for more than a moment in time, be seen as anything else than elephantiasis.
It's shameful and I'm sick of it. I want out. Right now, I'm paid quite well as a contractor to make some enhancements to a project developed by a hundred underpaid developers in a distant land, who did a remarkable job, considering. But it's rat's nest -- they didn't stand a chance. And I can't look myself in the mirror anymore and say this is what I do. It's grotesque. I'd rather run a coffee shop (and indeed I've been talking about it) because at least I could go home at the end of the day having seen a little beauty here and there, instead spending all day staring into the face of the Elephant Man.
Thanks for your concern but I'm not sure it's burnout. It's waking up. Here's my process:
1) Sure it's ugly, but look at what it can do!
2) Ok, it's ugly, but with good practices, good people, good management, you can ameliorate the worst of it.
3) Fuck it.
I've decided that this is my last contract. I have a project called Kayia, and my wife has a site called Kongoroo. I'll continue to work on those but otherwise that's it. If I'm not coding in Kayia, I'm not coding.
I completely understand your #3. We keep making excuses and rationalizing, for years and years, the fact that the overwhelming majority of software projects are fucked.
It doesn't sound like burnout to me either. I went through something similar. The solution was to admit that I had taken a wrong turn in my programming career, and commit to working only on things I believe in. It was either that or get out of the software business altogether.
I recently stumbled on them. Especially the second one has some entries that are very inspiring if you are feeling burned out / largely incontent with programming.
Thanks, I appreciate the sentiment. I think it's more like this: imagine you are a house builder in a place where they make houses out of plastic spoons and masking tape. You make a great living out of building these houses, but it just doesn't feel professional, and finally you quit just because it seemed too much of a farce to continue anymore.
It's not just where I'm working, etc. -- it's all based on C. It's all a farce. (And don't mention Lisp, which is an idiot savant. Sure it can count the matches on the floor, but you can't introduce it to anyone.)
Language is clearly not the whole solution, only part of it. The greater part is, I suspect, psychological and social. We have yet to come to terms with the relationship between software development and how humans function. Given that code is written by humans for humans, that's a fatal omission. Yes, there have been attempts to talk about this (Weinberg, Peter Naur, the better parts of agile) but they're the tip of an iceberg.
(p.s. Edited for brevity. I always do that and never bother saying it, but in this case there was a concurrency conflict because someone downvoted the bloated version. Mea culpa.)
It's a very interesting point but personally I couldn't disagree more. While being human means we'll never get rid of the chaos, I think a new paradigm can take us a huge way towards managing that chaos and abstracting it away to a great degree.
I remember a lunch I had with Kent Beck in Zurich in 1998 (I think) and I was all set to pounce on him with this idea I had of a new approach and he instead pounced on me first with XP. I was utterly unconvinced, not because he didn't make a fantastically convincing argument, but I was -- and am -- convinced the failure stems from somewhere else.
You're more optimistic than I am. I think the limiting factor in software development is the human brain. It has two critical limitations: an individual mind can only handle so much, and different minds don't compose easily. Yeah, XP was absolutely an attempt to address that. Did it work? Not in my book. It doesn't take the creative process seriously enough - not even close. Instead it regards creative minds as something to be harnessed as a resource (a nicer way of saying exploited). My body rejects that.
Another way of putting this critique is that XP is designed to work inside existing companies that are incapable of tolerating the forms of organization needed to produce software well, when what we ought to be doing is starting new and much smaller organizations, which in the local dialect I believe is called "startups". Of course it's not that easy, because once startups begin to grow, they bring back the old organizational assumptions. But that just means we need more deviant startups.
You, on the other hand, think there exist paradigms of abstraction that can fit the complex systems we're trying to build well enough to make the process tractable. I hope you're right. Certainly some such forms are better than others, so others could be better still.
That's a great argument. We seriously, seriously need to go for a beer.
You've summarized my argument nicely.
How to demonstrate it? Are you asking me to put my money where my mouth is? :-) Good question. I'm working on it. For example, we need to query code. I want to see all aspects of this portion of the UI. We should only look at code in layers, as accounting systems do. For one. I could go on, but I'll leave that for sometime over a beer.
Thanks for that, and I feel horrible with my cheap shot, just hours before the legend himself passed. Lisp is brilliant and has been one of the very few lights we've had to guide us. My heart breaks at the news of John's passing.
As usual, there is merit to the opposite side too. Don't create some huge abstraction with class hierarchies and virtual methods and stuff, just to avoid cutting and pasting a ten-line function somewhere.
Often it's a good idea to start by cutting and pasting, and then afterwards figure out what ended up being sharable.
Agreed. I recall a useful rule-of-thumb that states you should wait until the third time you need a bit of code before packaging it into something reusable. Otherwise, you might be wasting your time.
This sounds like a rule-of-thumb for making sure bugs get fixed in one place but not another. :)
Seriously though, this article got me to thinking and I realized that why would one live with the smell of copy-and-paste when it's just so darn easy to write a singly reusable function?
I agree with what you in the sense that there's no need to go from what might be a few lines of copied code to a full-blown library or subsystem. But for a small bit of common code not refactoring that out immediately seems to me a really bad practice.
If you factor too early you end up with a function that takes ten parameters. Sometimes cut+paste functionality evolves into truly divergent code, and you have to have a "feel" for when it will happen.
You're positing a false choice; it's not either-or. The proper abstraction for a cut-and-pasted 10-line function is generally an 11-12 line function, with just enough extra complexity (i.e. an extra argument or two) to capture the differences in the two implementations.
The huge abstraction with class hierarchies is an entirely different kind of idiocy. It's orthogonal to this problem. You can (and I've seen it done) cut and paste giant class hierarchies too.
I really disagree with the premise of the article that IDEs encourage copy&paste. Full screen editors do. IDEs do the opposite, by making it super-easy to reuse existing code through code completion and docblock tooltips. In a decent IDE, copy&paste is often more work than finding the right method and calling it.
My first programs ran on a programmable calculator. I wish I could have kept the habits I had back then. Drawing block schemata on paper before coding seemed tedious but it was in fact valuable. Damn you lord Sinclair and your ZX Spectrum for providing a full screen "editor" which let me skip the design phase. And it didn't really have cut'n'paste in the modern sense.
I'm not sure the original ZX Spectrum editor qualifies as "full screen" as selecting a line and visually editing that line are separate tasks. It has more in common with RegEdit than with Notepad.
Does anyone know if there are tools out there that look at a code base and find "similar" snippets of code?
I would expect such a tool to parse the language into AST form and find branches that are the same except some identifiers and a few other details. It is probably intractable in general, but I think it is feasible for most code bases.
Yep; we've just had exactly this problem with a contractor. Some functionality needed tweaking, which was done on one page - but on testing we found the exact same functionality hadn't been tweaked on all the other pages.
Huh?
Turned out it was a copy/paste job... which in turn became only the start of the rabbit hole. :)
It's gratifying to get something up and running quickly. I do a lot of "copy and paste programming" to get a functional prototype doing stuff. it's helpful when I need to show a non-technical person (e.g. my boss) in a convincing way. ctrl-c + ctrl-v is my best friend
Most of it is throw-away code: just playing with ideas and different implementations. Exploratory programming is cheap these days, and I really think it's for the better.
If it looks like I'm narrowing down on something I'm actually gonna use, I refactor, rewrite, simplify, and delete a lot of code. ctrl-x becomes my new best friend. (I write in natural languages much the same way.)
I understand the analogy, but on the other hand, the barrier to using a gun for killing people is nearly infinitely higher (I hope) than the barrier to using copy & paste to "just fix this little thing here".
So many times, when you need a quick solution for a small problem and you know that you have fixed this exact same problem a few weeks/months ago at a different spot, there's a huge temptation to just go and copy&paste those lines.
By doing so, you have just created debt. What if that code is using some API that you want to change a few years later? Whoever is going through the old code now has to fix up all instances of your copy & paste action.
What if the initial code contained a bug that somebody else fixed? They might not know about the copy pasta. It's very likely that they only fix the initial instance of the code and not all other places where it was pasted to, so the bug partially remains, or, worse, is later classified as a regression (which it's not).
So, coming back to your analogy, hating copy & paste is analogous for hating guns not only for their potential of killing people but also for their potential for causing accidents and for their potential to use them for any kind of potentially non-violent crime.
As the guy who is usually doing the refactorings to our 7 year old codebase, I'm the guy who is suffering from copy pasta and I'm telling you: I totally agree with the original article. I've yet to see a single instance where copy & paste of more than a single line of code didn't cause me non-insignificant amounts of additional work.
That is spot on. But blaming text editors and IDEs is a missed shot. If the author blames the editor I'm temped to think that he himself resources to copy paste because he as his editor to blame.
There are some greate advice however its impractical in the real world. (I am not defending copy and paste, the DRY principle should be your number one rule.) Most (good) contractors work in this way however its impossible to lay out your solution to its entirety and build a beautiful solution. Without specs being changed on you or you realizing that what you are building is not something the clint wants.
What he really hates is his lack of willpower when it comes to doing things the wrong way. Cut-and-paste enabled him to be lazy, but it certainly didn't force him to be.
Correction: BAD copy/paste is one of the cancers of our profession.
Like all tools, it has its place, and there are people who tend to misuse it. Just because it's misused by bad programmers is no reason to deprive GOOD programmers of its benefits. You might as well ban screwdriver heads in power drills because lots of clumsy, inattentive people tend to strip screw heads with them. Similar misguided calls-to-arms have been raised over other useful tools such as preprocessor macros and goto.
Could you give your estimate of the properly used/badly used ration of copy/paste, preprocessor macros, and goto? Mine would be "too low for me to worry about".
Okay, there are some (corner) cases where they really are a good idea. But "never ever use this Chtulu Abomination" still is a damn good heuristic.
Good copy/paste: Anything boilerplate such as manually initialized structures/arrays of stuff, language-specific boilerplate (Java comes to mind here) that is essentially the same repeated shit but you gotta do it anyway, unrolling loops, setting up switch statements. Anything unavoidable (or too expensive to refactor) that's repetitive.
Preprocessor macros: Code generators, compile-time switchable code (such as logging) without filling your source files with #ifdefs
Goto: Managing complex resource allocation/deallocation within a function:
I disagree that dismissing a tool out of hand as an "abomination" is a commendable approach. If you take that attitude (or instill it in others), you'll probably never learn the proper use of such a specialized tool, leaving you ill-equipped to deal with the situations those tools handle well.
Boilerplate: your language is bad. Switch to a better one, or modify the front-end of your compiler (they probably won't let you, though, and that sucks). I mean it, the cost of boilerplate is really high.
Unrolling loops: I never saw an instance where that was necessary. Plus, the compiler can often do it for you. I know we often use high performance applications (video decoders, 3D games…), but very, very few of us write ones.
Switch statement: the syntax of the construct is heavy, we should lighten it. The rest is hardly boilerplate any more:
Preprocessor macros: I agree (I mean, I back-pedal), they are more useful than the rest. However, I still avoid them by default, as they make really good foot-guns.
Goto: your example shows exceptions (try…catch finally here). Goto makes much less sense when you have them.
Now my point isn't to never do those things at all. Only to think of them as last resorts. The "Chtulu Abomination" metaphor helps me do that.
All languages have boilerplate somewhere. It's unavoidable. Switching languages just because there are pain points is not a solution, because you'll simply be exchanging one problem for another.
"or modify the front-end of your compiler"
This is most definitely pie-in-the-sky. In the real world of real business, you can't do this.
"I mean it, the cost of boilerplate is really high."
Oh, I agree wholeheartedly. But you need to work in the languages that programmers understand today. I'm not going to have nearly as much success hiring smalltalk programmers as I would have hiring Python programmers, for example.
"Unrolling loops: I never saw an instance where that was necessary."
I have, but then again I've been at this for 20 years.
"Plus, the compiler can often do it for you."
You can't be sure until you look at the disassembly. Often, it does it wrong.
"Switch statement: the syntax of the construct is heavy, we should lighten it."
That's not going to happen within the next decade. Meanwhile cut & paste is my tool of choice to cut through the boilerplate.
"Goto: your example shows exceptions (try…catch finally here). Goto makes much less sense when you have them."
But it makes LOTS of sense when you don't have them. And it's still useful in C++ and Objective-C, where you still end up interfacing with C libraries. If you don't know about the goto "poor man's exception", you'll either end up with repeated and buggy deallocation code, or a monstrosity of nested scopes, or a bunch of check-return-and-throw constructs, which in the case of Objective-C will slow your program way down because it implements try/catch using longjmp.
Goto is also useful for breaking out of inner loops when the language doesn't support that (some languages support "break [label], to make it seem less like a goto).
I don't consider these to be last resorts; I consider them specialized tools. Much like design patterns, they're for mitigating some deficiencies in a programming language, but they're only effective if you know how to use them.
However, I'd like to challenge the belief that modifying the front-end of a compiler is too hard, or unreasonable. Even in the so-called "real world" where screwing up means you're fired.
First, the point is to tweak the language, not the compiler. For instance, we may want to lighten the switch() syntax before GCC does, but we do not want to modify GCC itself if there's a simpler way.
More often than not, there is as simpler way: just write a parser and a printer for your language, so you can do source-to-source compilation by chaining them. Printers are easy. Parsers are almost as easy, except for C++.
Then tweak your parser (lighten some syntax, add some keywords…), do some pre-processing between the parser and the printer (yeah, true macros), whatever.
Now there are some caveats: such a pre-processor may confuse IDEs, and may screw up error reporting (where errors don't track back to the actual source code). I personally don't care much about the former, but to solve the latter, the base compiler need to provide a way to be told where a given line of "source" code actually comes from (very useful for tools such as Lex/Yacc). Unfortunately, taking advantage of this will greatly complicate your pre-processor.
I'm confused. Isn't this just a discussion of code duplication, that most try to avoid in programming because it means it should probably be re-factored into something reusable? Isn't it more about copy/paste then than cut/past?
Yes, the problem is exposed correctly. But he blames his tools rather than taking the blame himself. Or maybe he is just trying to blame his coworkers using some euphemism, still lame.
Tools have much, much more power over their users than you seem to think. Take subversion vs git for instance. How often would you merge branches with either tool?
We tend to follow the path of least friction. If a tool you use changes that path, and unless (even if?) you consciously fight it, you will change your behaviour, whether this is a good thing or not.
So I make a tool to make my work easy, then I go and misuse the tool to make my work shitty. Then I go around blaming the tool for my shitty code since it does not hinder me writing shitty code. I never thought a editor was supposed to hinder writing of shitty code.
I worked on a large embedded system where all the principal developers used such an editor (there was some occam heritage). One incautious copy and paste of 'a handful' of lines, and you had accidentally replicated a complete subsystem! It made debugging 'interesting'.
Every time I think 'Oooh - code folding - what a great idea', I retrieve those memories.
More generally, any time editor cleverness is interposed between the creative idea and the source code, I get a twitch. It should be possible to work on the source code without requiring anything more than a bare-bones editor - that is all I can assume about what my favourite editor shares with whatever future developers (incl. myself) happen to be using.