Hacker News new | past | comments | ask | show | jobs | submit login
Some things I've learnt about programming (jgc.org)
248 points by jgrahamc on July 12, 2012 | hide | past | favorite | 124 comments



Programming may be a craft, but researchers have published tons of studies about this craft. Many of these studies contradict anecdotal evidence. For example, copying code isn't as bad as you might think: http://www.neverworkintheory.org/?p=102

Another example is TDD. People espouse the benefits, then some study comes along (http://www.neverworkintheory.org/?p=139) saying the benefits are largely illusory and that code reviews are more effective.

Instead of listening to the experts at programming, listen to the experts on programming. Read some studies about the effectiveness of various tools and methods. Try new things. Programming is a craft, and like many crafts it contains significant amounts of dogma passed from teacher to apprentice.


> Instead of listening to the experts at programming, listen to the experts on programming

Wow. What a strange advice. All it takes to be an expert on anything is labeling yourself as such. I'm an expert on a lot of things.

Being competent at something, on the contrary, means actually going there, doing the stuff and learning the craft.

If the guy trying to teach me mechanics doesn't have dirty hands and nails, I'm not very interested in what he has to say.


If the guy doesn't know how he does it, what he has to say will not be very illuminating. To be able to teach, you have to understand why you are doing things the way you are doing them. It's surprising how many people are good at what they do, but still don't know why they do things a certain way, have not considered other ways and noticeably slow down when they leave their comfort zone.


However you should listen to an expert on Astrology, not an expert in Astrology.

Sometimes outside perspective is important.


Agree with you there. Replace "programming" with "management", and you discover the biggest pain point in the majority of our lives.


"An expert is a man who has made all the mistakes which can be made, in a narrow field."

  --  Niels Henrik David Bohr


Do you read the papers you link to?

The Copy & Paste one is rubbish. We say copy & paste is bad because 99% of the time you see it, it is bad.

The authors blithely ignore this to make an intellectual point that there are occasional uses to cloning code. Of course there are. A complete waste of words.


Yea. From the summary quote:

"For example, one way to evaluate possible new features for a system is to clone the affected subsystems and introduce the new features there, in a kind of sandbox testbed. As features mature and become stable within the experimental subsystems, they can be migrated incrementally into the stable code base; in this way, the risk of introducing instabilities in the stable version is minimized."

One might say, branching? Indeed, the paper mentions "forking" and boilerplate code. Many examples are poor, where a better language would be able to abstract at a higher level and not require "cloning". One example required "cloning" because the developer didn't have write access to the section he wanted to fix.

As far as real "copy and pasting": "Common examples include the initial lines of for loops".

But hey, I don't have a good survey to back up the fact that most of the real "copy and paste" I see in programming is laziness or poor platform limitations that end up being a pain in the ass and introducing more bugs.


As far as real "copy and pasting": "Common examples include the initial lines of for loops".

Makes it sound like another candidate for new abstraction facilities (not really all that new -- see APL and its descendants).


I agree one should try new things and relentlessly test one's own dogmas. But regarding TDD, one study of "junior and senior computer science students" (described in an article I have to pay $19 to read) doesn't make somebody an expert on anything relating to how experienced professional programmers handle code bases that last years to decades.


You can't just say "This is bullshit 'cuz they're two students who don't know crap about the real world". Any John Doe can come and affirm whatever he wants on any matter, and as long as he has proper evidence to support his claims, you should not discard what he comse up with based only on the fact that he's a nobody, a student, the President or Donald Knuth. The only thing that matters is the evidence John Doe brings.

But further than that, truth is they are not the first one to come up with those results. Following is a blatant selection of near verbatim quotes from Code Complete (the author makes a great job covering the subject).

"Microsoft's applications division has found that it takes three hours to find and fix a defect by using code inspection, a one-step technique, and 12 hours to find and fix a defect by using testing, a two-step technique (Moore 1992)."

"Collofello and Woodfield reported on a 700,000-line program built by over 400 developers (1989). They found that code reviews were several times as cost-effective as testin - a 1.38 return on investment vs. 0.17."

"[...]the Software Engineering Laboratory found that code reading detected about 80 percent more faults per hour than testing (Basili and Selby 1987). "

"A later study at IBM found that only 3.5 staff hours were needed to find each error when using code inspections, whereas 15-25 hours were needed to find each error through testing (Kaplan 1995)."

    Table 20-2. Defect-Detection Rates Removal Step
    				 Lowest Rate Modal Rate Highest Rate
 
    Informal design reviews			25%	35%	40%
    Formal design inspections    		45%	55%	65% 
    Informal code reviews			20%	25%	35% 
    Formal code inspections			45%	60%	70% 
    Modeling or prototyping			35%	65%	80% 
    Personal desk-checking of code		20%	40%	60% 
    Unit test		        		15%	30%	50% 
    New function (component) test		20%	30%	35% 
    Integration test			        25%	35%	40% 
    Regression test				15%	25%	30% 
    System test			        	25%	40%	55% 
    Low-volume beta test (<10 sites)    	25%	35%	40%
    High-volume beta test (>1,000 sites)	60%	75%	85%
     
    Source: Adapted from Programming Productivity (Jones 1986a), 
    "Software Defect-Removal Efficiency" (Jones 1996), and 
    "What We Have Learned About Fighting Defects" (Shull et al. 2002).
Really, I'm not saying that TDD is bad, not at all. What I'm saying is that this sentence :

"one study of "junior and senior computer science students" (described in an article I have to pay $19 to read) doesn't make somebody an expert on anything"

... is really wrong.

[Edit] Formatting nightmare


You've missed wpietri's point. He didn't question the study because students wrote it (they didn't). He questioned the generalization from CS students doing a toy exercise to teams maintaining production software.

Your additional citations are irrelevant because "testing" and "TDD" are not the same thing.


Actually you're right, after re-reading both the submitted link and the comment, I see I misunderstood his comment in regard to the context. Please disregard my last post.


Like gruesom says, you've missed my point entirely.


Studying this stuff is hard. Off the top of my head, I'd want any study to take into account

* The size of the company something is being developed for. IBM is not the same environment as some startup, and what works for one may well not work for the other.

* The economic goals and risks of the system: in other words, like someone else mentions, flight control systems have different incentives than a web page. What works for one doesn't work for the other (if you develop a web page as slowly and carefully as the flight control thing, your competitors will eat you alive).

* A long term look at how the code lives and evolves. Perhaps some things are quicker to code up. How do they stand up to maintenance, and adding new features with time? And as new team members are added?


You have to be a little careful to make sure that the conditions of the study are matching your conditions.

As an example the TDD study you mentioned compared the defect rates of a new software developed once with TDD and once with Code Review. At work we do TDD mostly to help us developing (faster feedback if the code does what i want, running it on the target needs 10+min), to have an example how the code should be used and to now that a refactoring broke something unrelated.

If it helps us with a refactoring or reduces the defect reduction rate that is a nice benefit but not the main reason why we use TDD.

So it depends on what you do if a study is applicable or not.


> Another example is TDD. People espouse the benefits, then some study comes along (http://www.neverworkintheory.org/?p=139) saying the benefits are largely illusory and that code reviews are more effective.

The study is much more thoroughly debunked in the comments than I will attempt to reproduce here. Important points: 1) The study's hypothesis was that TDD would produce software with equal, or fewer defects than inspection. That would be one, and depending who you talk to, by no means the most significant benefit of TDD. 2) The actual data collected by the researches showed no statistically significant difference in the two approaches.

Reading studies about the effectiveness of different tools and methods is a great idea. Treating them uncritically is not. :)


The article you link doesn't really support your conclusion that copying code isn't that bad. The only example it cites on when copying is acceptable(and even good) is one that's already been solved by version control software and branches.


I highly recommend the book "Making Software: What Really Works, and Why We Believe It". It's a collection of essays showing statistical studies on certain software techniques.


I highly recommend the book "Making Software: What Really Works, and Why We Believe It".

John Graham-Cumming, the author of the article submitted here, has a review of this book on Amazon.com:

http://www.amazon.com/Making-Software-Really-Works-Believe/d...

"This isn't a book about evangelizing the latest development fad, it's about hard data on what does and does not work in software engineering."


100% agree. So many blogs and books talk about how X is better than Y, or how Z is bad, but it is usually just a gut feeling with very little data to back it up. This book on the other hand is chock full of actual data on many topics that are debated all of the time. If you only read one programming book this decade make it this one. For me it replaces mythical man month as one of those books I now expect others have read if they want to debate the topics it presents.


Over time it has become clear to me that there are craftsmen and engineer type programmers, they can be utterly different, and applicable to different applications.


This quote by Groucho Marx might be relevant:

"Are you going to believe me, or what you see with your own eyes?"


Um... "your eyes can deceive you, don't trust them".

Besides the assorted cognitive biases that lead people to be convinced of the (non-)existence of $DEITY (and the absolute superiority of $EDITOR), there's the risk of deliberate trickery by salesmen and consultants who've been studying stage magicians.


I don't see why TDD and Code Reviews are mutually exclusive choices. You can easily use both (or neither) on a project, or mix and match, or use one on one part of a project and the other elsewhere. Knowing what each of them are is the on, knowing which to use when and where and how much is the at.


People selling methodology are selling methodology, NOT software. Remember the old adage, those as can do, those as can't teach.


One startup I was at was having trouble getting software done on time, with reasonable quality. Management started thrashing, searching for any solution that might possibly work [1].

One fine day a consultant arrived and announced that she was going to institute a new development paradigm. It was going to be world-class cool and whizzy, and improve our productivity and reduce our bug count.

The process? It consisted of a whiteboard mounted in the engineering area, with everyone's name and some heiroglyphs by the names, and some dates.

"Huh?" we said.

"It's the new (whatever her last name was) software development process. We put your name up here, with these symbols that tell us whether you're behind or ahead of schedule that get updated every day by me."

"WTF?", we said.

"I'm doing this with you guys, for free, because I want to get a business process patent out of it."

She lasted two days.

[1] Any solution that might work, except good software engineering and project management practices, that is. Sigh. There are no silver bullets.


The process? It consisted of a whiteboard mounted in the engineering area,

Up to this point, I thought you were going to tell us the new process turned out to be a genuine improvement.

IME, there are few ways to improve productivity in a small development team with a better ROI than providing vast amounts of whiteboard space right next to where people work.

A bonus point is awarded for each different colour of pen.

Ten bonus points are awarded for any technology that can immediately capture the current state of the board and save it in a standard graphics file format for future reference or wider circulation.


Whiteboards are awesome. We have them everywhere. Engineering would be impossible without them.

I just take photos with my phone and email them around. No need for expensive captures.

[Within minutes of our first meeting with the aforementioned short-timer consultant we were calling the whiteboard a "wall of shame"]


Judging from the responses to your comment, it seems to have struck a lot of religious nerves here.


Not religious, raw due to repeated rubbing. I understand the idea that software engineering should be made more rigorous. I don't understand the people who think it already has and you just have to follow the studies. The studies are almost uniformly terrible, if you actually want to apply their "result". There's no particular reason to think that what works for sophomores in an extremely artificial environment has any impact at all on professionals, and the vast, vast bulk of studies are exactly that. A couple of exceptions, but far fewer than the number cited.

Then these people get up on their high horse about how they've got the solutions to programming and would you idiots just listen to this wisdom and why aren't you listening and come on man up this needs to be made scientific come on. Sorry, no, your wisdom is paper thin to the point that it can't even support its own weight when you pick it up, let alone try to actually apply it to anything. It's orders of magnitude of value away from being strong enough to support you standing on it and preaching.

Yes, the preaching makes people a bit grumpy.


The studies are almost uniformly terrible ... orders of magnitude of value away from being strong enough to support you standing on it

I find that too. Most of them seem so incommensurate with what they purport to be studying as to be nearly trivial. The researchers rarely seem to address (or even be aware of) the assumptions they're making, and their assumptions are usually significant enough to dominate the data.

That's not to say one can't extract value from such studies, but what the value is is so open to interpretation that everyone ends up relying on their pre-existing preferences to decide the issue, which defeats the purpose.

Edit: that study on code cloning that AngryParsley cited above is an example. They make some good distinctions among reasons why programmers duplicate code. But their empirical findings are dominated by their own view of what's valuable vs. not. They admit as much:

Rating how a code clone affects the software system is undoubtedly the most controversial aspect of this study, and also the most subjective.

I have mixed feelings about this study. On the one hand, it's good to see people working diligently to study real codebases. At least someone is trying to look at data. On the other hand, how they're interpreting it is no different than what we all do when we argue this shit online or over beers or – more to the point – when hashing out a design decision. This isn't science, it's folklore with benefits. The problem is that it's being shoehorned into a scientific format it can't live up to.

Their title, by the way, is a straw man. What they're really arguing is that not all forms of code duplication are equally bad, that some are good choices under certain circumstances like platform and language constraints. That's reasonable (if bromidic) and even interesting, but it's just musing. It's not at all up to the authoritative status that AngryParsley gave it; it merely looks that way because it was published in a journal. The reality is that they have an opinion and looked at some code. At least they did look at some code.

Nobody is going to change their mind because of such work, nor should they. It isn't nearly strong enough to justify throwing out one's own hard-won opinions-based-on-experience-to-date. The net result is that everyone will look at it and see what they already believe. For example, I look at it as a Lisp programmer and the examples seem almost comedic. It's obvious that in a more powerful language, you could eliminate most if not all of that duplication, so what the paper really shows is that language constraints force programmers into tradeoffs where duplication is sometimes the lesser evil. Exactly what I already believed.


You imply that those who find fault with the studies are acting from "religious" motives instead of good faith. I don't find that suggestion constructive. Can you defend the substance of the studies to the extent that others are attacking it?


"You have to ask yourself: "Do I understand why my program is doing X?"" "I almost never use a debugger." "And if your program is so complex that you need a debugger [simplify]"

To me, being afraid of a debugger is like being afraid of actually knowing exactly what is going on - being lazy and just read logs and guessing what might have gone wrong, instead of letting the debugger scream in your face all the idiotic mistakes you have made.

I would argue that using the debugger is being lazy in an intelligent way, instead of spending hours reading endless logs trying to puzzle together logic the debugger can show you directly.


Agreed.

Logs are good. They're effective. They're easy to use. You can filter, search, and aggregate them. Without printf()s and log statements, my world would be chaos and darkness.

But a debugger gives you superpowers:

  break blah.c:180
  run
  thread apply all bt
You can stop time. You can get backtraces. You can see and modify every aspect of the process. Many debuggers even let you attach to a running process and do these things.

Changing log statements means stopping and re-running your program. If startup time is large, this can hurt productivity.

It's rare, but logs can mislead. With async stuff, logs don't always get printed out in the right order (hello, Node.js and Twisted). A debugger is crucial for figuring out that sort of unintuitive behavior.


There are Node.js debuggers? Oh, seems so: http://github.com/dannycoates/node-inspector


Node has a built-in debugger. Just run `node debug blah.js`. There are some issues with it though. The commands are different from any other debugger, and it has problems handling signals: https://github.com/joyent/node/issues/3167


This is an old argument, and the problem tends to be that each side reads the extreme position even when it was neither written nor intended.

I seldom use debuggers, but sometimes it's the right tool. Others I know and respect use debuggers much more. Even among people who are great at what they do, people work differently.

For those of us who prefer to use debuggers sparingly, the worst abuses stand out: coders wasting hours playing with breakpoint and stepping through code, eventually finding the point where it breaks, and then still not understanding the actual problem or how it should be fixed. This kind of situation is certainly not the case for good developers, but it's depressingly common. Good developers who use debuggers also see these abuses, but they respond with "you're doing it wrong" rather than "put away the debugger." IOW, anyone/everyone tends to correct someone by showing them how they do it.


If you are spending endless hours puzzling things out, you already have no idea what's going on. So sure, go for the debugger if that helps. But that's not what's he's talking about.

Like him, I use a debugger rarely. Not because I'm opposed; they're great when they work. But it means I don't understand what my software is up to. Which for me is a sign of design and code quality issues. Or just ignorance. Both of which are solved by working to clean things up.


Cleaning up is fine and commendable if you're working on a piece of software from scratch.

My code is often just a class or two plugged into a behemoth. I know exactly what my code is doing, but not exactly how it's being called by the platform, or what responses is it getting.


Sure. In my view, that's another context where I don't know what's going on. Ergo, sometimes I give in and use a debugger.


I don't think that using a debugger = you don't understand whats going on in your code. I've had many situations where it helped me understand a great deal what was going on. Plus, its a built in tool, so I don't have to waste time writing logging code or anything of the sort. I've also had situations where I worked with another programmer, who rarely used it, and often times got his understanding wrong when he puzzled it out. He eventually figured it out, but it was debatable whether it saved time from using a debugger.


A situation where a debugger helped you understand a great deal has to be a situation where you don't understand what's going on.

I agree a debugger can help you figure out mysteries. But when I find myself using one, I try to ask: how could I have avoided having a mystery in the first place. Common answers: better tests, cleaner code, better design.


>> Common answers: better tests, cleaner code, better design.

I wouldn't be so keen to say that these help "avoid mysteries" in the first place. Many algorithms that have better time complexity and space complexity are often much harder to understand than writing algorithms that sacrifice those qualities, but can be understood at first glance. For example, is it easier to understand code that performs a lot of bit wise operations, which is probably better designed and more cleaner than a piece of code that performs operations with strings and object?


The more experience I gain, the more I tend to prefer configurable logging over interactive debugging, for much the same reasons that I like automated tests and shell scripts. For a simple, one-off job it’s fine to do things manually, but a small investment in a systematic, repeatable approach quickly pays off.


Using logs wins against debuggers a few cases though:

- Your program runs on a client system and logs help you understand or reproduce the system without having to do a remote session (which might be impossible on some firewalled envs).

- Your code crashes without stack trace and you want to understand where to begin the search.

Agree with your overall statement though.


logs beats debugers in asynchronuous mutiple workers/consumers context.


I rarely reach for debuggers, but when I do, they are invaluable. I tend to rely on traces, and I do not consider that "lazy." Traces give me end-to-end understanding of behavior. Usually, that's enough for me to understand what is happening in the moment. When it's not, I use debuggers to understand exactly what is going on in a moment of time. But debuggers are not good at giving me an end-to-end understanding.


This was the only point I disagreed with. Is it really so uncommon for programmers to work on code written by others? Yes, it's true that I don't understand programs written by others. To use a debugger, you can get the actual state of the program at the point you want. To use a log, you'll probably have to first understand the program and then instrument it. For fixing bugs in big unfamiliar systems, I think rational ignorance applies. A surgical strike with a debugger will probably be faster.


I avoid debuggers too, not because I am afraid of them but because I find logging etc works better. However, the fact is that I wouldn't recommend being afraid of them. It is extremely helpful to know how to use a debugger.

In fact I would say if you don't know how to use a debugger, you really have no reason to avoid one. The point of knowing the debugger in part, esp. with more dynamic languages, is to do better debugging in your head.


I've found that debuggers actually help me understand a program. Even when I'm not necessarily trying to fix something, having a debugger to track all the local variables and is a lot easier than trying to keep in all in my head. (Caveat: I'm less of a programmer and more of a data guy who writes some code here and there).


I've found that, due to the prevalence of race conditions in the codebase I'm working in (due to lots and lots of async calls; it's Javascript), the debugger often exacerbates and hides the problem more than it illuminates. As a result, the debugger has ended up as a desperation tool rather than as a first-choice... and even then often doesn't tell me anything useful.


My reaction to this point was that the author is essentially proclaiming, "All you need to do is just have a comprehensive and unabridged knowledge of every single external library and API your code touches, no matter what kind of project it is!" Which, to me, comes across as a bit on the unrealistic side.


A lot of those things apply to many human activities.

I don't build bridges but I would be very surprised if an architect described his work as "pure science and no craft at all" (how would it be possible, then, to build beautiful / ugly bridges?)

I do a little woodworking and have many tools; friends sometimes look at my shop and ask if I really need all that -- yes, I do. In the course of a project you get to use many different tools. You can get around to missing one but it takes exponentially longer to work with not the exact tool. (Same thing with photography).

I'm learning to fly, and the most important word regarding human factors is "honesty". The way to fly is not to avoid mistakes, it's to detect them and minimize the consequences; if you feel you can do no wrong you'll eventually kill yourself.


> 0. Programming is a craft not science or engineering

Unless, of course, you are software engineer-ing.

Flight guidance-and-control systems, among many other things, are are precisely engineered software systems. In a world of web-apps and mobile-apps, people tend to forget this kind of software exists.

Sure, working on your web app, writing some JQuery widgets, or coding up some python scripts is a craft.


Can you explain more about how high quality engineering software projects do not rely on programmer's experience and intuition, but instead follow formulaic rules and achieve excellent?

From my limited reading, it seems that most "mission critical" software is achieved by applying a lot of resources, especially testing, to the project. Not to mention having a very well-defined (and relatively unchanging?) problem space.

Surely, if there were engineering principles that enabled folks to reliably create high quality software, we wouldn't see the horrible failure rates across all sorts of software projects.


I work in the area of critical infrastructure software.

There's no special software engineering sauce that gets used. But there is a dedicated commitment to careful code review, and exhaustive testing, as well as a very rigorous process for defect handling. We have a dedication from the top down to ensure our stuff doesn't break our customers. We don't have a "software pirate ninja rock star" culture, we have a culture of careful work.

Doesn't mean null pointers don't get accessed, or that weird code doesn't show up. It just means that hey, we worked real hard to ensure that these issues are minimized. Quality is a journey, and every day we have to work on it.

This is a great article about how this sort of stuff is done and the kind of culture that you want to cultivate. It's a bit dated, but still solid.

http://www.fastcompany.com/magazine/06/writestuff.html


This is mostly semantics, but in general attributes like "dedicated commitment" are hallmarks of a "craft" practiced by individuals. "Engineering" tasks are things like "processes" and "models". So really, I think you're agreeing more than disagreeing.

And the "software pirate ninja rockstar" thing is a shameless strawman.


Processes are nothing without people dedicated to following them. Models are unreliable without taking care to craft a reliable system.

I don't really consider disciplined programming to be a branch of engineering. While I don't have a sophisticated metaphysics of code, it seems that there is an essential ontological difference between "engineering" a software system and "engineering" a bridge or a chemical process.

Richard Gabriel once suggested the idea of a MFA in software, and I think that he is onto something.

http://www.dreamsongs.com/MFASoftware.html


I have done both hardware and software engineering, and I see no difference.

It is difficult to put into words, but I would say that the heart of engineering is the discipline of understanding how and why something is useful, as distinguished from feelings or hopes about its utility.

An MFA in software is pretty much the opposite of engineering. Engineering is not a matter of taste or opinion, it is about creating such hard sparkling truths that opinion would be superfluous.


From what I saw, most mission critical software is done in such a way that a single programmer can understand the whole system starting from microcode instructions up to the control parts.

Lots of static analysis also helps, as well as systematic human review of any code, test, or test result.

The best example I witnessed was an engine control system (FADEC software) written in a dialect of ADA called Spark. The code was so clear that it was self-explaining, and a requirements database could explain any given statement in the program. Spark has some nice properties (e.g. no recursion to make it possible to check stack depth limits statically...)

So while more manpower is an important element, it is not the only one. Simplifying the problem space to an extreme is also essential.


You seem to be saying that engineering does not require experience and intuition? I would think that in this case becoming an engineer would be a bit simpler than degree + exam + several years experience[1] + another exam.

[1] http://en.wikipedia.org/wiki/Engineer_In_Training#EIT_design...


I never noticed until now, but I never use a debugger either. An employer urged me to start using one while I was working on their code, but it's just not natural for me. If I need to debug a crash, I just print the 1-2 variables of the state involved, see what's wrong, and fix it.

Is there anyone who uses a debugger for more than inspecting state?

EDIT: I guess lower level languages and more involved applications use debuggers much more extensively.


Debugging compilers: there may be over 1GB of heap (so you can't reasonably dump / log everything), with promiscuous pointers connecting dense blobs of data (one or two variables don't cut it, you need on the order of 1000s); and the code is 20+ years old, and the people who originally wrote it have long since moved on - nobody knows all the code.

One example. Sometimes what you want is a time machine: figuring out how a particular variable reached its value. So you swap out the Windows memory allocator (which randomizes initial heap addresses) for one with predictable addresses, run the program until you find the dodgy value, take its address, then restart the program with a hardware breakpoint setup to monitor and log the stack whenever modifications are made to that memory address.

This kind of "backward tracking" takes no more than 5 minutes on a project that's set up for it (i.e. with the appropriate allocator available and switchable). Solving the problem by guessing locations, dropping printfs in the code, etc. is rather less productive.

Other uses: debugging code you don't have the source for, OS code, binary compatibility issues etc.

(Crash bugs are usually not a big deal; usually, when the code crashes, the crash location is relevant. Bugs that corrupt state are much worse (heap corruption, races / concurrent modification, etc.). The original bug resulting in the final bad behaviour may be completely different from where it appears.)


Aren't there debuggers that step backwards? That sounds useful in many cases, and not impossible to create, at least for special cases.


Yes, they're usually called "Omniscient Debuggers". Intellitrace on Visual Studio provides this kind of funcionality.


Also GDB has supported reverse-execution on some environments (i386-linux and amd64-linux included) since version 7.


Some debuggers let you "drop frames". This brings you back up the call stack and you can step back down into the beginning of the method call.

http://stackoverflow.com/questions/2367816/when-using-the-ja...


OCamlDebug does this. Magic :-)


Umm...

  * Tracing code execution paths to load the code into your mind
  * Modifying state inline so you get a different path without re-running
  * Modifying state like FirstName to trace just how the bugger gets into the output
  * Some sophisticated debuggers like java can allow you change code, then hot recompile and deploy code LIVE.
  * Suspend threads when you reach a breakpoint so you can dig around at all the multi-threaded state at that point
     - Although multi-threaded debugging is where printfs shine
All of that is much more difficult with just printfs.


Sometimes the logging is not enough, and you cannot isolate the case to prepare appropriate unit test.

Just few days ago I had a strange problem with the order of imports in Python at the border of my code and external library (Celery). There were import hooks involved but they didn't seem to be executed properly in certain conditions. I could reproduce them quite reliably but I needed to pinpoint the exact import (inside Celery itself, mind you) that was causing the problem. pdb (Python debugger) was indispensable while solving it.

On the other hand, though, it was probably the first time in many months that I used pdb for more than 5 minutes, and for something more complicated than checking why a particular test fails.


I agree, you certainly have to trace execution flow somehow, and a debugger is the best way to do that. However, my realization was more that the debugger wasn't useful for the vast majority of cases. Until now, I somehow thought I used a debugger a lot, and that it was a big part of my debugging (after all, how do you debug without a debugger?).

This post made me realize that 95% of the time I fix errors from just the stack trace, 4.9% from logging, and 0.1% from actually debugging.


Thats my experience as well - I rarely use a debugger, but when I do I cannot imagine any other way of finding out what is going on.


I don't know about the nature of the problem you were working on but in my experience there are some cases where the problem is far too complicated to be fixed with debug print statements. I couldn't live without my debugger.


For example, when debugging operating system code running before printf (or printk) is available. I barely used debuggers on high level code, but in low level its an all different story.


I can say the same for high level code, but I use debuggers when the source code is not available or I'm coding a device driver or something in assembly.


Yes, I've done that sort of programming (embedded stuff) and debuggers and other exotic tools are very useful.


For my own code, not so much. For resolving problems or understanding complex control flow in someone else's code, more frequently.


I use a debugger. If you've got some input that gets transformed into unexpected output, it's great to follow the code along. Of course, placing debug output works too, be the debugger is usually faster in this situation.

It's good for white box testing. It's also good for compiled languages. You know something is wrong, but you don't know why it's wrong.


I often feel that comfort with debugging procedures is what separates an awesome developer from a codemonkey. I've long wanted to find a good way to teach debugging concepts to people. It often makes the difference between being prolific and being competent, in my opinion.

...just based on years of observation


"I'm not young enough to know everything"

Having recently started mentoring/managing the first really junior engineer on our team (self-taught, <1 year programming experience), boy does this ring true. Luckily I'm of the temperament to find the "advanced beginner" stage of learning more funny than annoying.

I think it's possible to understand as little about your code when using loggers as when using debuggers, so I have a hard time agreeing with him there. I think his general point about having tools and knowing when to use them applies just as much to that as it does to language, so he contradicts himself.


A large part of the post can be rewritten as "don't be lazy".

1. Don't be lazy and just do something that works without taking the time to learn why it works.

2. Don't be lazy and just stop when you have something that works. Go through the code again and see if you can make it better.

4. If you find yourself writing the same thing twice, don't be lazy and carry on, put the code in a single place and call it from where you need it.

Or at least that's how I see it. I do all of the things I shouldn't do, largely because doing things the wrong way is so much easier!

Edit: Rather than rewritten, I meant "falls under the general category of". The article was great!


The hardest obstacle to not being "lazy" is deadlines

Is it lazy to accept something that works, and meet the deadline?

Is it OK as a craftsman to miss your deadlines, because you want to know why something works?

These decisions are like a craft in themselves - sometimes we can just trust a library works. Sometimes, we need to understand more. And sometimes we need to re-negotiate deadlines. And sometimes we lose clients.


Though point 4 should be rewritten as "If you find yourself writing the same thing twice, DO be lazy and carry on, put the code in a single place and call it from where you need it."


There is usually an upfront cost of time and/or effort. The choice comes down to copy-pasting what you need everywhere or put effort into making a generic function that you can put in one place. If you are working against an impossible deadline, option 1 seems very enticing. In the long run, you are right, the effort is much less to maintain a well thought out codebase.


I don't really understand why people think that copy pasting takes less time than making a new function.

Your function may need some refactoring to make sense, but just cutting that block of text out and then calling it in a function can usually be done in seconds.

Copy Pasting: Copy the code. Paste in new place. Change variable names to fit in new place.

New Function Cut the code. Place inside function declaration. create call to function in old and new places.


I doesn't usually happen like that for me. More likely, I stumble on a point where I have to reuse part of an old code, but with slightly different goals, or with some steps added/removed in the middle.

Copy-pasting and just dropping the useless parts is easier than making a new function, because you have to think about how to make the new function apply to both cases (the easiest way is "well, I'll just put a switch in the parameters and if statements", but it doesn't really lead to better code).


> New Function Cut the code. Place inside function declaration. create call to function in old and new places.

And also: decouple code from it's context. Rename variables. Rename the function. Possibly add a new module for it.

Moving repeated code to a function rarely just about relocating a piece of code. Quite often (especially if you have cross-file code repetition) you identify abstraction, which is a Serious Business (TM), requiring at least to think about where to introduce it, how to fit it into existing mental model of the program, and while we're at it, how to make it useful for code that you'll be writing next week, because it would be a waste not to do that now.

It's not that difficult and it's quite rewarding, but it also takes significantly more time than just copy-pasting and carrying on.


1) and 2) seems to define lazy as "avoiding work now". Can't being lazy also involve avoiding work later? (see also: The first great virtue of a programmer)


Writing it like that would have been lazy, though.


It's nice to see number 5.

>Be promiscuous with languages

>I hate language wars. ... you're arguing about the wrong thing.

It's easy to take this for granted, but it's a concept that is very important to stress to new coders. If you spend too much time focusing on one language you run the risk of the form becoming the logic . This is a dangerous place where your work can be better analogized to muscle memory than to logical thought.

At least in a college environment, I think this lack of plasticity causes discomfort with different representations of similar logic - and so flame wars abound.


> Programming is much closer to a craft than a science or engineering discipline. It's a combination of skill and experience expressed through tools

You seem be implying that the latter statement doesn't apply to the disciplines of science and engineering. "skill and experience expressed through tools" is highly important in both watchmaking and bridge building. I would advise anyone who says elsewise to reconsider.

I understand your point, but why create a hugely false dichotomy between a craft discipline and the science and engineering disciplines?

---

I strongly concur with points 2 and 6.


Bridge Building vs Software Engineering http://www.codinghorror.com/blog/2005/05/bridges-software-en...

There is something different between software and other engineering methods. I'm not sure he expresses it properly, but it's definitely there.


The core of programming is creativity.

No developers develop the same way, and even though there are some obvious ways, there seldom exists an absolute all cases best way.

It is the thesis of the mythical month (F. Brooks) and I do like this theory since its corollary the "no silver bullets syndrome" is quite accurate.

The essence of programming is creativity, thus no tools can improve software productivity in its essence.

The problem with school, is studious dull boys with no imagination thinks they worth something in programming by incanting mantras of pseudo tech gibbish. They have a 90K$ loan, no gift, and they pollute the eco system because else, they become hobos. At least, most of them are hired as java, C++ or PHP developpers where they fit best.


> The essence of programming is creativity, thus no tools can improve software productivity in its essence.

Completely not true. Creativity is a fragile thing and anything that stands between you and expressing your thougths may break the creative process altogether. Good tools can also enhance the process[1], for example by allowing to see you the thing you're working on in realtime (see e.g. Bret Victor's "Inventing on Principle").

Also, software productivity is a function of both creativity AND being able to turn the idea into reality efficiently. Good tools do a great job on the second part.

[1] I do have the feeling though that most of the creativity still happens on paper and/or whiteboard, not inside computer programs.


most painful bugs are in the conception, thus at paper level.

And also most breakthrough are also in simple efficient conception.

Delivering is what most people calls craft.

I do pride myself in delivering, however, any monkey coder can deliver.


Status for one and pay and benefits for another.

Engineers (And I mean real ones) have enough trouble with people thinking they are blue colar craft workers.


These are all golden lessons that people who think about writing code generally learn.

One thing I would add though is that there are many times when there is time pressure and a kludge works. The right thing to do here is to document that it is a kludge so that if/when it bites you later you have a comment that attracts your attention to it.

"I don't understand why this fixes the problem of X but this seems to work" is a perfectly good comment. It's great to admit in your comments what you don't know. (That's why questions relating to commenting are great interview questions IMO.)

Finally, I think it's important in the process of simplification to periodically revisit and refactor old code to ensure it is consistent with the rest of the project. This should be an ongoing gradual task.

Anyway, great article.


The debugger part doesn't look generic enough to me. As a Smalltalk programmer, I can only say usage of debuggers depends _a lot_ on which language you code in.

In Smalltalk, you practically live inside the debugger. Also, if you are an ASM programmer, the debugger is indispensable.


> I almost never use a debugger. I make sure my programs produce log output and I make sure to know what my programs do.

I used to do precisely that. Sprinkle code with log messages, recompile and run. When I finally learned how to use gdb, my debugging productivity increased tenfold.

I mean, just the ability to stop your program at any given point gives you an enormous advantage. You can not only examine the local state of your program, but also you can see how the state of systems outside of your program (e.g. database) changes, and all of this without polluting the code with tons of useless debug messages.

Often when I had new ideas during bug hunts, to test my hypothesis without a debugger, I had to go back and add new logs, the recompile, then run (and make sure it reaches the same state as before!) - lots of wasted time. With a decent debugger it's as easy as typing an expression.

And I don't think debuggers lead to lazy thinking. The process of finding the problem is the same whatever method you use - you analyze the code, have an idea about what could be wrong, change one thing, then see what happens. Debuggers just make it easier.


Good article, though I don't agree with it all.

It is harder to grow software than it is to initially build it. Preconceptions bite you on the ass, data structures don't allow for new features, side effects multiply.

You don't need to learn the layers. In fact, if you're learning all the layers, you're probably an innefective coder. This is not to say that you shouldn't investigate the layers or have a poke around the layers. But, software's about reuse and reuse is about reusing other people's work via known interfaces without worrying overmuch about what goes on underneath the hood.

I'm actually more of a debugger than a profiler, and as much as I'd like to believe that my way is as valid as his, I suspect that he's probably right on this and I'm probably wrong.


I find preconceptions bite ass way harder when you try to design big things all at once. If you grow them you can pivot and refactor more easily while they're small.


Growing sucks with large codebases. I worked with about a dozen people on a huge app, that would have been much easier to make if we had had a better idea of what we were building when we started instead of making stuff up as we went along.


#7. Learn the layers. Is this even possible anymore? Seems like with apis, frameworks, there are layers upon layers just within code. Then you have the OS, the hardware, the network layer (whoa! 7 layers right there!)...


A big part of this is actually learning some hardware and networking basics. If you understand how CPUs and memory works, if you understand how TCP/IP works, if you understand how compilers (or JIT/GC environments) work, you can reason out a lot of functionality that will cut across all sorts of frameworks. Otherwise, you get back to #1, where you are doing stuff but you don't really know why it works.

It appears to me that a lot of folks are actually incapable of stopping and thinking how stuff might even plausibly work. As soon as you rephrase the question in terms of fundamentals, it becomes clear, but people allow themselves to get confused by all the high-level whizbang stuff, without remembering that there is no magic.


Well, you are referring to Wheeler's Corollary to Lampson's Law, but it doesn't have to be like that. It's actually less work to just write the program you want in the base language, than trying to shoehorn it into a framework, or wrestle the framework around your code.

I've recently rediscovered the joy of writing GEM apps in m68k ASM...


For people's information:

"All problems in computer science can be solved by another level of indirection." - Butler Lampson

And, from what limited sources I could find:

"[above] ... Except for the problem of too many layers of indirection" - David Wheeler


It's a shame Wheeler isn't better known. He invented the subroutine back in the '50s. Calling a subroutine used to be called a "Wheeler jump" (relevant to #4 in TFA).


His group with the EDSAC did some absolutely tremendous and groundbreaking work.

Wilkes, Wheeler, and Gill wrote a book, "Preparation of programs for an electronic digital computer", that pretty much describes the core software engineering precepts we use today - in 1951.

Gill wrote "The Diagnosis of Mistakes in Programmes on the EDSAC", which gives a good snip of what they were doing at the time. I think that it can be obtained for free online.


We need to educate people that there was computing before the advent of Web 2.0. There are riches in the past to be mined by those who will take their blindfolds off.


As a sysadmin, I used to abhor levels of abstraction. As a new dev, I love it. Esp after I started using jquery. :)


9. You count from 0


Is "learnt" a real word?


Yes. It is used instead of "learned" in nearly every english-speaking country outside the U.S.


>>a printf that's inserted that causes a program to stop crashing.

Huh?


printf can/does have side effects. Perhaps one of the parameters to the call is obscured by a macro, which has some other effect. Or perhaps stdout was redirected and the printf was required to prevent line buffering from hanging a downstream consumer. The point is, if it seems very unlikely that such and such could cause a problem, it probably isnt the true cause. (Printf itself isn't the fix, it is just involved in the fix)


interesting, thanks for the explanation.


printf() introduces delay that is often enough to hide / "fix" race conditions.


I like that you starting numbering at 0 - very apropos.


Taking personal pride in not using a debugger is a bad idea. Sometimes it's the right tool for the job, and if your picking it up makes you feel dirty, you're only handicapping yourself.


But sometimes it isn't the right tool for the job. In particular, it enables you to deal with a confusing, poorly factored code base. The right tools there are the ones that help you clean the mess up, rather than making the mess more tolerable.


If you have a rational argument that proves that debuggers are only useful on poorly designed code then I would certainly be interested in hearing it. It is true that the worse the code the more frequently you need to debug, but that's a quite different proposition.


I'm not saying they're only useful there. But we both agree that they're very helpful in understanding a bad codebase. That's because they make bad code easier to handle. Which for some people removes the incentive to clean it up. Basically, they use debuggers as crutches.


Quite interesting




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: