Hacker News new | past | comments | ask | show | jobs | submit login

Hey software engineers, write some m*ther f!cking documentation! Don't tell me it goes out of date, at the very least a module level, architectural overview is better than nothing, and should remain relevant past your tenure.

/rant




How do you feel about the saying that code itself should be as good as documentation?

I personally prefer to read the documentation while skimming the code as well but sometimes, when I am under the pressure of having to deliver something, I absolutely despise not having proper documentation so I tend to agree with you.


Code doesn't capture intent in many critical cases, so figuring out what a piece of code is supposed to do is different from figuring out what it does. This is true in part because there are very different levels of abstraction involved.

To take a trivial example:

norm = sqrt(x[0]2 + x[1]2 + x[2]2) x[0] /= norm x[1] /= norm x[2] /= norm

This could be described as "take the square root of the sum of three values and then divide each value by the result" or "renormalize a vector". The latter is by far the more meaningful and useful description because it is presented at the level of abstraction that the user is likely interested in.

You could say "well why not create a function called 'renormalize_vector' so it would be self-documenting?" Fine, but now you have a function call per renormalization and that has a cost that may be unacceptable. For many simulations renormalizing vectors with a norm near unity is a big overhead, to the extent that I've written custom code to handle that special case and implement it as a macro that I could call "FAST_RENORM_NEAR_UNITY"... but what does "near unity" mean? And what trade-offs went into the design choices? What code isn't there because I tried it and it didn't work well?

People who advocate self-documenting code generally talk as if self-documenting techniques come at zero cost (adding a function call is an unacceptably high cost in some cases) and that the code that exists adequately captures all the thinking that went into it (it does not and cannot.)

So while I'm all for as much self-documentation as possible, any non-trivial code is going to require additional documentation to a) describe the purpose in high-level terms and b) capture the alternatives that were rejected and why.

Unfortunately, for open source projects especially, there is a law of documentation that says power*documentation=constant, so the most powerful code has the worst documentation, and there are projects with great documentation that simply don't do much.


This a hundred times. Comments should always be about intent; never about what's actually happening (the mechanistic description). I don't need help understanding the code as read; I need to know WHY the code was written. So I can debug, follow code paths, skim.

A further advantage: such comments have longer halflives. A rewritten method may still have the same purpose long after all details are changed.


> but now you have a function call per renormalization and that has a cost that may be unacceptable.

I would go for the function, and pass along Knuth's advice about premature optimization. If you're writing at such a low level that function calls actually aren't acceptable, go with a comment "// renormalize vector." Your instinct should be the function though. I bet there is more than one vector normalization going on in this hypothetical codebase, and that line looks pretty typo-prone.


> How do you feel about the saying that code itself should be as good as documentation?

Not him, but I'll take a crack at it: adorable, high-minded, and wrong. Most people are not going to walk through your code to grasp it in its entirety; design, and document, accordingly.

At the very least, you should have exhaustive, well-considered, commented unit and integration tests to demonstrate its use cases and (foreseeable) failure modes.


>> How do you feel about the saying that code itself should be as good as documentation?

The truth is, 95% of developers do not write readable code. But I've been doing this for a while, so I can follow pretty much anything. It's the shear volume of legacy code in the typical code base thats the problem.

What kills me is dead code that you don't know is dead; hundreds of class files, dto's, booleans passed around to control processing that are always false now, because that alternate path is no longer used. And protocol messages, oh god the hundreds of protocol messages, but we only use 10 now.

Edit; I've been the hero more than once documenting stuff like the above.


The code doesn't document your intent. There's no way to tell if something weird is a bug, or a feature for an edge case that was never written down.


Variable and method names can do that, and are more likely to be kept up to date than comments.


Standard HN car analogy time. "Look at the code" is like telling a car mechanic trying to replace a part to look real closely at a high res pix of the factory, instead of reading the Haynes manual.


I agree. Funny how people worry about documentation going out of date, but not code going out of date.

I guess it is more acceptable to have a bug in the code than a mistake in the documentation.


Exactly the opposite. The documentation gets out of date because people update the code without updating the documentation.


Yes that is a problem. However the question is what do you prefer, no documentation, or documentation that may be out of date? As a detective (which let's face it that is what an Enterprise programmer's job is to an extent) it is nice to have some information which you can scrutinize, rather than nothing.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: