Hacker News new | past | comments | ask | show | jobs | submit login

I contribute to the linux kernel and one key part of it is its extensive documentation - one of the patches I made was to correct some invalid offsets there - there is so much that can be done there that isn't code.

If even the kernel which is pretty specific low-level stuff can benefit from that then more-so pretty much any other open source project, and extremely value for it! Good technical descriptions are surprisingly hard to find.




> Good technical descriptions are surprisingly hard to find

Totally agree. It seems many developers don't like writing documentation. Good documentation can save hours of work to tons of people. It's a great way to make a positive impact to a project.

I'd like to have counter points of view, but I feel that lack of documentation led to the demise of many open source projects.


The argument against is just about effort. Maintaining documentation takes a stupendous amount of effort. Largely due to the language it's written in: dynamically linked and typed, tons of overloading, no formal specification, and 7 billion different interpreters.

Writing documentation is relatively easy. Writing clear documentation is hard. Writing clear documentation and ensuring it stays up to date is, if not an order of magnitude, at left several times harder than doing the same thing in code. To the point where it's less effort to re-answer the same questions over and over in an issue tracker (and others can help keep track of your responses) than to maintain up to date docs.

My preferred approach, on smaller projects with limited numbers of people, is to push documentation as close to the code as possible. Prefer comments, tests, and types over standalone docs, in that order. Prefer not to duplicate the code (explain why, not what, is being done; consider using intermediate variables with descriptive names rather than a comment on "what"), unless it is a public interface and you want to restrict the intended behavior to some subset of the actual behavior. Your standalone docs are now about the high level stuff, filling in the gaps between file-level comments and directing the reader to the right files/code to look at. Generated documentation from the types and comments helps make this approach more readable.

Here's an example: a project I'm working on now had several shell scripts to automate common tasks (e.g. setup). Each script starts with:

    #!/bin/sh

    usage()
    {
    cat >&2 <<'USAGE'
    Description here
    USAGE
    }

    set -eu
    trap 'test $? -eq 0 || usage' EXIT
In other words, an explanation of how the script is intended to be used is embedded at the top of the file, and printed any time there is an error (e.g. invalid arguments). And the readme clearly states that if you want to understand the scripts, you should go read them. This is so much easier to maintain than a comment in a standalone doc, that you KNOW I'm going to forget to update when making tweaks to the script. And although I'm not doing it now, it is machine parse-able if I wanted to generate docs from it.


That's a very good habit to be in, however when it comes to external-facing stuff, e.g. APIs, or stuff that needs a lot more detail (and especially examples) then you're going to need more than self-documenting code.

Of course getting out of sync is a real massive issue and a total pain. But that's another reason for people to contribute to such things :)


For APIs, there are tools to parse out comments (that the linter requires) into the API description. Doesn't give examples or much of a "when to use this" usually, but I've found it's better than nothing.

For me personally, even though I dislike these auto-generated docs since they're quite terse (and tend to display niche parts of the API just as prominently as the parts you likely want to use), I tend to find them much more reliable than more detailed guides precisely because they tend not to desync, so I often don't even bother with well-written guides anymore, since they are much more likely to waste my time (or at least be inaccurate).

I'd expect many other developers have the same level of trust issues I do, to the point that those shitty auto-generated docs are in some ways better than really nice, detailed docs, because they are easier to trust. I'm unlikely to break the habit of ignoring detailed docs/guides unless it's coming from some big project where they're uniformly good. Otherwise, reading them tends to be a waste of time IMO.


Good documentation is so important that I do not merge pull requests to my open-source projects unless they're accompanied by documentation (with the exception of minor cosmetic changes). If you can take the time to code up something, you can take the time to document what you changed -- and why.

It's a travesty that this is not the norm for major open-source projects.


I understand the sentiment, but for me, I actually prefer writing the documentation myself on my projects. It forces me to make sure that I understand the PR that's coming in, since I'm going to have to maintain it into the future anyway. I also want to maintain a consistent organization and style in my documentation that is difficult to explain to other people. So it's far easier to write it myself, than to ask the contributor to go through review cycles on the documentation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: