Hacker News new | past | comments | ask | show | jobs | submit login
Best practices for writing code comments (stackoverflow.blog)
193 points by nsoonhui on Dec 24, 2021 | hide | past | favorite | 188 comments



More often than not, I see code without any comments. There's this idea of writing self documenting code that really changed the commenting world.

And that whole thing was evangelized by Uncle Bob and the Agile wrecking crew. Before long it was bad to use comments, switch statements, or new up an object. This, in turn, led to the TDD movement, Agile only movement, enterprise patterns for all projects movement, and I'm sure there are others I'm forgetting.

Please comment your code. Tell me what you're trying to accomplish with this block of code. The function name doesn't always suffice. And I don't want to stare at it for 15 minutes, or re-format your 160 column LINQ statement, or Google your regex so I can read what it does on StackOverflow.

Even commenting pseudo code would be fine.


Yeah it's often really worth giving some examples of what you're trying to match (and not match) next to regex code.


That's one of the real world mysteries for me.

* If you ever had to maintain a code base you didn't design (who haven't ???) how can you not want to have as much paper trail as possible? Especially when the original authors move on.

* I don't remember discussing documentation (or even testing) styles/preferences during interviews. I don't remember managers making it a priority or educating the unwilling, even in large companies. So it's clearly not important from the business perspective. The closest I can remember was a mad rush to create runbooks after a particular nasty prod incident.

We live in a world of microservices. So you routinely work with multiple repositories. In larger companies developers routinely move to different teams in a couple of years. There's always another AWS service or non-relational data storage. It's interesting to notice that in other domains dealing with this kind of complexity and constant change it's expected to have written notes all the time. Think of medical doctors or aircraft maintenance crews.

When I was younger I kind of trusted that "self documenting code" promotion. As much as the Refactoring book was right this idea proved to be just wrong. Think about "business logic". Including the classical "converting XML/JSON to other protobuf/JSON". There's no grand theory here, just multiple confusing details influenced by previous versions, legacy ontologies, dependencies on other teams. Naming conventions won't help much, not to mention "the two most difficult problems in CS".

I think there's some correlation between poor documentation and missing tests. And the lame excuse is always the same - not having enough time. Even for obvious error-prone things such as calendar calculations or parsing deeply nested data received from the outside world. Or how to build and run a service.

Another correlation is between clear/structured thinking and how easy it is to explain the results. A reasonable functional decomposition and popular idioms/patterns/libraries documented elsewhere enable terse descriptions.

I see documentation as fungible. There are multiple somewhat interchangeable places information can be stored in. JIRA descriptions, commit messages, "javadocs", MD files (previously known as GOOG docs, wikis, Word documents). From what I've seen the people who have enough discipline to use one usually have others in place too.

Software development is very much a learning process. So other people will have to repeat it unless you summarize your findings for them. For some low-level details you could be one of them after not working on a particular component for long enough.


> Rule 6: Provide links to the original source of copied code

I learned this rule viscerally early in my career. Back in the golden age of Experimental Flash Art there was an enigmatic site called "flight404", and one day the site's author released source files for some of his more popular projects. All the Flash devs in my office started poring through them, and soon after my boss called me over to show me my own name in one of the comments!

Apparently the author (Robert Hodgin - whom I very much looked up to) had asked for help anonymously in a forum, and I had helped him out so he credited me in a comment (just for his own reference - in those days there was no Flash open source community and designers rarely distributed their source). That experience made me pretty obsessive about crediting outside influences in my source code whenever I get the chance.


I write my comments in commit messages because those are valid forever. A lot of times somebody will write a code comment, the code will be changed, but the comment not. This is a huge waste of time on so many fronts: writing the comment in the first place, and then confusing the subsequent developers with the wrong information. If you truly want to understand something, you can always check the change log, and find out why things are how they are.

Exceptions are e.g. if it's something exceptionally tricky or a hack of some kind that is kind of important. It doesn't happen all that much because the stuff I work on is simple. If I was going to do a lot of "commenting" I would prefer to write and update good documentation that gives an overview of how different things work together. The nitty gritty changes too often and is not that important in the grand scheme of things.


> I write my comments in commit messages because those are valid forever.

Don’t they disappear when someone squash merges branch where a file is both renamed and changed (a lot)? Or, at least, when somebody decides to move to code to another repo, and doesn’t bother bringing the git history along.


Yep. I make a commit on every little change and every team I've been on makes me squash them on merge.

Comments in Git commits are bad. Just comment the code and make sure the comments are updated while you're in the code. You can also look at them in a code review. The argument that they get outdated is easily remedied, but people just want to keep claiming they write 'self documented code.'


Git commit messages and code comments serve very different purposes, I don't see any reason to try to use one for the other or opposite.

To me both are important but for different reasons. I don't want to search the blame history of a line of code if a simple comment had been enough, neither would I want to primarily use code comments when bisecting.


> make sure the comments are updated while you're in the code. You can also look at them in a code review.

People will invariably forget to update comments and unless the comments show up in context lines of the diff associated with a change, it's likely that reviewers will overlook the need to change them.


You comment is a standard explanation for not having comments. Always makes me wonder why stale comments aren't flagged during code review? Seriously failure to update comments should eventually lead to constructive dismissal.


I don't think "constructive dismissal" is the phrase is you want there. (That's normally associated with employer wrongdoing or, at a minimum, mischaracterizing the terms of an employee's departure..)


I don’t understand. Can you not squash your commits and rewrite the message before you merge? Wouldn’t that preserve the message? (Trying to think of a scenario where it wouldn’t…)


I dunno if he really meant that, or just said he is explaining the added functionality in detail. I do explain in the commit message the feature I have just added.


Everyone has to be onboard for that to work. Lots of people just never look at comments because they don't trust them.


> … when someone squash merges…

It depends on who the “someone” is I suppose.

I typically squash my own commits, and as part of that I aggregate all of the commit messages into a single message with all of the relevant details (and leave out the “fixed typo” type stuff that’s not relevant.)

From the complaints that people are raising here there must be dev shops where someone else decides to squash a bunch of commits and throw away the messages.


squash merge is evil


The author should squash useless commits. A readable series of commits is put to review. At merging to a commonly used branch no more squashing.


Why?

Why doesn't the author make commits that are not useless?

Why squash everything into a single commit? Why not 2 commits, 3, 5?


I feel differently, why don’t you like it?


The above-posted scenario is a great reason to hate it. It's destroying developers' documented rationale for each change.


it destroys information for no good reason


> I write my comments in commit messages because those are valid forever.

I try to write good comments and commit messages. Comments are typically along the lines of explaining what was done and how for a particular block of code or method. Header comments include a list of parameters, return values, side effects, class variables, etc. Commit messages explain what was done or changed and why.


> Header comments include a list of parameters, return values, side effects, class variables, etc.

These are only useful for public stuff and in special circumstances in other cases. Sadly a lot of companies make stylecheck require them on every little private method which makes each file 50% longer for no reason and drown the really useful comments in noise.

> Comments are typically along the lines of explaining what was done and how for a particular block of code or method.

The problem with that is that the block of code that you explain will often call other code, and then that other code will change for other reasons (preserving the correctness but not the initial design that was in the comment) and then that explanation higher in the callstack won't be true anymore. It's nice to pretend we always check everything that calls our code all the way up when we change stuff, but in reality if tests pass and the code works - we often don't check for comments up the callstack. So the comments will drift away from the truth with time.

Comments in commit messages are much more likely to be true than static comments in code (for example if you refactored some method signature as part of the change - every call site will have the updated commit message automatically). If you comment in the code manually - you will probably not notice that you have to change a comment block 5 lines above the changed function call - it won't even show in the diff - so the comment block won't be true anymore).


This sounds like a recipe for write-only comments.

The point of a good comment is to guide later development - to say "here's something you should know before editing this code". If you put information like that in commit logs, then that implies that anybody who updates any part of that codebase must first read every log for every commit near what they're editing. If they don't (and they presumably won't), they'll never see that "here's why we're not doing X" comment before they change the code to do X.

Meanwhile, it's true that comments can drift away from truth over time, but putting them in commit logs doesn't change that in any way. An inline comment that's stale can be updated (and doing so is part of the job!). But a commit log comment that's no longer accurate will stay there forever, waiting to mislead anyone who finds it. The promise that it was accurate when originally written is of no practical use later on.


Same thing here. At work I can see the history going back to 1995. And that's after migrating from something to Mercurial, and from Mercurial to git. Maybe that's not the case in all companies, but in the one I work at, the commit history is the longest-lasting information trail.


> I write my comments in commit messages because those are valid forever.

Only until someone moves a file to a new directory. Now this file shows up as a new file with no history.

Also, you hope to never change your version control system because that change will erase the history.

Relying on commit messages is sometimes just not good enough.


> Only until someone moves a file to a new directory. Now this file shows up as a new file with no history.

Git will usually be able to link the two back, unless the move was combined with a lot of changes, as it does not record moves but infers them.

Even if it can't link them, the "creation" commit will visibly remove the old file, at which point you can... log / blame on the previous location and keep going.

> Also, you hope to never change your version control system because that change will erase the history.

Of course not, there are conversion tools between basically all VCS, and anyone tasked with such a migration who is not a complete goober will use them in order to maintain the historical record.

At $dayjob we've got history spanning over 3 different VCS and more than 15 years, and that's including weird stuff like splitting and merging repositories.


That's why renames should be a discreet commit and why squashing is bad


Making a rename in a discrete commit is less discreet (and therefore preferred).


No it doesn't. At least git tracks and follow renames


So if you added several classes and functions, you describe them all in a single commit message? Probably you don't document them at all.


That's a great idea. I've been doing this just to rationalize my laziness, but now it makes sense. However, I'm not sure those are valid forever as you can simply delete the .git folder, can't you?


> However, I'm not sure those are valid forever as you can simply delete the .git folder, can't you?

You can also format your hard-drive, yes.


and burn down the stupid remotes


Commit messages don't necessarily last as long as the code that they comment though.

Changing revision control system or copying code from one package to another can lose them.

The only durable documentation I've seen is in the source code


That’s horrible because the git commit messages are easily lost, disconnected or hard to find in any reasonably active codebase. For example as soon as you do a change and move a file it almost always disconnects from the previous change history.

Whats even more difficult is searching through a code base when the documentation isn’t in or near the code. I don’t know any IDE or editor that makes it easy to search though git commit message and source code at the same time.

On top of that, do you review git commit message in code review? Do you aks people to improve descriptions, typos and language in commit messages?


I don't understand how they are hard to find. It's trivial to find where the file was moved from.

What I truly care about is why something is the way it is, what is the rationale behind it, how it works with other parts of the codebase, what problems it solves, what is tricky about it, what to pay attention to and so on. I don't care at all about the code that was written and an explanation to it because this I can read myself in the code. The best place I've found for this is a code commit because it can tie different parts of the codebase together and add a lot of context to a change. I commit heavily and don't squash. A long comment in a commit that contains all file changes related to a certain feature, bug or whatever adds a lot more information than a comment in one file. When other people do it, it helps me a lot more than chasing their (outdated) code comments throughout the codebase.

But if that doesn't work for you, then don't do it. Just don't be dogmatic and dismissive. I accept there might be situations and codebases where this doesn't work.


I'm lost. Are you talking about meta-level comments, like "here's why this change is being made", that aren't directly tied to any particular line of code? If so, certainly putting that stuff in commits makes sense.

But for regular "explanatory note about this variable/function/etc" comments, how does one work if those things are in commit logs? If you're reading code and something is unclear, do you look back through the commit messages of every commit that's ever touched that line, just in case one of them has something relevant?


IDEs can show commit history. But you're right that it can get messy tracing back to the original commit when a file has been changed many times.

But for the most part I feel like comments should be automated tests. If a line is there for an edge case it should have a matching test for that edge case.

The only exceptions I see are for performance optimizations or some other situation where you can't easily test.


> IDEs can show commit history.

How you check the commit history is beside the point - the question is, do you read the entire commit log history for a chunk of code before editing it?

If not, any comments there might as well not exist - functionally speaking you're maintaining an uncommented codebase.


Tests are not a replacement for comments or documentation. Tests check that A() returns x but don't explain what is A, x and why it should return it.


That goes in the test name. You'll have something like

Feature A

"It should return X in [some edge case]"

That's where any link to a use case should be.


Lot's of IDEs and external tools do very basic git integration ¯\_(ツ)_/¯

I think JetBrains is probably the best of the IDEs


Your comments applies very little to commit messages, and much more so to comments.

> That’s horrible because the git commit messages are easily lost, disconnected or hard to find in any reasonably active codebase. For example as soon as you do a change and move a file it almost always disconnects from the previous change history.

Learn your tools or get better ones, `git log --follow` has no issues with renames, and when files get munged in ways it can't handle (e.g. content is split out or merged) it's easy enough to stitch back, and good annotate UIs (Jetbrain's is stellar and one of the few things I don't use magit for) make flitting through a snippet's history trivial.

Meanwhile finding removed comments is nearly impossible (VCS are nowhere near as good for finding when was removed than when it was added), and comments can easily drift apart from their point of origin as developers aren't too careful about maintaining them when adding unrelated comments.

> On top of that, do you review git commit message in code review? Do you aks people to improve descriptions, typos and language in commit messages?

Bet your ass I do.


> I don’t know any IDE or editor that makes it easy to search though git commit message and source code at the same time.

Any editor from Jetbrains with the GitToolbox plugin does that.


Can second that. Having the commit message as a virtual comment on the end of the current line is immensely helpful in tracking changes. It’s like git blame on steroids.


I also like seeing the commit messages in this way! In the case of VSCode this comes with the Git Lens extension.


  git log --follow -p file
edit: here's another

  git log -p -L:show_commit:builtin/rev-list.c


> On top of that, do you review git commit message in code review? Do you aks people to improve descriptions, typos and language in commit messages?

Absolutely I do. The commit message is part of the commit just like the code; why would it be excluded from review? The number of times that a good commit message has helped me when dealing with a bug, and that a poor one has stymied me, have firmly convinced me that they are just as important as any other project documentation. They should be clear and informative, and I ask for those aspects to be improved when needed.


I’ve been misled by comments so often that I now literally don’t see them. They’re like banner ads on websites. My brain just doesn’t register them anymore.

They get orphaned by slightly wonky merges. The underlying code gets updated or refactored, but the comments remain. When they are correct, they’re useless 90% of the time (at least in codebases whose linter requires doc comments). Even the accurate comments tend to drift with age and become inaccurate unless they’re carefully maintained (which they almost never are).

It’s really hard for me to figure out the balance.

For exceptionally good codebases (Redis, SQLite come to mind), the comments are a godsend.

For mediocre codebases, the comments are largely a waste of time at best, misleading and time-wasting at worst. And most of us, I suspect, are working on mediocre codebases.


Rust libraries use comments for generating documentation and testing. Here is an example of such documentation: [1]. It is difficult to believe that this code would be better without comments.

At least classes and public methods should have comments (except trivial ones).

[1] https://docs.rs/chrono/0.4.19/chrono/naive/struct.NaiveDateT...


Haskell does this as well, using Haddock [0] — e.g. all content on this page is generated from comments in the standard library: https://hackage.haskell.org/package/base-4.16.0.0/docs/Prelu...

[0] https://haskell-haddock.readthedocs.io/en/latest/


What is generating this HTML page? I'm always looking for something like this can help automatically generate documentation from code comments; it's very easy to get buy in for "update the docs when you update the code," much less so for "go through the comments for every piece of code you touch and double check they're still relevant." It's one of those things everyone nods at then promptly ignores.

Something like this with a discrete task that can be checked on code review would help immensely, I think.


It's rustdoc: https://github.com/rust-lang/rustc-dev-guide/blob/master/src...

> Rustdoc actually uses the rustc internals directly. It lives in-tree with the compiler and standard library. This chapter is about how it works. For information about Rustdoc's features and how to use them, see the Rustdoc book. For more details about how rustdoc works, see the "Rustdoc internals" chapter.


`cargo doc`

docs.rs runs it for every crate (rust library) uploaded to crates.io.

When actually working with rust, I recommend running `cargo doc --open` semi frequently. It generates that documentation from all your dependencies, and your own code, links it all together (with search no less) and opens it in a browser.

Note that it only uses doc comments, which are distinct from normal comments (/// instead of //, only allowed in certain places, //! Instead of /// to apply to the parent item instead of the next item).


I'm interested to know the answer as well. I've used org mode in the past for literate programming and document generation, but I'm starting to think that generating docs from comments is also a good option.

Adding a mechanism to publish the docs (maybe part of CI/CD?) would make maintenance way easier, it's something I would love to test out.


I’ve been working on a tool to do this based on a simple Python file watcher, web server and https://casual-effects.com/markdeep/ It just turns regular code into markdown code blocks and /** comment contents pass through to the markdown —> HTML pipeline. A little JS websocket and you get an auto-updating preview from any editor to any browser every time you ctrl-s.

Not much time to work on it lately. But, I’ll Show HN when I put it on Github.


HEavily used code will evolve good comments, as a product of multiple smart people using and improving the same code.

Bad codebases are ones that basically are throwaway or depreciating assets. As you say, we practically all work on these except the lucky few.


I miss the most important one. Explain business reasons.

Code does what it's written for. But it does not explain intent. Write that on a comment, link to the relevant ticket and document


This is what source control blame is for. It’s very hard to estimate what business reasons will be important to readers of code in the future.

I very often nudge people to remove their comments entirely. Less experienced devs often write comments to explain code, instead of spending time on making the code itself readable/understandable. I often ask: “Can you modify the code such that the comment will become obsolete?”


Hard disagree with this.

Yes, code should be easy to understand.

But well-written comments that explain assumptions and intent helps as code evolves over time.

Also, comments are quicker to scan than code itself.

A good comment can indicate code that can be ignored for the purposes of certain troubleshooting.

Finally, junior developers usually write code that is difficult to understand and don't comment.

I've rarely ever seen a junior developer that both writes unreadable code and takes the time to write comments.

Usually comments are a sign of seniority, someone who has pity on those who will come after him/her.

And thus that person tends to also write understandable code.

Perhaps there's an uncanny valley in between where comments are a band-aid for complexity, but I have never observed that in years of working with many developers.


Exactly, it's better to document on the VCS.

Comments there will be tied to that version of the code, can never go out of sync.

It's also good to learn how that piece of code evolved and what reasons to change. Avoids repeating past mistakes.


Oh no, never link to tickets or docs that are not version controled in the same repo. Usually code tends to outlive the tools for organisation.


So should we live in darkness now because the link might break 3, 5 or 10 years later?

It's also a misguided tools upgrade if they have no way to redirect, convert or otherwise handle old links.


Link to things in the same repo, like a wiki or external document (if you're linking to external stuff anyway). You have to assume whatever you link to will disappear without notification. At least if it's in the repo, you can check out the old commit and get the file back, and hopefully find the commit where it was removed and see why it was removed.

If you link to Sharepoint or something it's useless as soon as that link changes.


In my experience, then, most tools upgrades are misguided.


No we should live in light!

But make the boss understand...


But even if the link eventually dies, you have the business explanation.

Normally the comment is an explanation of the intent but the referenced ticket has the full discussion and backing data that led to the decision


I always say more generally “explain the why”, business OR technical reasons.

Bigger architecture decisions should go in ADRs (again, explain the why) but smaller stuff like explaining why you monkeypatched a library can save future devs a lot of lost investigation time and pain. Maybe by the time they are reading your comment/code, the patch you wrote is supported in the main library!


One of the most transformative features we added to the D language was Ddoc, which is a documentation generator for functions. It has a modestly standard format. The result is it applies significant pressure on the coder to add the Ddoc comments, and a routine change request review of a PR is "please add function ddoc comment".

Before Ddoc, the D standard library was inadequate, totally wrong, or missing entirely. After Ddoc, it became reasonable (though no documentation is perfect). Further improvements were an ability to actually run the example code in the Ddoc comment as a unit test.

The end result is the entire documentation of the D runtime library is generated by Ddoc from the source code.


I decided to go looking for this to check it out. To save someone else that websearch: https://dlang.org/phobos/index.html


Doesn't every language have a feature like this these days? If not an official standard, then a commonly used third-party tool.

Though embedding unit tests in API documentation is less common; the only other languages I'm aware of that support it out of the box are Python and Rust. A Google search turns up some implementations for other languages, like C++ and Haskell, but I don't know how widely used they are.

Regardless, I definitely agree that it's a very important and useful feature.


DOxygen existed for D at the time we developed Ddoc, and some people used it, and so I was constantly asked "Why not just use DOxygen?"

The problem with DOxygen was, it wasn't installed automatically! It was extra work to go get it and install it, so it never happened. What a builtin Ddoc does is:

1. It's always installed

2. It's always matched to the current compiler (one never has a mismatched set)

3. It is standardized (code using DocX is not mismatched with other code using DocY)

4. It can take advantage of the compiler's available semantic information

5. Don't have to beg documentation tool vendor to add features we need. For example, Markdown support was recently added. Didn't have to ask or beg. A Ddoc user simply added it

These advantages are enormous and transformative. Minimizing friction matters a great deal.

C and C++ do not have builtin documentation generators. I ask you, of professional C/C++ code you've seen, how many consistently and properly documented the function interfaces? In my experience, it's rare. It's much more common in D, and that's entirely due to Ddoc.


One that's missing: comments should explain why a piece of code exists or is written in a certain way (and implicitly, when it can be changed or removed). This overlaps with "explain unidiomatic code in comments", but there can be idiomatic code whose purpose isn't obvious.


+1 for this. It also links to my #1 rule for comments: why not. It applies when:

1. There's a chunk of code that, on first reading, could be clearer/simpler/more idiomatic.

2. There's a good reason not to use the obvious approach, and do something else instead (maybe performance).

Then comment to explain why the obvious path wasn't taken. No matter how well written, code alone can never explain "why not". I've found this invaluable, even looking back at my own code.


“Why not” is one of the top reasons to write a comment. It should be discussed more. Many times I've encountered a piece of code which could have been much simpler or idiomatic. Upon rewriting, I discovered that it didn't work for some obscure reason. If there's no comment explaining “why not”, many others (including future me) could lose time trying to do the same.


Man, I've been bit a few times with this. I'll read code and think, "Why the hell did they do that?" Later, when I'm almost done re-writing it, I see the edge case they were working around.

At the very least, comment these situations.


I think "why not" is one of those things which definitely does not belong in comments, if you regularly write up such justification, your code becomes mostly comments. Even more so than "positive" comments, "negative" comments belong in the commit message.

Possibly unless your entire codebase is literate, and code is secondary to comments.


honestly it needs both, but if you only put it in 1 place put it in code.

During the development you are mostly likely looking at commits, or PRs, so that makes sense.

But if its long living piece of code, people will get you your code via following function/method chains or just browsing the source not commits. While you can use git blame, and then figure out the commit, and then read last few commit messages, putting comment on code is easier on everybody.


I once worked in a codebase full of such comments, they were all like

// Adding this because XYZ said so


Sounds like a plain-language version of "git blame"!


I've also seen it where every comment was just who made the code change and the date. "John K. 10/13/1994"


When I started many aeons ago, when I did not know what I am doing, so I tempted to documented the language itself:

inc al ; add one to al register


I like these rules but if I were writing my own set the first one would be that the most important comments you write are often those describing persistent mutable state. Often the point of keeping an objects members private is to preserve the parity among them and you explaining it where they're declared can save everyone a lot of trouble. Also if you've got a state machine the semantics of all the different states.


I write comments first, then fill in the code later, adjusting the comments as I learn better ways to do things. That way, the comments are like a guide to anyone as to the goal of a section of code. I comment about every 2 or 3 lines of code, or more sometimes. I even comment on things that everyone would easily understand. My comments are basically a plain English version of the code. Functions and such have comments or docstrings that explain the function and the basic steps of its functionality, so that’s about 2 times that I explain things in my code. I’ve never had anyone say I comment too much. When reading uncommented code, I wish there were more comments sometimes to explain what each variable does or is. My variables are often named with 4 or 5 words, connected by underscores or title case. I often see variables that aren’t named well and try to avoid that myself.


Are you concerned that you are introducing tech debt into the codebase? Anyone who refactors your code later will also have to refactor your comments (but likely won't).


If they don’t refactor my comments then that’s their lack of care, I’m not responsible for that even a little bit.


Kernighan's law is fun.

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.

If effort is linear then, by definition, you should never put more than 50% brainpower in to your code.

I'm not sure I want to increase the effort I put in though.


Only if complicated is considered clever. Clever might actually make things simpler.


Reminds me of this quote by Alan Perlis [1].

> Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove it.

1. http://www.cs.yale.edu/homes/perlis-alan/quotes.html


Only in the mind of the person writing it.


I believe "cleverness" refers to ingenuity; it's outside-the-box, and trades simplicity in one plane for sophistication in another.

However most "clever code" we come across is lopsided in this tradeoff, and might even require cleverness to understand, which it turns out is not very clever at all.


All these discussion about comments miss the most important point of why we need comments: To aid us in understanding the code.

My only rules to write comments are:

- Add “why” comments when you write the code

- Add all the other comments when you read the code, and don't understand


I wish I could make this the accepted answer


"Rule 6: Provide links to the original source of copied code."

StackOverflow looking to get more backlinks from GitHub/GitLab :)


Also, find the real source of what you're copying. The formula they cite for brightness is a rounded version of BT601 luminance. Citing the standard in the comment is way better than just linking a random SO answer.

Links are mutable, links die, SO answers can be edited. Include any information needed to understand the code into the comment and proper copyright acknowledgements if you copied it (assuming the license allows it).


Yes! The comment may not even have been necessary, if you extract into a function calculateBT601Luminance(red, blue, green).

Then you can link the standard, or at least the Wikipedia page, and I would lean towards this if I don't expect readers to know a bit about the domain. But if you don't, someone can still find an authoritative source with a single search.


What search engine prioritizes links in comments or code blocks as any sort of signal?

I have no particular attachment to SO, especially after the way they handled their public drama recently.

That said I put a link to SO any time I have to look something up there and it’s not immediately obvious from the naming/docs why it does what it does. I also try to sum it up in a sentence or two if I can and if it doesn’t distract from understanding the larger goal of that section of code.


Sure, but also it may be worth noting that the content license on Stack Overflow posts requires this. And it's the poster's content, not Stack Overflow's, so there's an element of respecting a fellow coder who helped you. In fact, most open source licenses require attribution at a minimum.


That was my first cynical thought. It is a good rule regardless.


When you solve something in a weird way, leave a link to SO in a comment.

Write TODO and NOTE and use a tool to find all your special comments that show unfinished features and investigations. Scan through regularly to make sure the comments still make sense.

Comments not containing special strings should just be "why" explanations, eg "we sort the bids in the reverse order to the asks because the best bid that the highest price". So generally something where the code has special cases that are explained by the domain.


Within the domain of electronic trading it should be obvious that prices are ordered from best to worst, and that it means increasing order for asks and decreasing order for bids.

You shouldn't comment on things that are obvious within your domain.

Now you may put a comment if instead you sort things in the reverse order than usual, for example so that you can implement adding/removing a price level at the top more efficiently with std::vector (which is only efficient fot additions/removals at the back).


> Now you may put a comment if instead you sort things in the reverse order than usual, for example so that you can implement adding/removing a price level at the top more efficiently with std::vector (which is only efficient fot additions/removals at the back).

Though in that case you should probably have a comment explaining why you didn't use an std::deque.


I'm not sure what such a comment would say?

The advantages of vector over deque should be well known to any C++ programmer.


It's more that there's an asymmetry and sometimes that raises a question. You're right I could have picked a better example.


There are three kinds of comments:

1/ API documentation, which is a must unless you can cover everything with examples, which are better.

2/ Internal comments explaining how things fit together. I don’t do any of this any more. If my code doesn’t make this obvious, my code is wrong and gets refactored and functions get better names.

3/ Warning signs. Invaluable! “You might think that this is wrong and change this to use / instead of //. Nope! Don’t make that mistake!” kind of thing. Few and far between, hopefully.


When I find APIs that show a few examples but miss detailed specs I always miss the latter.

I agree examples are great for a variety of purposes, but they're no substitute for detailing the API endpoints, authorization mechanism, data types, etc.


I often include links to the section in the online D ref spec that defines the behavior the code is implementing. This turns out to be very handy.

I wish that could be done for C and C++. Too bad it can't, because of copyright issues. I don't link to online descriptions of the C std library, because I've found errors in those online rewrites (rewrites because of, again, copyright issues).

So, for C and C++, I just cite the paragraph number in the standard.


If you are parsing/manipulating a particularly hairy data structure, try to simplify it. If you cannot, put a comment with a simplified example of the input/output data structure(s) so that the the next developer (which may be you few months down the line) has something visual to match the code against instead of having to imagine everything.


Yes! I love the data input/output example in a comment. So few developers do this


If you’re not English speaker, do you always write comments in English, or in your native language? I personally see the latter as bad practice - in 21st century it’s almost impossible to expect that your code won’t ever be read by a foreigner. But I wonder how do you feel about it?


I am not an English speaker, but everything that ever went into my code was always in English, including names of variables, types, etc. Programming languages are in English and there is something that irks me in having another language mixed in.


Agreed. But I also discovered that in some domains local jargon doesn't correspond well to its English translation, or is not available at all, which is likely the case in law and accounting. In such cases, I go with the local language.


Then it’s important to add a good English comment explaining why the local jargon is used, right? :-)


Right :)


Exactly. I spoke about comments only. I can turn blind eye on occasional non-English comment or short note here and there. But naming variables and functions in local language is an inexcusable sin. I am from small country of Czechia with 10 mil people. Naming function in Czech is utterly stupid. Even if you’re sole developer, there’s a good chance you’ll want to paste a snippet to stackoverflow, sooner or later have a collaborator from nearby Ukraine or Pakistan (very common), or god forbid! - your product will be so successful someone will want to buy it. Good luck if the code base is littered with language nobody speaks :-)


The problem with Czech is our extended alphabet and, more importantly, conjugated forms of verbs and nouns. You can't make the code sound good without proper conjugation, but of course no mainstream programing language supports it.


< do you always write comments in English, or in your native language?

Same language as the codebase, so usually english.

It can make sense for the codebase to use local naming conventions e.g. for legal, accounting, or administrative concerns: the ideas and concepts don't necessarily translate easily (or at all) and all the reference documents are in the local language in which case the codebase will probably be better off using the local language, and both comments and commit messages should match.


At least here work language is English comments, jira tickets, code and so on. even when orginal team is fully native Finnish speakers next guy might not.


Code (variables, ...) are always in english, they are most of the time shorter than german words.

Comments are in my native language, if i am absolute sure, that this code will not be used by any other people.


You can never be sure :-) Don’t you paste snippets to SO, GitHub issues or forums, when seeking help?


In Rule 6 (Provide links to the original source of copied code) the article says:

> People copy a lot of code from Stack Overflow questions and answers. That code falls under Creative Commons licenses requiring attribution. A reference comment satisfies that requirement.

This is incorrect (or, more accurately, not enough). The license is CC-BY-SA: the BY part requires attribution, but the SA part also requires that you share your own code.


I'm not sure why this has rule 8. Don't do this. This is handled by git blame and PRs.

In regard to rule 5, I've found it's a bit more nuanced than:

> Without the comment, someone might “simplify” the code or view it as a mysterious but essential incantation. Save future readers time and anxiety by writing down why the code is needed.

What is idiomatic? Well that depends on nested organizational requirements merged with some community merged with developer experience.

I have some methods:

    public void doSomething() {
        myType foo = createType();
        foo.monitor();
    }

    public myType createType() {
        return new myType();
    }
There are no comments. What's idiomatic about this? Well the doSomething tests needed a mock, so we get a random create method. Why did the doSomething tests need a mock? Because the organization wants code coverage this way. You have to assume, because of company policy, there's tons of these things everywhere. I hate the term "idiomatic" when it's more subjective than anything else.


Rule 8: Add comments when fixing bugs.

>I'm not sure why this has rule 8.

It's oriented toward maintainers. Hence little attention to larger architectural questions or business strategy.


John Ousterhout spent a (short) chapter on writing comments in “a philosophy of software design”. The whole book is in my opinion a must read and gives advice rather than claiming it has all the answers. Back to original topic: including how and when to write comments


I believe random comments often become extra clutter and diversions from actual code. Code should have short, succinct names that explain in context.

What we should do is document the code, short and consise description of classes, methods and functions. This will then act as a reference, when names inevitably fall short.

From this discipline, comments above code-blocks should explain what's missing in code to a future reader - probably yourself even. But a basic explanation of "What the heck is this? What is it for?" might be in order, if not already given.

How to implement this depends on needs and tooling.

The bigger picture belong in design documents, with references to components.


One instance where I've found writing lots and lots of comments helpful is for functions with code that I'm writing for the first time. It's like rubber-duck troubleshooting - making myself explain what I'm doing in plain language actually helps me reason about it. I tend to leave those comments in because then the next person who works on it understands why I did something in a particular way. This perhaps results in superfluous comments, but you're never going to get it exactly right, and I think erring on the side of too many comments is probably better than too few.


Are you familiar with Literate Programming? https://en.wikipedia.org/wiki/Literate_programming


Thanks for the pointer - I had read about it, but hadn't put two and two together. The kinds of things I work on (cacheable wesbites) don't need to be highly tuned for back-end performance, so I guess it's easier for me to do that.


Lit style probably doesn't have much impact on performances, it's handled by the parser and mostly by ignoring the comments.

It can impact the memory of runtime-oriented languages, especially if they keep the text around (e.g. for reflection or whatever) but that's about it.


I do this in a markdown file, and then check that file into a /docs folder. That way, it doesn’t muddy the code, and even though it becomes inaccurate over time, it’s still handy for a “what was I thinking?” check.


The problem with lots of comments is not primarily clutter, but as quoted from the article

Writing and then maintaining comments is an expense. Your compiler doesn’t check your comments so there is no way to determine that comments are correct. You are, on the other hand, guaranteed that the computer is doing exactly what your code is telling it to.

Unless your team has the discipline to maintain them, don't litter the code with comments. Put it in the commit message. Obsolete and incorrect comments are just confusing.

What I sometimes do is write comments before I implement something. This provides a clear idea of what I need to do and where. Then when it's done I cut it into commit message or note


My rule of thumb: comments should tell you WHY; code itself should tell you HOW.


Say what you want, but I'll continue to write comments as much as I can. They help me think about what I wanted to code and in some cases even helped me realize a simpler way or even find unexpected cases. I also find it very useful when reviewing code to know what I wanted to make and quickly jump to find the block I'm looking thanks to the header comments. Reading uncommented code is like having all the comments here on HN one after another without any separation or formatting, hard to search and understand.


My favorite article of all time about code comments: http://antirez.com/news/124

I regularly recommend it to colleagues who make the bold claim that "No one needs comments. Good source code can and should document itself". The article covers pretty much all the types of comments and rules discussed in the OP and others not mentioned there.


"Best practice" is a horrible term. Best for what mix of goals and situation? Is it REALLY the best EVAR and every single alternative has been checked and evaluated?

It's weasel words, intended to give more credence to the advice that it has proven.

As for these recommendations, these aren't bad or good, even worse, these are mediocre.

Want my advice? Have your logging statements double as comments. Logging and comments both should concentrate around difficult code. So why double them up?

Javadoc was the last "good" idea in comments I ever saw: generate documentation from comments. Unfortunately it didn't provide enough autogeneration abilities or semantics to track evolving code. Nor do comments and javadocs integrate git history or help indicate heavily modified and evolved code, something I think could also be done.

I believe Rust enables some use of markdown in comments as well, that is a good idea.

IDEs, even intellij-level, don't really help out with comments and doc-comments much either.


It's a great article.

For myself, and my approach, I'm always a bit leery of "hard and fast" rules. I prefer a heuristic approach to almost everything that I do. Also, I've found that code comments are only part of the mix. As the article indicates, the code, itself, should be written in a clear fashion, and supporting materials (which can include seminars, tutorials, examples, unit tests, and test harnesses) are an important ingredient.

I wrote my own approach to documentation in this post: https://littlegreenviper.com/miscellany/leaving-a-legacy/

It's a long read. I don't think many people really give it much of a gander.


Function comments should explain what it does, and comments inside functions should explain the how.

  // Receives n >= 0 and returns the n'th fibonacci number
  int fib(int n) {
      // We use the memoized version of the algorithm.
  }


The code should explain what it does. In-line comments should provide explanation for any non-intuitive code that can’t be refactored to an intuitive state. The function comments should be about the intent of the code.


IMHO those aren't best practice at all. It's just a vague list pick for the sake of writing these article.

To me, the most important is that comment isn't the code. Then what it's and why we write it. Now, common becomes normal writing.

So the rule is? Know your audience.

Just think who you write this comment for and explain that to them. It helps a lot to guide people through what the code do. Especially in even driven code.

Imagine this pseudo code:

send_event({name: 'a', {props: name: 'a"}})

Why do name show up twice there? Without comment noone know why except people have business visbility. Because apparently some down the life consumer need the name in `props` and it cannot access the root of the object.

So. my best practice is know your audience when writing code comment.


I agree with basically all of the article. But I have a probably-misguided idea for better comments: Change text editors so that they don't put comments in dark grey (which for most of us, is on a black background)!

Why does my text editor think that, e.g., a function call should be blue and keyword arguments to it should be bright orange, but a comment should blend in to the background? Have them be bright green or something! If something merits a comment, it should be the most visually important thing in that block. At least that'll make bad (pointless, redundant, or outdated) comments stand out enough to annoy people.

If anyone knows how to make Sublime Text do this I'll give them a big virtual hug. :)


Usually any programming text editor will have themes for syntax highlighting. You can pick a theme that does what you want or modify one that suits you.


Ditto. I always edit themes to make comments brighter than code. Fairly easy in sublime text, more-so now there’s a built-in “edit theme” function.


I think comments should be used much less than we have propagated for years. First, idiomatic code often doesn't need any comments at all. Second, every project requires some level of documentation, but the documentation should be a high-level abstraction of the code.

So the documentation is the easy way for humans to understand what is going on and when I want to know the details, I can jump into the code. And just if the code itself is so complicated or its implications are not easy to understand (which should rarely be the case), then comments should be used.

However, the problem with this mantra is, that the original authors often don't know/want to know when their code is not simple enough ;-)


It goes both ways. As a C programmer I'm inclined to make certain streamlines or optimizations that, when I show them to my peers, exemplify that I am NOT writing idiomatic code. But to me and my niche online C community the approaches are completely obvious and even viewed as elegant.

Do I add comments or not? Should the person reading my code "just learn" how it works? Or should I "just realize" that I am using advanced patterns that few people know?


In my long experience of writing and reading code I think that comments have mostly been absent.

I think that lists of best practice for comments are mostly irrelevant because most developers simply don't write them.

So, unlike Peter Vogel, I would rather have some bad comments than no comments if that is the price I have to pay for worthwhile comments.

What is it about software development that makes people think that yet another list of things to do will make things better?

If I were to create a rule regarding comments it would be this: code review should include reviewing the comments.


> So, unlike Peter Vogel, I would rather have some bad comments than no comments if that is the price I have to pay for worthwhile comments.

IME people who write bad comments never write worthwhile ones, so that doesn't seem like a tradeoff, unless you mean a binary choice between allowing or forbidding comments.

> If I were to create a rule regarding comments it would be this: code review should include reviewing the comments.

Isn't that usually the case? And it's not that hard. The issue I usually hit is that code review should include reviewing the commit messages, and while others may (I really have no idea) github has even less support for reviewing commit messages than they do PR contents.


Notice that this only makes sense if a "bad" comment is somehow still "worthwhile." And we'll incur the cost of writing and maintaining (and, apparently doing code reviews on) bad comments to get there. And that ignores the cost of programmers reading and attempting to process bad comments.

Personally,I would rather have well written code than bad comments...and I think programmers can actually create well written code. I question whether we should reasonably expect our programmers, after creating well written code, to suddenly acquire the skillset for writing "worthwhile" comments.

Peter Vogel


> IME people who write bad comments never write worthwhile ones

It takes a lot of expertise to extract which information (mostly the "why", sometimes a reference you used while coding) is actually useful as a comment.

I also think you need to come across comments that helped _you_ understand other code to learn what good comments are.


Sure, so people with more experience should be demonstrating good practice here, as in other areas, by writing good comments for the newer people to see and gain experience with.


Your exhaustive test suite is the technical documentation.


Too often I write comments that I believe to be clear and descriptive only to find out much later on (when revisiting the code) they are not as clear and descriptive as I thought they were.

Therefore my top advise would be to review comments a few times after writing them to see if they are indeed the breadcrumbs that I hoped they are, or even have someone not involved in the development review the comments for clarity.


Rule 10. If you can - comment in commit messages instead of statically in code.

When you do git blame you get commit message for each line of code. You can see what ticket changed it (there will be reasons why it changed hopefully), what it was before, what other things changed with it. This is incredibly useful, much more useful than static comments in code. But you have to help yourself by writing good commit messages and making small commits that don't mix many changes into one. If you fix whitespace or some stylistic stuff - don't commit it in the same commit as your business logic changes.

The best things about comments in commit messages is that they are automatically changed when you change a line of code in the next commit. They are never outdated and are never lying to you (regular static comments often do). Git blame should answer "why is each line of code here right now". Static comments answer "what somebody at some point thought was important in this general region of the code".

Often companies want you to adhere to their commit message standards that make it harder to write good commit messages. For example with git they will say "[task number] one line 80 char description"

You can still write a good description and have git show it correctly in one-line format if you do:

    [task number] One line 80 char desc.
    
    The rest of a good descriptive commit message.
    As many lines as you like....


I don’t think this is a good rule.

1) I can’t see “git blame” in my code review. Don’t make me check out the branch to review your changes.

2) If we refactor the code (say, extract a new class wrapping the functionality) then the explanation in git becomes much harder to find. If it’s a comment you can just copy it over.

You should explain the change set in your git comments too, but at a different level of abstraction - why is this whole set of changes being made? What problem is being solved? What’s left to do? Etc. this stuff can’t usually be associated with a particular line of code.

So inasmuch as you are arguing for good commit messages I strongly agree. But I disagree that this should be at the expense of good comments in your code.


> code review

You shouldn’t need this. The commit message should clearly explain any non-obvious changes (and ideally there should only be one). If there is a historical context, the reviewed commit message should include it (including links to past commits in whatever repo hosting service is used).

> refactor

If there is a refactor the reader can skip past the change in blame. This does become slightly more complicated when code is moved between files, but most repo hosting services make this fairly easy.


I don't like this idea. Nobody will read your comment in this case. I prefer comments to be visible by default without having to run extra tools. Also, usually a commit adds many functions and classes and it's difficult to describe them in one message.


This idea does not work when you need a different comment on each line. You're obviously not going to commit each line of code separately.

Also, not all code is being read in an IDE, and not all code is stored in git. The code may outlive the repo and valuable context could be lost.


What's wrong with commenting on the end of a long block? It helps knowing where you are. I find them quite useful.


The article misses part of the point, I think. All the points it describes are symptomatic fixes.

The real issue is that code should read like a report as best as humanly possible: it should have a logical structure, be well signposted, local context clear at all times, and easy to spot and correct mistakes.


A comment at the end of a long block can be helpful. If you have to scroll up to figure out what block just finished


If your code block does not fit in the window, it's too long. You don't fix that with a comment. You should break it apart into smaller units.


Or you need a bigger monitor ;-)


Or use smaller fonts!


Or use code folding.


  // NOTE: At least in Firefox 2, if the user drags outside of the browser window,
  // mouse-move (and even mouse-down) events will not be received until
  // the user drags back inside the window. A workaround for this issue
  // exists in the implementation for onMouseLeave().
In Firefox 2, mouse-move events cease after dragging outside window.

  // TODO(hal): We are making the decimal separator be a period, 
  // regardless of the locale of the phone. We need to think about 
  // how to allow comma as decimal separator, which will require 
  // updating number parsing and other places that transform numbers 
  // to strings, such as FormatAsDecimal
TODO: allow commas as decimal separator.

Long comments disrupt visual flow. No comment is better than a bad comment.


Comments are an anti-pattern. The more comments you write, the less code gets deployed to production. Only code that's deployed to production counts.


I may have been hanging around on Hacker News too much, but I can't decide whether your comments are sincere or an attempt at parody (I hope it's the latter, or I pity Mr Krumins' employees).


They are doing great. They love deploying! Deploy or die is the mantra we have.


hahaha amazingly I actually heard this in real life a couple of days ago... something along: "you write more comments than code! that's not productive!"


And it's 100% true. The job is to get it out to production not to write essays in comments. Always be deploying.


I always thought my job was thinking, not writing? It should be about finding a good solution to the problem, not cranking out code like a madman.


No, you have to crank the code out to production as fast as possible. Ship it all the time, every day, tens of times a day. It's a war zone, not academia where you can think all day.


Nah, sorry, I don't buy that. I'm not in a "war zone", I'm building a product for humans, and would like to earn a little so I can enjoy life. Going to war doesn't spark joy for me, but if that's your kink, I'm not stopping you.


Running a business is always a war. Either you deploy and win the war, or you are still thinking about the code you will be deploying some day and lose the war to someone who deployed.


If someone is just one deployment behind you, it may be time to work smarter, not harder.


Just one deployment? To win the war you need to be hundred deployments ahead.


You can still ship early and often while adding a comment here and there so tomorrow you'll understand what you wrote and won't have to figure it out all over again


Agree here but:

   // not a longer comment than this


May I use your last sentence? It is great.


Absolutely!


Comments explain why the code exists. When you write a comment, you see sometimes that the code is not necessary.

Not necessary and removed code is the best code.


The best code is the one in production. All other code is useless.


The best code is the code that solves the user's problems


The best code is in production independent of anything else.


Comments are often a sign of code smell. If you have to explain the code, is it too long or too complex? Short functions, better naming, and so on. Other than that those rules seem reasonable. Specify intent over implmentation. The code implements, the comments explain the reasons, if any.


To you as the author it might be crystal clear, but the next person who has to modify might not have the full context. A lot of good comments in this thread mention that you should write comment about intention (“why”), not implementation details (“how”), although also the latter might make a significant difference to the next person. And even if your original code might have been clear to another person, it might have been modified in the meantime.


> A lot of good comments in this thread mention that you should write comment about intention (“why”), not implementation details (“how”)

Erm, that's what I said above. I don't believe that's controversial.

And neither is the fact that if there's a 30 line comment above a 100 line function, perhaps the function should be reduced in size because it's clearly complex. In fact, IDEs such as Intellij will flag it for complexity

Commenting for the sake of it, especially due to poorly named functions and variables is a code smell. Code is for the reader. The compiler doesn't care if your variables are two characters or twenty.


Breaking up a function doesn't always make it easier to follow. Instead it sends you bouncing around, trying to keep track of all the values passed back and forth. Some things you need to do are just complex because there's lot of complex rules and mathematics.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: