More often than not, I see code without any comments. There's this idea of writing self documenting code that really changed the commenting world.
And that whole thing was evangelized by Uncle Bob and the Agile wrecking crew. Before long it was bad to use comments, switch statements, or new up an object. This, in turn, led to the TDD movement, Agile only movement, enterprise patterns for all projects movement, and I'm sure there are others I'm forgetting.
Please comment your code. Tell me what you're trying to accomplish with this block of code. The function name doesn't always suffice. And I don't want to stare at it for 15 minutes, or re-format your 160 column LINQ statement, or Google your regex so I can read what it does on StackOverflow.
* If you ever had to maintain a code base you didn't design (who haven't ???) how can you not want to have as much paper trail as possible? Especially when the original authors move on.
* I don't remember discussing documentation (or even testing) styles/preferences during interviews. I don't remember managers making it a priority or educating the unwilling, even in large companies. So it's clearly not important from the business perspective. The closest I can remember was a mad rush to create runbooks after a particular nasty prod incident.
We live in a world of microservices. So you routinely work with multiple repositories. In larger companies developers routinely move to different teams in a couple of years. There's always another AWS service or non-relational data storage. It's interesting to notice that in other domains dealing with this kind of complexity and constant change it's expected to have written notes all the time. Think of medical doctors or aircraft maintenance crews.
When I was younger I kind of trusted that "self documenting code" promotion. As much as the Refactoring book was right this idea proved to be just wrong. Think about "business logic". Including the classical "converting XML/JSON to other protobuf/JSON". There's no grand theory here, just multiple confusing details influenced by previous versions, legacy ontologies, dependencies on other teams. Naming conventions won't help much, not to mention "the two most difficult problems in CS".
I think there's some correlation between poor documentation and missing tests. And the lame excuse is always the same - not having enough time. Even for obvious error-prone things such as calendar calculations or parsing deeply nested data received from the outside world. Or how to build and run a service.
Another correlation is between clear/structured thinking and how easy it is to explain the results. A reasonable functional decomposition and popular idioms/patterns/libraries documented elsewhere enable terse descriptions.
I see documentation as fungible. There are multiple somewhat interchangeable places information can be stored in. JIRA descriptions, commit messages, "javadocs", MD files (previously known as GOOG docs, wikis, Word documents). From what I've seen the people who have enough discipline to use one usually have others in place too.
Software development is very much a learning process. So other people will have to repeat it unless you summarize your findings for them. For some low-level details you could be one of them after not working on a particular component for long enough.
> Rule 6: Provide links to the original source of copied code
I learned this rule viscerally early in my career. Back in the golden age of Experimental Flash Art there was an enigmatic site called "flight404", and one day the site's author released source files for some of his more popular projects. All the Flash devs in my office started poring through them, and soon after my boss called me over to show me my own name in one of the comments!
Apparently the author (Robert Hodgin - whom I very much looked up to) had asked for help anonymously in a forum, and I had helped him out so he credited me in a comment (just for his own reference - in those days there was no Flash open source community and designers rarely distributed their source). That experience made me pretty obsessive about crediting outside influences in my source code whenever I get the chance.
I write my comments in commit messages because those are valid forever. A lot of times somebody will write a code comment, the code will be changed, but the comment not. This is a huge waste of time on so many fronts: writing the comment in the first place, and then confusing the subsequent developers with the wrong information. If you truly want to understand something, you can always check the change log, and find out why things are how they are.
Exceptions are e.g. if it's something exceptionally tricky or a hack of some kind that is kind of important. It doesn't happen all that much because the stuff I work on is simple. If I was going to do a lot of "commenting" I would prefer to write and update good documentation that gives an overview of how different things work together. The nitty gritty changes too often and is not that important in the grand scheme of things.
> I write my comments in commit messages because those are valid forever.
Don’t they disappear when someone squash merges branch where a file is both renamed and changed (a lot)? Or, at least, when somebody decides to move to code to another repo, and doesn’t bother bringing the git history along.
Yep. I make a commit on every little change and every team I've been on makes me squash them on merge.
Comments in Git commits are bad. Just comment the code and make sure the comments are updated while you're in the code. You can also look at them in a code review. The argument that they get outdated is easily remedied, but people just want to keep claiming they write 'self documented code.'
Git commit messages and code comments serve very different purposes, I don't see any reason to try to use one for the other or opposite.
To me both are important but for different reasons. I don't want to search the blame history of a line of code if a simple comment had been enough, neither would I want to primarily use code comments when bisecting.
> make sure the comments are updated while you're in the code. You can also look at them in a code review.
People will invariably forget to update comments and unless the comments show up in context lines of the diff associated with a change, it's likely that reviewers will overlook the need to change them.
You comment is a standard explanation for not having comments. Always makes me wonder why stale comments aren't flagged during code review? Seriously failure to update comments should eventually lead to constructive dismissal.
I don't think "constructive dismissal" is the phrase is you want there. (That's normally associated with employer wrongdoing or, at a minimum, mischaracterizing the terms of an employee's departure..)
I don’t understand. Can you not squash your commits and rewrite the message before you merge? Wouldn’t that preserve the message? (Trying to think of a scenario where it wouldn’t…)
I dunno if he really meant that, or just said he is explaining the added functionality in detail. I do explain in the commit message the feature I have just added.
I typically squash my own commits, and as part of that I aggregate all of the commit messages into a single message with all of the relevant details (and leave out the “fixed typo” type stuff that’s not relevant.)
From the complaints that people are raising here there must be dev shops where someone else decides to squash a bunch of commits and throw away the messages.
> I write my comments in commit messages because those are valid forever.
I try to write good comments and commit messages. Comments are typically along the lines of explaining what was done and how for a particular block of code or method. Header comments include a list of parameters, return values, side effects, class variables, etc. Commit messages explain what was done or changed and why.
> Header comments include a list of parameters, return values, side effects, class variables, etc.
These are only useful for public stuff and in special circumstances in other cases. Sadly a lot of companies make stylecheck require them on every little private method which makes each file 50% longer for no reason and drown the really useful comments in noise.
> Comments are typically along the lines of explaining what was done and how for a particular block of code or method.
The problem with that is that the block of code that you explain will often call other code, and then that other code will change for other reasons (preserving the correctness but not the initial design that was in the comment) and then that explanation higher in the callstack won't be true anymore. It's nice to pretend we always check everything that calls our code all the way up when we change stuff, but in reality if tests pass and the code works - we often don't check for comments up the callstack. So the comments will drift away from the truth with time.
Comments in commit messages are much more likely to be true than static comments in code (for example if you refactored some method signature as part of the change - every call site will have the updated commit message automatically). If you comment in the code manually - you will probably not notice that you have to change a comment block 5 lines above the changed function call - it won't even show in the diff - so the comment block won't be true anymore).
This sounds like a recipe for write-only comments.
The point of a good comment is to guide later development - to say "here's something you should know before editing this code". If you put information like that in commit logs, then that implies that anybody who updates any part of that codebase must first read every log for every commit near what they're editing. If they don't (and they presumably won't), they'll never see that "here's why we're not doing X" comment before they change the code to do X.
Meanwhile, it's true that comments can drift away from truth over time, but putting them in commit logs doesn't change that in any way. An inline comment that's stale can be updated (and doing so is part of the job!). But a commit log comment that's no longer accurate will stay there forever, waiting to mislead anyone who finds it. The promise that it was accurate when originally written is of no practical use later on.
Same thing here. At work I can see the history going back to 1995. And that's after migrating from something to Mercurial, and from Mercurial to git. Maybe that's not the case in all companies, but in the one I work at, the commit history is the longest-lasting information trail.
> Only until someone moves a file to a new directory. Now this file shows up as a new file with no history.
Git will usually be able to link the two back, unless the move was combined with a lot of changes, as it does not record moves but infers them.
Even if it can't link them, the "creation" commit will visibly remove the old file, at which point you can... log / blame on the previous location and keep going.
> Also, you hope to never change your version control system because that change will erase the history.
Of course not, there are conversion tools between basically all VCS, and anyone tasked with such a migration who is not a complete goober will use them in order to maintain the historical record.
At $dayjob we've got history spanning over 3 different VCS and more than 15 years, and that's including weird stuff like splitting and merging repositories.
That's a great idea. I've been doing this just to rationalize my laziness, but now it makes sense. However, I'm not sure those are valid forever as you can simply delete the .git folder, can't you?
That’s horrible because the git commit messages are easily lost, disconnected or hard to find in any reasonably active codebase. For example as soon as you do a change and move a file it almost always disconnects from the previous change history.
Whats even more difficult is searching through a code base when the documentation isn’t in or near the code. I don’t know any IDE or editor that makes it easy to search though git commit message and source code at the same time.
On top of that, do you review git commit message in code review? Do you aks people to improve descriptions, typos and language in commit messages?
I don't understand how they are hard to find. It's trivial to find where the file was moved from.
What I truly care about is why something is the way it is, what is the rationale behind it, how it works with other parts of the codebase, what problems it solves, what is tricky about it, what to pay attention to and so on. I don't care at all about the code that was written and an explanation to it because this I can read myself in the code. The best place I've found for this is a code commit because it can tie different parts of the codebase together and add a lot of context to a change. I commit heavily and don't squash. A long comment in a commit that contains all file changes related to a certain feature, bug or whatever adds a lot more information than a comment in one file. When other people do it, it helps me a lot more than chasing their (outdated) code comments throughout the codebase.
But if that doesn't work for you, then don't do it. Just don't be dogmatic and dismissive. I accept there might be situations and codebases where this doesn't work.
I'm lost. Are you talking about meta-level comments, like "here's why this change is being made", that aren't directly tied to any particular line of code? If so, certainly putting that stuff in commits makes sense.
But for regular "explanatory note about this variable/function/etc" comments, how does one work if those things are in commit logs? If you're reading code and something is unclear, do you look back through the commit messages of every commit that's ever touched that line, just in case one of them has something relevant?
IDEs can show commit history. But you're right that it can get messy tracing back to the original commit when a file has been changed many times.
But for the most part I feel like comments should be automated tests. If a line is there for an edge case it should have a matching test for that edge case.
The only exceptions I see are for performance optimizations or some other situation where you can't easily test.
How you check the commit history is beside the point - the question is, do you read the entire commit log history for a chunk of code before editing it?
If not, any comments there might as well not exist - functionally speaking you're maintaining an uncommented codebase.
Your comments applies very little to commit messages, and much more so to comments.
> That’s horrible because the git commit messages are easily lost, disconnected or hard to find in any reasonably active codebase. For example as soon as you do a change and move a file it almost always disconnects from the previous change history.
Learn your tools or get better ones, `git log --follow` has no issues with renames, and when files get munged in ways it can't handle (e.g. content is split out or merged) it's easy enough to stitch back, and good annotate UIs (Jetbrain's is stellar and one of the few things I don't use magit for) make flitting through a snippet's history trivial.
Meanwhile finding removed comments is nearly impossible (VCS are nowhere near as good for finding when was removed than when it was added), and comments can easily drift apart from their point of origin as developers aren't too careful about maintaining them when adding unrelated comments.
> On top of that, do you review git commit message in code review? Do you aks people to improve descriptions, typos and language in commit messages?
Can second that. Having the commit message as a virtual comment on the end of the current line is immensely helpful in tracking changes. It’s like git blame on steroids.
> On top of that, do you review git commit message in code review? Do you aks people to improve descriptions, typos and language in commit messages?
Absolutely I do. The commit message is part of the commit just like the code; why would it be excluded from review? The number of times that a good commit message has helped me when dealing with a bug, and that a poor one has stymied me, have firmly convinced me that they are just as important as any other project documentation. They should be clear and informative, and I ask for those aspects to be improved when needed.
I’ve been misled by comments so often that I now literally don’t see them. They’re like banner ads on websites. My brain just doesn’t register them anymore.
They get orphaned by slightly wonky merges. The underlying code gets updated or refactored, but the comments remain. When they are correct, they’re useless 90% of the time (at least in codebases whose linter requires doc comments). Even the accurate comments tend to drift with age and become inaccurate unless they’re carefully maintained (which they almost never are).
It’s really hard for me to figure out the balance.
For exceptionally good codebases (Redis, SQLite come to mind), the comments are a godsend.
For mediocre codebases, the comments are largely a waste of time at best, misleading and time-wasting at worst. And most of us, I suspect, are working on mediocre codebases.
Rust libraries use comments for generating documentation and testing. Here is an example of such documentation: [1]. It is difficult to believe that this code would be better without comments.
At least classes and public methods should have comments (except trivial ones).
What is generating this HTML page? I'm always looking for something like this can help automatically generate documentation from code comments; it's very easy to get buy in for "update the docs when you update the code," much less so for "go through the comments for every piece of code you touch and double check they're still relevant." It's one of those things everyone nods at then promptly ignores.
Something like this with a discrete task that can be checked on code review would help immensely, I think.
> Rustdoc actually uses the rustc internals directly. It lives in-tree with the compiler and standard library. This chapter is about how it works. For information about Rustdoc's features and how to use them, see the Rustdoc book. For more details about how rustdoc works, see the "Rustdoc internals" chapter.
docs.rs runs it for every crate (rust library) uploaded to crates.io.
When actually working with rust, I recommend running `cargo doc --open` semi frequently. It generates that documentation from all your dependencies, and your own code, links it all together (with search no less) and opens it in a browser.
Note that it only uses doc comments, which are distinct from normal comments (/// instead of //, only allowed in certain places, //! Instead of /// to apply to the parent item instead of the next item).
I'm interested to know the answer as well. I've used org mode in the past for literate programming and document generation, but I'm starting to think that generating docs from comments is also a good option.
Adding a mechanism to publish the docs (maybe part of CI/CD?) would make maintenance way easier, it's something I would love to test out.
I’ve been working on a tool to do this based on a simple Python file watcher, web server and https://casual-effects.com/markdeep/ It just turns regular code into markdown code blocks and /** comment contents pass through to the markdown —> HTML pipeline. A little JS websocket and you get an auto-updating preview from any editor to any browser every time you ctrl-s.
Not much time to work on it lately. But, I’ll Show HN when I put it on Github.
This is what source control blame is for. It’s very hard to estimate what business reasons will be important to readers of code in the future.
I very often nudge people to remove their comments entirely. Less experienced devs often write comments to explain code, instead of spending time on making the code itself readable/understandable. I often ask: “Can you modify the code such that the comment will become obsolete?”
But well-written comments that explain assumptions and intent helps as code evolves over time.
Also, comments are quicker to scan than code itself.
A good comment can indicate code that can be ignored for the purposes of certain troubleshooting.
Finally, junior developers usually write code that is difficult to understand and don't comment.
I've rarely ever seen a junior developer that both writes unreadable code and takes the time to write comments.
Usually comments are a sign of seniority, someone who has pity on those who will come after him/her.
And thus that person tends to also write understandable code.
Perhaps there's an uncanny valley in between where comments are a band-aid for complexity, but I have never observed that in years of working with many developers.
Link to things in the same repo, like a wiki or external document (if you're linking to external stuff anyway). You have to assume whatever you link to will disappear without notification. At least if it's in the repo, you can check out the old commit and get the file back, and hopefully find the commit where it was removed and see why it was removed.
If you link to Sharepoint or something it's useless as soon as that link changes.
I always say more generally “explain the why”, business OR technical reasons.
Bigger architecture decisions should go in ADRs (again, explain the why) but smaller stuff like explaining why you monkeypatched a library can save future devs a lot of lost investigation time and pain. Maybe by the time they are reading your comment/code, the patch you wrote is supported in the main library!
One of the most transformative features we added to the D language was Ddoc, which is a documentation generator for functions. It has a modestly standard format. The result is it applies significant pressure on the coder to add the Ddoc comments, and a routine change request review of a PR is "please add function ddoc comment".
Before Ddoc, the D standard library was inadequate, totally wrong, or missing entirely. After Ddoc, it became reasonable (though no documentation is perfect). Further improvements were an ability to actually run the example code in the Ddoc comment as a unit test.
The end result is the entire documentation of the D runtime library is generated by Ddoc from the source code.
Doesn't every language have a feature like this these days? If not an official standard, then a commonly used third-party tool.
Though embedding unit tests in API documentation is less common; the only other languages I'm aware of that support it out of the box are Python and Rust. A Google search turns up some implementations for other languages, like C++ and Haskell, but I don't know how widely used they are.
Regardless, I definitely agree that it's a very important and useful feature.
DOxygen existed for D at the time we developed Ddoc, and some people used it, and so I was constantly asked "Why not just use DOxygen?"
The problem with DOxygen was, it wasn't installed automatically! It was extra work to go get it and install it, so it never happened. What a builtin Ddoc does is:
1. It's always installed
2. It's always matched to the current compiler (one never has a mismatched set)
3. It is standardized (code using DocX is not mismatched with other code using DocY)
4. It can take advantage of the compiler's available semantic information
5. Don't have to beg documentation tool vendor to add features we need. For example, Markdown support was recently added. Didn't have to ask or beg. A Ddoc user simply added it
These advantages are enormous and transformative. Minimizing friction matters a great deal.
C and C++ do not have builtin documentation generators. I ask you, of professional C/C++ code you've seen, how many consistently and properly documented the function interfaces? In my experience, it's rare. It's much more common in D, and that's entirely due to Ddoc.
One that's missing: comments should explain why a piece of code exists or is written in a certain way (and implicitly, when it can be changed or removed). This overlaps with "explain unidiomatic code in comments", but there can be idiomatic code whose purpose isn't obvious.
+1 for this. It also links to my #1 rule for comments: why not. It applies when:
1. There's a chunk of code that, on first reading, could be clearer/simpler/more idiomatic.
2. There's a good reason not to use the obvious approach, and do something else instead (maybe performance).
Then comment to explain why the obvious path wasn't taken. No matter how well written, code alone can never explain "why not". I've found this invaluable, even looking back at my own code.
“Why not” is one of the top reasons to write a comment. It should be discussed more. Many times I've encountered a piece of code which could have been much simpler or idiomatic. Upon rewriting, I discovered that it didn't work for some obscure reason. If there's no comment explaining “why not”, many others (including future me) could lose time trying to do the same.
Man, I've been bit a few times with this. I'll read code and think, "Why the hell did they do that?" Later, when I'm almost done re-writing it, I see the edge case they were working around.
I think "why not" is one of those things which definitely does not belong in comments, if you regularly write up such justification, your code becomes mostly comments. Even more so than "positive" comments, "negative" comments belong in the commit message.
Possibly unless your entire codebase is literate, and code is secondary to comments.
honestly it needs both, but if you only put it in 1 place put it in code.
During the development you are mostly likely looking at commits, or PRs, so that makes sense.
But if its long living piece of code, people will get you your code via following function/method chains or just browsing the source not commits. While you can use git blame, and then figure out the commit, and then read last few commit messages, putting comment on code is easier on everybody.
I like these rules but if I were writing my own set the first one would be that the most important comments you write are often those describing persistent mutable state. Often the point of keeping an objects members private is to preserve the parity among them and you explaining it where they're declared can save everyone a lot of trouble. Also if you've got a state machine the semantics of all the different states.
I write comments first, then fill in the code later, adjusting the comments as I learn better ways to do things. That way, the comments are like a guide to anyone as to the goal of a section of code. I comment about every 2 or 3 lines of code, or more sometimes. I even comment on things that everyone would easily understand. My comments are basically a plain English version of the code. Functions and such have comments or docstrings that explain the function and the basic steps of its functionality, so that’s about 2 times that I explain things in my code. I’ve never had anyone say I comment too much. When reading uncommented code, I wish there were more comments sometimes to explain what each variable does or is. My variables are often named with 4 or 5 words, connected by underscores or title case. I often see variables that aren’t named well and try to avoid that myself.
Are you concerned that you are introducing tech debt into the codebase? Anyone who refactors your code later will also have to refactor your comments (but likely won't).
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
If effort is linear then, by definition, you should never put more than 50% brainpower in to your code.
I'm not sure I want to increase the effort I put in though.
I believe "cleverness" refers to ingenuity; it's outside-the-box, and trades simplicity in one plane for sophistication in another.
However most "clever code" we come across is lopsided in this tradeoff, and might even require cleverness to understand, which it turns out is not very clever at all.
Also, find the real source of what you're copying. The formula they cite for brightness is a rounded version of BT601 luminance. Citing the standard in the comment is way better than just linking a random SO answer.
Links are mutable, links die, SO answers can be edited. Include any information needed to understand the code into the comment and proper copyright acknowledgements if you copied it (assuming the license allows it).
Yes! The comment may not even have been necessary, if you extract into a function calculateBT601Luminance(red, blue, green).
Then you can link the standard, or at least the Wikipedia page, and I would lean towards this if I don't expect readers to know a bit about the domain. But if you don't, someone can still find an authoritative source with a single search.
What search engine prioritizes links in comments or code blocks as any sort of signal?
I have no particular attachment to SO, especially after the way they handled their public drama recently.
That said I put a link to SO any time I have to look something up there and it’s not immediately obvious from the naming/docs why it does what it does. I also try to sum it up in a sentence or two if I can and if it doesn’t distract from understanding the larger goal of that section of code.
Sure, but also it may be worth noting that the content license on Stack Overflow posts requires this. And it's the poster's content, not Stack Overflow's, so there's an element of respecting a fellow coder who helped you. In fact, most open source licenses require attribution at a minimum.
When you solve something in a weird way, leave a link to SO in a comment.
Write TODO and NOTE and use a tool to find all your special comments that show unfinished features and investigations. Scan through regularly to make sure the comments still make sense.
Comments not containing special strings should just be "why" explanations, eg "we sort the bids in the reverse order to the asks because the best bid that the highest price". So generally something where the code has special cases that are explained by the domain.
Within the domain of electronic trading it should be obvious that prices are ordered from best to worst, and that it means increasing order for asks and decreasing order for bids.
You shouldn't comment on things that are obvious within your domain.
Now you may put a comment if instead you sort things in the reverse order than usual, for example so that you can implement adding/removing a price level at the top more efficiently with std::vector (which is only efficient fot additions/removals at the back).
> Now you may put a comment if instead you sort things in the reverse order than usual, for example so that you can implement adding/removing a price level at the top more efficiently with std::vector (which is only efficient fot additions/removals at the back).
Though in that case you should probably have a comment explaining why you didn't use an std::deque.
1/ API documentation, which is a must unless you can cover everything with examples, which are better.
2/ Internal comments explaining how things fit together. I don’t do any of this any more. If my code doesn’t make this obvious, my code is wrong and gets refactored and functions get better names.
3/ Warning signs. Invaluable! “You might think that this is wrong and change this to use / instead of //. Nope! Don’t make that mistake!” kind of thing. Few and far between, hopefully.
When I find APIs that show a few examples but miss detailed specs I always miss the latter.
I agree examples are great for a variety of purposes, but they're no substitute for detailing the API endpoints, authorization mechanism, data types, etc.
I often include links to the section in the online D ref spec that defines the behavior the code is implementing. This turns out to be very handy.
I wish that could be done for C and C++. Too bad it can't, because of copyright issues. I don't link to online descriptions of the C std library, because I've found errors in those online rewrites (rewrites because of, again, copyright issues).
So, for C and C++, I just cite the paragraph number in the standard.
If you are parsing/manipulating a particularly hairy data structure, try to simplify it. If you cannot, put a comment with a simplified example of the input/output data structure(s) so that the the next developer (which may be you few months down the line) has something visual to match the code against instead of having to imagine everything.
If you’re not English speaker, do you always write comments in English, or in your native language? I personally see the latter as bad practice - in 21st century it’s almost impossible to expect that your code won’t ever be read by a foreigner. But I wonder how do you feel about it?
I am not an English speaker, but everything that ever went into my code was always in English, including names of variables, types, etc. Programming languages are in English and there is something that irks me in having another language mixed in.
Agreed. But I also discovered that in some domains local jargon doesn't correspond well to its English translation, or is not available at all, which is likely the case in law and accounting. In such cases, I go with the local language.
Exactly. I spoke about comments only. I can turn blind eye on occasional non-English comment or short note here and there. But naming variables and functions in local language is an inexcusable sin. I am from small country of Czechia with 10 mil people. Naming function in Czech is utterly stupid. Even if you’re sole developer, there’s a good chance you’ll want to paste a snippet to stackoverflow, sooner or later have a collaborator from nearby Ukraine or Pakistan (very common), or god forbid! - your product will be so successful someone will want to buy it. Good luck if the code base is littered with language nobody speaks :-)
The problem with Czech is our extended alphabet and, more importantly, conjugated forms of verbs and nouns. You can't make the code sound good without proper conjugation, but of course no mainstream programing language supports it.
< do you always write comments in English, or in your native language?
Same language as the codebase, so usually english.
It can make sense for the codebase to use local naming conventions e.g. for legal, accounting, or administrative concerns: the ideas and concepts don't necessarily translate easily (or at all) and all the reference documents are in the local language in which case the codebase will probably be better off using the local language, and both comments and commit messages should match.
At least here work language is English comments, jira tickets, code and so on. even when orginal team is fully native Finnish speakers next guy might not.
In Rule 6 (Provide links to the original source of copied code) the article says:
> People copy a lot of code from Stack Overflow questions and answers. That code falls under Creative Commons licenses requiring attribution. A reference comment satisfies that requirement.
This is incorrect (or, more accurately, not enough). The license is CC-BY-SA: the BY part requires attribution, but the SA part also requires that you share your own code.
I'm not sure why this has rule 8. Don't do this. This is handled by git blame and PRs.
In regard to rule 5, I've found it's a bit more nuanced than:
> Without the comment, someone might “simplify” the code or view it as a mysterious but essential incantation. Save future readers time and anxiety by writing down why the code is needed.
What is idiomatic? Well that depends on nested organizational requirements merged with some community merged with developer experience.
I have some methods:
public void doSomething() {
myType foo = createType();
foo.monitor();
}
public myType createType() {
return new myType();
}
There are no comments. What's idiomatic about this? Well the doSomething tests needed a mock, so we get a random create method. Why did the doSomething tests need a mock? Because the organization wants code coverage this way. You have to assume, because of company policy, there's tons of these things everywhere. I hate the term "idiomatic" when it's more subjective than anything else.
John Ousterhout spent a (short) chapter on writing comments in “a philosophy of software design”. The whole book is in my opinion a must read and gives advice rather than claiming it has all the answers. Back to original topic: including how and when to write comments
I believe random comments often become extra clutter and diversions from actual code. Code should have short, succinct names that explain in context.
What we should do is document the code, short and consise description of classes, methods and functions. This will then act as a reference, when names inevitably fall short.
From this discipline, comments above code-blocks should explain what's missing in code to a future reader - probably yourself even. But a basic explanation of "What the heck is this? What is it for?" might be in order, if not already given.
How to implement this depends on needs and tooling.
The bigger picture belong in design documents, with references to components.
One instance where I've found writing lots and lots of comments helpful is for functions with code that I'm writing for the first time. It's like rubber-duck troubleshooting - making myself explain what I'm doing in plain language actually helps me reason about it. I tend to leave those comments in because then the next person who works on it understands why I did something in a particular way. This perhaps results in superfluous comments, but you're never going to get it exactly right, and I think erring on the side of too many comments is probably better than too few.
Thanks for the pointer - I had read about it, but hadn't put two and two together. The kinds of things I work on (cacheable wesbites) don't need to be highly tuned for back-end performance, so I guess it's easier for me to do that.
I do this in a markdown file, and then check that file into a /docs folder. That way, it doesn’t muddy the code, and even though it becomes inaccurate over time, it’s still handy for a “what was I thinking?” check.
The problem with lots of comments is not primarily clutter, but as quoted from the article
Writing and then maintaining comments is an expense.
Your compiler doesn’t check your comments so there is no way to determine that comments are correct.
You are, on the other hand, guaranteed that the computer is doing exactly what your code is telling it to.
Unless your team has the discipline to maintain them, don't litter the code with comments.
Put it in the commit message.
Obsolete and incorrect comments are just confusing.
What I sometimes do is write comments before I implement something.
This provides a clear idea of what I need to do and where.
Then when it's done I cut it into commit message or note
Say what you want, but I'll continue to write comments as much as I can. They help me think about what I wanted to code and in some cases even helped me realize a simpler way or even find unexpected cases. I also find it very useful when reviewing code to know what I wanted to make and quickly jump to find the block I'm looking thanks to the header comments. Reading uncommented code is like having all the comments here on HN one after another without any separation or formatting, hard to search and understand.
I regularly recommend it to colleagues who make the bold claim that "No one needs comments. Good source code can and should document itself". The article covers pretty much all the types of comments and rules discussed in the OP and others not mentioned there.
"Best practice" is a horrible term. Best for what mix of goals and situation? Is it REALLY the best EVAR and every single alternative has been checked and evaluated?
It's weasel words, intended to give more credence to the advice that it has proven.
As for these recommendations, these aren't bad or good, even worse, these are mediocre.
Want my advice? Have your logging statements double as comments. Logging and comments both should concentrate around difficult code. So why double them up?
Javadoc was the last "good" idea in comments I ever saw: generate documentation from comments. Unfortunately it didn't provide enough autogeneration abilities or semantics to track evolving code. Nor do comments and javadocs integrate git history or help indicate heavily modified and evolved code, something I think could also be done.
I believe Rust enables some use of markdown in comments as well, that is a good idea.
IDEs, even intellij-level, don't really help out with comments and doc-comments much either.
For myself, and my approach, I'm always a bit leery of "hard and fast" rules. I prefer a heuristic approach to almost everything that I do. Also, I've found that code comments are only part of the mix. As the article indicates, the code, itself, should be written in a clear fashion, and supporting materials (which can include seminars, tutorials, examples, unit tests, and test harnesses) are an important ingredient.
The code should explain what it does. In-line comments should provide explanation for any non-intuitive code that can’t be refactored to an intuitive state. The function comments should be about the intent of the code.
IMHO those aren't best practice at all. It's just a vague list pick for the sake of writing these article.
To me, the most important is that comment isn't the code. Then what it's and why we write it. Now, common becomes normal writing.
So the rule is? Know your audience.
Just think who you write this comment for and explain that to them. It helps a lot to guide people through what the code do. Especially in even driven code.
Imagine this pseudo code:
send_event({name: 'a', {props: name: 'a"}})
Why do name show up twice there? Without comment noone know why except people have business visbility. Because apparently some down the life consumer need the name in `props` and it cannot access the root of the object.
So. my best practice is know your audience when writing code comment.
I agree with basically all of the article. But I have a probably-misguided idea for better comments: Change text editors so that they don't put comments in dark grey (which for most of us, is on a black background)!
Why does my text editor think that, e.g., a function call should be blue and keyword arguments to it should be bright orange, but a comment should blend in to the background? Have them be bright green or something! If something merits a comment, it should be the most visually important thing in that block. At least that'll make bad (pointless, redundant, or outdated) comments stand out enough to annoy people.
If anyone knows how to make Sublime Text do this I'll give them a big virtual hug. :)
Usually any programming text editor will have themes for syntax highlighting. You can pick a theme that does what you want or modify one that suits you.
I think comments should be used much less than we have propagated for years. First, idiomatic code often doesn't need any comments at all. Second, every project requires some level of documentation, but the documentation should be a high-level abstraction of the code.
So the documentation is the easy way for humans to understand what is going on and when I want to know the details, I can jump into the code. And just if the code itself is so complicated or its implications are not easy to understand (which should rarely be the case), then comments should be used.
However, the problem with this mantra is, that the original authors often don't know/want to know when their code is not simple enough ;-)
It goes both ways. As a C programmer I'm inclined to make certain streamlines or optimizations that, when I show them to my peers, exemplify that I am NOT writing idiomatic code. But to me and my niche online C community the approaches are completely obvious and even viewed as elegant.
Do I add comments or not? Should the person reading my code "just learn" how it works? Or should I "just realize" that I am using advanced patterns that few people know?
> So, unlike Peter Vogel, I would rather have some bad comments than no comments if that is the price I have to pay for worthwhile comments.
IME people who write bad comments never write worthwhile ones, so that doesn't seem like a tradeoff, unless you mean a binary choice between allowing or forbidding comments.
> If I were to create a rule regarding comments it would be this: code review should include reviewing the comments.
Isn't that usually the case? And it's not that hard. The issue I usually hit is that code review should include reviewing the commit messages, and while others may (I really have no idea) github has even less support for reviewing commit messages than they do PR contents.
Notice that this only makes sense if a "bad" comment is somehow still "worthwhile." And we'll incur the cost of writing and maintaining (and, apparently doing code reviews on) bad comments to get there. And that ignores the cost of programmers reading and attempting to process bad comments.
Personally,I would rather have well written code than bad comments...and I think programmers can actually create well written code. I question whether we should reasonably expect our programmers, after creating well written code, to suddenly acquire the skillset for writing "worthwhile" comments.
> IME people who write bad comments never write worthwhile ones
It takes a lot of expertise to extract which information (mostly the "why", sometimes a reference you used while coding) is actually useful as a comment.
I also think you need to come across comments that helped _you_ understand other code to learn what good comments are.
Sure, so people with more experience should be demonstrating good practice here, as in other areas, by writing good comments for the newer people to see and gain experience with.
Too often I write comments that I believe to be clear and descriptive only to find out much later on (when revisiting the code) they are not as clear and descriptive as I thought they were.
Therefore my top advise would be to review comments a few times after writing them to see if they are indeed the breadcrumbs that I hoped they are, or even have someone not involved in the development review the comments for clarity.
Rule 10. If you can - comment in commit messages instead of statically in code.
When you do git blame you get commit message for each line of code. You can see what ticket changed it (there will be reasons why it changed hopefully), what it was before, what other things changed with it. This is incredibly useful, much more useful than static comments in code. But you have to help yourself by writing good commit messages and making small commits that don't mix many changes into one. If you fix whitespace or some stylistic stuff - don't commit it in the same commit as your business logic changes.
The best things about comments in commit messages is that they are automatically changed when you change a line of code in the next commit. They are never outdated and are never lying to you (regular static comments often do). Git blame should answer "why is each line of code here right now". Static comments answer "what somebody at some point thought was important in this general region of the code".
Often companies want you to adhere to their commit message standards that make it harder to write good commit messages. For example with git they will say "[task number] one line 80 char description"
You can still write a good description and have git show it correctly in one-line format if you do:
[task number] One line 80 char desc.
The rest of a good descriptive commit message.
As many lines as you like....
1) I can’t see “git blame” in my code review. Don’t make me check out the branch to review your changes.
2) If we refactor the code (say, extract a new class wrapping the functionality) then the explanation in git becomes much harder to find. If it’s a comment you can just copy it over.
You should explain the change set in your git comments too, but at a different level of abstraction - why is this whole set of changes being made? What problem is being solved? What’s left to do? Etc. this stuff can’t usually be associated with a particular line of code.
So inasmuch as you are arguing for good commit messages I strongly agree. But I disagree that this should be at the expense of good comments in your code.
You shouldn’t need this. The commit message should clearly explain any non-obvious changes (and ideally there should only be one). If there is a historical context, the reviewed commit message should include it (including links to past commits in whatever repo hosting service is used).
> refactor
If there is a refactor the reader can skip past the change in blame. This does become slightly more complicated when code is moved between files, but most repo hosting services make this fairly easy.
I don't like this idea. Nobody will read your comment in this case. I prefer comments to be visible by default without having to run extra tools. Also, usually a commit adds many functions and classes and it's difficult to describe them in one message.
The article misses part of the point, I think. All the points it describes are symptomatic fixes.
The real issue is that code should read like a report as best as humanly possible: it should have a logical structure, be well signposted, local context clear at all times, and easy to spot and correct mistakes.
// NOTE: At least in Firefox 2, if the user drags outside of the browser window,
// mouse-move (and even mouse-down) events will not be received until
// the user drags back inside the window. A workaround for this issue
// exists in the implementation for onMouseLeave().
In Firefox 2, mouse-move events cease after dragging outside window.
// TODO(hal): We are making the decimal separator be a period,
// regardless of the locale of the phone. We need to think about
// how to allow comma as decimal separator, which will require
// updating number parsing and other places that transform numbers
// to strings, such as FormatAsDecimal
TODO: allow commas as decimal separator.
Long comments disrupt visual flow. No comment is better than a bad comment.
I may have been hanging around on Hacker News too much, but I can't decide whether your comments are sincere or an attempt at parody (I hope it's the latter, or I pity Mr Krumins' employees).
hahaha amazingly I actually heard this in real life a couple of days ago... something along: "you write more comments than code! that's not productive!"
No, you have to crank the code out to production as fast as possible. Ship it all the time, every day, tens of times a day. It's a war zone, not academia where you can think all day.
Nah, sorry, I don't buy that. I'm not in a "war zone", I'm building a product for humans, and would like to earn a little so I can enjoy life.
Going to war doesn't spark joy for me, but if that's your kink, I'm not stopping you.
Running a business is always a war. Either you deploy and win the war, or you are still thinking about the code you will be deploying some day and lose the war to someone who deployed.
You can still ship early and often while adding a comment here and there so tomorrow you'll understand what you wrote and won't have to figure it out all over again
Comments are often a sign of code smell. If you have to explain the code, is it too long or too complex? Short functions, better naming, and so on. Other than that those rules seem reasonable. Specify intent over implmentation. The code implements, the comments explain the reasons, if any.
To you as the author it might be crystal clear, but the next person who has to modify might not have the full context. A lot of good comments in this thread mention that you should write comment about intention (“why”), not implementation details (“how”), although also the latter might make a significant difference to the next person.
And even if your original code might have been clear to another person, it might have been modified in the meantime.
> A lot of good comments in this thread mention that you should write comment about intention (“why”), not implementation details (“how”)
Erm, that's what I said above. I don't believe that's controversial.
And neither is the fact that if there's a 30 line comment above a 100 line function, perhaps the function should be reduced in size because it's clearly complex. In fact, IDEs such as Intellij will flag it for complexity
Commenting for the sake of it, especially due to poorly named functions and variables is a code smell. Code is for the reader. The compiler doesn't care if your variables are two characters or twenty.
Breaking up a function doesn't always make it easier to follow. Instead it sends you bouncing around, trying to keep track of all the values passed back and forth. Some things you need to do are just complex because there's lot of complex rules and mathematics.
And that whole thing was evangelized by Uncle Bob and the Agile wrecking crew. Before long it was bad to use comments, switch statements, or new up an object. This, in turn, led to the TDD movement, Agile only movement, enterprise patterns for all projects movement, and I'm sure there are others I'm forgetting.
Please comment your code. Tell me what you're trying to accomplish with this block of code. The function name doesn't always suffice. And I don't want to stare at it for 15 minutes, or re-format your 160 column LINQ statement, or Google your regex so I can read what it does on StackOverflow.
Even commenting pseudo code would be fine.