Hacker News new | past | comments | ask | show | jobs | submit login
Best Practices of Variable & Method Naming (codebuild.blogspot.com)
37 points by javinpaul on Feb 25, 2012 | hide | past | favorite | 35 comments



> Use short enough and long enough variable names in each scope of code. Generally length may be 1 char for loop counters, 1 word for condition/loop variables, 1-2 words for methods, 2-3 words for classes, 3-4 words for globals.

---

A variable name, function, or class name, should be exactly as short as it can be and still usefully to convey its purpose to someone who is not the code's author--no longer, no shorter. Their "best practice" was backed into by counting how many words it usually takes to convey this information, but the horse goes in front of the cart.


Best practices are best automated. Syntax highlighting helps tell apart vars and funcs so you don't need to offload it to naming conventions.

Some new approach of tags, where every variable, class name, function name can have multiple tags describing it as Factory, Adapter, Handler, whatever, can also remove this often unnecessary information from the shorthand name. Only display it in some deep editing mode or something.

Capitalization can also be handled by tags, each word in CapitalizedCamelCase or underscored_lower_case is just a tag. The IDE should handle this, so you never have to read about inconsistent naming conventions in various languages.

Each item like a var or function gets a unique ID, so you don't have to do search replace by name and find a bunch of other things with similar names in the process.

A lot of these problems stem from still using plain text as the medium.


16. No variable name should be fully self-contained within any other variable name:

Wrong: Cust & CustCtr

Right: CustID & CustCtr

You should be able to do a global search with any tool and get every instance of that variable with no instance of any other variable.

This pretty much kills "1. 1 char loop counters", which is a good thing.


> This pretty much kills "1. 1 char loop counters", which is a good thing.

Why do you think this is a good thing ?


"for node in nodes" is preferred, at least IMO, to "for i in nodes"


"for node in nodes" implies you are returning the node itself and not simply a counter. This would seem to fall more under his 1 word for loop variables idea.

* NB: I am not certain whether I agree with such rules or not, but that is another discussion point.


You must be a functional programmer :) My specific example was Python-related; but yes, if I'm using a functional language (or comprehensions in Python) I'm much more inclined to just use i/j/k/x as the counter. But for loops in Python don't return anything of course except in `break` cases.


"i" as in a counter is almost a standard for any code in any programming language. "i", "j" and "k" (comes from math). If you have the loop counter as "node", then you're doing it wrong, but if you have an actual node, then it's ok.


Or, if you want to be sure that a search/replace of a name won't go awry, you could use names such as customerIdx, nodeIdx, accountIdx. It also helps keep things clear in the few cases where you find yourself with nested iterations (so that you don't accidentally call customers[j] when it should have been customers[i]).


If I've got an iterable of Foo object instances, I'll call it "foos". When I iterate, I'll write "for foo in foos" without exception. Calling it i, j or k is fine, too, but I try to be as obvious as possible, because I believe that is a virtue.


I no longer use i as a counter because a few times in my career I've mistaken i for 1 or l.


This is not necessary. Just search for regex \bVarName\b when doing a replace. If looking for function calls look for \bFuncName\( etc. The chances that a large mult-developer codebase will have overlapping variable or functions names is extremely high.


I agree that you should never use 1 character loop counters, but a lot of the lazy programmers out there don't agree. They've never had to do maintenance programming (or heaven forbid make old code actually work/bugfree/do something new) that has the following variable names:

i ii iii j ij jj ji jjj jij jjj

One letter loop counters and their ilk are always a bad idea. But saying so angers so many people because they can't possibly ever be wrong.


You should be able to do a global search

I would offer that you should have tools that are smarter about your code, with a contextual awareness of what things are and how they are used. Many coding purported "best-practices" are based upon such a tool deficiency.

Let me give one example that is a serious peeve of mine -- C# allows you to define region. I have to interact with code that uses regions to define accessibility or type of members.

#region private members

#endregion

This was communicated as a "best practice" because it allows you to hide/show appropriate parts of the code.

Yet the primary tool -- Visual Studio -- has fantastic tools for exactly that use. Tools that don't muddy up the code with unvalidated, quickly out of date metadata (e.g. nothing stops me from putting a public in there).

This holds true for many, many coding conventions. We discarded hungarian notation for the same reason that many of these other naming guidelines are of dubious merit.


>Use specific names for variables, for example "value", "equals", "data", ... are not valid names for any case.

I have to disagree. As parameters to short functions/methods, names like "value" and "data" can be perfectly appropriate, especially if those names correspond to types.


> Don't start variables with o_, obj_, m_ etc. A variable does not need tags which states it is a variable.

I disagree on this one -- it's helpful to be able to quickly recognize whether a given variable in a method is a local variable or a member variable of the class.


I agree with you 100%. I also agree Joel Splosky with the goal of making wrong code look wrong which is why I hate my company's style guide. For C++ they name member variables with _ as in name_, address_, The problem I have with that is in a complicated expression it's easy to mistake one for the other. with m_ or some other more obvious prefix that wouldn't happen.


From http://mindprod.com/jgloss/unmainnaming.html

a naming convention from the world of C++ is the use of "m_" in front of members. This is supposed to help you tell them apart from methods, so long as you forget that "method" also starts with the letter "m".


Tagging your variables isn't really needed anymore thanks to syntax highlighting in modern text editors.

I'm pretty on the fence about it actually. It's not really necessary because of syntax highlighting, but it is convenient to just type m_ and get a list of all the member variables from intellisense.


Which modern text editors do this?


Something I started doing a while ago that was surprisingly helpful was naming arguments to methods or functions like so ... params_durations or params_user etc.

This helps when I'm scanning code, to quickly to know whether a variable was passed in vs being local, global or part of a class.

In Rails/Ruby programming I will also usually just write self.this_method where most people would just write this_method, because I instantly know where to look for that particular method (since a method definition could be in so many different places in rails).

Works really well for me but YMMV.


Most of the Ruby code I've read through almost never uses camelCase for variables; it's almost always some_variable. I prefer to use lowercase + underscores for method names and camelCase for variables. I find this particularly helpful when dealing with argument-less methods that return strings since () is not conventionally used.


Yes Ruby follows the Perl tradition of camelCase for Module::OrClassNames and instead using underscore in variable_names.

Interesting Perl6 introduces option of hypenated $variable-names ala Lisp. Damian Conway gave this great comment on underscores vs hyphens on the Perl6 mailing list - http://www.nntp.perl.org/group/perl.perl6.language/2010/04/m...


I think using Unicode is fine if your language guarantees support for it. (E.g. it is in the language standard.)

Ultimately, something like τ -> τ' is much more readable than tau -> tau' and is about as easy to type in a decent text editor (e.g. \tau -> \tau'). This is especially true for heavily mathematical code.


If you're feeling uncharitable, you can go look at the source code of PHP and see how many times they run afoul of these guidelines.

(not like the items in the article are Holy Writ, but they're sensible guidelines, and it's kind of sad that a widely-used programming language can screw up on nearly all of them)


12. Use meaningful names for methods. The name must specify the exact action of the method and for most cases must start with a verb. (e.g. createPasswordHash)

This depends. My classes often have side-effect-free methods that just return a value based on the state of the object. They are just properties the storage of which would be redundant. I never put "get" in front of the names of these methods. For instance, in a math Vector class, I don't have getNorm(), getUnit(), etc., I just have norm() and unit().

I struggle with what to name methods that compute, for instance, a dot product of two vectors. I don't like a.getDotProduct(b), nor a.dot(b) for that matter. The way the implicit parameter ("this") is special-cased in the notation makes everything so ugly. Nowadays I usually just have a class V with static methods that act on double[]s (so I end up with V.dot(a, b), which still sucks but at least isn't so confusing).


I've recently been jumping around in my naming conventions. I'm a student and haven't quite settled in one one particular convention. For the current project I'm working on (C++) I've been using function overloading. So for example, I have a class called BankAccount. I have a private and a public method that will set the balance or return the balance (set is private, get is public) of the account, I've overloaded this so that if no parameter is passed it returns the balance, and if a parameter is passed it will set the balance accordingly. I use this in my deposit and withdrawal methods. I do the same thing for other setters/getters. I picked up this convention from seeing it used in other languages like Ruby and Javascript.

Is this a bad practice? It seems logical to me, but I'm wonder what other peoples thoughts are and if it may be confusing to someone else.


These are all great tips but still won't help me figuring out how should I name my variable. Sometimes it's obvious but sometimes I struggle. I hope it isn't an indicator of bad code :)


In my experience if you're struggling with naming something (variable, function, class, etc.) in your code, it often indicates that you're not entirely sure about the role of that particular structure. When this happens to me, I ask myself: what is the role of this variable? What does this method do? If you can't figure that out sometimes that's indicating that it's time for refactoring.


Get the book "Code complete", it has a large section dedicated to this problem.


obligatory:

    There are only two hard things in Computer Science: cache     
    invalidation and naming things.
    -- Phil Karlton


I like this version: "There are two hard things in CS: cache invalidation, naming things, and off-by-one errors."


My students often miss 0 element of arrays.


use Camel Case (aka Upper Camel Case) for classes: VelocityResponseWriter

This is also called Pascal case.


RunTimeException or RuntimeException? As a non-native English speaker, I hate CamelCase.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: