Yeah, I used to go by this advice, and I found the maintainability of my code dramatically increased when I typed out full names. I don't even use "i" for loop variables anymore. If the length is a problem, invest in an editor with autocomplete.
Well-known abbreviations are fine, like "iter" and "prev", but single-letter variable names notoriously impede readability for me.
If the name is more than a few characters long, it starts to become non-instantaneous to recognize it. Things get much easier to follow with visually-instantly-recognizable symbols.
So in conditions where a variable is used over a short area in the code (or where it's used _constantly_ over a wide area), I prefer short variables.
I think you're identifying the core issue here, that is, frequently vs infrequently read code. I'd argue that if you've spent enough time in go, variables like `i`, `conf` and `ctx` will become very recognisable and easy to skim over, if they're always used in the same context.
If it's less "boilerplate" code, I'd go for more descriptive names. I think.
I like the sorta described "rule" in this document; the longer the variable is used, the more descriptive it should be.
Totally agree. If you have to scroll up to be reminded what some variable means, a longer name is good. But if its usage is a few lines away from its declaration then there's no reason to add visual noise to your code.
Its easier said than done though. We started following this few years back but then while it was "few lines away" originally, code evolved and now there are parts where its 20+ lines away. This means that short var names need to be continuously "enlarged". Well, then why not start with a "medium" name and avoid all that headache? ctx is a good enough compromise between c and context. (c can be client, config, context, certificate ... you get the point).
Rarely do you have a case where a variable outgrows its simple name (and value) and you keep it the same. Even when a function adopts new characteristics it's advisable to change it's name or make a new one. If you initially had variable `i`, created and destroyed in 5 lines of code but that has grown to 20+ lines, I'd recommend you (1) not add the 15+ if it could be put in a function (you get to name that section of the code), or (2) rename the variable. But a 5-line code that becomes 20+ lines is both strange and interesting.
Well as the article states, one-lettered variable names should only really be used in tight loops; `ctx` is a better name for e.g. a function argument.
This wouldn’t likely be hard to write a linter for: is the name 1-2 letters other than a method receiver, “i” or maybe “j” and is referenced more than 5 lines from where it’s declared? Lint warning.
I tend to agree here but I limit that to block scope as a general rule, I also only bother with really common concepts.
id identifier
i loop counter
j inner loop counter
c general purpose non loop counter
a, b, comparator methods
k, v, key value setters, array iteration
e, event, error or exception. Contextual rule but rarely collides.
Other than those I pretty much always write entire full word names. If it's abbreviated or shorthand it likely sounds funny in my head, or it leads to ambiguation. If I need to even ask / recall for a split second it's not good enough.
Programmers incorrectly place far too much weight on time to type. I write like ten lines of code on a good day. It's just not important. I want readable code that is clear and concise as possible. Every branch, block or shorthand variable raises a question I have to think about before moving on.
Clever code is why I call ten lines of code a good day. It sure as shit isn't my typing speed.
Dense code that benefits from dense variables(which describes a lot of "pure algorithms" work) I often approach by aliasing the variables to shorter ones. It's all in the same body so the context is not especially hard to lose.
IMHO short is OK as long as the name makes sense and is easy to connect to the real meaning. For instance "msg" is perfectly valid name for a message var, and even "m" in some short snippet of code is obvious, but calling it "a" or "x" or something completely unrelated like that is a bad idea. In matter of fact calling it something too neutral like "data" can also be as bad if it's not clear which data you mean. Context is everything in making it easy to understand.
For me Config just isn't one of those cases, HackerNewsOnYCombinatorMessageBoard on the other hand becomes an opaque mess to eyes when it is mixed with other variables of similar length and complexity.
Regarding single-letter variables, mathematical functions might be an exception. I think writing func gcd(a, b int) int {...} is better than other alternatives. There is simply no need to assign any more meaning to the arguments other than their type.
Maybe if you - and more importantly, everyone that will read the code in the future - are comfortable with domain-specific expressions like that. It depends on the audience really.
As an extreme example, scalaz is similarly a very specialized DSL. Or in my personal experience, functional constructions like map, flatMap, foldLeft and reduce (which I never learned in school).
For GCD? I would find those names very misleading since semantically the order of the arguments to GCD is irrelevant (even if in the implementation you typically mod by b there's no reason you couldn't mod by a).
I generally find if you need multi letter variable names it means your function or scope is to long, or manipulating to many things. It's a nice little red flag for me.
Up to four or five single letter variables is pretty trivial to remember. Especially when three of those are i, j, and k. More than 6 or 7 rapidly becomes painful. But if you are manipulating 6 or 7 variables _in the same scope_ you are doing to much.
If you have to go "cross-reference" elsewhere you are modifying something "to far away" from you. That's a giant sign of spaghetti code.
Also names vary with there contextual scope. Larger scopes mean longer names generally. Russ Cox gives a succinct description here: https://research.swtch.com/names
A name's length should not exceed its information content.
For a local variable, the name i conveys as much information as
index or idx and is quicker to read. Similarly, i and j are a
better pair of names for index variables than i1 and i2
(or, worse, index1 and index2), because they are easier to
tell apart when skimming the program. Global names must convey
relatively more information, because they appear in a larger
variety of contexts. Even so, a short, precise name can say more
than a long-winded one: compare acquire and take_ownership.
Make every name tell.
Variables should generally have short scope (we don't want a lot of global, or even package level variables). So _variable names_ in particular should be short.
> For a local variable, the name i conveys as much information as index or idx and is quicker to read
This is only true because i is a specific, common abbreviation for index. When writing arbitrary glue code, a single letter variable would be a meaningless abbreviation without shared context. If you encounter "i" and it doesn't mean "index of a for loop", you're going to be taking additional time parsing meaning.
An example where I disagree, and use 1-letter names daily: arrow functions in both Java and JS.
`usersList.stream().forEach(u -> someSet.add(u))`
It's immediately obvious that u is a user in usersList. I realize that it's debatable if u is really more readable than spelling out user, but I prefer it, and I don't think anyone is going to be confused by it. If the chain does get really long, also, I will spell it out explicitly.
I disagree. I am pretty sure that there's some code that is short and simple but you want to have descriptive names anyway like financial related computations.
>I generally find if you need multi letter variable names it means your function or scope is to long, or manipulating to many things. It's a nice little red flag for me.
Many times your function should be longer, rather than shorter.
Short methods and functions used just for the sake of being short just move the complexity of understanding in the interaction between them, making logic harder to follow an algorithm when it could have been all in the same place (for related functionality of course, I don't advocate having a function do 2 different irrelevant things).
Agreed. Step through the code in a debugger if you really want to understand what it's doing. Even unreproducible problems with production code can succumb to raw understanding gained from stepped test cases.
In short Go functions I tend to like single letter variable names. Maybe because it makes me closely consider the behavior of the program instead of trying to assume by the variable names.
Well, for most things ok, but i is so well established, than if you don't use it for loop indexes you you probably impede the readability of those reading your code
I am fully there with you, the only exceptions are still using "I" in short for loops, e for WPF/Forms event handlers due to convention and typical math symbols like x,y,z.
> I don't even use "i" for loop variables anymore.
Yeah, I noticed that for myself, too. so instead of i I might use frame_index, or whatever it "actually is". Up to a certain length it seems faster to just read what is there, without an additional mental translation step. But to be honest, I just do it because I like it.
> Functions should do one thing only. ... In addition to be easier to comprehend, smaller functions are easier to test in isolation, and now you’ve isolated the orthogonal code into its own function, its name may be all the documentation required.
Using single-caller functions as a substitute for comments makes the workings of a specific operation much harder to follow, as you have to jump around the source to understand its effects.
A long function is easier to understand than an exploded one.
Also tests should target specific operations (aka functional tests), not every single function in the program.
EDIT: Every function you add becomes part of your internal API. Any API, exported or not, should comprise a cohesive collection.
> A long function is easier to understand than an exploded one.
This is a pretty controversial position, and quite situational in my opinion. I absolutely agree that having to hop all over the source to understand something is frustrating, but that doesn't mean that very long functions are the right solution. Some combination of reasonably named helper methods and a function flow that makes the logic easy to parse should be the goal; either end of the spectrum is a problem.
> If a function is only called from a single place, consider inlining it.
> If a function is called from multiple places, see if it is possible to arrange for the work to be done in a single place, perhaps with flags, and inline that.
> If there are multiple versions of a function, consider making a single function with more, possibly defaulted, parameters.
> If the work is close to purely functional, with few references to global state, try to make it completely functional.
> Try to use const on both parameters and functions when the function really must be used in multiple places.
> Minimize control flow complexity and "area under ifs", favoring consistent execution paths and times over "optimally" avoiding unnecessary work.
This is one reason I like nested functions. They’re not available to the surrounding scope so they don’t succumb to these weaknesses, while also allowing you to organize your very long function internally by task. I use ‘em in Python all the time.
It’s a bummer that more languages don’t support them, though you can get there with lambdas too, sometimes at the cost of more syntax.
One of the (few) things I like about Javascript is the ability to define a closure anywhere in the containing function, so it appears in the order of operations:
function f() {
setTimeout(fDing, 2000);
g();
function fDing() {
...
}
}
Disclaimer: I have no Golang programming experience.
I am curious with this approach though...you're nesting behavior and/or logic, doesn't this further obscure the meaning of the code and contribute more to the need to jump around the source in order to figure it out?
Now write a unit test for the inline block of code within the longer function which would have become the nested function???
You write the test (where any is needed) for the outer function.
Edit: the start of the discussion was about using nested functions to decompose what otherwise would have been “unpartitioned” long functions. Such blocks of code nested within a long function would not be possible to unit test, either.
Unit tests, rather than integration tests, are usually bogus, anyway, though.
You write unit tests for units of code. A function with nested functions inside of it is a single unit of code; that's essentially what those functions being nested, and hence not directly invokable from the outside, indicates.
Make private functions that are only visible to the current module. Then write your unit tests directly in that same module, next to the functions, so they can access them even when the rest of the world cannot. Of course, this requires sensible language support.
Hierarchy vs list. I don’t want a list (of subroutines). I want a tree of self contained routines. Only your containing routine uses you. Which “private“ routines use which? (I know the IDE will tell me about this routine, and that routine, but I don’t want to have to ask)
Your employer doesn’t want you testing getters, anyway, but rather features. Unit test fanatics need to stop.
I guess unit tests were useful for C++, when it was constantly crashing everything :-(
C++ and its legacy need to ride off into the sunset, already.
You are still leaking the details of the function that is the sole caller of those other functions. It's not leaking across translation unit boundary, but it's not the only one that matters.
An excellent balance is to try to make a function only do one level of abstraction at a time. It's a bit of a flexible guideline, but basically you shouldn't call `isUserActive`, do some complex arithemtic, read extract data from a complex data structure, and call a templating engine in one function, since those a results all different levels.
As long as what are doing is approximately the same type of thing, it is fine to do a lot of things without breaking readability.
Longer functions are much more prone to causing errors, and errors that are harder to find. It is honestly much better to have functions that do one thing and one thing only. Might not always be possible, but it is always the best way to code.
I think “doing one thing” and “getting called one time” are getting confused.
GetData()
FormatData()
UploadData()
Those could do one thing but could be called at several points within a larger program, which i think you are fine with. Or they could only be called once in which case i think you are saying it might make sense to just inline them.
I don't necessarily agree with "one thing only". At least, for things that start simple and grow as needed, I split things into functions either to not have to copy and paste the same code, or for readability/structure purpose. But not out of principle and always, until I can't divide any further. I can still split things out into functions later, should I need it, but doing it "just in case" and then not even having a use for it doesn't save me time and just adds overhead.
Though it also depends on whether I'm doing something familiar or something new, if I'm doing something new I might split things up more to help me conceptualize the problem. But when I'm just making a quick CLI tool, I might put it all in main first and only split it up as needed.
The argument is that it is not fundamentally better to use the longer name in the given context, so why make it longer? I'd say it isn't about how long the variable takes to type either, but how long the code takes to read and parse.
I'm wrestling with this at work right now and the short names don't really bother me as they are right in that the fact that if you have the type the shorter name can make the code easier to read slightly. What annoyed me is that for those single letter names I always got collisions so my naming was inconsistent. I personally ended up doing medium length names (like conf) except for the case where there was only 1 local variable (and sometimes if there were two or three) because in that case there were no collisions and what that variable was is very very clear
The first version was too long to fit on the screen for me. Reading the second I missed the swapping of `i` and `j` -- only noticing it when I went back to the first version to figure out which parts of line I would assign to something if I were to rewrite the code.
It says so in the signature, but Go code tends to have long functions, so it's kind of a fallacy to say that it's going to be easily recognizable 50 lines in because it's in the signature. Now if this were Haskell and it was a single small expression, it would be different.
The issue w/ calling it "config" is that you end up with the confusing scenario where "config" is the object and "Config" is the type, differing only in casing.
Sorry. To clarify: it's not that I'm confused by the distinction between capitalized vs uncapitalized, it's that visually, "config" and "Config" look quite similar at a glance, whereas "c" and "Config" are clearly visually distinct.
> In this case consider conf or maybe c will do if the lifetime of the variable is short enough.
This seems petty. Is it really that problematic to type out a few extra characters?