I very much disagree. The in-lined version can easily be confusing, harder to ch...

alephr · on Nov 15, 2019

I don't think it's a preference issue. The problem you have when you split up functions is you're making decisions that have consequences in a fairly arbitrary way(not backed by an understanding of the system, just how you feel things should be broken up).

On top of arbitrarily pushing the system in directions, you yourself note that you hide the actual context of the code when you split it up. It might make you feel better in the moment but I don't think it's the right response to feel emboldened by making decisions with less context. Premature abstraction is at the heart of a lot of bad design and complexity.

With even primitive dev tools you can get a lot of the benefits (grouping, naming) with comments and braces. More could definitely be done on this front but dev tool progress is sadly pretty slow. Going this route you can have organization while not throwing away the all-important context of what the code is actually meant to accomplish.

horsawlarway · on Nov 15, 2019

I suspect this is mostly preference.

Personally, I really dislike pulling everything out into small methods - it literally forces you to remember a crap load of names. I find remembering names to be extremely high cognitive load.

Worse, the names WILL lie to you. Especially if you weren't the one doing the naming.

So not only are you forced to remember a bunch of labels that better fit someone else's mental model, you're still not off the hook for understanding what the code in those tiny methods is doing, and considering how it might impact the task at hand.

So now you're stuck with twice the work - Remembering which name goes with which functional piece of the task at hand, and then remembering how that name actually accomplishes the task (the actual code).

Worse, because functions are moved out of line, I find it much harder to jump between relevant bits of actual "doing things" code.

Basically - The ONLY time I want to split code off into named chunks is when the alternative is copying/pasting code somewhere. I look for code which is getting reused and break that out.

The in-lined version is faster to read, faster to understand (and by understand I mean REALLY understand, not some hand-wavey "I trust this name" understand, but as in you actually know the operations and changes to the system that the call will result in)

The downside to inline code is you have to actually read it. I find a lot of the folks who really like short methods struggle to parse the language they're working in and fall back on the name without actually understand what the chunk of code does. But I think they're also the folks who have a better memory for names.

davvolun · on Nov 15, 2019

Reducing cognitive load by breaking things into digestible pieces can be hard, and at times it's hard to say if option A is better than option B, but it is not preference.

If `extractId` is an eight-step process that's only used once, but I step over it and check locals to find that the variable was correct before the call, and wrong after, then I just found my problem. That would be faster than stepping over each step to find the problem -- basically the debugging version of divide and conquer (for finding the solution, optimal would be binary search, but for readability and maintainability, that should not be the goal).

Another example is if I have a boolean condition that incorporates extremely complex logic but is only used once, if I assign it to a variable called `nameWasPreviouslySet`, that is easier to read and understand, and if I'm debugging and expect `nameWasPreviouslySet == true` at this point, and it's not, then I've figured out the bug is in setting `nameWasPreviouslySet`. So DRY shouldn't be the only reason to refactor -- refactor for readability also.

horsawlarway · on Nov 15, 2019

Ok, so lets take your example at face value. You have an extremely complex step of code that you only call once.

Already, you have invariants that are easy to break. You're assuming it's only called once. That's relatively safe if you're inline, that's bogus as soon you've broken it out into a named function.

Some other dev WILL come along and re-use your helpfully broken out code somewhere, and that invariant no longer holds.

Even assuming no one else has re-used it, what ensures that the problem is actually in that method? Particularly if that method itself calls out to many small named methods.

So you have some complex code that does something, but the "things" it does are call out to other small chunks of code.

so now you have

DoComplexThings => { DoSubThingA DoSubThingB DoSubThingC DoSubThingD DoSubThingE return }

So you're back to binary search - The problem is in there somewhere, but "in there" is actually calls out to many other functions again. Which one is broken?

And wait! DoThingD got changed at some point and will now fail if you run it before DoThingC, but there's nothing in the names that tells you that.

So you have hidden deps between chunks of code that are all trying to accomplish a single goal.

If you're debugging that, you almost certainly have to go through all the code in that method again, except you have to page back and forth in your editor to get the meaningful pieces into view. All so some OTHER dev could give it name that was meaningful to them, but likely not that helpful when you're debugging because the mental model they have of the system resulted in the bug in the first place.

---

Now, rant aside - I think we agree more than we disagree, I think good names matter, and I think there's a lot of wiggle room around when a thing should be broken out.

But if the code is trying to do a single thing, and it's not re-used, I'll take a single cohesive method over 8 tiny abstractions ANY day.

davvolun · on Nov 15, 2019

> if the code is trying to do a single thing

What is a single thing? Login? Initialize db? One line of code, 3 lines of code, ...

If you have 100 lines of code and 6 nestings, even if none of that is or, as far as you can tell, should ever be re-used, you should break that into smaller, digestible chunks of DoSubThingA, etc.

It's kind of an aside, but you mentioned e.g. `DoThingD` got changed or "some other dev WILL come along" -- if some other dev comes along and doesn't bother to check where and in what context `DoThingD` is being used, and changes it in a way that breaks `DoThingC` or `DoThingE`, they aren't doing their job. Particularly, if `DoThingD` wasn't written to be re-used, and they just use re-write it and use, particularly in a way that breaks its original invariance... I would be having strong words with that developer.

horsawlarway · on Nov 15, 2019

Oh, and I'll add in a separate comment because it's sort of unrelated. Code structure should have very little to do with how you narrow down failing code.

You can do a binary search by just shoving some log statements into the application and seeing what prints out (or hell, use the debugger for your toolset). That doesn't change if I have 100 lines in a single method, or 500 lines divided out into 100 different "tiny" methods.

I think it's actually harder when the methods are all split, because again - more paging back and forth in the editor to add the required debugging info.

dragonwriter · on Nov 15, 2019

You are talking about what is (or is equivalent to) printf-style debugging; if you use an actual debugger, you can step over function calls, you’ve got to do more work manually setting breakpoints to do the equivalent with undifferentiated streams of code in a long function.

hollerith · on Nov 15, 2019

So, you add a command to your debugger that lets you step over a "block" (a sequence of statements surrounded by braces -- the analog to Lisp's PROGN and Scheme's BEGIN) and before you debug the undifferentiated stream of many statements, you factor it into a handful of statements -- without introducing any new function names -- some of which are blocks.

horsawlarway · on Nov 15, 2019

No, what I'm talking about is how to narrow a problem down. The medium you use to accomplish that is flexible.

I'm a little confused as you why you think clicking step-over 20 times to get out of the std::lib is better than just moving the mouse down 20 lines and adding another break.

So again, this feels like we're back at "this is a matter of preference".

oblio · on Nov 15, 2019

I kind of agree with you. If fold markers were universal, I'd say that they would have been the best of both worlds. Longer linear functions with logical blocks foldable.

Folds get a bad rap because they aren't that widespread, outside of Emacs/Vim and amusingly, the .NET world, and because some people abuse them to write God-classes and such.

dgb23 · on Nov 15, 2019

I understand what you mean. I think a nice middleground between our preferences would be scoped blocks with explicitly passed vars. This would keep the readability you want but still have the other benefits I mentioned. Plus it would be simpler to refactor when needed.

horsawlarway · on Nov 15, 2019

This is a good point, and nicely matches a pattern I find myself using in JS where I just define the sub methods inline in the larger method.

You get the niceties of a name for the folks who want it, but the code is still grouped nicely and it's easy to walk through the whole operation.

I see this much less often in languages where scoping is less flexible though.

shultays · on Nov 15, 2019

Personally I find all those function calls harder to debug/review. I am perfectly capable of interpreting most non-complex small code blocks with a glance. If someone takes that and replaces with a function it usually forces me to step into function to understand what it really does. Function name wont be as meaningful as the actual code.

Forcing stuff to sub functions just hiding information away from the people that reads code.

nikita2206 · on Nov 15, 2019

If you’re talking about scopes of variables then if language supports it, the best of both worlds would be to have an inlined version where each block has its own scope so that you can easily track which variables are actually shared among those scopes and which only belong to certain scopes.

olejorgenb · on Nov 15, 2019

As long as the sub-procedure don't need 10+ input parameters and 3+ output parameters/return values I can agree. Otherwise it's a sign that the code/logic simply too interrelated to be broken up well.