I think about that email a lot at work (where we have some very long functions)....

Jtsummers · on Oct 28, 2020

This exposes one of the weaknesses of C and C++. Nesting functions inside functions is actually a very useful thing to avoid precisely what's described, while still allowing code to be more readable. The inner functions can always be extracted later if deemed appropriate. So if C permitted it you could do something like:

  void FullUpdate(...) {
    void PartialUpdateA(...) {
      ...
    };
    ...
  }

So that function is only visible in this one scope, but where it's used it can have the effect of greatly improving readability (especially if it's called multiple times).

Now, C++ can halfway get there with private methods in classes. So anyone outside the class has to really try to break that encapsulation and access the function. C and C++ can get there by not exposing the functions in headers, so they remain file local.

But that doesn't prevent something like (within a file):

  // should only be called from FullUpdate
  void PartialUpdateA() {...}

  void AnotherFunc() {
    ...
    PartialUpdateA();
    ...
  }

tayistay · on Oct 28, 2020

You can do that fairly easily with C++ lambdas. I think the primary deficiency, though, is that the functions aren't exposed for unit testing.

Jtsummers · on Oct 28, 2020

C++ lambdas do solve the problem, I'm not sure why I didn't include that.

However, unit testing them is not the only concern. There are reasons for using nested functions, class methods, file global, or program global functions.

If you make them global or methods, you lose control of how and when they're called. This can break invariants. So some functions can be hoisted up, but others oughtn't be (in particular, any pure function can be made global without any concern other than occupying a name, side effecting functions should be more carefully considered).

The interface to the functions may change if they capture any variables. If they capture nothing in the local scope, then hoisting them doesn't impact their interface. If they do capture something, hoisting them means adding parameters (complicating the interface) or making them observe variables either in the class (complicating the class) or file/program globals (bad practice).

Regarding unit testing. Nesting functions (or lambdas) are essentially a wash. You were, hopefully, testing the host function to begin with so nothing is changed if you use nesting functions as your first pass refactoring approach. You can then examine those functions and consider which should be moved out and why, and then add tests to any that have been pulled out of the host function.

tayistay · on Oct 28, 2020

C++ gives you enough access control so you can expose things just for testing if you want (make the test a friend, encapsulate the functions into another class). You can do this without breaking encapsulation.

If the functions are capturing a lot of variables, then I would try to reconsider the design. Usually things aren't irreducibly complex.

I don't exactly follow your last paragraph, but tests aren't just something you throw away when they pass. So if I were to write tests for the helper functions, I wouldn't delete those tests on a refactoring pass in order to use nested functions.

Jtsummers · on Oct 28, 2020

> C++ gives you enough access control so you can expose things just for testing if you want (make the test a friend, encapsulate the functions into another class). You can do this without breaking encapsulation.

If tests can do this, then so can anyone else. Consequently encapsulation is broken and your invariants aren't invariant anymore.

> I don't exactly follow your last paragraph, but tests aren't just something you throw away when they pass. So if I were to write tests for the helper functions, I wouldn't delete those tests on a refactoring pass in order to use nested functions.

I didn't write clearly because I didn't re-present the context of that paragraph.

I'm not talking about throwing away tests after they're run. Keep in mind my original post's context: manually inlined functions for access control to that functionality. You already can't test those separately because they aren't exposed. By moving to nested functions you regain some semblance of reasonability (versus 1k+ line functions with who knows how many levels of nested blocks) and the compiler can do the inlining (for performance). But it has zero net effect on testing, it's a wash. Because the public interface is the same (only the primary function interface is accessible to a tester).

If nested functions are available (and with C++ they are with lambdas) the refactoring would (or could) be something like: 1k line function => 500 line function with several lambdas => 3-8 functions totaling ~500 with some lambdas remaining.

Only those that make sense to move out for separate testing would be, and only if you also wanted to expose them for others to call.

tayistay · on Oct 28, 2020

> If tests can do this, then so can anyone else.

If I had an image loader class, I can make a test function a friend, which would allow it to call private functions on the class. This only grants access to the test function (or test class). And there are stronger ways to hide things, like the PIMPL idiom, private headers, opaque pointers.

I find it interesting we're in such different schools of thought here.

> Only those that make sense to move out for separate testing would be, and only if you also wanted to expose them for others to call.

There are so many things that you might want to hide from an interface, yet still test. Imagine if you took that to an extreme and only tested the public interface of a library. I'm all for trying to independently test any bit of code that fills a screen.

guenthert · on Oct 28, 2020

I suppose you're not implementing safety-critical code then, as AV rule 1 [1] demands that "Any one function (or method) will contain no more than 200 logical source lines of code (LSLOCs)."

Personally, I just find it frustrating, if functions don't fit on the screen anymore (and I do use portrait mode already). Further, sub functions, when named appropriately (definitely not like helperA and helperB) can aid readability (as would comments about code blocks do, but who writes those and who maintains those?).

[1] https://stroustrup.com/JSF-AV-rules.pdf

leetcrew · on Oct 28, 2020

you are correct; I do not work on safety-critical code. I have worked with a lot of legacy code where functions depend on and mutate global state, often in fairly subtle ways. as much as possible, I want to see exactly which globals are being read/written and in what order. from this perspective, the "whole function" doesn't fit on one screen, whether or not sub-functions are factored out, and the name of a function cannot possibly tell me all I need to know about it.