Hacker News new | past | comments | ask | show | jobs | submit login
Refactoring in C++: Top Techniques and Best Practices (jetbrains.com)
47 points by signa11 26 days ago | hide | past | favorite | 19 comments



This reads like nothing more than an advertisement of Clion’s refactoring features. Some of the tips are even quite dangerous and highly context dependent, like creating a base class when two classes share common functionality. That sounds like a refactoring nightmare to happen.

I’m currently working through a difficult refactor and the most important tips I can give are:

- test everything after each change,

- make (very) small adjustments.

- Use version control.

- sometimes make large sweeping adjustments, copy the main logic of it to a scratch file and then create small commits that bring you to that larger change.

- don’t be afraid to reset to your last “checkpoint”, sometimes you just go a wrong direction / overengineer something. Just throw it away and use it as additional guidance to build the right thing.


> This reads like nothing more than an advertisement of Clion’s refactoring features.

Yes, I think that's the whole point.

> Some of the tips are even quite dangerous and highly context dependent, like creating a base class when two classes share common functionality. That sounds like a refactoring nightmare to happen.

I don't see where that hypothetical danger lies. Deduplicating code is a basic refactorizarion step.

> I’m currently working through a difficult refactor and the most important tips I can give are:

I'm afraid you're going the wrong way about this. No wonder you associated "danger" with refactoring.

The basic steps you take to refactor legacy code are:

- identify code paths that are affected if you touch a specific component,

- identify invariants and expected state changes through that code path. If there is none, add your own (see code seams)

- add tests to verify those invariants (see software vise tests)

- proceed with refactoring.

If you want to learn more, I recommend you pick up the book "working effectively with legacy code" by Michael C Feathers.


> I don't see where that hypothetical danger lies. Deduplicating code is a basic refactorizarion step.

Deduplication is one thing and fine, but creating inheritance hierarchies should not be used for simple code reuse, but for representing relationships.


> Some of the tips are even quite dangerous and highly context dependent, like creating a base class when two classes share common functionality.

Yeah, I am a pretty big fan of inheritance, but that makes me cringe. Don't do this unless you understand the semantics of the class, and even then probably don't.


This is a little light on actual content.

The best practices are: 1) do it in small steps 2) use version control 3) use automated tests 4) make sure the code is readable.

Pretty straight forward.


5) Use "asserts" liberally to check program state.


here's one that a teammate taught me - const everything that can be const and then slowly undo the consts when necessary. it'll reveal lots of stupid implicit state spaghetti. const methods for the OOP flavor of spaghetti.

another one is move variables as close to their use as possible. so many people somehow inherited FORTRAN style (from where! you're younger than me!) and declare variables at the top of a function. that's a brilliant way to either forget to use/remove/initialize them and/or shadow names. drives me utterly insane (because it's like who the hell is teaching people this).

also probably -Werror or -Wall depending on the size of the code base (at least -Werror=return-type, -Werror=unused-variable, -Wunused-but-set-variable, -Werror=dangling).

> use version control

yea lots and lots of incremental commits with decent commit messages purely for you to remember what ground you covered. also lots of branches and rebasing. what a chore but it's the only way to stay sane.

> make sure the code is readable

debatable but my first move is to undo all the spaghetti OOP and "please break out into a function" bullshit. less readable to the junior maybe but much more readable to a senior that needs to actually understand what the entire module is doing.

C++ is fine language - most people just are really bad at it.


> so many people somehow inherited FORTRAN style (from where! you're younger than me!) and declare variables at the top of a function.

I thought this was a C 89 requirement too, and I thought until recently that some Linux distros lifted the build requirement to C 99.

I've worked on a few code bases with all the variables declared at the top of a function though and you're absolutely right - plenty of variables that end up unused or shadowed.


Also from those who learned with Pascal. Traditional Pascal requires that all variables be declared at top of procedure/function in a 'var' section.


> I thought this was a C 89 requirement too

TIL


It's not a C89 requirement. C89 requires variables to be declared at the top of a block, not a function. So, you could write things like this:

  while (...) {
      int a = 1, b = 2, c = 3;
      printf("%d\n", a + c - b);
  }


Ah okay - I misremembered - I only had to work with strict C89 once and that was close to 15 years ago now, and at a similar time I inherited a codebase that did things the C89 way but it was compiled as C++ so with no enforcement of variables at the top of blocks I didn't really pay attention to where things had been declared when it was first written.


As for "Top": The best refactoring tool I ever used was http://structure101.com, for both C++ and Java.

Unfortunately it was just bought by Sonar. Their former 30-day free trial was mind-blowing. Now it seems to be reduced to architecture enforcement.

It creates an internal database of names and relationships and shows you graphically the dependencies, so you can detangle cycles, impose layering, extract common utilities, etc. It helps you make and execute an overall plan.

Yes, work incrementally with tests, but you're blind if you can't see the whole landscape and have no plan (esp. no vision of exactly why the result will be better).

The strict meaning of refactoring is not any re-design, but a wholesale change that is guaranteed by language features to be correct (i.e., to not change semantics). With structure-101 (in java at least) you can lay out a whole series and execute them all.

(Absent that, Eclipse offers a host of refactoring steps for Java.)

It's too bad given the dynamics of worse-is-better free or cheap IDE's that syntax- and semantic-aware tooling like structure101 isn't widely prevalent.


> Refactoring code in C++ is challenging because there are roughly ten different ways to do one thing.

When learning C++ I was not yet proficient in English, so I learned the version that was taught in the books translated to my language. I only later realized how much the language evolved over time, so while I can do C++, I feel I am writing in old style.

I wonder what is a good way to refactor code for keeping it conforming to updated best practices.


You can check out the clang-tidy "modernize-" set of checks [0]. I think some IDEs like clion or visual studio (both the "full fat" and VS code with a plugin) have options to integrate this. Or even just as a build pass - if you use cmake you can just set the CMAKE_CXX_CLANG_TIDY variable [1] to the path of the clang-tidy executable and it'll run every build.

They will likely pick out the "low hanging fruit", the easy like-for-like transforms like for loops or std::array, but more "structural" differences that are considered "bad" in modern C (like raw pointers or non-trivial users of those some of those same "replaced" features) will likely require much more significant manual input. In many ways the changing environment of more modern c++ changes how you holistically should design code and data structures, and no tools will "fix" that.

But it's a start, and you need to start somewhere. I mean it's not like there's an "end target" for the "perfect modern c++" anyway. I also wouldn't treat the output of clang-tidy as gospel 100% truth either, look at the changes it suggests and decide if they are actually making clearer code before blindly applying them, it might even help with understanding the new features if you can understand why they were changed and what problems they are trying to solve too.

[0] https://clang.llvm.org/extra/clang-tidy/checks/list.html

[1] https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_CLAN...


I haven't worked on a large C++ codebase yet, but with Rust my refactoring technique has been 1. Roughly model the data flow on paper so you don't have to fight the borrow checker or make design changes later. 2. Make the obvious changes related to the refactor. 3. Listen to Rustc and Clippy. 4. Everything usually works with surprisingly few issues.


> 1. Roughly model the data flow on paper so you don't have to fight the borrow checker or make design changes later.

So Rustacians have to defer to paper and pencil just to avoid the tooling huh, something seems counter productive about that.


What's proposed in the article is mostly what I would call, low level refactoring. It's what's done almost all of the time. Group some things together, split some things, use better names, move to other files etc.

Technical debt on the other hand originates more from a design that does not fit the application anymore, because of added features or change of goals over a latge time span. That's where what I would call 'real refactoring' needs to take the stage.

Document/draft the current architecture as well as an architecture that fits better the purpose of the application. Work out a path to move from one to the other. Usually but not necessarily, this path is executed in steps, while keeping everything functional. New skeleton / middleware code is needed, some parts need rewriting, some parts need moving, some parts need deleting.


I am yet to find a tool to unwrap/unroll macros, specially monsters like these:

    #define loop(v,m) for(int v = 0; v < int(m); ++v)
    #define loopi(m) loop(i,m)
    #define loopj(m) loop(j,m)
    #define loopk(m) loop(k,m)
    #define loopl(m) loop(l,m)
    #define looprev(v,m) for(int v = int(m); --v >= 0;)
    #define loopirev(m) looprev(i,m)
    #define loopjrev(m) looprev(j,m)
    #define loopkrev(m) looprev(k,m)
    #define looplrev(m) looprev(l,m)
JetBrains failed miserably on that codebase.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: