Hacker News new | past | comments | ask | show | jobs | submit login
The magical -- and not harmful -- rebase (Git) (jeffkreeftmeijer.com)
58 points by jkreeftmeijer on Oct 11, 2010 | hide | past | favorite | 24 comments



Rebase makes long-lived branches much more manageable.

Everyone who's worked with one knows that long-lived branches are a bad idea - the performance-improvements branch that the intern started last year and has been running on beta for a while but hasn't really kept up to date with the new features we added on production. The time required to merge one of these seems to grow faster than linearly with the time since it diverged, and a year-old branch is basically impossible to merge.

However, long-lived branches happen (features get shelved and then picked up again, refactoring gets postponed due to deadlines...). Rebase doesn't make long-lived branches a good idea, but it makes them much easier to deal with:

* you can rebase once a week and keep your branch up to date (to avoid the superlinear growth of merging effort). You can just about do this with frequent "latest changes from master" merges, but it clutters the branch history, and merge commits complicate things if you ever need to rebase. (Also, anyone who tried to do this with Subversion probably got shivers down their spine at the thought of merging the branch back in after all those merges from master.)

* when the branch inevitably conflicts with master, it's much easier and less risky to resolve the conflicts a commit at a time, rather than resolving them all at once as a merge would force you to.


I would say that your problems were not caused by a long-running branch per se, but rather were caused by the intern not pulling from production often enough. Linus talked about this in one of his Git talks. Responsibility for merging should lie with the team working on the new feature in the side branch, not some other team that knows nothing about it.


Agree (the intern was hypothetical, but agree that long-lived branches are usually symptoms of human rather than tooling problems). However these things happen, and if you are in a hole, it's nice if your tools let you dig yourself out :)


By the way, the technique of rebasing every once a week to keep merge/rebase-conflicts manageabable, can of course also be applied virtually: Suppose you have a year old branch; instead of doing one mega merge, you can do 52 small merges.

If the merge conflicts really grow faster than linear, divide-and-conquer will save you time.


I assume you mean divide-and-conquer the feature branch history into segments, and merge a segment at a time into master? That can work well.

Unfortunately, often the reason you have a long-lived branch is because it's some stop-the-world change (e.g. "refactor data model") which is pretty much all-or-nothing.

Of course it's always possible with enough discipline to turn a stop-the-world change into a series of incremental steps, and usually it's well worth that effort (for risk management etc) - but sometimes that's not feasible.


With divide-and-conquer, I meant, when you are finished with your long lived branch, don't merge/rebase it outright with/on your current master branch.

Just keep rebasing it forward week by week of last year. (Of course this should not happen in real time, since you have to catch up.)

So a stop the world change is still doable with this model---your commit history will just pretend that you finished your branch within a week.


Ah: yes, very good point. That's another option rebase gives you that you pretty much can't do with merges. (Throwaway merges plus rerere will give you some of the benefit, but it's not as good as incrementally bringing your branch up to date.)


Why? It should be even the same process with merges, or isn't it?


Using rebase this way is not harmful, but the author didn't explain the situation when it is!

As usual, the Pro Git book does a great job explaining: http://progit.org/book/ch3-6.html

In short, if your team develops with an old-school, SVN type model, where there is one reference repo (the remote) that all developers pull from and push to, rebase to your hearts content. But if your team pulls from each other, you really want to be careful with rebasing. I believe this is how the Linux kernel is developed, which explains all of the warnings about git rebasing. After all, it's logical to assume that you should do exactly as the Linux kernel developers, for whom git was created, do!


It should also be mentioned that if you are going to be doing some rebasing that you should look into enabling git rerere. What isn't mentioned in his article if you do multiple rebases to the same branch you will have to resolve the same conflicts over and over without rerere enabled.

    git config --global rerere.enabled 1


You could simply merge that into your feature branch, but that would result in one of those nasty merge commits...It’s as if you didn’t start working in the feature/login branch before the commits you pulled in were made. Nice, huh?

I'm genuinely not sure how I feel about git rebase. Two questions. First, what is so nasty about such a merge commit? People complain about these merges a lot, but is the objection just aesthetic? Why is such a merge bad? Second, from a historical point of view, you did "start working in the feature/login branch before the commits you pulled in were made." So why would you want the history to look otherwise?

I completely understand the use of rebase or --amend to fix a typo or a forgotten file commit, but in a case like this, I'm not sure what to think. I'm inclined to prefer the history to show what actually happened, rather than a tidied up version of the past.


A merge is totally fine, if you really are merging something. But in the case of feature branches you often have the situation, that you work on a very isolated thing. Simultaneously, someone works on the branch which you branched your feature off. You want to get the current state for some reason. By using a merge you would add complexity that is not necessary at this point, because it would just originate in the fact, that you branched off sometime sooner. It would not really be a merge in "content". I find rebasing in this occasion perfectly suitable, since it removes this layer of complexity that noone needs at this point. You just pretend to have branched off later, which is alright. When you are finished with your feature you use a regular merge to get back in your dev branch, since now you really ARE merging something.


By using a merge you would add complexity that is not necessary at this point, because it would just originate in the fact, that you branched off sometime sooner. It would not really be a merge in "content". I find rebasing in this occasion perfectly suitable, since it removes this layer of complexity that noone needs at this point.

What exactly is the complexity you have in mind here? (I'm not trying to be contentious: I honestly don't see it.)


When you merge the dev branch in the feature branch, you create a new commit with 2 ancestors. While this is all done very well in git and allows for perfect inspection, it makes it nonetheless harder to follow the history of your feature branch, since now you have two sources for changes. I just find it easier to make the history linear in that case, because it is easier to analyse in retrospect.


I actually prefer having two ancestors, because you can analyse different kinds of change in isolation. If your branch was to add a new widget to the UI, and there's a regression I'm trying to find in the backend data munging code, it's better if I can trivially ignore all the changes from the new-ui-widget branch, because I know that's very unlikely to have broken the data munging. That's harder to do if you've linearised the history.


Not really, because every changeset from the new-ui-widget branch shows up after the point at which the rebase happened, so there is a clear cutoff in the git log between branch changes and master/upstream changes after you've rebased.


You will thank me later:

   git log --graph


what is so nasty about such a merge commit?

From the article:

Merge can be used when you want to merge a feature branch back into your development branch. That way, you’ll be able to see when you merged in what in the future because you have that merge commit I called “nasty” before. It isn’t, really.

I've come to prefer merges for this case. It reflects reality - you really were developing on a separate history for a while, you're not interweaving bugfixes from master with feature work, etc. And it visually (and logically) groups the commits for the feature, so it's very easy to review or revert the feature as a whole, instead of having to figure out exactly which commits in the big linear history are relevant (or having to religiously tag every time you do anything interesting). The current git visualisation tools (gitk, gitg, gitx) aren't very good at displaying histories with interesting merge structures, but I confidently predict someone will release an awesome dotviz-based history visualiser soon which makes it easy to navigate forests of branches and merges.

That said, as the article describes, rebase does handle conflicts (syntactic and semantic) better than merge does. Because it rebases a commit at a time, you have to think about how the upstream change affects each change that you made in the feature branch, which means you're much less likely to resolve the conflict in a way which breaks some subtle assumption you baked into the code two weeks ago.

I've sometimes taken the approach of first rebasing my feature branch onto the latest master to resolve any conflicts, and then doing a 'merge --no-ff' to explicitly create a merge commit to get the visual history marker.


One way to get really used to rebase is to use git-svn, which pretty much requires it.


I heart rebase, but I haven't figure out how to keep a remote branch in synch, I rebase and when I try to push to the remote branch it fails, I sometimes end up just creating a new local & remote branch :/


    git push --force remote local-branch-name:remote-branch-name
(or if local-branch-name and remote-branch-name are the same:)

    git push --force remote branch-name
(N.B. if there's a chance that anyone else is working on the same branch, then think twice before doing --force, because you may spoil their afternoon. Read 'git help push' for more info.)


This relates to a question I asked on Stackoverflow a while back: http://stackoverflow.com/questions/457927/git-workflow-and-r...

Reading the Oreilly git book (the one with the bat) was immensely useful for me to understand rebasing (and merging and cherry picking). It makes it clear when and where it's harmful, when it might cause problems, and when it's perfectly safe.

It's taken me several years, but I've come to appreciate rebasing, and I recommend it to others now


Good, if short, article. Commendations on having a healthy relationship with merge, which is highly useful. Recently it seems people have gone way overboard with rebasing.


This is one of the better explanations of `git rebase` I've seen. Hardcore git users seem to have a hard time relating git concepts to non-git users. Nice write-up, Jeff.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: