Take Advantage of Git Rebase

placatedmayhem · on Oct 6, 2022

Rebase has become so integral to my workflow, it's hard to imagine living without it. I intentionally avoid merge commits, including configuring git pull to rebase rather than merge. I find it so much easier to commit a bunch of tiny iterations while doing local testing, rebase and squash them, then post them for review. As a bonus, I frequently push after each of those small commits to a feature branch so I don't have to worry about losing work if my drive dies.

rubyist5eva · on Oct 6, 2022

Personally, I actually stopped using rebase or advocating it's use (because noobies mess up their history all the time with it). I just commit whatever, then when the code is merged to trunk/dev/master/whatever I use `merge --squash`.

Reviews are very rarely happening on the commit level, so having lots of small "meaningless" commits (lint fixes, fix spec, add thing) hasn't been an issue but having squashed merge commits (effectively a rebase that squashes everything) has made release management so much easier for our team.

It also has the benefit of just less cognitive overhead of micro-managing my feature branch commits that are WIP or currently in-review.

dabears · on Oct 6, 2022

This is my workflow as well, it's great. Github is configured so merging a PR into master will automatically produce a squashed commit.

My rule of thumb, always rebase unless you hit a series of merge conflict while rebasing. Then either merge, or squash your branch then re-run the rebase.

rubyist5eva · on Oct 6, 2022

We require linear history on our shared mainline branches so the squash workflow works great for us using Github as well. Another benefit is the "show changes since last review" option when reviewing a PR - it's been a while but I'm not sure if this is possible with a rebase-based workflow.

jokab · on Oct 6, 2022

Thanks. I thought it was just me.

rrwo · on Oct 6, 2022

> Rebase has become so integral to my workflow, it's hard to imagine living without it.

Same. I use it to reorganise branches before submitting change requests or merging:

- Combing or splitting commits so that each commit contains a small unit of a logical change

- Reordering commits to related changes are near each other

- Reordering commits so changes that are depended on by later commits occur earlier.

- Rewording commit descriptions to have a lengthy explanation, if needed

- Removing small commits that fix minor typos

It's a great way to turn the commit history of a project into one of logical changes, rather than a simple temporal log of every change.

(I had a colleague who was so used to CVS and couldn't get his head around not using version control as a time tracking device.)

thunky · on Oct 7, 2022

> It's a great way to turn the commit history of a project into one of logical changes, rather than a simple temporal log of every change.

This seems a bit like an unhealthy obsession. I would be curious to know if the benefits of having a perfect git history outweigh the costs. A temporal log of changes is good enough for me.

rrwo · on Oct 8, 2022

It's not an obsession, and the commit history isn't perfect by any means.

But don't underestimate the value of a good commit history, especially if you are using git blame to understand a change from several years ago. Wading through small typo fixes gets in the way.

Git encourages you to use branches, and rebasing allows you to squash simple typo fixes and minor changes, so you can review a change with a few commits that are easier to understand instead of a dozen.

Used with cherry picking, you can also move commits to other branches. I often find that bigger features changes turn into multiple branches and PRs, which is easier to review.

The cost isn't very much. Maybe a few minutes at most reorganizing a branch.

It also forces me to review the branch before submitting a PR. Sometimes I catch issues or use that as an opportunity to split into multiple PRs.

thunky · on Oct 9, 2022

This all makes sense and I don't doubt that it's worth it for you, I guess I just personally don't lean on commit history that much.

I use git blame to find who touched the code, but not necessarily why. And during reviews I spent most of the time looking at the totality of the code changes (along with tests and PR notes) rather than deducing what's going on by looking at one commit at a time.

Overall I see the history as a log, which is only an artifact and not a product on its own.

placatedmayhem · on Oct 7, 2022

Specifically, more atomic commits make rollbacks easier when issues with a change are discovered after merge (e.g., in integration and user acceptance testing). In the very active code bases I've worked on, generating a rollback commit for a single commit is far more straightforward than having to deal with a merge commit that potentially includes other merges (e.g., with `git pull`'s default behavior). In fact, GitHub's pull request UI has it as a single button.

thunky · on Oct 8, 2022

Wouldn't these types of changes more commonly be caught while testing against a branch?

Also in many cases I would think that the fix ends up being even smaller (or bigger) than a single commit (the commit isn't actually atomic).

Maybe the codebases I've worked on weren't active enough for this to be a common problem. But also, having too many cooks in the kitchen may be a part of the problem here too.

robertlagrant · on Oct 6, 2022

This might be obvious, but it's worth checking out git commit --amend. Then you can be regularly committing, but to the same commit.

dnsmichi · on Oct 6, 2022

Depending on the workflow and collaboration in branches in teams, amending may not be ideal. Amending changes to an existing commit will generate a new sha checksum, thus resulting in a diverging history if the branch is often pushed to the remote Git server. That's one of the rare cases where "git push --force" makes sense but also requires awareness for other collaborators on the branch - they need to run "git fetch && git reset --hard origin/branchname" every time, and best "git stash" to temporarily remove any local changes, with running "git stash pop" afterwards.

One reason to regularly push commits can be to trigger CI/CD, security scanning and dev/staging deployments or review apps.

GitLab team member here - before joining GitLab in 2020, I was a Git/GitLab trainer. https://www.netways.de/en/blog/2018/05/24/releasing-our-git-...

tspike · on Oct 6, 2022

I've completely replaced my use of --force with --force-with-lease. It does the same thing, except it will abort if the upstream branch has been updated since your last pull. This pretty much prevents the concerns about clobbering others' work.

dnsmichi · on Oct 6, 2022

After reading this comment about "git push --force-with-lease" in https://stackoverflow.com/a/53011907/1821348 my workflows with "git fetch" first would not work with "git push --force-with-lease", behaving as if "git push --force" was run. Theoretically - I need to do some testing here.

Thanks for the pointer into learning something new :)

robertlagrant · on Oct 6, 2022

For sure, it doesn't work with every workflow. But I think it would for the OP's described workflow.

And I am currently waiting for my GitLab backpack for contributing something (very small, but apparently it counted!) so thank you to the team for that!

dnsmichi · on Oct 6, 2022

Thanks for sharing your ideas, appreciate the thoughts to learn new perspectives :-)

And also thanks for contributing, every contribution counts and helps :-) Maybe see you at a future hackathon :) https://about.gitlab.com/community/contribute/

usr1106 · on Oct 6, 2022

--amend if you have a modification to your most recent commit.

--fixup if you have a modification to an earlier commits

rebase --autosquash amend the fixup(s) to the commit(s) they belong too.

(Of course you need to know what you doing. Otherwise you create conficts tedious to resolve in the rebase)

shagie · on Oct 6, 2022

--amend would interact with the remote system poorly if there's a push with each commit.

sjburt · on Oct 6, 2022

You have to force push, but you have to do that when pushing a rebased branch as well. In either case it’s a history rewrite. It’s not a big deal unless other people have pulled that branch—if they have they will need to reconcile things on their end as well.

gregmac · on Oct 6, 2022

> Rebase has become so integral to my workflow, it's hard to imagine living without it

I'm the same. I commit frequently. A lot of those end up being `--amend`, but a lot of time I will rebase and squash/reorder/etc. In fact I'd say the majority of time I push changes I've rewritten my local history at least once.

According to any git history you'd find from me on the server, you'd think I never accidentally commit syntax errors, break unit tests, and that when I do big structure changes or refactors I get it the way I want it on my first attempt.

> including configuring git pull to rebase rather than merge

It annoys me a bit this isn't the default -- at least for teams that work from a single centralized git repo (eg: almost all corporate dev, and the majority of open source). I've done it for years and have never had any issue with it.

It doesn't rewrite anything but your local (non-pushed) history, which is totally fine. (Unless you are working with multiple remotes, in which case it causes chaos, which I think is why it isn't the default).

Are there other drawbacks?

agumonkey · on Oct 6, 2022

then there's autosquash and similar

honestly the commit graph as a first-class value yields a lot of ideas in my mind, we should be investigating compute at the subtree layer more often.

jbergknoff · on Oct 6, 2022

If I've already reviewed a PR and the author makes further changes, I definitely prefer to review an add-on commit. If the history is rewritten/rebased, then IME the entire PR needs to be re-reviewed from scratch. If we're talking about a <10 line change, then, by all means, rebase to your heart's content. With anything more complicated than that, rebasing a branch that's already been looked at can be disruptive and I'd strongly recommend against it (though squash-and-merge after review is fantastic).

Shatnerz · on Oct 6, 2022

I didn't actually read the linked article but I see it is from GitLab. GitLab makes it easy to view the diff between versions of an MR even if it includes rebases.

johnnypangs · on Oct 6, 2022

How does that feature work in Gitlab? I also use it along with a rebase policy at work and sometimes have that issue.

nemetroid · on Oct 6, 2022

https://docs.gitlab.com/ee/user/project/merge_requests/versi...

matijs · on Oct 6, 2022

GitHub does the same. There is a compare button that appears after a force (with-lease) push of a rebased branch.

kotborealis · on Oct 6, 2022

Some code review systems, like gerrit, actually encourage this workflow: you can easily view diffs between commit versions ('patchsets').

jasonlotito · on Oct 6, 2022

Are there CR tools that don't display the diffs between rebases? Seems like a tooling issue more than anything.

seba_dos1 · on Oct 6, 2022

> then IME the entire PR needs to be re-reviewed from scratch

Why? What's the difference? You can still diff the previous version of the PR with the current version and end up with the same thing that an add-on commit would give you, but ready to merge as-is.

Arch-TK · on Oct 6, 2022

Does your project have a policy of compiling and running (and e.g. tests passing) at EVERY commit?

I can't imagine being able to easily enforce that without asking people to edit the correct part of their commit. It's maybe more difficult with gitlab/github interfaces where changing the middle of a sequence of commits will not render very well, but in email based workflows it works fine.

On the other hand, being able to bisect a project without having to worry about whether an unrelated issue is causing you to traverse the wrong branch of the bisect is an enormous advantage compared to the minimal effort required of keeping track of a modified (rebased) commit in the middle of a set of commits under review.

rjmunro · on Oct 6, 2022

You can now use --first-parent when you bisect to ensure bisect doesn't go into branches but stays on the main branch.

https://git-scm.com/docs/git-bisect#Documentation/git-bisect...

Arch-TK · on Oct 6, 2022

I don't see how this solves the problem of patches which fix up previous patches. This workflow doesn't require using merges, but it will introduce situations (irrespective of whether you use --first-parent or not) where patches fix previous patches potentially leaving a gap where code doesn't compile, doesn't run, or doesn't pass tests.

Supermancho · on Oct 6, 2022

> I can't imagine being able to easily enforce that without asking people to edit the correct part of their commit.

I imagine something something githooks. However it might be enforced, it seems like a miserable way to develop.

omegalulw · on Oct 7, 2022

You want to ask the author to change the commit you are interested in and then review the diff to that commit.

This is if course mostly valuable if you don't squash commits on merge. Otherwise, the extra rebase work isn't that valuable.

thibran · on Oct 6, 2022

Changes should be added with additional commits. When the review is complet, the code should be rebased and merged (git merge --ff-only my-reviewed-banch). This leads to a clean git commit history and an easy review process.

rubyist5eva · on Oct 6, 2022

In your view if a PR is not rebased but trunk is merged into it, does that warrant a full re-review? The end result is functionally the same as a rebase.

itslennysfault · on Oct 6, 2022

A lot of people have hate for rebase, but I've always LOVED it. I've always had way less issues with it than merge. Whenever I do a feature I work in a feature branch then just rebase it with the target branch (main) before submitting the PR and it is a really painless workflow I've used for years.

rmetzler · on Oct 6, 2022

I really try to like git rebase, especially git rebase -i and at one time I even used a special tool to help with that. What I find is that I see a lot more merge conflicts than in Gitlab MRs, but it could also be that the tooling on my machine is suboptimal: switching between VSCode as the editor and using vim in a separate terminal for git. In vim I don’t see the current commit where the conflicts appear.

hoten · on Oct 6, 2022

What was your tool? Was it more than the --fixup/--autosquash feature (perhaps your tool predated that)?

rmetzler · on Oct 8, 2022

I think it was this one in an earlier version. https://github.com/sjurba/rebase-editor

BeFlatXIII · on Oct 6, 2022

I made some minor contributions to MuseScore a few years ago and they turned me on to the joy of rebasing.

Tade0 · on Oct 6, 2022

No rebase tutorial is complete without the --onto option, which lets you essentially transplant a series of commits to a completely different branch.

Very useful when you've created a branch(A) based on another branch(B), which in turn was based on master, but in the meantime master had a few commits added, so while it's trivial to rebase B with master, rebasing A with an updated B won't work.

b3morales · on Oct 6, 2022

The latest git release has an option to handle this multiple branch scenario for you: `git rebase --update-refs` I don't know why I can't find any announcement of this other than a GitHub blog, but here it is: https://github.blog/2022-10-03-highlights-from-git-2-38/

bern4444 · on Oct 7, 2022

Is that effectively the same as cherry picking a bunch of commits at once?

vincentkriek · on Oct 6, 2022

I sometimes run into rebases where solving merge conflicts 3 times is hard. Replaying 3 commits for instance, where merge conflicts are present in each one. In the first I need to fix the merge conflict but remind myself that there are more commits following this that change this behavior.

With a merge commit I am fixing the resulting work on both branches, which is easier than merging the in progress state in the current branch.

tcoff91 · on Oct 6, 2022

In these scenarios, I think it's better to use merge commits.

Anytime resolving conflicts becomes pretty intense, where you're feeling like there's a significant risk of introducing a bug in the resolution process, I think it's better to use merge commits because then it's preserved in the merge commit how the merge conflict was resolved. If things were done incorrectly, you can trace it back to the merge commit and analyze it and you don't potentially lose important intermediate state that you might need to recover.

On the other hand, If you're touching the same code in 3 separate commits, then perhaps they shouldn't be separate commits. In these instances, I often do 2 rebases: 1 to squash the commits without reparenting them, and then another to reparent the squashed commit onto the upstream branch.

Also, if you use merge commits for merging PRs, it's easy to cut through the noise of the commit graph by using git log's --first-parent option. Once you are using this, it's a lot less important to keep an ultra-tidy commit graph.

abeyer · on Oct 6, 2022

Isn't that exactly what rerere is for?

dbbk · on Oct 6, 2022

Every PR we have in GitHub is merged with a squash, so I'm kinda missing the value proposition here. Is it really crucial for each commit to be a nice clean unit of work?

giancarlostoro · on Oct 6, 2022

I think it really depends on whose doing the commits, I've mostly worked with people who wont commit things that don't work so you see a lot less commits than you would expect. I prefer commits to be iterative and I prefer to see that if possible. I rather see 100 commits across dozens of files per commit vs 1 commit for 500 files.

warmwaffles · on Oct 6, 2022

Depends on the situation entirely.

If I see a ton of commits like "fixing formatting" or "oops", I want those squashed away into proper commits.

If the changes you make are atomic in their own right and I can check out that commit, compile it, then run tests, and it passes. That's perfect. It works good for git-bisect, but the trick is to get everyone on board in the project to do that.

For public libraries I maintain, it's squash merges all the way. I like a clean history and the ability to check out that commit, compile, test, and run cleanly is perfect.

giancarlostoro · on Oct 6, 2022

> If I see a ton of commits like "fixing formatting" or "oops", I want those squashed away into proper commits.

Every team I'm in I strongly advise against such commits honestly. They don't help anyone, I emphasize being able to find things you changed if you need to find them. You can even enforce a format for commits.

Funnily enough, I saw a wave of "fixes bug" "now really fixes bug" that got auto rejected by some commitcop utility at a former job, the guy was freaking out cause it wouldnt take all his changes. I guess they wanted to force him to stop doing such awful commit messages.

Commit messages should be useful and historically descriptive.

warmwaffles · on Oct 6, 2022

> Every team I'm in I strongly advise against such commits honestly. They don't help anyone, I emphasize being able to find things you changed if you need to find them. You can even enforce a format for commits.

Same with me, but people still ignore it.

P5fRxh5kUvp2th · on Oct 6, 2022

plus, you can tell who doesn't use git bisect because they prefer a single large commit across 500 files to keep history clean.

As opposed to a series of small commits, that would make git bisect actually useful.

mixmastamyk · on Oct 6, 2022

Your features are not broken down enough if one ticket requires hundreds of changes.

giancarlostoro · on Oct 6, 2022

This is a solid argument. Except in cases where you're upgrading everything in a small to mid sized codebase that needs a major dependency updated that affects everything from syntax to other things, and cannot be done discretely.

mixmastamyk · on Oct 9, 2022

For a Python 2 to 3 migration a few years back we did it in a branch with many commits. We didn't obsess about telling a story we just got the work done and the tests passing.

Before this there was some work to bring the codebase as close to 3.x paradigms as possible.

As long as the two points before and after the work are copacetic we were satisfied. In that sense it was discreet.

dmatech · on Oct 6, 2022

Why can't we just have some kind of "virtual squash" where the commits are preserved, but they're grouped together so you could view it as a single squashed commit if that's preferable?

lixtra · on Oct 6, 2022

Why not keep commits as the atomic element and use (release) tags if you want a coarse overview what happened?

wnoise · on Oct 6, 2022

That's a semi-linear history merge.

throwaway_au_1 · on Oct 6, 2022

Never understood the appeal of squash commits at merge time, assuming the PR contains atomic, logical commits (all bets are off if your team's PR process accepts ad hoc commits..). You lose the utility of git bisect, conventional commits, etc, and also have larger, noisier commits forming your history/documentation. Is there a benefit to squash commits other than allowing developers to forget about that as they work? I may be biased against squash commits as I have spent enough time diving through garbage commit history to figure out bugs/Chesterton's fence that good commits as documentation appeals to me.

phailhaus · on Oct 6, 2022

> assuming the PR contains atomic, logical commits

This is impossible to enforce or guarantee at scale. Squashing PRs, though, is practically fool-proof: PRs already represent a single, atomic unit of work that passes all CI checks and is safe to merge. No such thing is true (or should be!) of individual commits within that PR. Whether we like it or not, a branch commit really only represents a "save point" for a developer.

kbr- · on Oct 6, 2022

What you say is impossible, we pretty successfully apply at ScyllaDB (see https://github.com/scylladb/scylladb/commits/master).

I'm not sure 100% of the commits compile & pass all tests - there may be some mistakes - but generally we're in a pretty good state, and the clean git log is being successfully used for bisecting.

If you want even larger scale - if I understand correctly, the Linux kernel practices a similar thing, which is where we got this practice from (ScyllaDB founders came from kernel development). And since Git was originally created to help developing Linux - that's where you want to look for good practices.

throwaway_au_1 · on Oct 7, 2022

I also found the comment you replied to a little unconvincing. The remark concerning scale in particular did not hit home, as I would guess the vast majority of teams are <10 devs which I would hardly call 'scale'. I left my previous role for several reasons but one was the constant "Microsoft does x".. Microsoft has 100k devs, we had 5. Not the same.

phailhaus · on Oct 7, 2022

The point is that you can try, but IMO it's wasted effort. Commits are immutable and really hard to manipulate retroactively, and humans are guaranteed to make mistakes. Why put so much effort into trying to make your commits atomic, when it's unlikely that they ever truly are?

Sure, you can git bisect to the exact commit that introduced a bug, but that commit was part of a larger PR, and you probably can't revert just that commit alone. So what was gained?

WorldMaker · on Oct 6, 2022

You can get the same guarantees without squashes with {log, bisect, blame} --first-parent and merge --no-ff (force all PRs to make merge commits). You can preserve more of the graph and use it to revisit "inside" of PRs when necessary.

tcoff91 · on Oct 6, 2022

So much this. Everyone needs to learn about --first-parent, it makes git log, git bisect, etc... so much more powerful and prevents people from just squashing PRs into giant mega-commits that make the history so much less useful.

WorldMaker · on Oct 6, 2022

I keep joking that all we need is one good Git GUI to go viral that defaults to --first-parent in every view and we might eventually convince more people they don't need to squash/rebase as much. One of these days I may even take the joke far enough along to prototype something.

phailhaus · on Oct 7, 2022

What's the point? It's unlikely you'll be able to revert it in isolation. You can do the same thing with squashed PR's, except you also get a description of the overall work, all discussions related to it, and a higher likelihood that you can revert it in one piece.

WorldMaker · on Oct 7, 2022

First parent applies to revert, too (with the -m flag):

    git revert MERGECOMMITHASH -m 1 # revert to first parent of merge (revert changes of second parent/other branch)

You can include good descriptions in your PR merge commits including discussion comments and everything. Some PR tools automate that, some do not. I don't know any technical reason why any PR tool automation would create better squash merge commit messages than normal merge commit messages.

(ETA: Also you may want to reconsider your workflow if you rely on reverts that often. I had to look up the command argument, even though I knew it existed, because I haven't needed to do it in a while and I'm very thankful for that.)

debaserab2 · on Oct 6, 2022

You don't lose that much utility of git bisect unless your PR's are typically large.

One thing squash and merge does is make reverting trivial.

throwaway_au_1 · on Oct 7, 2022

Easy reverting is certainly not a bad thing, but IMHO, and not to suggest your comment implied anything either way, it should be a minority case and optimising for a minority case doesn't seem generally sensible. Maybe reverting has a use outside of 'uh oh' that I'm not conscious of (release management?) but if 'uh oh' reverting is not a minority case, sounds like there may be bigger problems at play. Do you come across many situations to use reverts day to day or week to week? I've probably only made a few in the last few years

tcoff91 · on Oct 6, 2022

You can revert a merge commit.

ivanche · on Oct 6, 2022

I much prefer to see one squashed commit over an endless stream of "fix typo", "small fix", "fixup", "forgot XYZ"...

tcoff91 · on Oct 6, 2022

Don't approve PRs that have those commits.

A PR doesn't necessarily have to be a single commit to be a good PR. Sometimes, it makes sense for a PR to have several atomic commits.

Those tiny messy commits should be fixed-up into the atomic commit that they are fixing.

ivanche · on Oct 7, 2022

That is 100% correct and I agree. Actually, I always try to make a PR as a series of small, self-contained commits. I just wanted to point out the, IME, strong correlation between people that leave a bunch of "small fix" commits and them not using squash/rebase but "normal" merge.

tcoff91 · on Oct 7, 2022

I've been taking a different tactic. I have been using git-patch-stack https://git-ps.sh/, and making all my PRs be 1 commit, and trying to keep my PRs smaller using feature flags and things of that nature that allow me to deliver smaller incremental units of work. This tool is amazing, i highly recommend it. It's wonderful for being able to hack away on a huge change and still deliver tiny incremental PRs from it as you go along.

dnsmichi · on Oct 6, 2022

GitLab team member here, putting my personal hat on - from my experience in using different Git workflows since 2009, a smaller clean unit of work can help with debugging and troubleshooting. It also provides a way to new team members and contributors to understand the thought process and ideation to implement a new architecture, apply performance fixes, add documentation, work with tests, additional fixes, until its final release. Most of this can be tracked within a MR/PR and the history of code reviews, etc. - even after the merge and squash and Git branch delete, not trying to argue with this functionality. :)

From the Git CLI, without any reference to Git* platforms, it is not so obvious when searching for a commit that introduced a bug, e.g. using "git bisect" for binary search. Reading a 10,000 lines git diff can be harder than a smaller commit that also explains the reasoning in the commit message. Speaking from own experience and programming mistakes in a small team, focussing on clean commits and a good history tremendously helped in stressful debug situations. Until you hit a compiler regression bug, but that's a different story then ;)

I'm personally still very fast on the Git CLI, but I also know that there are a variety of CLI and UI tools out there that can help with analysing large Git commits. Potentially in the future also AI assisted that tell us which change a diff caused a performance regression in a release 5 months later. Or we don't need it at all because Observability driven development enabled to see these problems before merging and code reviews, e.g. the memory leak but only when DNS fails. True story from ~2016, more in my KubeCon EU talk at https://www.youtube.com/watch?v=BkREMg8adaI and project at https://gitlab.com/everyonecancontribute/observability/cpp-d...

mixmastamyk · on Oct 6, 2022

If your merge request is gigantic, that’s the problem. Easier to keep those focused than obsess about every commit.

dnsmichi · on Oct 6, 2022

True, thanks. Some workflows can require larger merge requests, having the platforms and tools that enable smaller iterations help reduce (or eliminate) them though.

Nagyman · on Oct 6, 2022

As always, it depends. Especially for large PRs, I will go through the effort of rebasing to help the code reviewer so they can view key commits rather than a mile long scroll-fest on the GH "Files Changed" tab. It's about being a good co-worker and facilitating faster reviews.

stinos · on Oct 6, 2022

In my opinion, yes, but like everything else it's not a hard rule. But especially for refactoring I really really want to read a comprehensive story of granular enough (but not too much) commits which logically follow each other, instead of a code dump with commit message 'bugfix'. That, plus just looking at the default graphical representation most git tools out there produce for merges vs simple rebased history: yes I'll take the latter.

blktiger · on Oct 6, 2022

I sometimes prefer a rebase to a merge when pulling in changes from some other branch. It can be easier to deal with several smaller merges than one big one which a rebase accomplishes. This is not always true of course, so I usually start with a regular merge and if it's sufficiently complex or hard to untangle then I give a rebase a try to see if things get easier.

agumonkey · on Oct 6, 2022

you keep the feat branch but only add single merged commit to master ?

xyzzy4747 · on Oct 6, 2022

I would recommend not doing anything complicated such as git rebase and just add more commits, patches (git diff / apply), or merges until your code works. If the number of commits is large, it doesn’t really matter. Optimizing for a pretty looking git history is probably the most foolish thing to focus on.

If you are doing anything that involves rewriting the history you are doing it wrong.

Nagyman · on Oct 6, 2022

> Optimizing for a pretty looking git history is probably the most foolish thing to focus on

It's not about "pretty"; commits are a form of _communication_. Do we send emails without editing before hitting send? It's a means to optimize for easier reviews through better comprehension of the changes, which also leads to faster reviews. Our colleagues don't want to read a bunch of intermediate commits.

> If you are doing anything that involves rewriting the history you are doing it wrong.

Care to elaborate? What's your general strategy?

chriswarbo · on Oct 6, 2022

> It's not about "pretty"; commits are a form of _communication_. Do we send emails without editing before hitting send?

Writing clear, atomic commits is a good idea regardless of whether you use rebase or not.

In your email analogy, rebasing would be altering previous messages in the chain. That doesn't make rebasing look good!

> > If you are doing anything that involves rewriting the history you are doing it wrong.

> Care to elaborate? What's your general strategy?

Not the parent, but to me the most important part of VCS history is accuracy. Rebasing commits (or cherry-picking them) changes their context; that context can be important for understanding why things were done in a certain way. For example, imagine we're digging through the following history, to understand how some feature 'bar' works:

  * Add workaround for Error(foo) in feature bar
  |
  * Implement feature bar
  |
  * Bump dependency baz to eliminate Error(foo)

Why was a workaround for Error(foo) added, if that error had already been eliminated? Did that dependency change not work? Is there some more permanent way to eliminate Error(foo)? Is the workaround still needed?

Compare that to the following, more accurate history:

  * Merge
  |\
  | * Add workaround for Error(foo) in feature bar
  | |
  | * Implement feature bar
  * | Bump dependency baz to eliminate Error(foo)
  | /
  |/

Here it's much clearer what's going on: the dependency change was not in place when that workaround was added. Hence the workaround shouldn't be needed anymore. Rebasing the 'feature bar' changes on to the 'dependency baz' changes throws away that information.

k_ · on Oct 6, 2022

Good luck bisecting such a repository, unless you squash PRs (for which build passes of course)

WorldMaker · on Oct 6, 2022

git bisect --first-parent bisects only the straight line of first parents of merge commits. If you --no-ff PRs and always have merge commits, it is essentially a bisect at the PR level without any need to squash the graph.

boleary-gl · on Oct 6, 2022

I don't know if Christian would go this far, but I actually put this into my ~/.gitconfig so that I'm always rebasing

[pull]

rebase = true

edited for line spacing

augusto-moura · on Oct 6, 2022

Just a nit on the HN comment, you can indent the lines 2 spaces so that it shows up as a code block, it should maintain the correct spacing/line wrapping and it uses a monospaced font as well

  like this

dnsmichi · on Oct 6, 2022

Today I learned, thanks a lot!

Created a MR for the Developer Evangelism Hacker News handbook to add this formatting tip, and some more https://gitlab.com/gitlab-com/www-gitlab-com/-/merge_request...

boleary-gl · on Oct 6, 2022

Thanks - today I learned

dnsmichi · on Oct 6, 2022

You are more brave than I am - I have it explicitly disabled :-) https://gitlab.com/dnsmichi/dotfiles/-/blob/main/.gitconfig

When I pull and it runs into a merge conflict, I want to see that error/log on the CLI. Reason: Sometimes the automated pull-rebase takes a very long time to resolve conflicts after each step. I prefer to first run

  git fetch

  git diff branchname origin/branchname

and then decide my strategy :)

Bit off-topic but since we share .gitconfig tips - I upgraded to a recent Git version and enabled the "git push" option to setup tracking automatically. No more "git push -u origin branchname" actions. https://gitlab.com/dnsmichi/dotfiles/-/blob/main/.gitconfig#...

  [push]
   autoSetupRemote = true

davecamp1717 · on Oct 6, 2022

Yeah, I wish this was the default functionality in git, to me it never makes sense not to want to rebase when pulling.

silverwind · on Oct 6, 2022

`pull.rebase` should be the default just because it avoids the problem of merge commits.

greatpostman · on Oct 6, 2022

Wow did not know this existed. I’ll add it to my config

bob1029 · on Oct 6, 2022

Rebase is something that I see a lot of developers shy away from but it is usually the right way to get a feature branch re-aligned.

My practice is to rebase all my pending PRs each morning to make sure that the prior day's activity is coalesced.

If you wait weeks and weeks to do the first rebase for a big change set, you can wind up visiting the same files and conflicts way more times than is logical. In my experience, this results in a much greater chance of screwing something up along the way, further reinforcing for some developers that the rebase is bad.

Timely rebase is the answer.

sidlls · on Oct 6, 2022

Git is in general a terrible tool: I’ve always thought of it as a shining example of where “worse is better” ought to have been applied. Rebasing is one of its worst features. For non-trivial changes, anyway, it is often just a way to add complexity and messiness to what should be a simple workflow. And the benefits aren’t at all as clear as its advocates contend.

mixmastamyk · on Oct 6, 2022

Yes, and these folks are never around when you try rebase and get a forty file conflict and have no idea how to get out. Happened to me several times when I fell for one of these articles.

You know what works almost every single time? With tiny conflicts if any? Merge + squash on gitlab.

sidlls · on Oct 7, 2022

Yep. And most of the time conflicts merging from master into your working branches produces minimal conflicts.

Rebasing is just a way to make extra work for yourself

Extigy · on Oct 6, 2022

If a rebase goes wrong and you want out you can run,

  git rebase --abort

to return to where you were beforehand.

I find that git is actually very good at explaining what is going on and what you should do next by just running git status in the middle of some process like rebasing and reading what it says.

Even if it all goes really wrong the git reflog has always saved me.

mixmastamyk · on Oct 7, 2022

Ok, but what do you do when it goes wrong every single time you've tried it? That's what happens at work, so I gave up and use merge+squash on gitlab.

flippinburgers · on Oct 7, 2022

1) Never use rebase on the master branch.

2) Before you rebase take a note of your current branch's HEAD commit.

3) If things go horribly wrong, you can force your branch to point at the original commit and it will be as if nothing happened. Well as long as you do it immediately. At some point git gc will collect up unreachable commits and delete them.

ludwigvan · on Oct 6, 2022

On Mac, I prefer Gitup, a free and open source GUI which makes rebases (and a bunch of other git operations) much easier: https://github.com/git-up/GitUp

DrBenCarson · on Oct 6, 2022

Easily the most underutilized and poorly understood part of the typical git workflow

OJFord · on Oct 6, 2022

I don't know if it's possible to write an article like this and not just be preaching to the choir; ignored by the flock that's already decided it doesn't like it.

chris_wot · on Oct 6, 2022

So what? If the crowd doesn't want to use it, what of it? You only rebase on your own changes before you push the patches.

OJFord · on Oct 6, 2022

Well my suggestion really was just that it's pointless to write about, because you won't convert anyone. (Not '..and you should be doing xyz because that will convert them and that's what we want'.)

But to answer your actual question, there are various ways in which what others on a team do affects you:

- CI building 'merge branch master' all the time on master, because of merge-pulling upstream changes after committing

- spaghetti merge feature branches (that you might be reviewing, working jointly on, or as below)

- git log, blame, etc.

chris_wot · on Oct 7, 2022

Yeah, rebasing in master is a huge no-no.

LWN has a great article:

https://lwn.net/Articles/328436/

Linus Torvalds explained it best:

https://lwn.net/Articles/328438/

OJFord · on Oct 7, 2022

If you have local, un-pushed commits 'on' master, and there are upstream changes to master that you lack locally, there is absolutely nothing wrong with pull.rebase, i.e. rebasing your local changes on the upstream ones before you push.

Beats the hell out of:

    * deadbeef (master) Merge branch 'master' of upstream
    |\
    | * feedbeef A proper commit message from upstream
    * | deafbeef A proper commit message of the actual change
    |/

which not only hides the real commit message in CI, but also flips around the history so master's snaking around between upstream's path and the person not pull.rebase-ing's path.

perryizgr8 · on Oct 6, 2022

Rules I would have everyone follow if I was a dictator:

1. 1 PR, 1 commit.

2. 1 PR cannot have more than 50 lines of product code added. Any number of lines can be removed. You can have up to 100 lines of test code.

3. Every PR should include a set of tests for the added/changed functionality. They must pass.

4. Git merge is forbidden. Everyone must rebase.

5. Every PR goes into master. Nobody can push to master. You can create as many feature branches you like, but definition of done is that your code is available on master.

6. Identify relevant existing test cases and make sure they are passing.

7. Master must always be in a state that it can be deployed instantly.

lamontcg · on Oct 6, 2022

> 1 PR cannot have more than 50 lines of product code added. Any number of lines can be removed. You can have up to 100 lines of test code.

Hard disagree.

Landing a single class/module to implement new functionality may take more than 50 line of code and 100 lines of test, even though conceptually the additional functionality is simple and it amounts to mostly boilerplate and looks like every other class that implements the pattern.

Some large plumbing refactorings or CI refactorings are difficult to do in small chunks. I'm perfectly fine with a PR that replaces one internal API with another one and then has hundreds of small copypasta fixes littered around the codebase to swap out the APIs. Anyone should be able to read that without cognitive overload.

Tests are also commonly over 100 lines of code since they tend to be repetitive by nature, and I'd like to not see those artificially broken up into smaller PRs since the most important question I've usually got is if all the cases are covered or not. Breaking 300 lines of tests up into 3 different PRs to satisfy some kind of line-length metric means now the most important question I have needs looking at all 3 simultaneously and that is deeply counterproductive and totally useless. And small code changes with lots of comprehensive tests which fill out cross products of different API usages are great and should be encouraged.

And while I don't personally like the dozens-of-tiny-atomic-commits approach, I deeply don't care if other people do it or not. I've never found it useful to read their PRs, or even after the fact, but I won't stand in the way of them doing that if it is what gives them enjoyment (OTOH, I never do that).

AtlasBarfed · on Oct 6, 2022

Maybe a good rule for projects in maintenance mode. Large scale refactors? Well, that's impossible under these rules, well unless you go full LISP on what a line does.

lamontcg · on Oct 6, 2022

Mostly what I've done is maintenance mode, that's where if you try to touch something it is often very complicated in secondary effects and fixes, and requires lots of tests. And additions of functionality which are bolt-on will come with 10 years of accumulated requirements around the shape of any new code and the tests.

qz_kb · on Oct 7, 2022

What kind of simple features are you writing that 50 lines seems like enough to do anything useful??

perryizgr8 · on Oct 7, 2022

You can write any feature with this constraint, if you break it down into smaller units. In fact, it produces better quality code, more readable, fewer bugs.

If you find yourself unable to subdivide the task into small units, maybe your system isn't well architected. In that case this won't work for you.

holoduke · on Oct 6, 2022

Rebasing is merging multiple commits to one commit right? Basically you commit your small changes to your local master branch and at some point you merge them to one commit and push them to master repo?

shagie · on Oct 6, 2022

Rebasing is re-applying a sequence of commits to a different parent.

I like using explain git with D3 to show how things work - https://onlywei.github.io/explain-git-with-d3/#rebase

Compare that with merge - https://onlywei.github.io/explain-git-with-d3/#merge

At the end, the HEAD has the same content, but the structure of the graph is different.

Note that rebase falls in the set of "rewriting history" operations in git and so pay attention to the caveat in the explanation:

> For this reason, you never want to rebase commits that have already been shared with the team you are working with.

Rebase and reset are two of the commands that need to be done with caution and full awareness if working with commits that have been shared with other people.

hkdobrev · on Oct 6, 2022

That's squashing which could be achieved with rebasing, but rebasing is larger than that which allows you to order and edit your history.

dnsmichi · on Oct 6, 2022

GitLab team member here.

You can use `git rebase` to perform the so-called "squash" action, that squashes multiple commits at once.

For example, the latest commits in the branch look like this:

123 WIP1

456 WIP2

789 WIP3

012 FeatureA

WIP1,2,3 as commits should be merged into a single Git commit, and be put on top of the FeatureA commit - so to speak, use the FeatureA commit as a new base.

Git uses the rebase command to do exactly that - you'll perform an interactive rebase onto FeatureA commit. The interactive rebase allows you to define specific actions.

"pick" keeps the commit. In order to perform a squash of the WIP1,2,3 commits, you'll keep the first WIP1 commit, and tell Git to continuously squash the next commits WIP2 and WIP3. Each operation is done at once.

pick 123 WIP1

squash 456 WIP2

squash 789 WIP3

pick 012 FeatureA

This results in a new history and commit - squashing changes the commit sha checksum, with new content in the Git commit, as well as a new base commit.

567 Squashed

012 FeatureA

"squash" in an interactive rebase keeps the individual commit messages for each commit, and at the end, the editor for commits allows you to edit the final messages. This can be helpful to review the squash action, and for example amend or abort it.

If you plan to not keep the individual commit messages, "fixup" throws them away and can be used as alternative action.

"git rebase -i" has more options - you can also stop at a specific commit, and amend it, e.g. when a "git add" was missing a file earlier.

Note: Any change to a commit within the Git history forces all later commits to change too, as the linked commit base changes too, thus regenerating the sha checksum. This entirely rewrites Git history trees, and can be very invasive - if you intend to keep specific commit IDs (for release tags for example), ensure that the workflows to not allow rebasing on certain branches. One possible workflow is to keep the main default branch protected, disallowing to rebase and push a changed history, add git tags there, and only rebase in a Merge Requests branch prior to review/approve/merge cycles.

Moving from Git commands to GitLab - GitLab also offers a Merge Request option to squash commits automatically when the MR is accepted, so that all commits in the MR branch are squashed, and you don't need to do it manually. https://docs.gitlab.com/ee/user/project/merge_requests/squas...

Last but not least - the quick action /rebase in a MR comment allows to trigger a rebase via the UI too, thus not requiring client side clone/fetch/rebase. More tips in https://about.gitlab.com/blog/2021/02/18/improve-your-gitlab...

secondcoming · on Oct 6, 2022

Been using git for 10 years now, never rebased, never had an issue.

Jwarder · on Oct 6, 2022

I think it depends on what problems you want git to solve for you.

Some people like having the context from the complete the history of little commits without the risks of rebase breaking something.

Some people like using rebase to group all changes needed for a feature into isolated(ish) commits.

Some people just like the aesthetics of a straight master branch without the clutter of little "corrected typo" or "fixed bug for real this time" commits.

chriswarbo · on Oct 6, 2022

Been using git for 15 years now. Occasionally rebased, usually had an issue ;)

goalieca · on Oct 6, 2022

You've never used gerrit then. That tool is entirely designed around rebasing with merges as a rare event.

recursive · on Oct 6, 2022

Nor have I heard of it. (not OP)

There are probably tools built around merging too.

goalieca · on Oct 6, 2022

Gitlab and github are built around merge-requests and gerrit around rebasing/amending. I strongly prefer gitlab/github workflows.

chris_wot · on Oct 6, 2022

You've never lived.

ayewo · on Oct 6, 2022

Can't tell if you are joking or not, but if you are not that's a bit too mean. At the end of the day, they are merely tools.

> You've never lived.

chris_wot · on Oct 6, 2022

I use it all the time.

mattpallissard · on Oct 6, 2022

Keeping your commits as separate units of change, and leveraging rebase/ff-only is worth is simply so you can do stuff like the following.

    git revert $(git rev-list COMMIT43^..COMMIT123 -- path/to/thing)

I've been on teams where everyone _hates_ what a stickler I am about good VCS hygiene until they realize something that looks like it's going to be a big pain in the ass at first glance is doable with a one liner.

mixmastamyk · on Oct 6, 2022

What is this a solution to?

mattpallissard · on Oct 6, 2022

Reverting a range of changes for a given path.

MilStdJunkie · on Oct 6, 2022

In my personal experience rebase made ISO9000/AS9000 gatekeepers twitchy. Even to the extent that they'd tell the Overlords "Do everything in Perforce, or else".

And then the Overlords shut down all the VCSs because "this isn't a software company". Then everyone sneaks around using weird homebrew portable tracking widgets, or, more often, just gives up.

Is rebase handy? Oh yeah.

divbzero · on Oct 6, 2022

The following tip from OP could be rather useful.

  git rebase --exec 'make test' main

The --exec <command> flag allows you to run any shell command after each rebased commit, stopping if the shell command fails (which is signaled by a non zero exit code).

DBCerigo · on Oct 6, 2022

A much more in-depth, advanced, and more valuable imo, article on the same topic https://medium.com/@porteneuve/getting-solid-at-git-rebase-v...

nickporter · on Oct 7, 2022

Love rebasing, love merging, squash or no squash, it all depends on what I'm trying to communicate with my pushes.

Just discovered the --rebase-merges option, worth exploring if you want to edit a commit under a merge commit, but don't want to mess up the merge commits.

manbash · on Oct 6, 2022

Anyone who has experienced rebasing a PR on github knows how much it hurts the review process.

derefnull · on Oct 7, 2022

In other words, GitHub has useful tooling that breaks when commit history is rewritten

tuckerpo · on Oct 6, 2022

I love rebase. It allows for a Draft PR workflow where you can have your WIP out in the open for a big project, and then clean it all up via rebasing right before asking for reviews. Just don't rewrite history on master. :^)

chris_wot · on Oct 6, 2022

The one thing I'd like to do is to insert a commit between two commits, then edit this and then make a modification.

Currently I create a text file, add this, then move this up to where I need to insert it and edit this commit.

alisonatwork · on Oct 6, 2022

You can also just `git rebase -i` and set `e` (edit) on the commit before the one where you want to insert a new commit. Instead of doing `git add .` and `git rebase --continue` like you normally would to squash your changes into the commit you've stopped to edit, you can just `git add .`, `git commit -mwhatever` and then `git rebase --continue`. Now you have new commit "whatever" in between the one you stopped to edit on and the next one.

throwaway_au_1 · on Oct 6, 2022

It's not clear to me exactly what you mean. Based on adding the text file, which sounds like you're creating a placeholder commit, you might not know about 'git commit --allow-empty' which may serve you better.

Not exactly sure of your order of events wrt inserting a commit and editing it (I.e. edit commit message? Or changes?) and modifying it (I.e. see previous) but am happy to try and help if you want to flesh out your example a little more.

throwaway_au_1 · on Oct 6, 2022

To speculate, maybe you haven't heard of fixup/squash.

E.g. Commit 1 Commit 2 Commit 3

Using an empty fixup commit, e.g.

git commit --allow-empty --fixup=commit2hash

You would end up with

Commit 1 Commit 2 Commit 3 fixup! Commit 2

Which will adjust automatically (when starting interactive rebase with autosquash) to

Commit 1 Commit 2 fixup! Commit 2 Commit 3

You could then simply change the rebase 'fixup' instruction to 'reword' to edit/remove the 'fixup!' part of the message, or 'edit' to continue the rebase uo until that commit and then stop.

That said, again, not clear on your use case but maybe that might help you or another reader.

zwieback · on Oct 6, 2022

Years ago I really liked rebase, now I never use it. I don't know why, though, one of those mysteries of the brain.

rmetzler · on Oct 6, 2022

I was waiting for „Gitlab releases a UI for rebase -i in MRs“.

dnsmichi · on Oct 6, 2022

Thanks, great idea. Added a comment into the feature proposal in https://gitlab.com/gitlab-org/gitlab/-/issues/273250#note_11... - suggest upvoting and/or subscribing to notifications.

hoten · on Oct 6, 2022

--fixup has been incredibly useful for my rebase workflows.

kartoshechka · on Oct 6, 2022

irony is that my team use gitlab and we have auto squash before merge, avoiding manual rebasing