Git rebase in depth

gwright · on May 10, 2019

About 99.9% of the time when people talk about rebase they talk about ‘editing’ history or ‘rewriting’ history as in the first sentence of the article.

I find that terminology terribly misleading and when I was learning git and rebase it confused the heck out of me.

No commits are harmed in the operation of `git rebase`. All the commits you had in the repo before the rebase are still in the repo. Git rebase creates a new sequence of commits and after doing its work relocates the branch name to the tip of the new sequence but you can easily access the previous commits if need be:

    $ git co feature-branch
    $ git rebase develop
    $ git co -b before-rebase-feature-branch feature-branch@{1}

Chazprime · on May 10, 2019

This really should be the top comment here. Learning about the non-destructive nature of Git really helped me overcome the unease around using some of Git's more advanced features (especially in a team environment).

Buttons840 · on May 10, 2019

Yep. Specifically, git will never delete a commit, unless it is old (like 30 days old or older I think?) and is not part of a branch. I suppose there may be some arcane commands to force a deletion, but it wont happen by accident or by normal usage.

Like you, I felt much more comfortable using git after learning this.

dahart · on May 10, 2019

> git will never delete a commit, unless it is old (like 30 days old or older I think?) and is not part of a branch.

Yeah, completely unreachable commits have 30 days, even when you run git gc. The default reflog for a branch is even longer: 90 days!

https://git-scm.com/docs/git-gc

shandor · on May 10, 2019

This is also why commiting very often is a very good idea.

I often try to reassure people new to git that "if you'll just commit often, there's basically no way you can lose work so that I can't help you get it back. Apart from deleting the whole repo folder. Don't delete the repo folder.".

I also have the habit of preventing non-fast-forward pushes on origin/master, which also helps when I can tell my team that they can't trash the origin even if they try.

kohtatsu · on May 10, 2019

Which merge strategy do you recommend?

    # Merge commit
    git merge --no-ff -m <message> <hash>
    # Squash and merge
    git merge --no-commit --squash <hash>
    git commit -m <message>
    # Rebase and merge
    git rebase --force-rebase <hash>

From https://stackoverflow.com/a/52301456

shandor · on May 10, 2019

I...have to admit that my merge strategy, and what I teach my teams, is "don't" :) (only slightly tongue-in-cheek)

I believe in clean, linear history, and strongly prefer rebase-based workflows to merges. That's actually one of the reasons I chose Phabricator for my current place, as it is also very opinionated towards the same way of working.

Edit: oh, and to answer your actual question, the third one.

greyskull · on May 10, 2019

+1. Interactive rebase to squash your feature branch, then ff merge into master/mainline.

Years ago when I was just reading about git instead of using it, I saw sentiments along the lines of "always use feature branches and merge them so your thoughts and process can still be looked at later". In the last ~5 years or so I've worked professionally, I've not once wished I could reference intermediate commits in my own code or someone else's. I've found that ambiguities and clarifications can and are caught during the code review process.

I'd say another couple of benefits:

- It's relatively easy to teach git workflows when the log/graph is linear, and similarly it's _way_ easier to reason about your workspace when you have an actual production codebase.

- Merge commits can make certain operations like reverts and patches harder to reason about

u801e · on May 10, 2019

> In the last ~5 years or so I've worked professionally, I've not once wished I could reference intermediate commits in my own code or someone else's. I've found that ambiguities and clarifications can and are caught during the code review process.

One command I often use is git blame which allows me to find the commit that's associated with a particular line of code. Then I can look at the commit message and the diff against its parent. Perhaps what I'm changing may undo a bugfix and I wouldn't have realized it without reading the associated commit message.

kqr · on May 10, 2019

That's separate from what the grandparent talked about. During the process of developing a new feature, I may incrementally refactor old code a few times, but only some of those changes make it into the final merge.

When I look back at the history, I'm only interested in seeing the changes that actually made it through, not every intermediate alley and dead end in between. If those dead ends are significant discoveries/results, I document them elsewhere.

Vinnl · on May 11, 2019

Yes, that's exactly my main motivation for rebasing too: properly thought-out commits can give more context to the lines of code I'm looking at. (I wrote these thoughts up in more detail here: https://vincenttunru.com/Spend-effort-on-your-Git-commits )

mikekchar · on May 11, 2019

The trick is you have to pick one of "merge always and avoid rebase" or "rebase always and avoid merging". If you take a branch, merge master into it, do some more development and then rebase it onto master, you are asking for trouble. If you have a revert in there (and especially a revert on a merge commit), it's a world of hurt.

But either way works fine. It just gives you a different history. My team likes merging because they don't understand exactly what happens when rebasing. In that environment `git log --topo-order` is practically a necessity, though.

yebyen · on May 12, 2019

I prefer to say, merges should only flow one way. (And always rebase before a merge.)

If you are merging master into a feature branch that has ongoing work and continues without merging back to master, that's the problem.

If your feature branch is short-lived, it can be easily rebased.

If your other branch is more like a release branch, with lots of work that can't be rebased easily, some times you can't really avoid a merge from master without communicating it first. If your team is large or distributed it might not be practical to say "release has moved to (rebased ref), please catch up"

In that case you should treat merges to release the same as merges to master (they should be finished bits of work that are considered published) and any unmerged features for the release, are kept on feature branches that are based on the release. They can be rebased after the point where master is merged back into release to avoid the nasty merge conflicts.

gbacon · on May 11, 2019

When cleaning up my own feature branch for review

    git merge —-squash <tree-ish>

When merging into the baseline

    git merge —-no-ff <tree-ish>

The reason for the latter is subtle. Yes, a perfectly linear history is nice in the aesthetic sense. However, the merge commits are artifacts of reviews that are useful in process audits, which QA and QC like to preserve.

MichaelMoser123 · on May 10, 2019

How comes that I see older commits with git log -p ? I can even see what the commit changed.

tomcatfish · on May 10, 2019

Those commits are referenced. Parent comment is talking about commits that have no references to them (ie. You make a commit then remove all traces of it, it is not gone completely until after a certain amount of time).

kradroy · on May 10, 2019

I had to do a git history rewrite via "git filter-branch" and one other method. I can attest that you really have to go out of your way to actually permanently get rid of anything in a git repo. It's always funny to me when I see an engineer have a panic attack the first time they think they've trashed the repo and lost important code. They always have that same look as relief washes over them when I show them how to recover from the mistake.

rags2riches · on May 10, 2019

Git is non-destructive and its great. Until you learn why the really dangerous command nobody talks about is git checkout.

lisper · on May 10, 2019

Or until you accidentally check in a file that contains a secret. Then the non-destructiveness becomes a serious problem.

palunon · on May 10, 2019

Two possibilities there: either your secret was published to others, and it's not a secret anymore, or it wasn't, and you can easily remove it. (Even from your local repo, if necessary remove the relevant blob hash)

kkarakk · on May 11, 2019

Easy to say but a chinese dev went to prison for pushing DJi's secret crypto key to github

thanatos_dem · on May 10, 2019

Once I learned about the reflog I got a lot more adventurous with git. Now I’m the “git expert” in my team.

dmitryminkovsky · on May 10, 2019

> No commits are harmed in the operation of `git rebase`. All the commits you had in the repo before the rebase are still in the repo.

They are still in the repo, but if no treeish item (eg. a branch) points to them, then they'll eventually get garbage collected.

Still, glad to see people are trying to elucidate git rebase. A small subset of its functionality is fundamental part of my workflow and I wouldn't know how I'd use Git without rebasing.

ddevault · on May 10, 2019

git-gc will also preserve objects which are referred to in reflogs - it doesn't just have to be tree-ish. From the man page:

>git gc tries very hard not to delete objects that are referenced anywhere in your repository. In particular, it will keep not only objects referenced by your current set of branches and tags, but also objects referenced by the index, remote-tracking branches, refs saved by git filter-branch in refs/original/, or reflogs (which may reference commits in branches that were later amended or rewound). If you are expecting some objects to be deleted and they aren’t, check all of those locations and decide whether it makes sense in your case to remove those references.

dmitryminkovsky · on May 11, 2019

I didn’t know that! Thanks so much. I was always a bit nervous about things in my reflog being GC’d.

Dayshine · on May 10, 2019

I really don't understand why git doesn't create a tag when you rebase in case things go wrong, or more generally when doing potentially gc-able actions. Pretending the average user will know how to get things back to how they are is silly.

jrockway · on May 10, 2019

There is the reflog for that. https://www.atlassian.com/git/tutorials/rewriting-history/gi...

brandmeyer · on May 10, 2019

This is technically true, and a common reposte when talking about preservation of history edits.

Unfortunately, the reflog is confusing and hard to use correctly in the case of an interactive rebase with multiple steps. It is hard to figure out exactly how far back you need to go in the reflog to get to moment before the rebase started if you want to start over. It also just so happens that its when an interactive rebase goes awry that I really want to reach for the reflog to fix the damage.

lilyball · on May 10, 2019

HEAD and the branch have separate reflogs. Each step of an interactive rebase adds a separate entry to HEAD's reflog, but the branch's reflog only ever gets a single new entry when the rebase is complete. So you can run e.g. `git log -g master` and skip the rebase intermediate steps.

It is rather unfortunate that there is no convenient documented shorthand for "show me the reflog of the current branch" (`git log -g` gives you HEAD's reflog). That said, after a bunch of experimentation, it seems like `git log -g '@{0}'` will give you the reflog for the current branch. Apparently this works because e.g. `git log -g '@{2}'` gives you the reflog for the current branch skipping the first 2 elements.

dmitryminkovsky · on May 11, 2019

Thanks for pointing this puts I didn’t realize/notice this until recently. And it makes a lot more sense than having one reflog. I would have liked to have known sooner.

u801e · on May 10, 2019

> Unfortunately, the reflog is confusing and hard to use correctly in the case of an interactive rebase with multiple steps. It is hard to figure out exactly how far back you need to go in the reflog to get to moment before the rebase started if you want to start over.

When you run git reflog after rebasing, you will see lines like the following:

  29d82ac HEAD@{6}: rebase -i (finish): returning to refs/heads/your-branch
  29d82ac HEAD@{7}: rebase -i (fixup): Commit message 2
  4f8e996 HEAD@{8}: rebase -i (pick): Commit message 2
  f3a954e HEAD@{9}: rebase -i (pick): Commit message 1
  f74b8a5 HEAD@{10}: rebase -i (start): checkout origin/master

The line listed after the one that has rebase -i (start) is the commit you were on before you started the rebase. If I screw up a rebase, then I will stash any uncommitted changes and run a git reset --hard to the commit listed below the rebase -i (start) commit I see in the reflog and start the rebase again.

chrisweekly · on May 11, 2019

Yes, git stash is awesome! I use it all the time to "snapshot" my WIP and/or to quickly get a clean working tree when taking an interrupt to work on something in a different branch. (`gs` alias for `git stash save -u`, along with `grs` for `git reflog show stash` -- which shows the commitish for each stash...)

I see the stash as kind of like a private remote, in that I can freely put whatever messy or experimental or half-baked WIP I like, gaining the benefits of a commit without inflicting it on anyone else.

Stratoscope · on May 10, 2019

SmartGit does a brilliant job of integrating the reflog and stashes into the rest of Git. The history log window has a Branches panel, which is a tree view with subtrees for your local and remote branches. Below those is another subtree of any stashes you have saved, and then a Recyclable Commits checkbox.

When you turn on Recyclable Commits, every commit in the reflog shows up in the history tree just like any other commit. You can see exactly where they diverge from your other branches and can work with them as you normally work with any commit.

Same thing for stashes: check one and it just shows up as part of the commit tree as if it were a normal commit.

I've used SmartGit for years and highly recommend it over the Git command line for the way it gives you so much more insight into the state of your repo.

Dayshine · on May 13, 2019

This is literally what I was referring to with:

"Pretending the average user will know how to get things back to how they are is silly."

The reflog is faaar more complicated to use that any of the day-to-day git commands.

mixmastamyk · on May 10, 2019

Yep, it was designed by and for linux kernel development, that average users are using it is an accident of history.

irrational · on May 10, 2019

So what are the chances that a version of Git, or something like Git, will be developed that is suitable for the average user? Someone above made a comment about how easy it is to do X in Git, and then preceded to write lines of Git commands that are as arcane as anything an alchemist could come up with. If that is easy Git, I'd hate to see what hard Git looks like.

mixmastamyk · on May 10, 2019

Making it could be done, Mercurial already exists for example. Perhaps a next-gen version introduced with all the lessons learned over the last decade.

Getting folks to use it would be very difficult due to network effects however.

muxator · on May 10, 2019

So true. I live a mercurial life, in a git world. Thanks to hg-git, it is easy to use it and just treat the remote git repository as a black box.

After having rewritten history locally (with Mercurial this is easy, powerful, safe and with a lot of tooling available) everything you need to do is doing an hg push --force.

For one, hg histedit with its curses interface (default since 4.9) is sweet.

acqq · on May 10, 2019

Don't hope for git to become something else. Something else will be something else.

In reality, git is actually far from ideal tool, with its own weak points and use-case scenarios which it simply not or badly supports. So it's never "just you" don't feel bad that it's "hard." On top of this, even the scenarios that it supposedly "good" supports demand sometimes totally "illogical" combination of the names and parameters.

git is used in spite of its flaws for different reasons. Some actions are really very fast, faster than by the competition. Sometimes that is a reason enough. Another is -- we have to use what our colleagues use. Even another: once you become familiar with it, even if you were aware of the weirdness, it can stop annoying you. Still to be able to realistically compare it with something else, you have to at least try it. And that something else too.

mikekchar · on May 11, 2019

Something like Darcs is probably going to be the next generation. The nice thing about Darcs is that it records patches in an order independent fashion. This means that you can reorder your commits without penalty. This allows you to remove pretty much all problems requiring ninja-like skills to fix in Git. Darcs, has performance problems on merges, though (as a result of its approach). However, I remember a while ago about someone saying they found a solution to the problem. I think they meant to write a new system, but I can't for the life of me remember what it was called.

Anyway, once you get rid of the actual complexity in git, it's an easy step to work on the added complexity to the UI.

pdobsan · on May 11, 2019

Pijul (https://pijul.org/) is based on a categorical theory of patches (https://arxiv.org/abs/1311.3903). It is similar to Darcs but written in Rust. They claim that Pijul has solved the exponential merge problem. The docs, FAQ, and blogs, in particular the last one, are interesting readings.

theoh · on May 10, 2019

Git is really a toolkit for version control. Linus (or some other Git proponents?) distinguishes between "plumbing" and "porcelain": the infrastructure and the UI.

Linus wouldn't claim to be a brilliant designer of user interfaces. It's totally conceivable that somebody could come along and develop a new way of talking about Git's functionality, implemented by a new "porcelain". Changing the vocabulary, creating a more comprehensible map of the internal logic of the thing...

mcguire · on May 10, 2019

Unfortunately, none of the porcelains caught on, because all of the power users use plain git.

jdormit · on May 10, 2019

Except those of us using magit instead ;)

theoh · on May 10, 2019

Who knows what the future holds. Once upon a time all the power users used assembly language.

humblebee · on May 10, 2019

feature-branch@{1} is the branch prior to the rebase.

https://www.git-scm.com/docs/gitrevisions#Documentation/gitr...

mkesper · on May 11, 2019

Create a backup branch before:

    git branch local/foo

If you mess up too hard, check it out again.

hinkley · on May 10, 2019

> They are still in the repo, but if no treeish item (eg. a branch) points to them, then they'll eventually get garbage collected.

Why else are you going to go through the trouble of rebasing master if this isn't the goal you're shooting for? I'm a big proponent of commit history hygiene but even I can't defend rebasing master except for egregious things.

I think the only time I rebased master except for this was to fix a poorly executed mass file rename that broke git annotate.

rusk · on May 10, 2019

I'd suggest creating the branch before the rebase:

    $ git co feature-branch
    $ git branch before-rebase-feature-branch
    $ git rebase develop

gwright · on May 10, 2019

Worthwhile if you anticipate something going wrong, but usually things are just fine or a `git rebase --abort` will get you back to a safe place (if you are using an interactive rebase).

humblebee · on May 10, 2019

Or git `reset --hard ORIG_HEAD`

> HEAD names the commit on which you based the changes in the working tree. FETCH_HEAD records the branch which you fetched from a remote repository with your last git fetch invocation. ORIG_HEAD is created by commands that move your HEAD in a drastic way, to record the position of the HEAD before their operation, so that you can easily change the tip of the branch back to the state before you ran them.

https://www.git-scm.com/docs/gitrevisions#Documentation/gitr...

dan00 · on May 10, 2019

> No commits are harmed in the operation of `git rebase`. All the commits you had in the repo before the rebase are still in the repo.

The same changes are still in the repo (edit: I should have said branch here), but not the same commits, because the parents and children change and therefore the hash of the commits.

It is very important to be aware that the history is changed, because the previous history can not be without issues merged with the new one, which is the main pain point and the most problems that arise from a rebase.

ddevault · on May 10, 2019

No, gwright is correct, and my guide fails to capture the nuance of this detail.

Each commit has a link to its parent, and represents the tip of a linked list. .git/objects is a heap of all commits (and other objects), and .git/refs contains a list commit IDs that define each head (e.g. master). git rebase will often introduce new versions of a commit to the heap and update the heads to reference new histories, but the old commits stick around and can be accessed through the reflog - with their full original history intact.

dan00 · on May 10, 2019

It is right that the previous commits are still there in the repo, but from the point of git they are garbage now and going to be removed. The point are the commits reachable now from the branch.

dahart · on May 10, 2019

What’s the matter with reachable and unreachable commits? Commits no longer needed should be unreachable and cleaned eventually, that’s a feature. Git is fantastic about keeping the unreachable commits for long enough that should I actually need them for any reason, they’re usually there. The default is 90 days. The number of times I need to dig into the reflog for any reason is very low, and always because I made a mistake. The number of times I’ve lost a commit irrevocably because it was cleaned before I needed it is 0.

ZeroGravitas · on May 10, 2019

> The number of times I need to dig into the reflog for any reason is very low, and always because I made a mistake.

A good UI should allow you to recover from mistakes. Like the trashcan vs rm example everyone is using.

It's good that git doesn't permanently delete stuff, it's bad that you need to be a relative expert to know that. If git branch showed rebased branches and told you they would disappear in x days then beginners might feel less fear and embrace the power of git faster.

dahart · on May 10, 2019

This is a little hyperbolic though, because git does have UI above the reflog designed for catching the most common mistakes. The reflog is a powertool, it is not the default UI, and most people never need to look at the reflog.

git rebase has an "abort" feature when you need to redo it. git has tag & branch & stash features if you want to save what you're doing before you rebase. The problem with keeping and showing rebased branches are that 1- you don't need them after the rebase is successful. You only need them when the rebase is going badly, and 2- you'd have a lot of unnecessary noise pile up. I often rebase multiple times before every push. I don't want to see them all, you probably don't either.

That said, I fully agree that git's UI could be better and help beginners feel less fear!

ddevault · on May 10, 2019

git-gc will save stuff in your reflogs. It works pretty hard to avoid removing objects which are referenced by anything at all.

dan00 · on May 10, 2019

Ah ok, the point about reflog makes sense. Thanks!

AlexTWithBeard · on May 10, 2019

the old commits stick around

They kinda are, but that's like saying that deleted files are not deleted, but stick around for a while.

While technically true, for most practical purposes _rm <file>_ deletes the file. The fact that each and every "git 101" manual has to explain how to recover deleted commits, means something is wrong.

It's like saying: "Here's the key, and in case it doesn't work there's a pry bar in the garage". This is usually a pretty good indicator that the lock is broken.

mikeash · on May 10, 2019

It’s more like saying that files in the trash or recycle bin aren’t really deleted. It’s true, they’re still there.

Someone who has never done it before won’t know how to retrieve them, but that’s hardly a surprise.

dahart · on May 10, 2019

Comparing lost commits to rm isn’t a good analogy.

Git has a very good safety net when you know how to use it. The problem is knowing how to use it, not that it’s not there.

It’s a legitimate point that git’s UI sucks, that’s what’s wrong, and everyone agrees. But learn how to use the reflog and you will see the light!

pmiller2 · on May 10, 2019

Git has very good safety even if you don’t know how to use it, provided you commit when you want something to be saved and you don’t rm the whole repo at the first sign of trouble.

AlexTWithBeard · on May 10, 2019

I think my issue is that most people should not even know how this safety net works. But every other question about git on stack exchange seems to be "how do I recover from a failed rebase".

dahart · on May 10, 2019

You have your wish: most people already don't even know how the safety net works. :P The existence of questions on a site designed to ask questions is not any indicator of how often rebase causes problems. Nobody posts to stack exchange every time it works and they're not confused. Aren't the questions on stack exchange a good thing, if what you want is for people to not have to learn the safety net? Just commit and rebase until there are problems, then if you get confused, go look up the answer on stack exchange or post a question if you don't see one already. Seems like the system is working?

pkamb · on May 10, 2019

Just have a backup branch at the same commit as the branch you're about to rebase. It'll keep all of the pre-rebase commits on that standard branch. No need for the reflog or "trashcan" or anything weird like that.

PhaseLockk · on May 10, 2019

My understanding is that the original commits will eventually be cleaned up during garbage collection if there is nothing else pointing to them. Is that correct?

Myrmornis · on May 12, 2019

Yes, came here to say the same thing.

> If you've made a mistake and in so doing lost commits which you needed, then git reflog is here to save the day.

Seeing that sentence in the article immediately makes me think the author has poor git practices or even understanding. reflog is useful at times, yes. But it is not what saves beginners from losing work while practicing their rebasing skills. What saves them is that they didn't rebase an important branch. They made a temp branch, pointing at the HEAD of their important branch perhaps, and they screwed up the temp branch.

If people would just make this one thing clearer to beginners, it would help a lot of people learn git more easily and with less fear.

luhn · on May 10, 2019

I'm familiar with the reflog and use `head@{n}` on occasion, but it never occurred to me that you could use that same syntax with branches. It seems obvious in retrospect—I feel silly.

war1025 · on May 10, 2019

Thanks for pointing this out. I completely missed it when I read the parent comment. A neat trick.

sampo · on May 11, 2019

> All the commits you had in the repo before the rebase are still in the repo.

In your local repo yes, for next 30 days. Then git will garbage collect them. But when you force push new pointers to GitHub, the GitHub repo will lose access to the unreferenced commits right away.

But of course those 30 days will give you plenty of time to go back, if you changed your mind about the rebase.

nacho_weekend · on May 10, 2019

Yes, I agree. Rebase has the ability to obfuscate and rewrite commits if you so choose. Rebasing to just re-order commits is totally viable and perhaps encouraged. We did this at a previous company and it made the commit history very clean to read. However, with great power comes great responsibility in a rebase, and a junior dev can easily mess things up if you don’t educate them properly.

gwright · on May 10, 2019

Nope. It doesn't 'rewrite' commits. You are using the language I was calling out as confusing.

It creates new commits and moves the branch to the tip of the new sequence of commits. No existing commits are changed or deleted.

anamexis · on May 10, 2019

OP's point was that rebasing cannot rewrite commits. It can only make new commits and change branch pointers to them.

ddevault · on May 10, 2019

I just added a mention right after the big scary warning about how everything you do in git is non-destructive. Thanks for the suggestion!

pjc50 · on May 10, 2019

As the old Andre Previn sketch joke says: all the right commits, but not necessarily in the right order.

Nullabillity · on May 10, 2019

It's not 'editing' the raw history files, but you're still presenting a false history to your coworkers. To me there are essentially two kinds of rebases:

- Summarizing history: squashing "implemented subfeature A.A" and "implemented subfeature A.B" into "implemented feature A"

- Rewriting history: moving commits around, changing the base commit, and so on

In my opinion summarizing history is acceptable, you're making a creative decision that certain information will not be useful in the review/when trying to understand the code in the future.

Rewriting, on the other hand, is essentially lying. You're creating repository states that never existed, and which you have never tested. In the worst case, consider the following history:

    *     F: (master) Merge branch 'component2'
    |\  
    | *   E: (component2) Fixed component 2's integration with 1
    | *   D: (component2) Merge branch 'master' into component2
    | |\  
    | |/  
    |/|   
    * |   C: (master) Refactored component 1's API
    | *   B: (component2) Implemented component 2 that depends on 1
    |/  
    *     A: (master) Base

Yes, it could probably be completely linearized, but that would be a horrible idea. Commit B will leave the repository in a completely nonsensical state. Sure, you could squash in E to mitigate it (since, luckily, nothing else happened in component 2 in the meantime), but then you're still stuck explaining what will likely look like a bunch of really weird design decisions compared to if you had designed against component 1's new API immediately.

Historical context matters. If in doubt, don't rebase. Never `git pull --rebase` blindly.

gwright · on May 10, 2019

It isn't lying. Advocates of rebase are always talking about a work flow where you are curating a set of proposed changes before merging into your "public" branches (.e.g, development or master).

No one is advocating that you use rebase on your public branches or basically any branch that has been "published". We are talking about feature branches or spikes or branches that exist just on one developer's machine.

Nullabillity · on May 10, 2019

You're lying about the path that you took to get to that point, and you're creating a lot of (public) nonsense commits on the way there (unless you're very careful, and/or overly squash-happy).

Whether you've published the true history earlier is irrelevant to that discussion.

gwright · on May 10, 2019

Let me try again. I'm advocating that you use rebase to improve the quality of your changes that will be reviewed before merging or even before being reviewed at all.

If I make three commits and then realize that I should have included something in the first commit, I use rebase to create a new sequence of three commits that has the corrected version of the first commit. I haven't shared those commits with anyone, this is just work that I've done locally.

Are you seriously advocating that creating a pull request with: (A, B, C, A-fixup) is better than using rebase and then creating a pull request with: (better-A, B, C)?

You think that second case is "lying" because I didn't show the intermediate step that included the mistake?

Nullabillity · on May 10, 2019

Yes, it is lying.

You can mitigate most of the damage if it is convincing enough (for example, go through B' and C' and make sure everything still makes sense at each point), but realistically nobody is going to do that, because it's pretty inefficient way to spend your time. And even then, you're still removing context (unless you're just fixing a typo).

> I'm advocating that you use rebase to improve the quality of your changes that will be reviewed before merging or even before being reviewed at all.

That was clear from the start. But the fact that X breaks Y doesn't imply that Y is a good idea when X doesn't apply.

dang · on May 11, 2019

Throwing the word "lying" into an argument like this counts as name-calling and flamebait in the sense that the site guidelines use these terms. It leads to distracting, shallow, and therefore more boring conversation. Would you mind reviewing the rules and please not do that? Let's stay focused on exchanging what we're curious about.

https://news.ycombinator.com/newsguidelines.html

gwright · on May 10, 2019

I really don't understand why you are choosing to use the word "lying".

Us mere humans make mistakes all the time. Typos, omissions, false starts, and so on. What is the value of throwing that raw set of events at a reviewer or complicating the understanding of the changes when viewed in retrospect from the future? What is the reason you call curating the work into a more polished form "lying"? Why do you think the time spent being intentional about changes isn't valuable when compared to the time spent by a reviewer (or your future self) to sort through the flotsam and jetsam of your intermediate work?

fulafel · on May 11, 2019

But reviewers in most Git workflows mainly look at PRs. Then if as a reviewer you want to see how the sausage was made, you can zoom in on the commits, including all the messy reality of how the work was done. In some circumstances you might of course want to hide this, but in an open and safe collegial environment this lets the reviewer understand your thought and work process.

raimue · on May 11, 2019

The reviewer also won't see all the things the author tried without ever committing these states. They don't need to see all the messy steps, or they would have needed to look over the authors shoulder all the time.

I rather review the final patch series with changes in logical order and not necessarily in the order the code was written or with intermediate work that was later reverted or changed again. I do look at commits, because also every commit message counts and is supposed to explain the individual change.

fulafel · on May 11, 2019

Sure, it's not a perfect record. But as in most things, perfect is the enemy of good.

The messy reality is valuable, when talking with your teammates about how the work was done and what kind of bumps were along the way. It's not about looking over their shoulders, it's about using data to develop together as a team, eliminating hinderances, etc - if you have the mutual trust to do that. And of course you yourself can go back and look for patterns of mistakes or problematic areas in code based on your history.

Like I said, to judge the change its, the whole PR diff is usually the most useful unit of inspection when you just want to see what happens. And if it's a big pr, you can of course always merge child PR's or branches against the big PR/branch, and look at the merge diffs.

fulafel · on May 12, 2019

Another formulation of the "learning as a team" idea-

The science principle of publishing your experiments, including failed ones, has the same benefits in sw engineering: others can build on your failed attempts, or save time by not replicating them.

Nullabillity · on May 10, 2019

I'm using it to differentiate between summarizing (removing steps between A and B) and modifying (introducing new steps, reordering them, or editing them).

You can do it in a way that isn't harmful (as I mentioned earlier), but good luck getting a team to actually stick to that. It also doesn't help that pretty much no tooling encourages doing it properly.

avar · on May 10, 2019

Are you lying to your co-workers when you draft an E-Mail to them, read it over, and decide to delete a paragraph or write it again from scratch? If your E-Mail client automatically saves drafts that's basically the equivalent of "rebase".

I made a typo when writing this reply, and pressed backspace to correct it. Is use of the backspace key lying?

I think you're placing a value on "history" that doesn't map onto all users of "rebase", or E-Mail client drafts. A lot of advanced users use it as the equivalent of "save" in an editor, sharing all those intermediate states is more noise than value v.s. crafting a sensible patch once you figure out what you want/what change to make.

Nullabillity · on May 10, 2019

No, if you only merge changes then you're just summarizing.

The true problems begin once you start creating commits that represent repository trees that you never tested or reviewed, for example by editing past commits (invalidating any testing you've done of commits after that point), deleting past commits (aside from squashing an unbroken sequence of commits, or deleting them if the squash would result in a no-op), reordering commits, or rebasing commits.

avar · on May 10, 2019

You're assuming that commits are tested before they're made, and that rebase invalidates this. I don't test most of my commits, just like I don't proofread an E-Mail after every word I've written. I do that later.

But yeah, the history you push to a canonical branch should generally be made up of commits that have all been tested in isolation. The rebase command doesn't make this worse, but better, e.g. with "rebase -i --exec='make test'".

I also prune out history of some false steps taken. Have you never written a program and done something like "I'll use a hash here <save><compile><test>, no actually a list makes more sense <save><compile><test> ...". Those intermediate steps are commits for a lot of advanced git users.

Sharing all your mistakes-as-you-go-along with the world doesn't help anyone, I'd typically be sending you a 100 patch merge request for some rather trivial change instead of 1-3 sensible commits.

int0x80 · on May 11, 2019

That kind of rebase is just to refresh your work against updated masters etc. Of course you have to test against the refreshed (rebased) work again!! That doesn't mean you can't rebase. It means you can't randomly push untested work. If you know what rebase means you will understand that there are very likely new interactions with your code and you have to test your updated changeset. Exactly the same as if you merge. You have to retest the resulting tree.

meribold · on May 11, 2019

A merge of a branch with N unique commits creates one new, yet-to-be-tested commit/tree. A rebase creates N. I doubt that it's common that people replay all the new history after a rebase and test each new commit/tree.

tshaddox · on May 10, 2019

Maybe "lying" isn't the best term to use here, because it seems like you're using it to mean "not providing all information in a way that is morally bad." Of course you're not providing literally all information. Heck, I conceal a lot of information about my development process by testing and changing code before I even make a commit. But I think that's preferable to, for instance, providing a video screen capture of my entire development process for review.

aldanor · on May 11, 2019

Yea. Moreover, I would often ask my coworkers to rebase their code if there’s commits like “oops, missed a comma” because it distracts from the main point when you read the commit history.

bonzini · on May 11, 2019

> realistically nobody is going to do that, because it's pretty inefficient way to spend your time.

Of course you do! And it's not am inefficient use of your time, because it helps reviewers now, and yourself when you're bisecting later.

> And even then, you're still removing context (unless you're just fixing a typo).

You place that context in the commit message.

fulafel · on May 11, 2019

IME this is a very rare level of sophistication in use of rebase.

And how do you detect that you forgot / was too busy to do it, when you go back 6 months later? It's fragile, "fail-open".

bonzini · on May 11, 2019

You would be surprised. For example, this is a guide my colleague wrote to describe his git workflow:

https://github.com/tianocore/tianocore.github.io/wiki/Laszlo...

PButtNutter · on May 10, 2019

I disagree.

Say upstream is at A.

I clone it in my local work-space, and make a few commits over the course of a few days. So my local is A B C

During this time other changes have been merged into upstream, so upstream looks like A D E

I now have two options. I can try to merge from upstream or rebase off of upstream. Merging introduces a messy commit history that quickly becomes difficult to follow. Rebasing removes my local commits, applies the changes in upstream, and then re-applies my local commits.

So after rebasing my local is A D E B C. There are no messy merge commits. And ideally, I can squash my local changes into a single feature commit, so upstream ends up incredibly tidy.

At no place in this process is there any dishonesty or lying. I haven't changed the history upstream, which is the source of truth. What's the issue here?

Nullabillity · on May 11, 2019

> So after rebasing my local is A D E B C.

No, your local is now A D E B' C'. Commits aren't just a diff between two tree snapshots, they are tree snapshots.

Hopefully you test and sanity check C' before submitting for review, but it's very unlikely that you're going to give B' the same treatment, making it more difficult for people to understand the history in the future (as well as breaking `git bisect`).

And even if you do, are your coworkers going to? Consistently? No CI tool that I'm aware of will enforce this for you.

> There are no messy merge commits.

No, but the underlying messy workflow is still there. You've just swept it under the rug for the sake of aesthetics, at the cost of future comprehension.

> At no place in this process is there any dishonesty or lying. I haven't changed the history upstream, which is the source of truth. What's the issue here?

Those are completely orthogonal concerns. You're presenting a false version of the repository state.

The common mantra of "don't rewrite public history" is about not creating a mess of duplicate commits, it doesn't imply that rewriting history is fine as long as it's not public.

Vinnl · on May 11, 2019

But you lie all the time, by that definition! If I write code, make a mistake and press Ctrl+Z before committing that code, I've just "rewritten" my history without my team mates being able to tell.

Your commit history is just a somewhat arbitrary recording of your code at certain points in time that you choose. Rebasing simply makes that less arbitrary, allowing you to document the way your code is built up in a structured way. Rather than having to decide on the spot whenever a certain combination of code is a good candidate for a single, atomic commit, you can make that judgment with the benefit of hindsight.

jolmg · on May 10, 2019

How do you feel about deleting commits to avoid reverting them on such personal branches? Also, what about doing it to fix commit messages (maybe because they were accidentally written in a language that was not agreed on for the project)? What about splitting a commit with a generic "lots of semi-related things" commit message into multiple, more focused commits?

eshch · on May 10, 2019

do you want to know all my wrong tries to make a thing work? why are you sure that all the commits i did are not nonsense? i look at the history as a way to 1) divide my work into reusable pieces of changes 2) document my changes to read for other developers.

Nullabillity · on May 10, 2019

> do you want to know all my wrong tries to make a thing work?

Yes. A failed attempt is still a useful signal that people shouldn't try to simplify back to that way in the future (and why not). It's also a useful starting point in case the reasons it failed no longer apply.

scrollaway · on May 11, 2019

It's very rare that "failed attempts" are a useful signal. When it is the case, it's better to document it (eg. as part of the commit message, PR, or the dev documentation itself).

Commit histories littered with commits that get back-and-forth reverted are frickin unreadable though. Extremely annoying to bisect, painful to comb through when looking for changes, noisy in git blame, etc. There's a ton of downsides for what in practice is very rarely even an upside.

jojo14 · on May 12, 2019

I agree with you, yet I don't understand why you were downvoted. I think git-rebase proponents haven't really ever worked in a professional environment particularly with several co-workers. I as a project manager would not trust a "rebaser" and I would question the time he spent to rewrite the git history. It is granted Merge and rebase need the same amount of reading the code and merging the differences. However with git-rebase there is the added cost of beautifying the history. Which means at least 2 drawbacks : one is the cost the other is more about memory. About cost, what is the point of rewriting the history when you have a (great) tool to janitor it. Then the git history automatically reflects the project history. If the git history is rewritten how would people remember the order of commits in case bugs occur. When did the bug happen ? Who should correct it ? IMHO those questions are more fundamental than a straight line of bullets in gitk.

gwright · on May 13, 2019

I can't help but feel like you don't understand a work flow that utilizes rebase in a responsible way. In particular you seem to think that the rebase is going to affect the history of a released version of your software (When did the bug happen?).

Nobody is suggesting that rebase be used to change the history of a released or published branch (master, develop etc.). If that is your concern and the reason for you not trusting a "rebaser" then you are simply mistaken, you are arguing against an imaginary workflow for which no one is advocating.

Rebase should be used only to curate the commits on a feature branch and to keep the feature branch synchronized with the upstream branch.

alkonaut · on May 11, 2019

I have a very simple solution to this: your feature is at most 1 or 2 commits (after squash) and they can be ff merge when your branch is done. Or it’s too big.

The exception to this is when a feature becomes more involved and has several logical steps, or any kind of history worth providing. This should be rare and when it happens, use merge commits to preserve history.

Not polluting the history with N trivial branches for every 1 branch that needs historical context, is a benefit of this.

JshWright · on May 11, 2019

In that case I assume you have also removed the backspace key from your keyboard?

Toury2d · on May 10, 2019

[flagged]

dang · on May 11, 2019

Personal attacks aren't ok here, regardless of wrong or annoying another comment is. Would you mind checking out the site guidelines and taking the spirit of this site to heart? We'd be grateful, since that's the only way for it to remain interesting.

https://news.ycombinator.com/newsguidelines.html

gwright · on May 10, 2019

Or you could just rebase component2 when you learn that component1 has been updated on master:

    *     F: (master) Merge branch 'component2'
    |\
    | *   E: (component2) Implemented component 2 (now with updated component1)
    |/
    *     C: (master) Refactored component 1's API
    |
    *     A: (master) Base

So we've lost B and D from your example, but who cares about those commits?

Nullabillity · on May 10, 2019

That's easy in the trivial example where nothing else happened in the meantime.

palunon · on May 10, 2019

> and which you have never tested.

That's on you if you don't test every commit. I don't care if you had failing tests (or even build) when you were writing your feature. I care that every one of your patches (and thus commit) does one logical thing, and that tests passes (making bisect useful).

You want your PR commit history to tell a coherent story. No one care if a writer had 15 bad draft of their story before publishing, and the same apply here.

dahart · on May 11, 2019

> you’re still presenting a false history to your coworkers [...] essentially lying

This has been a common misunderstanding of git in the past, but thankfully is fading now. I was hoping it wouldn’t come back to haunt this thread. I don’t know where the extreme and hyperbolic idea of using git the way it was designed is “lying” and creating “false history” first came from, you aren’t the first person to suggest it, but it’s neither correct nor helpful to use that kind of language. This theoretical philosophical ideal that there’s only one true history is trading away things git was specifically created to do, as well as the practicalities of real world software development, in favor of a strange unrealistic and abstract notion that once git commit has been used the commit should never be touched again.

Everyone knows and agrees that rearranging already published commits is a bad idea. Not because it’s “lying”, but because it causes problems, costs other people time, and can even inflict irreconcilable merge conflicts on their work.

Cleaning up your own commits before you push using interactive rebase is not just a good idea, it’s the way git was designed, it’s what Linus does, and it’s kind to your team. This includes reordering commits and pulling with rebase.

> Historical context matters. If in doubt, don’t rebase. Never `git pull -- rebase` blindly.

Maybe you could back up your assertion with some examples of why it always matters, and why that justifies using words like ‘never’?

Your rhetoric is ignoring the real-world fact that on a large team, the majority of commits at any given time are orthogonal to each other, and that the parent commit you end up with is completely arbitrary.

Not only do I use pull -- rebase, I always git config --global pull.rebase true, and I frequently recommend others do the same.

Having merge commits in master every single time someone checks in is incredibly noisy and it inflicts friction on the entire team to force everyone to read the noisy log. I’ve always worked on teams that decided to take the more practical approach of one-off commits should not have a merge, regardless of when they happen, to keep history cleaner, and feature branches with more than a couple of commits or by more than one person should have a merge commit, to keep the master branch from having broken commits or unfinished features and so it’s always bisectable.

fulafel · on May 11, 2019

Yes, good explanation of the problem.

And the cherry on top is that you can't easily tell later if you are looking at rewritten history. So if the above kind of rewriting might have happened in your project, you will essentially not be able to trust git history anymore as a record of engineering decisions.

malingo · on May 10, 2019

My eyes were opened on git-rebase when I read https://matthew-brett.github.io/pydagogue/rebase_without_tea...

The full version of the command as

    $ git rebase --onto new-base start end

takes the commit range (start,end] and re-commits them on top of the new-base commit. The commit range doesn't have to be a full branch and you don't even need to be on the branch to run the command this way. It's very intuitive and I nearly always use the full version now.

I've also gotten into the habit of "pinning" my branch before I rebase so that I have it in its original form. If the branch name is my-branch, then the command

    $ git branch my-branch{-hold,}

which is a handy (bash-specific?) shortcut of

    $ git branch my-branch-hold my-branch

leaves you on my-branch and creates a new branch label called my-branch-hold that points to the same place.

EDIT: clarification of pre-rebase branching

gbacon · on May 11, 2019

I’ll sometimes do the same for a rebase that looks like it will be hairy, and I like to call the bookmarks, for example,

    git branch my-branch-mulligan

zemo · on May 10, 2019

Git rebase is great. Honestly I think the argument that "if you have to push -f that means rebase is wrong" is making a huge assumption about how people use branches and why people are force pushing branches.

Force pushing branches is what you do when you have pushed a branch that you expect to modify. Why would you do that? Because that's how Github and Bitbucket have taught people to conduct PR's.

If your immediate reaction is that "rebase is bad UX", ask yourself whether or not pull requests are good UX. I honestly think rebase is great, but that pull requests are extremely bad UX, and the UX blame is misplaced on rebase when where it really belongs is on pull requests.

u801e · on May 10, 2019

> Force pushing branches is what you do when you have pushed a branch that you expect to modify. Why would you do that? Because that's how Github and Bitbucket have taught people to conduct PR's.

I thought that Github and Bitbucket encouraged people to push up additional commits to fix issues in their PR. So, a typical PR will end up with a commit history like:

  Implement a feature method
  Add calls to new feature method
  Update to version 1.2.3
  fixing missing semi-colon
  addressed comments
  one more thing
  now its working

People who force-push are the ones who are trying to keep a clean commit history (meaning you don't have those extra 4 commits). So, your point about a PR being bad UX versus a rebase is correct, but not for the reason you state.

cyphar · on May 10, 2019

Almost no projects I've worked on that use GitHub ask people to push fixup commits. In fact, maintainers (like me) often have to ask people to squash their commits into reasonable chunks.

Liskni_si · on May 10, 2019

It's not the project that asks people to push fixups, it's that GitHub, as opposed to e.g. Gerrit, makes it hard to see what changed (how your comments were addressed) after a force push, so it's best that people only add commits.

Ideally they'd use git commit --fixup=<sha> -p, and then git rebase --autosquash when the maintainer approves the merge, but few people care.

(And what I'm saying isn't really accurate any more since GitHub does show force-pushes in the pull request UI these days and one can run git range-diff on that. But this wasn't possible last year.)

mcguire · on May 10, 2019

Usually, in my experience, you submit a pull request, add commits to fix things in the request, then when everything is copacetic, you squash the commits and submit a final pull request. You have to do a push -f to make that last request.

eximius · on May 10, 2019

I see this in OSS a fair bit. But for any private repo, I see the fixup commits.

paulddraper · on May 10, 2019

I should add that `rebase` in no way requires `push -f`, if these are private, unpublished changes.

Sahhaese · on May 10, 2019

When you have a command so confusing that you need an entire website dedicated to a single command, and still need to warn against using it, then perhaps you've got the UX wrong.

ddevault · on May 10, 2019

git is a version control framework more so than a version control system. It starts from simple primitives, exposes them to the user, then builds complex and powerful tools on top of them. Because git rebase gives you primitives to accomplish high-level tasks (e.g. "reorder these commits"), the learning experience is different because you have to learn the low-level details to accomplish your high-level task. However, because those low-level details are accessible to you, you are afforded a greater flexibility in inventing new high-level tasks.

Sahhaese · on May 10, 2019

But those high level tasks could also be exposed directly. There's nothing to stop there being more commands which more directly accomplish the desired tasks.

The idea that git is good because it is difficult to use is just "git snobbery", as is the idea that it must be difficult because it's a DVCS.

There's nothing to stop git having two levels of the API, one exposed for tools to build off of with the full complexity and another for every day use.

dahart · on May 10, 2019

Git does have two levels of the API explicitly, one for tools (“plumbing”) and one for every day use (“porcelain”).

I don’t think the argument is that the difficulty is a virtue, nobody is trying to be snobby, so try to avoid jumping to that conclusion.

Git just has some inherent complexity. Git does have a steep learning curve that is the root of a UX problem. But it’s not clear what better abstractions there are or how to simplify git. Lots of people have tried to make a higher level porcelain, and the issue isn’t going away. You are welcome to suggest & create a git wrapper that makes it simpler and less dangerous.

Perforce is easier to learn, so you might try using that instead. I use both and I’m becoming more and more frustrated with Perforce because git is so much more flexible and safer and easier to use once you learn how to use git.

AlexTWithBeard · on May 10, 2019

The main problem with git is that it tries to be two things at once: a change management system and a version control system.

For the former you want flexible history, distributed repos and freedom to do whatever you want.

For the latter the history should be sacrosanct and the repository is better be more or less centralized.

git tries to sit on both chairs and therefore has to adapt a quite awkward position.

dahart · on May 10, 2019

I totally don’t understand your implied semantic difference between “change management” and “version control”. Those sound like exactly the same thing to me. ;)

I also don’t understand your larger point about git and what the problem is. For almost everyone using git, the pushed history is sacrosanct, and the main repo is centralized. The main workflow for rebase is to clean up before making commits public.

AlexTWithBeard · on May 10, 2019

For me "change management" is akin to IntelliJ's shelf: you can have a bunch of changes, you can combine them in lists, you can shuffle the changes between these lists and selectively apply them. Editing is the king. I should totally be able to destroy five years of my work without a way to recover.

"Version control" is a log. I should be able to return to any point in history at any time. I should not be able to destroy this history no matter what I do. The history should be backed up in a remote location.

With git every once in a while these two come into conflict: I'm using my branch as a change management system and then someone else pulls that branch and makes a couple of changes on it. Then I force push and the mess begins.

Regarding centralized repository: say, I have two working copies. In goode olde subversion there was the "master" version in trunk and two local versions, one per working copy. Three versions altogether. Pretty clear which is which.

Now in git I have:

- master in origin repo

- origin/master in wc1

- master in wc1

- actual file in wc1

- all the same in wc2

Seven potentially different versions of the same exact file. That's even without mentioning a stash.

dahart · on May 10, 2019

Rebase is a way to provide what you're calling "change management", and so is git stash. The rest of git is what you're calling "version control". I don't see any conflict. I don't love trying to differentiate those terms either, FWIW. Managing changes and controlling versions are "literally" the same thing.

> Then I force push and the mess begins.

That is your problem right there. Force pushing over published history should always be avoided. Don't do that, it is very unfriendly to others you work with. Just always pull, resolve any conflicts, then push your changes without forcing them. If you need to force in order to fix a serious mistake, then notify everyone first and have people hold their changes, then pull everything, resolve the problem, force push, and notify everyone again to "force pull" by first fetching, then reset their branch to what's in the origin verison; using reset --hard will let them avoid having any conflicts after you force pushed. Consider carefully whether the mistake even warrants a force push, or if you can make do with new commits on top that fix the bad ones.

> Regarding centralized repository [...] Seven potentially different versions of the same exact file.

You're conflating multiple different topics. The existence of multiple copies of a file isn't related to which repo is the central one, nor is it some kind of problem.

The origin repo is your central repo. Your downstream repo has to copy from the upstream/central repo if you even want to work on the code. What you called "copies" in origin/master and master are branches, not copies of the file. The only single copy on your machine from your point of view is your local workspace, which expanded from your "master". stash is something that happens behind the scenes to your git database, it's not making more working copies. wc2 is another repo or computer, it's not even relevant. None of these copies you're talking about are visible to a user except the one working copy.

AlexTWithBeard · on May 10, 2019

That is your problem right there. Force pushing over published history should always be avoided.

That is my problem right here.

There is no way to tell the difference between "I push to make something public" and "I push to back up my data in some safe location".

dahart · on May 10, 2019

Well in either case, I'll just repeat: force push should always be avoided. Regular push without forcing can handle both of those scenarios, publishing commits to your team, and also backing up data to a safe location.

That said, making backup data in a safe location isn't what git was really made for. If you really don't want history to be there, you can use cp or rsync, or you can git clone -depth 0 from the backup machine, or just use backup software. There's "bup" which is a backup tool based on git...

ddevault · on May 10, 2019

This is the same argument that suggests Squarespace is better than HTML & CSS. Maybe true for some people - but not the typical HN audience, I imagine. You use git all day, every day. It's worth it to learn it inside and out, and the design assumes users who are willing to make that investment. If you assume everyone learns the primitives, then the rest of git's design makes sense as an organic evolution of that.

And for anyone who doesn't know the primitives, I posted a brief summary on Mastodon the other day:

https://cmpwn.com/@sir/102038690003388821

Also recommend Pro Git's chapter on git internals.

Espressosaurus · on May 10, 2019

Nah. It's a UI issue. Even when you know the primitives, the counterintuitive randomness of what is a command vs. what is a switch on a different command vs. something that can be done in three different ways using different combinations of commands and switches makes it a clusterfuck to interface with until you memorize everything. Contrast to something like mercurial where once you know the concept, you either know how to do it, or you know the command that will give you the help for how to do it. It's generally not going to be some switch on an unrelated command because the UI was actually designed and not simply cobbled together.

eterm · on May 10, 2019

This isn't HTML & CSS vs squarespace, and to butcher the analogy this is more like CSS vs Sass/SCSS. CSS is the internal API which does everything but Sass is an API which exposes that in a much more beautiful way.

I don't need to know how postgres does paging, indexing, tree diffs, etc to be able to write good SQL.

I don't need to know how typescript compiles to javascript to use typescipt.

I don't need to know how my engine works to drive my car.

I use all of those every day.

But the suggestion here is that as every day users of git should learn git internals to be able to use the tool better.

That's a tooling failure. It's not a failure of the git design or git fundamentals, it's a failure of the git cli.

Espressosaurus · on May 10, 2019

Mercurial's interface is just fine, and these days it's just as powerful as Git.

Things could be better. There's an existence proof. It just lost the mindshare war and so now we're stuck with Git, which I still have to look up basic syntax for because its command set is contradictory and makes no sense. (Is it git <x>? git <y> --x? git <z> <a-b>? Something else entirely? Who knows!)

guitarbill · on May 10, 2019

I used to agree with this, now I've stopped worrying. Because if git is the worst part of your workflow, that's a great problem to have. But at many places, git is the best part.

(I've also had to work with various IBM CVS, and they are universally garbage. When I get frustrated at git, all I have to do is think back to those.)

So yes, Mercurial is better, but is it worth the effort? Not in my experience.

hnthrowaway919 · on May 11, 2019

I know it's popular to shit on anything that isn't git these days, but you mentioned IBM CVS. I've used a couple of them, but primarily RTC (Rational Team Concert). I know that was an IBM acquisition and not a home-grown solution (what wasn't?). I personally prefer some features of RTC over how to do the equivalent in git. Namely, being able to move change sets (think commits) around freely, not having to deal with rebasing/merging into whatever branch you want to put it in/on. I also think there's something to be said for a CVS system that is built for teams that work together daily, compared to a system that's built for a "remote contributor" model.

That being said, I use git daily and find that I'm able to do everything I want and more, so I'm not looking to make a switch. Unfortunately, most people don't care to learn how to use git beyond "checkout, commit, push, call for help".

guitarbill · on May 12, 2019

There was one before RTC, called CMVC, which was truly awful, especially using after 2010 felt like an insult to developer productivity.

I forget all the reasons why RTC isn't great, but the main one: if the server goes down, you're screwed. This happened several times, and we basically went to the pub instead of working. Slow to check out. Streams sucked compared to branches (especially when the server admin restricted creation of streams, meaning you simply could not branch at all if I'm remembering), and the capability to stash changes/switch branches to work on different work items if one was blocked was also more cumbersome. Code review was terrible.

A centralised paradigm does simplify things a lot mentally, but the workflow suffers IMO.

hnthrowaway919 · on May 14, 2019

You certainly make some good points about the downsides of RTC. RTC's streams are often compared to git's branches because they're the closest construct, but they are definitely very different and have pretty minimal overlap, considering they're basically the parallel construct. IMO stashing changes was not bad (suspending change sets, I believe it was called), but perhaps I was mostly doing that within 1 stream and not between streams. I agree that code review was not great, though I'm not a _huge_ fan of GitHub's comment/PR review mechanism either. I'm not aware of code reviewing built in to git itself, though I could be totally missing it.

guitarbill · on May 15, 2019

> I'm not aware of code reviewing built in to git itself

I guess you can pull a branch or email a patch and diff it with the diff tool of your choice.

There's a few options for gir review UIs, e.g. Gerrit or GitLab. Kind of unix-y, just have the VCS be a good VCS.

uryga · on May 10, 2019

nitpick, but I think "existence proof" is the opposite of what you meant (that Mercurial is a living proof that things can be better):

> a constructive proof is a method of proof that demonstrates the existence of a mathematical object by creating [...] the object.

> This is in contrast to [an existence proof], which proves the existence of a particular kind of object without providing an example.

https://en.m.wikipedia.org/wiki/Constructive_proof

tasuki · on May 10, 2019

> The idea that git is good because it is difficult to use

Git is good because it is extremely elegantly designed. There are blobs, trees, commits, and refs. When you understand them, you understand pretty much everything about git.

And yes, the interface is a mess.

Hello71 · on May 10, 2019

http://man7.org/linux/man-pages/man1/git.1.html

GIT COMMANDS

       We divide Git into high level ("porcelain") commands and low level
       ("plumbing") commands.

np_tedious · on May 10, 2019

Being that rebase is under "porcelain", would you agree the UI ought to be improved?

I'm not sure if I do, but that's the conclusion the man page + your comment would seem to support.

ajross · on May 10, 2019

It's not an interface problem at all. "git rebase", with no arguments, does almost exactly what a typical user wants almost all the time. Probably 80% of the remaining cases are handled by "git rebase $BRANCH".

But once outside that world, the user if faced with the problem that "rebase" is just a special case of "merge" and shared all the complexities and edge cases. And that's hard for fundamental reasons. Git has tools for this too, but their interface shares the complexity of the problem domain.

gpm · on May 10, 2019

"git rebase" does exactly what you want, except when it's totally unrelated to what you want because what you actually want is "git rebase -i HEAD~3" which does something basically completely different (from a users point of view).

ergothus · on May 10, 2019

This is probably where I foul up.

My last few workplaces were either not git, or relied on git pull w/o rebase. My current workplace has rebase as part of their flow, and I find that, like, 60% of my PRs require force push, which bothers me greatly. Everyone else just shrugs and considers it part of business, but I know what I WANT to do should be nicely aligned and not encounter that problem.

Unfortunately, every explanation I'm like "yes, yes, branching trees, I get it..." and then I'm suddenly in the "...and it says things are different and I don't know why". And because this always happens when I'm trying to get some fix in, I never have the time to study it to figure out what is really happening. It's just "--force and promise myself the next time will be different".

house9-2 · on May 10, 2019

> My current workplace has rebase as part of their flow

Not really sure what that means as rebase can be used in multiple ways, but you might want to try using:

`--force-with-lease` instead of `--force`

> This option allows one to force push without the risk of unintentionally overwriting someone else’s work

https://thoughtbot.com/blog/git-push-force-with-lease

pjc50 · on May 10, 2019

That does sound like they're doing it wrong. Could you give an example?

ergothus · on May 10, 2019

> Could you give an example?

To the degree I understand it, sure:

We have master branch A

I create a feature branch B

Both get updates. Someone will do a rebase of B to the most-recent A and push that. (In my previous workplaces they would have just pulled the most recent A)

Here's where the confusion comes in: If I get the updated B but A has updated again, I cannot pull A nor rebase to A and successfully push the result without forcing. IIRC, on push it complains that my local branch is not up to date, but if I pull it will tell me I am up to date.

At least, that's what I think is the timing - since this involves multiple people I'm uncertain of what exactly occurs and the order, nor why problems are inconsistent. We don't have that many feature branches that have multiple people contributing AND requiring updates from the master branch, but it happens often enough.

jolmg · on May 10, 2019

> which does something basically completely different (from a users point of view).

The only difference is that one's interactive and the other isn't. If you use `-i` and simply exit out of the editor, the effect is completely the same as not having used `-i`, isn't it? I think moving `rebase -i` to a completely new command could potentially make things more confusing. That new command would be an extension of `rebase` and so could be used interchangeably.

gpm · on May 10, 2019

The other difference is that I'm "rebasing" onto an ancestor of the current head, as in I'm not really changing base at all.

A hypothetical new command would be a simpler version of "rebase" that comes with the restriction described above, that it's not actually changing base.

jolmg · on May 10, 2019

> The other difference is that I'm "rebasing" onto an ancestor of the current head, as in I'm not really changing base at all.

`rebase -i` doesn't restrict that, does it? There may be people whose workflow includes things like `git rebase -i --onto foo bar baz`. That you don't use it is another matter.

> A hypothetical new command would be a simpler version of "rebase" that comes with the restriction described above, that it's not actually changing base.

You want to remove features from git? Why?

If you didn't mean to say that instead of having `rebase -i` we should only have this hypothetical command, then you can do:

  git config --global alias.edit-history 'rebase -i'

Though you may want to add to that to make it impossible to use edit-history to rebase, too. I mean, you did say you wanted the restriction, right?

gpm · on May 10, 2019

No, rebase -i doesn't, that's why I said `rebase -i HEAD~x` in my original comment and not just `rebase -i`. `rebase -i` should of course continue to exist.

Your right when you say I don't "just" want to introduce an alias because I want to restrict the arguments.

The other issue with that solution is I don't just want a solution for me. I know how this works now, I've already memorized the magic incantation to edit history and later spent the time to understand why the command does what it does (the same goes for basically all the other common git commands). What I want is a solution that works for everyone, out of the box, so we can stop wasting time teaching git internals.

ajross · on May 10, 2019

OK... And what is your suggestion for a proper and obvious interface choice for that?

FWIW: the line noise you typed means "Make a list of the changes since three commits ago, let the user edit the list, then apply them". Other than -i step and the length of the list, this is basically a noop -- you're rebasing on an ancestor of HEAD! I mean, yeah, git lets you do that, but I don't know why you expect the syntax for irrelevant nonsense to be simple.

But let's humor you and try to use that syntax for something real. If you typed "my_version_tag~3" it might make sense -- you want to back up to the commit before whatever automation might have added for a release and put your current work on top of the earlier tree as if they had been developed as part of the release. And you have some junk in your current tree you don't want to expose to the customer to whom you are going to hand this test tree, so you want to remove it interactively.

That... sounds like a useful trick. But it's complicated. And the syntax is complicated. So what's a good syntax for the previous paragraph's action?

gpm · on May 10, 2019

How about, `git edit-history 3`? Just stop calling it rebase.

The "line noise" I typed is an extremely common command that is used to clean up commit history, e.g. squash all your "WIP" commits into a few nice ones. Yes the idea of rebasing onto the same branch is just weird, that's kinda my point, since it's the only way to edit history that git supports (as far as I, or anyone I've ever seen answer a question about how to do this knows).

krupan · on May 10, 2019

You mean like mercurial histedit <revision>?

Also, emacs magit makes this even easier, if you don't mind a more GUI-like interface.

ajross · on May 10, 2019

So... your whole complaint is that people use "rebase" as a trick for "edit history". You'd be fine if they just put a wrapper into the project for that? Seems like not much of a complaint to me. Why not just submit it yourself?

Git doesn't have "reorder patches" feature. Maybe it should. But the fact that its rebase tool can be abused to do this doesn't say anything about the interface value of "git rebase".

Honestly, it seems like the root cause here is that you don't actually do branch rebasing very often, don't see the value of having a rebase tool in the tree, and are just complaining that the trick you do need isn't well supported by the rebase tool you don't use or understand.

gpm · on May 10, 2019

No, I use rebase for "branch rebasing" (i.e. actual rebasing) too. I have no clue how you got that out of my comments.

My complaint is that git's UI is terrible to teach people, to understand it you have to understand way too many internal details of git.

Yes, making a wrapper for `git rebase -i HEAD~x` (and `git rebase -i <other thing that refers to a commit above head>`) would satisfy this UI nit. It wouldn't satisfy all UI nits, this is just a relevant example.

As for why not submit it myself, I'm sure I'm not the first person to complain about this, drive-by UI changes to a project are the absolute best way to get a ridiculous inconsistent UI. I don't submit it myself because I'm not willing to commit the time to become a core maintainer of git, and without being a core maintainer I don't feel right trying to push UI changes in.

dahart · on May 10, 2019

> Git doesn't have "reorder patches" feature.

Huh? That is exactly what git rebase -i is. This isn't some kind of "abuse" or "trick", this is precisely what interactive rebase was designed for. What is making you think that rebase is only to be used when merging two different branches?

mcguire · on May 10, 2019

Git is a complex system, of necessity, and rebase is a flexible, powerful tool. Some of the things it can do are common parts of the normal work flow. Others are only for serious situations.

I generally prefer transparent systems to those that try to figure out what I really want to do.

Amorymeltzer · on May 10, 2019

The warning regarding "public, shared, or stable branches" is always warranted, but I think those warnings end up reverberating where they needn't. Before interacting with anything public or shared — when working solo or locally — `rebase` can be hugely helpful. When starting out with something complicated, I often make separate commits for different files or steps; using rebase to reorder commits or amend can make turning your first steps into a viable change much easier, and can help with merge conflicts down the line. I also find it helpful for large commit messages; rather than needing to write everything at once, making liberal use of `fixup` or `squash` can keep disparate thoughts or bug fixes manageable.

I'd never use it on a public or shared branch, but `rebase --interactive` and `--exec` are some of my most-mused git aliases.

V-2 · on May 10, 2019

I'd use rebase --interactive quite aggressively on a public branch when it's a feature branch that's not yet been merged. As I'm the only owner of it, the way I see it I owe no guarantees to anyone. You're welcome to watch it, but it's work in progress in every aspect.

Amorymeltzer · on May 10, 2019

Github and refined-github[1] make it easier, by keeping track of changes from a `push -f` and having merge/squash/rebase options on merging, but I know of large projects with an explicit "no rebasing" rule as it can get confusing for reviewers. I'd say it depends on the project, maintainers, and general workflow. Which is good!

1: https://github.com/sindresorhus/refined-github

brandmeyer · on May 10, 2019

My team does this for branches during the code review process all the time, and it works great.

falsedan · on May 10, 2019

rebasing a shared branch is fine, if the audience who needs to know about it is within arms length (or the remote chat equivalent). Rebasing a public branch that's used by hundreds requires special notification tooling and email templates and elaborate public shaming rituals, which people are happy to come up with but seems unnecessary in my opinion if you think twice before force pushing

also: 'git push --force-with-lease' (worth repeating)

chris_mc · on May 10, 2019

I use `--force-with-lease` instead of `--force` or `-f` because it ensures that if someone happened to push before me it would fail and I could manage that manually. Even on branches that no one "should" be touching other than me, it seems safer to type the extra characters `-` and `orce-with-lease` around the `-f`.

ZeroGravitas · on May 10, 2019

This was new to me, and makes me feel better about force pushing. However, while googling to find out if I could set this as the default behaviour I found an Stack Overflow answer that as well as saying "no" points out a very git-like gotcha:

https://stackoverflow.com/questions/30542491/push-force-with...

Apparently it will think you know about the changes if you've fetched them, even if you've not merged them. And some systems auto fetch in the background.

dbaupp · on May 10, 2019

The 'push -f' suggestion could instead recommend 'push --force-with-lease' to be less error prone in case one is accidentally pushing to a concurrently modified branch.

Also, unless the diagram is confusing with the alignment, the "rebase to rebase" example seems to be implicitly assuming --onto, because the last common ancestor of 'master' and 'feature-2' includes 2 commits on that branch: the first of 'feature-1' and then the one that is only on 'feature-2'.

ddevault · on May 10, 2019

I've heard this before, but I felt that `--force-with-lease` requires a lengthier explanation in an an already intimidating article, is harder to type, and generally isn't useful for users of the Sourcehut workflow. It's definitely a useful tool, though. Maybe I should add a footnote.

munk-a · on May 10, 2019

I disagree, I'd suggest just including the plain `--force-with-lease` since it's generally a tool that stops breaking stuff to happen. In particular, the assumption that your branch isn't publicly used is usually wrong if `--force-with-lease` fails.

guitarbill · on May 10, 2019

I do sympathise, it's hard to keep it at a high level, but many people just copy what they see.

One thing I do in our team is provide git novices with some useful aliases, one of them being `fpush = push --force-with-lease`. It's saved a few people losing work along the years

chris_mc · on May 10, 2019

I don't get the "extra letters" argument as a Computer Scientist. Most of my time is spent figuring out what I'm going to do before I type anything, whether that's research, staring at code for hours to see how it works, or something else. I could type 3x as many characters each day and would probably only work an extra 10 minutes per day. Maybe I'm the exception, but I don't like brevity for brevity sake.

recursive · on May 10, 2019

You could also consider it "harder to remember".

chris_mc · on May 10, 2019

I normally use "git push --fo<TAB>" or "git push --fo<UP ARROW>" in zsh, but after seeing the switch option "--force-with-lease" pop up each time I've memorized it now. I let my CLI do the work for me most of the time, but when I can't there's always "man COMMAND". After doing anything a bunch of times, you remember it.

umvi · on May 10, 2019

I just made a git alias `fpush` that does --force-with-lease

pizzapill · on May 10, 2019

I've you use rebase to rewrite history a lot I have found/developed a couple helpful commands (tested & working on Linux bash).

Put those in your ~/.gitconfig:

  [alias]
   fu = "!f() { local msg=\"fixup! $(git log --oneline -n1 | cut -d ' ' -f2-)\"; git commit -am \"${msg}\" && git rebase -i --autosquash HEAD~2; }; f"
   fuc = "!f() { local msg=\"$(git log --oneline -n1 | cut -d ' ' -f2-)\"; if [[ \"${msg}\" != "fixup!"* ]]; then msg=\"fixup! ${msg}\"; fi; git commit -am \"${msg}\"; }; f"
   xx = "!f() { git reset --hard && git clean -f -d; }; f"

From now on:

git fu = (git fix up) combines last commit and all staged changes into one commit with the last commit message

git fuc = (git fix up comment) commits all staged changes as a new commit with the last commit message, but prefixed with "fixup!". Except if the last commit message is already prefixed with "fixup!". Now you can work on the same thing but commit incremental steps. In the end you just do: git rebase -i master (or similar) and they will be all in one commit.

git xx = get rid off all unstaged/uncommited changes in current directory. Very destructive, very useful.

Last but not least. If you have a lot of "fixup!" or "squash!" commits and need to interactively rebase without autosquashing them do:

git rebase --no-autosquash -i master

sktrdie · on May 10, 2019

Weird they don't talk of the `git commit --fixup=` command. And then `--autosquash` when rebasing.

u801e · on May 11, 2019

You can achieve the same thing by just titling the commit:

  fixup! Exact title of commit

We actually use that workflow during PR reviews on Github. That is, someone comments on a PR and the person who opened the PR will make a fixup commit with the appropriate title and reply to the comment stating that it was addressed and link the fixup commit by its abbreviated sha1 value.

At the end of the PR review, the person who will merge the PR will run:

  git fetch origin
  git rebase -i --autosquash origin/master
  git diff @{u}..

This basically updates the remote tracking branches (including origin/master), rebases the feature branch on top of the remote tracking branch for master, and then checks if any changes were introduced in the rebase process by running a diff against the remote tracking branch.

If there is no diff or the diff only shows changes made on the upstream master branch since the feature branch was created, then they can run:

  git push -f origin feature-branch-name

and then merge the PR.

masklinn · on May 10, 2019

Also no mention of `--onto` to e.g. move a bunch of commits from one branch to an other.

ddevault · on May 10, 2019

I don't use --fixup in my own workflow, but I would be happy to accept a patch adding a note about it:

https://git.sr.ht/~sircmpwn/git-rebase.io

umvi · on May 10, 2019

Is --fixup the same as --amend?

Liskni_si · on May 10, 2019

Conceptually, it's similar, and after git rebase --autosquash, the result is the same. But you can amend/fixup (call it whatever you want) any commit in your branch, not just the last one.

jaequery · on May 10, 2019

FWIW i've never really needed rebase. i am pretty happy with seeing all the commits that ever happened.

nothrabannosir · on May 10, 2019

Fixing history enables powerful second level tools , such as bisect and cherry pick. Being able to pinpoint a problem to an exact commit is incredibly powerful for debugging, but it does require your commits to be as healthy as possible. If you fix a bug 10 commits after it was introduced , now the 10 commits between them are harder to work with; you always have to keep in mind that unrelated bug. And cherry picking is lovely between branches , but if a commit introduced a bug and another fixes it, now you always have to cherry pick them together. Easier to fix it and keep them atomic.

A clean git history is not a vanity project. It can be used as a tool in further code building.

mixmastamyk · on May 10, 2019

Bisect still works anyway. For non-huge projects, all this obsession with tool minutiae is a waste of time.

As mentioned by another poster, the fact that a whole website is needed to explain the concept illustrates the design and UI failure.

This is an uphill battle that can't be won until a next-generation interface becomes usable by mortals. If that can't be done due to complexity, it's a lost cause for average developers paid for delivering business value.

zemo · on May 10, 2019

You're talking about a UI failure but you're not actually considering the entire experience. Reading the history of the project is a big part of the experience, especially for people in team lead roles. How much of your experience is writing code versus reading code? A merge-oriented workflow often results in a history that is deeply confusing to read. The only part of the experience that you're considering is the authorship of commits, not the utilization of the project's history. I've found a rebase-oriented workflow to result in a significantly more usable repository inasmuch as the history is significantly easier to understand.

mixmastamyk · on May 10, 2019

History has been useful to me at a high level, inspecting grains of sand at the beach, not so much. I can imagine it might be useful to some like linux kernel devs, but nowhere I've ever worked over a long career.

zemo · on May 10, 2019

it's not "inspecting grains of sand". If you have any number of developers worth talking about, merge-oriented workflows can create incredibly unwieldly git histories very quickly. I'm talking about, at a very high level, just understanding the history of the project. If you have five feature branches going on, and your developers are all committing on a regular basis, understanding the history and cadence of your project work based on the git history is incredibly difficult when the events of the separate feature branches are all intermingled.

Your comparison is not apt in many ways. For one thing, open-source governance is extremely different than managing private codebases maintained by a single company. Nobody in open-source governance is reading git histories to figure out whether employees are struggling, who is performing, who is not performing, who is overworking, where the project is, whether or not people are duplicating their efforts, how to report progress to clients, etc. And besides, the Linux kernel uses an email-based, merge-oriented workflow anyway. Kernel patches are submitted via email. That's not at all representative of any company that I have ever worked for or any company that I know of. The Linux kernel history also has a network graph that is literally unviewable on Github because it's so complex. Again, not representative of 95% of the work for 95% of users on 95% of projects.

mixmastamyk · on May 10, 2019

It's easy enough to select commits from one user, or squash whole branches. Our devs are judged on completed features they deliver at acceptable quality, not their commit history.

I'm not saying you shouldn't care, just that I don't believe this strategy will become mainstream unless it gets much easier to understand and use.

ddevault · on May 10, 2019

I've heard this before, and it seems reasonable on the surface. The argument I make against this viewpoint is: "git rebase gives us powerful tools that allow us to curate a good commit history in the same way we use refactoring to uphold good software design practices."

gshulegaard · on May 10, 2019

My biggest issue with that argument though is what constitutes "good" is subjective. For me a good commit history is one that faithfully chronicles what happened.

With this in mind, acceptable curation of the history for me is squashing or separating commits and neither of these require rebase. But I wouldn't object if a colleague chose to use rebase to accomplish this.

I would, however, object to reordering commits or rebasing since this alters the chronicle.

Which I guess leaves me with my opinion on rebase:

You can, but you don't need to. If you are going to, be sure you understand what you are doing and don't alter the chronicle.

fulafel · on May 10, 2019

PRs are a good unit of changes for examining meaningful units of changes. No reason to lose info about how the sausage was actually made, it is also a valuable record

mixmastamyk · on May 10, 2019

Most shops don't follow good software design practices, so how likely is it to get a practice one-step-removed from that, with a difficult UI to boot, adopted?

ddevault · on May 10, 2019

So you're saying we shouldn't argue for the adoption of good software design practices? There are a lot of software teams that do care, you know.

mixmastamyk · on May 10, 2019

But this isn't that. It's one step removed. Maybe if the UI gets better. Even then an uphill battle.