Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What is your Git commit/push flow?
199 points by nineplay on March 17, 2022 | hide | past | favorite | 191 comments
I've long been in the practice of "commit early, commit often". If one use case works I commit, if the unit tests pass I commit. The code may be a mess, the variables may have names like 'foo' and 'bar' but I commit to have a last known good state. If I start mass refactoring and break the unit tests, I can revert everything and start over.

I also push often because I'm forever aware disks can fail. I'm not leaving a day's worth of work on my local drive and hoping it's there the next morning.

I've become increasingly aware that my coworkers have nice clean commit histories. When I look at their PRs, there are 2-4 commits and each is a clean, completely functioning feature. No "fix misspellings and whitespace" comments.

What flow do you follow?




Here's how I address this problem.

When I'm developing, but before I create a PR, I'll create a bunch of stream-of-consciousness commits. This is stuff like "Fix typo" or "Minor formatting changes" mixed in with actual functional changes.

Right before I create the PR, or push up a shared branch, I do an interactive rebase (git rebase -i).

This allows me to organize my commits. I can squash commits, amend commits, move commits around, rewrite the commit messages, etc.

Eventually I end up with the 2-4 clean commits that your coworkers have. Often I design my commits around "cherry-pick" suitability. The commit might not be able to stand on its own in a PR, but does it represent some reasonably contained portion of the work that could be cherry-picked onto another branch if needed?

Granted, all of the advice above requires you to adhere to a "prefer rebase over merge" workflow, and that has some potential pitfalls, e.g. you need to be aware of the Golden Rule of Rebasing:

https://www.atlassian.com/git/tutorials/merging-vs-rebasing#...

But I vastly prefer this workflow to both "merge only," where you can never get rid of those stream-of-consciousness commits, and "squash everything," where every PR ends up with a single commit, even if it would be more useful to have multiple commits that could be potentially cherry-picked.


I do this (kind of too), but instead of an interactive rebase I just do a `git reset --soft <target_branch>`, where target_branch is the local (and up-to-date) copy of my target branch. That gets me one clean commit that I can force push up to replace my branch @ remote.

(This works for me because auditing commit history is not important where I work, if it were I would organize commits better.)


I first did rebase, and after a while I realized I was not getting much of it since I mostly wanted to merge everything down to a single commit.


This is the ideal workflow for me since I think merge commits clutter up the log, but it falls apart if people aren't consistent in following it.

*glances and $corp git repo and sees 'updates' 'fix' 'updates'. Sigh.


I've always been fond of the "WIP" commits.


...what is the golden rule of rebasing

edit: googled, "Never rebase while on a public branch" i.e. a shared branch


It usually works out fine when you do a `git pull --rebase`, but not everyone does this or has it setup so pulling might have some nasty effects. Generally helps to consider a feature branch as a private branch. Don't push to other people's features without asking, don't fuck up other people's work.


Everyone absolutely should configure that. (Git config pull.rebase true.)

Such an annoying mess it leaves otherwise. And CI is building 'merge branch master', on the master branch, great.


Are there any downsides to doing this? Why isn't it on by default?


I suppose just the usual 'rebase vs. merge' - i.e. if there are conflicts to resolve they will be dealt with commit-by-commit in the usual way when rebasing, whereas without the option set they will be dealt with all at once 'merge-style'. I happen to think that's a feature, but I know some don't like it.

I think there's less argument in favour of merge than usual with pulls though - since it's much less like a semantic merge to begin with for git-merging to preserve a history of.

I'm certainly not aware of any objective downside/gotcha/'oh but it doesn't work when...', no.

The docs only add that you might further want to set it to `interactive` or `merges` rather than `true`, for the effect of rebasing with those options: https://git-scm.com/docs/git-config#Documentation/git-config...


> Why isn't it on by default?

Because git was written primarily for the Linux kernel and the defaults reflect that. The workflow of the Linux kernel is completely different from what most people outside of it do with git. "git pull" is used for opposite purposes in both worlds.


I think it's the usually public vs private branch issue.

If you're merging a public branch with another public branch, then if you a rebase, rather than a merge commit, then you mess up the history for anyone has pulled that branch.

For private branches that isn't an issue.


Combine with git config --global rebase.autostash for your own protection.


FWIW I set pull.ff to only, since I don't want merge commits or rebases happening without explicitly calling pull --rebase or --merge.


> And CI is building 'merge branch master', on the master branch, great.

How does one set this up in GitHub actions?


By default it shows the commit message doesn't it? At least I'm not aware I've done anything for e.g. https://github.com/OJFord/terraform-provider-wireguard/actio...

The annoyance I'm describing is that when the commit message is 'merge branch master' (and especially if, as the label next to it shows, it is the master branch) this is crap and useless, and hiding the 'real' commits behind it that the committer had locally while behind the remote. If they had `git pull --rebase`d (or `git pull` with the config option set) the commit message would be that of the latest 'real' one.


Yes, if you're going to rebase, rebase your feature branch. Do not do it on the shared staging/dev etc.


This is very close to my approach. I work in a private branch and make many commits along the way, but then organize it to tell a coherent story for the actual PR.


I do the same. A clean and logical history helps other people understand your code and might consequently increase the chances of getting it merged.


I do the same thing generally. In fact I generally find multiple mistakes or changes to be made, reading over my own code before merging, requiring figuring out which commit to merge my changes into (git absorb helps in the easy cases) and even more rebasing, or giving up and adding an extra commit at the end.

I see some people whose projects (Furnace Tracker, PipeWire, previously Famitudio) seem to make progress very quickly without getting noticeably slowed down by technical debt, despite sloppy programming and unorganized commit logs (push-to-head). Meanwhile I move slowly, dread reviewing hundreds of lines of my own code, and produce technical debt (regrets) anyway, not as many surface-level lintable errors but plenty of entrenched mistakes. I wish I could move faster, but instead struggle to make progress.


The only problem I have with this workflow in the command line is that I would like to be able to split changes to the same file across multiple commits. I think some GUI tools enable this, anyone know about it?


I do this via git add -p which breaks your changeset down into atomic patches that you can either add, skip or delete before making a commit. You can turn one file change into many commits this way, if need be.


Have a look at `tig`. It's even included in Git for Windows now and does this reasonably well.

Only the keybindings are a bit weird if you're not accustomed to Vim bindings:

- Open tig

- Change into the staging view with `s`

- Select your file using the arrow or `j` and `k` keys

- Press Return to show the diff

- Navigate to the line(s) in question with `j` and `k` (arrow keys will switch files)

- Stage parts with `1` (single line), `2` (chunk parts), `u` (chunks) or split chunks with `\`

- "Leave" the diff with `q`

- You can find the keybindings with `h` in the help screen, which also uses Vim keys -- like manpages usually do


"s" is status view, which you can also get at by running "tig status" directly.

"c" is staging view.

> - Stage parts with `1` (single line), `2` (chunk parts), `u` (chunks) or split chunks with `\`

You can also revert a chunk or file by using "!". Sometimes this is very useful.


Ah, thanks for the correction :)


I used tig for a bit because it was the nicest way I could find to do line-level staging in a terminal. But was really impressed with gitui https://github.com/extrawurst/gitui so I've switched to that


If you're a vim guy, I use `git difftool` setup with `vimdiff` for this. Let's say you have your changes in a branch CHANGES on top of public branch PUBLIC.

1. I `git checkout PUBLIC -b CLEANUP` to a new branch.

2. Do a `git difftool CHANGES`, which opens each changed file in vimdiff one at a time.

3. For each file, I use :diffput/:diffget or just edit in changes I want.

4. Commit these changes on the CLEANUP branch.

5. Use `git difftool CHANGES` again to see the remaining diff.

6. Repeat until the diff comes back empty!

My unstructured changes tend to contain a handful of small typo fixes, white spacing, localized refactors, and 1 or 2 larger refactors and a behavioral change. Once they're all broken out, It's usually easy enough to use `git rebase -i` and reorder the smaller changes first, put out PRs for just those first, etc.


I use the interactive flag on git add for this. It lets you add parts of a file, commit, and then do it all again. I want to say it's git add -i, but I'm not 100% on that. My fingers just do the right thing when I want it to happen.


It's "p" for "partial"


I use Sourcetree [0] for this (on Mac but Win version available)

I've tried most Git clients on Mac over the years and kept gravitating back to Sourcetree.

I only tend to use it for this particular workflow (picking out very granular changes on a line-by-line basis). Otherwise, 90% of my git stuff is via IDE integrations or command line.

[0] https://www.sourcetreeapp.com


Gitx is great for that on a mac, but every time I set up a new MacBook I have to hunt down the latest repo / build. It keeps getting abandoned and then forked and continued, and then abandoned again...

Edit: looks like it's back from the dead again :) https://github.com/gitx/gitx


The way I deal with this is 1. try to make commits small enough so you are less likely to split them after the fact. 2. when I need to split a commit, I use VSCode UI and apply patch by patch. This is one of the rare case were I use a GUI for git. For most other things the command line is fine.


When you're doing your interactive rebase, find the commit you'd like to split - choose "edit" for that one. Then when you reach that point, you can do `git reset HEAD^1` to bring those changes back onto your staging stack, and make as many commits as you need.


I can second the recommendation of tig in the sibling comment.

Additionally, git itself comes with a simple `git gui` command that allows you to do partial commits on a line by line basis. It also has a nice "amend last commit" mode.


sublime merge handles this really nicely. You can do this in VS code too but the UI is a little more fiddly (right-click->"Stage Selected Range").


If you have a Mac, check out GitUp. It's a simple but fantastic tool for cleaning up git histories: merging commits, splitting commits, reordering...


Git has a built-in GUI for that. Just run 'git gui'. It's not pretty but it works.


  > git gui
  git: 'gui' is not a git command. See 'git --help'.
  The most similar commands are
   gc
   grep
   init
   pull
   push


It's a separate package in some distros since it's written in Perl and uses Tk. gitk is often in the same package.


Both git-gui and gitk are in Tcl, not Perl.


The most important git feature I discovered was `git add -p`, this allows both to select which patchs to stage, but also to do a review of what you are going to stage. Combined with `git commit -v`, this allows you to have plenty of occasions to review your changes before creating your pull request.

Shameless plug, but here are other efficiency tips I wrote about, for working in a high demanding environment: https://dimtion.fr/blog/average-engineer-tips/


For tasks that you have mentioned, there's also built-in `git gui` for those who prefer to click.


Where is that? `git: 'gui' is not a git command. See 'git --help'.`


Got it, it is not built-in on all distros. Head to https://git-scm.com/docs/git-gui/ and https://github.com/prati0100/git-gui.git/ for more.


It is part of git, some distros just split it out to a separate package ("git-gui" on Debian) to avoid pulling GUI dependencies in unnecessarily.


Sublime Merge offers a nice GUI for the workflow you're describing with `git add -p`


Here is my last five commit messages: "width*height is area", "fixing stuff i broke", "rm properties we dont need", "rm more useless attributes", "nicer figures". I do my best to keep the code base as clean as possible, but I couldn't care less about keeping the commit history pretty. Any time spent on prettifying git history is better spent on documenting the existing* code imo.


thank you for saying this. I'm always afraid to. revision control is a necessary safety net and facilitates discussion around changes (PRs). but people act as if the history is somehow _really really important_.

I've seen someone post on HN, apparently seriously, that the history is more important than the source.

I know a (potentially) really good developer that spends his time pulling in the recent patches and reorganizing them to make an alternate history that is prettier somehow.

sure, every once and a while it because useful/necessary to bisect, and a 'clean' history might help with that.

but seriously - why do we fetishize this? this is a medium where the amount of writing vastly outweighs the amount of reading.

when people are looking for a bug do they seriously find value in seeing how the code evolved? or do they just figure out why it doesn't work? is there an implicit assumption that the code all worked at some point and the task is to find out when/how it was broken?

just really confused


Yes, going back and understanding why something broke/has changed is incredibly valuable. Often it's not because of one singular decision but a collection of decisions over time that resulted in some behavioural regression. Being able to easily hop through all the commits of the recent past is incredibly valuable for me to understand how we can prevent such errors in the future, not just patch over the current one and move on. Fixing things without considering how we got here I tend to find leads to messy code; extra checks and assertions that aren't necessary if one takes the time to update the underlying assumption or modules that end up too tightly coupled because an extra bit of logic is added to fix that one bug.

Obviously it's possible to go too far; not every commit needs an attached essay. Many of my commits are just "fixed typo" or "added unit test for X", but then sometimes I'll write a short paragraphs or two explaining my rationale, referencing the commits that came before


Yeah I love picking my way thru junk commits/comments. You may as well not use VCS.


what were you hoping to accomplish in the first place?


I look thru history when I am building on top of stuff that already exists. In a complex system that has been around for a while, how else do can you figure things out?


It depends on the project. An open source library should probably have a sane history, a closed-source application, in a lot of cases it doesn't matter so much and often useful to see the whole workflow.


Why would it matter whether the repo was for an open source library or a closed-source application? That seems like an arbitrary distinction.


To me, it doesn’t seem arbitrary at all. Depending on the team size, a closed-source application likely has a small number of other developers that have a vested interest (or access, for that matter). Whereas an open-source project could potentially have orders of magnitude higher number of readers. A much larger audience would warrant higher priority over a readable/clean commit history.


i think this can be true, in some cases the git history may never be looked at so agree effort may be better soent elsewhere


When multiple people work in the repo, I like Squash Merge in Github best because you can still do small commits in your feature branch, and when you merge, it generates a message from all commits messages (so there is still a trace of the process, but you can get rid of noise like "fixed a typo" with the benefit of hindsight) and history looks clean because it's merged as a single commit, no rebase footgun to worry about.

https://docs.github.com/en/pull-requests/collaborating-with-...


IMO this feature destroys the history the developer of the PR should have crafted carefully in the first place. With a squash merge you basically say "I don't give a f..." and remove all traceability from the PR, giving future developers potentially a big headache if they have to figure out why something was done. That's why I think squash merge should _never_ be used and is one of the very big anti-features of github. Commit history has to be crafted like code, by the developer who wrote that code and in a way that other developers can see the steps that were taken to craft that code. Squashing PR commits just removes all of that, resulting in a SVN-style repo with "checkin 2020-01-01" like commits. Yes, there might be more in the commit message, but its value is lost because it is not for a small, possibly atomic change.


The squashed commit message should include comprehensive documentation of what happened in the commit. There's no reason "check-in 2020-01-01" should appear anywhere. In the extremely rare case of needing to see how the commit was written step by step, the PR is still there.


For our company squash merge simplifies cognitive load when a bad commit is deployed to production. There’s exactly one commit that caused the issue and one commit to be reverted. Also very easy in CI to deploy previous commit to revert back to steady state quickly before debugging whatever the issue is.


I used to do exactly what you described a few years ago (when I was learning git for the first time). Not anymore. A few reasons:

- commit early/commit often. I usually push one commit when I think the feature is done. While others do a review of my code, I commit to improve the code/fix the issues found by others. The advantage here is that future readers looking at the history of file X line N can know what other files were introduced alongside file X (as a reader of big codebases, this is a nice side effect). I don't like hiding defects either from the git history (one could in theory squash all the commits of a given PR in order to keep the "history clean"... In my experience having a trace of bugs fixed at PR time, or other subtle details is also worth it and serves as documentation of what not to do).

In the cases I need to work through many days in a single feature, and only if the feature is so complicated/critical than I cannot reproduce it from scratch by myself again, then yes I push the progress upstream. This is usually not the case though: I stash progress. I tend to open small PR and usually I remember what I've done (so I could write the entire code again easily). Plus, hard drives fail, sure but they are also quite reliable. In 20 years of work I never experienced losing "non critical" work because of disk failure (for critical work, I for sure have a different workflow).


The flow you use will typically depend on the company you are working for. Using regular git (no Github/Gitlab) will often have a different workflow than Github.

My team (and myself) prefer this workflow:

- One commit per PR. This allows for easy reverts and cherry-pick.

- One developer per branch. You can do a few devs per branch, but rebases need to be coordinated and carefully handled, because:

- No merge commits. Only rebase onto latest main. Which means force-pushing PR branches and, thus, rewriting history (other devs working on same branch need to be aware that history has changed).

If you're constantly rebasing onto main, then all of your working commits sit on top of the latest code in main. Which means you do not have to deal with tricky merge conflicts, where your commits may weave in and out of the main branch at various points in time because you were doing "git merge" at random points. In addition, if you squash your commits before doing a rebase this will also make merge conflicts rather trivial, because you're only dealing with one set of merge conflicts on one commit.

That's the big picture, team workflow. For my personal workflow, I rely on "git add -p" and stashes. The only time I do a commit and push up code is: a) when I have a large change and want to make sure I don't lose it or b) others have already reviewed my PR and I want to keep the new changes separate to make their life easier when reviewing a 2nd time. I use "git reset --soft HEAD~<number-of-commits>" to squash instead of "git rebase -i" because I find it easier and quicker.

I must emphasize this point: learn "git add -p". It's extremely useful in the case where you have some changes like debugging code or random unrelated changes that you do not want to commit. It's a filtering mechanism.


> - One developer per branch.

I had never even considered that some teams might have multiple developers active on the same branch.


hrm.... I do this occasionally with some other folks on a project. often it's fe/be concerns - one person needs something added in a payload, for example. Working in their same branch to add that for their environment/branch was much faster than trying to coordinate a series of other changes/branches. It's not something I/we do a lot, but the few times we've done it in the last couple of months people have really liked it, and seemed to feel like we were making faster progress somehow.


Yeah, it's really cool when you're on calls supporting each other and the branch is moving forward twice as fast.


> - No merge commits. Only rebase onto latest main. Which means force-pushing PR branches and, thus, rewriting history (other devs working on same branch need to be aware that history has changed).

How does that even scale? I would imagine that in a team of 10, you would be rebasing 90% of your day and only 10% doing actual work?


I automized constant rebasing. it's a couple of cronjobs for the mirrors and projects I'm maintaining over several years, and the cost is marginal. I get about one failed rebase email per month.

a big project of mine is about 2500 commits ahead. rebasing this beast is partially automated, but still I get about 2000 upstream changes through once a month. you need scripts to rebase and to rollback for a wrong choice.

it scales trivially.


My flow is:

1.a. a lot of dirty commits/wip commits

1.b. a few of clean commits, when I spot changes that I know they are already a commit by itself

Before opening the PR:

2. `git log -p`: I inspect the commits I've done and I decide what should go together, what should be edited and what can stay as it is

3. `git rebase -i`: I apply the changes I've decided during 2

4. repeat 2 and 3 until I'm happy with the results

5. the last `git rebase -i`: reword almost every commit, as almost all the commits at this point have placeholder descriptions

I'm very happy with this strategy. It requires some time to get used to it but at the end my PRs are very clean and well-thought.


One thing that can make rebasing much easier is making use of --fixup and --squash. Often you know at the time that a commit is either fixing a previous commit or that you would want to squash it with a previous commit. This can save a lot of time later as you can simply issue a --autosquash to rebase. If you do it right it means others can rebase your branch too.


If you are afraid to lose what you've done, you can stash your minor changes to keep track of the small things that you already got working, and then, when you got enough working to make a commit, you do it with the best code you got. And for the many pushes, you could backup your projects files everyday. I think is a much more appropriate way to use the tools that git gives us.

That way, you won't be afraid to lose your recent work by messing something up, because you have the stash, and won't be afraid to lose your whole project/progress because you have a recente backup of it.

For instance, I have a backup script that runs everytime I shutdown my work computer so I won't have to worry if suddenly my hard drive gives up on everything.


Create a feature branch for everything i work on. Then squash all commits on that branch for merge. If that doesn't look clean because there are too many features munged together, then split that feature branch into multiple ones (ideally would have already realized this ahead of time), then squash and merge each of those. So, then the main branch commit history is only "clean, completely functioning features" and then a link back to the PR that then has the full history of commits that got squashed, if you wanted more insight into the development process. The github UI simplifies a lot of this process and makes it a lot nicer to view.


Squash-merge PRs. You can configure this in GitLab, GitHub, and Azure DevOps. Your private commits can be whatever you want and they get rolled up to a single PR commit when your working branch is merged to trunk.


Yes. Much easier to roll back things, cherry pick things and blame people ;) Don't you dare merge a PR with 30 commits.


+1 to this. I like the mantra that every commit is deployable


"Every commit is deployable" isn't at odds with merging multi-commit MRs, provided that the submitter actually did their job well. I find squash-merging to be an ugly workaround, not a proper solution. Good non-trivial merge requests should usually consist of multiple commits that improve readability, revertability, cherry-pickability and blamability ;)


The git history on my personal/dev-machine are an absolute mess that I try to optimize for `git bisect`, so preferably each commit should compile (and I usually commit whenever a new part of the requirements is introduced/a new test passes)

The way our code review system works is essentially each upload is a commit, so the review's history is like a compressed version of the local git history. I usually upload when I think the branch is "complete"/will pass all of the CI, and ~once per round of comments.

The main git history is squash-and-rebase, so one commit per code-review, with the CR description serving as the commit message (with automatic links to bugs and code reviews and the like).

I personally like the squash-and-merge strategy for the main branch, it makes the most sense (especially when the code review's history is still around after)


I barely skimmed what's below so maybe someone else has covered this, maybe not.

If I need to hack on a task and I know I'm the only person that's going to work on the branch I amend changes to my commit basically constantly. The only reason I might not is if I want multiple commits on the branch to make review simpler for others.

```

git fetch && git checkout main

git branch feature/JIRA-###/words-describing-it

git checkout feature/JIRA-###/words-describing-it

<do some work>

git add .

git commit -m "JIRA-###: words / description"

git push -u origin feature/JIRA-###/words- describing-it

<review CI build, make more changes>

git add .

git amend

git push -f

```

`git amend` is an alias in my ~/.gitconfig

```

#cat ~/.gitconfig

<snip>

[alias]

        amend = commit --amend --no-edit
```

So most of my commits I just amend the first commit I made, force push, review CI build results, rinse repeat until it's ready for review.

Simplest thing that works when it's just my own work on a branch. I'll also do a rebase pre PR review if need be.

```

git fetch && git rebase -i origin/main

git status

<do what it says if anything>

git add <fixed files>

git rebase --continue

git push -f

```


I do the same, but before pushing, I squash the commits to remove the small intermediate commits. That may be what your coworkers are doing as well.


Yeah.

Consider I have a feature branch that no-one depends on. When my history is messed up I do easy rebase :

  git reset --soft master (or whatever commit up to which I want to UNDO, but no further than commit I branched off from)
Now at this point changes from multiple commits are in my STAGING.

  git commit -m "Feature X implemented"
  git push --force (you can do If no-one depends on your branch, otherwise - don't do this)


Ohhh, you trickster! Going to start doing this.


I often commit as well to checkpoint, and push those onto a branch organized under my username as a backup (<username>/<branch> to make use of ref folders). When I’m done, I use interactive rebase to clean the branch up, writing meaningful commit messages for each unit of change (answering the question “why does this commit exist?” usually). I force push that cleaned up branch and then open a PR.

My view is that final commits are a form of communication, and deserve some intention. I’ve thanked myself when I’ve looked years back at work I’ve done and been able to figure out not only the change, but also my own state of mind.


At the end you can just do an interactive rebase and squish the commits away that don't clearly explain what functionality you've added.

I believe you can also configure your PR software to attempt to squish for you? I'm not sure about that, but I think I've seen an option in the settings somewhere.


I see no value in shorter "clean" commit histories in a feature branch. In fact, if you need that, it might imply that the PR is too large to look at holistically, which is a red-flag.

Half my PR's probably have a commit that just says "mashed potatoes", or 5 commits in a row that say "maybe it works now?". It doesn't matter, its going to get rebased/squashed to a single commit in the merge process anyway (which IMO everyone should be doing, true merges from feature branches to mainline are bad).


I usually squash my branches to single commits in the end and write a nice message then. There are times when I don’t do this. For example, yesterday I was moving existing code to a separate project and I first committed the original version and then made necessary changes in my next commit. This made it clear where things differed from the original.

Overall I think commits that land in main should be worthy of landing in main. Random crap you tried isn’t important. Larger self-contained changes are.


I often have several "topics" going on in the same branch, for which there are existing (unpublished) commits.

When I make some small change, it usually belongs to one of these topics. If it's the topmost commit, of course it can be combined into it with "git commit --amend --patch".

If the small change for one of the commits behind that one, then I make fixup commits with "git commit --fixup <hash>" or "git commit --fixup HEAD^^^" (as many carets as needed to refer to the commit I want).

If the little fix requires an addition to the commit message, then --squash instead of fixup.

These little fixup/squash commits can then be squashed into their target commits using "git rebase -i --autosquash".

That may be how some of your coworkers have clean histories.

In some environments, a change made up of numerous little commits like "fix typo" wouldn't pass peer review ; you're supposed to know ho to squash things together (but not too much so that topics get inappropriately combined).

Some shops have a policy that every commit has to build and possibly also pass the test cases. So you can't have an "oops, add missing semicolon" commit; its parent wouldn't build.

Commits should be like Stack Overflow answers: "is this useful to future visitors of this git history?"


Pretty similar to you, I commit as often as possible, but I also push whenever I commit so that its backed up. Then once its ready for review, I do an interactive rebase to try and group the commits into useful chunks for the people reviewing, and to improve the commit messages.

Git is crazy complicated under the hood which is awesome when you mess up and need to recover something, but in general my goal is to use as few git commands as possible and keep everything as clean as possible.


>I also push often because I'm forever aware disks can fail. I'm not leaving a day's worth of work on my local drive and hoping it's there the next morning.

I separate file backup and version control. I keep every git repository I'm working on in Dropbox, and don't ever worry about how often I'm committing and pushing. I don't think git should be considered a substitute for a proper continuous backup system.


> I separate file backup and version control. I keep every git repository I'm working on in Dropbox, and don't ever worry about how often I'm committing and pushing.

Motivated by a slightly different use case (seamless syncing across multiple machines), I built a custom tool that solves this concern within `git`: https://github.com/rraval/git-nomad

You can toss `git nomad sync` into a systemd timer (or cronjob if you prefer) and forget about it.


Same. I used to make a commit every time I got up so as not to lose work, but that just littered the history. Then I switched to keeping it in Dropbox or using Syncthing for work stuff, so now I only have to commit when I get to a stopping point or to share with others.


Yep. Dropbox is also a great safety net for when I mess something up in between commits. I've had a few times where I've deleted or changed some code that I couldn't restore just via undos. Normally I'd pretty much be screwed, but Dropbox retains a diff every time a file is saved, so it's basically impossible for me to lose anything.


Agreed but pushing at least at the end of the day should be required just in case somebody needs to pick up your work for some reason.


The n1 rule I follow (and push everyone to follow) is: "Don't write shitty commit messages." It helps in general with everything related to Git.

Otherwise I decide ahead of time and focus on one area at a time that I know is a small chunk of functionality (like adding boilerplate for a small script, unit testing a piece of functionality, implementing an endpoint) and as long as I'm working on it I just use git add -A ; git commit --amend --no-edit . Once I move on to new self-contained batch of functionality I make a new commit and repeat the amending. It requires a bit of discipline to keep commits small enough but I like it since it's an easy process to follow.

Before I make a review request I usually do an interactive rebase and reword commit messages, sometimes reorder them and squash small stuff if it makes sense. After a review is in process I generally no longer rebase (to keep review transparent) and commit each fix separately.


I commit as soon as a single functionality makes sense as its own. Simple example: If function X uses function A, B and C. I'll commit after finishing A, B, or C. And once X is then, I create a separate commit. I also ensure the code still runs most of the time. How I group them depends on how complicated it is. The more complex, the more commits.

This helps me write code that are more standalone (easier to replace & test). Also helps to clear the cognitive burden when developing larger features. Anything that's committed are pretty much "done". So if I leave any work mid-way, I can simply look at the untracked files to know where I was at.

Then at the end of the work/PR, I go through every changes to make sure everything makes sense.

In practice, this means that the first code I write is often the last I commit (see my simple example above).


I have three branches: macbook, imac, and develop.

develop is the mainline branch that other people could pull from.

macbook and imac are each backups of their respective machines. I use git merge --ff-only to rebase commits from one to the other when I change machines. If I need to change machines midway through a commit, I'll push a WIP commit to imac, fastforward macbook to match, and then git reset HEAD^ to keep the WIP files (but undo the commit) on macbook. I finish on macbook, make the commit, and push it to origin's macbook and develop.

Like the others here, I make liberal use of git rebase -i to reorder commits, etc. If imac and macbook ever diverge (e.g. I forgot to push a commit before changing machines), I also use interactive rebase to resolve it.


git commit -m "Initial commit"

git commit -m "Add some stuff"

git commit -m "progress"

git commit -m "Add library"

git commit -m "Drop library"

git commit -m "Let's try this library"

git commit -m "Slider is working"

git commit -m "Fixing transparency issue"

git commit -m "Fixed"

git commit -m "I hate you"


Not entirely unlike yours, but I also very often use `git commit --fixup` and let the computer clean up large portions of my messes with `git rebase -i --autosquash` later.

That is, nowadays I usually do the same, but through Git Fork [1] (via its custom command system), and it works even better than the terminal interface for me -- which doesn't happen often. But for fixups the "find the target commit in the history and use a context menu entry to flag your commit flow" really works rather nicely.

Also, Fork's interactive rebasing dialog does `--autosquash` by default.

[1]: https://git-fork.com


All I use in git is clone, pull, commit, push. I do not use feature branches. The only branches I use are for releases that need to be maintained separately from master.

I absolutely do not care if there are "fix typo" or WIP commits in the history. The amount of effort it takes to get a "clean" commit history has negative payback in my experience.

It should be said I am a lone developer for the most part, sometimes working with one or two other people. So Git in fact is overkill for that scenario. Subversion would do just as well and in fact I prefer it or mercurial.

In a large project such as Git is meant for, my usage probably doesn't apply.


I do exactly as you say: commit early and commit often. However, something I find quite successful when publishing a Pull Request (aka CR/MR), especially large ones, is that I'll use git rebase to group chunk diffs together in a logical flow to create an easy-to-review set of commits.

If I have 5 commits that are all whitespace changes, I'll use git rebase to group them into one commit before I push.

If it's a large, complex PR, then I'll reorder my commits and pick-and-choose chunks from the many commits so that they make sense in a progression. For example, I might just add one new component in the first commit. Add a second component in the second commit. Third commit might be integrating those two components into the system.

If my PR consists of both "cleanup" like running black on python, then I'll make that its own commit instead of littering actual functional changes within cosmetic ones.

I try to make it as easy as possible for my reviewers to review my code and actually catch mistakes. I also tell the reviewers to look at the commits separately and not to look at the entire diff which can sometimes be hundreds of thousands of lines of code. Even a large PR with 500+ lines of changes can be easy to review if you structure the commits right. (It's really like many PRs within one PR.)

You may ask, why not have separate PRs then for each commit? Sometimes, that makes sense when it's so large that it's worth having an integration branch that lasts weeks or months. Sometimes, it's not possible to just merge that one commit due to problems it'll cause with compilation or linting or other issues. Also, with separate PRs, it can be more costly (in both CICD resources and the dev's time) to do automated testing on each commit separately.

With some practice, the git rebasing and reordering or restructuring commits is not as difficult as it sounds, especially with the aid of a good git GUI (I like Fork.app) that lets me choose exactly the chunk to commit from a particular diff.

Once I got really comfortable with this workflow, what I'll actually starting doing is... when I'm done with something and ready to put up a PR, I'll flatten (fixup) all my commits into one, and then I'll revert it and commit that. Then I revert the revert but instead of committing that revert-the-revert, I'll stage it. This gives me all of my changes in a staged state. Then I'll pick-and-choose chunks from that entire diff to craft a "narrative" like I was describing before. Once I have everything in the right order, I'll drop the first two commits to get of the initial and revert commits.


I commit as clean as possible, each one should be a functioning feature or a part of it. I try to do small ones, so generally they'll just be part of a feature I have on my todo list for the entire feature. That actually helps me to mentally move on as well. If I need to rename some stuff I might squash commits, but no wip commits in the actual history.

End of day I usually commit a "temp" commit with a few comments to myself and push that to the remote branch, revert that the next day and force push over it for the next commit.


I say, you have it right and your coworkers are less so. Encourage your coworkers to be secure in their work. Show your struggles, false starts, mistakes, typo fixes. Embrace the messy history!

My last few teams' workflows and projects were such that commit history wasn't really a big deal. I'm skeptical that it matters if the dev-to-production process is smooth.

These tips are habits we have:

Keep commit message first line to under 50 characters, use imperative mood, capitalized. If you need more than that to describe what's going on then the commit might be too big.

PRs must pass all continuous integration tests + 1 or 2 devs approval. Pretty much always do what a dev suggests in review, even if you disagree.

Optionally, squash merge PRs to master, to make reverting easier. I prefer that.

If a feature is very complicated, break it down into smaller tasks, merging each to master when completed.

Maybe use a feature branch for super complicated inextricably tasks, but doing that too often is a code smell.

In that situation, code history isn't really that important.

That said, if you're still keen, look up these commands:

  git commit --fixup {commit id}
  git reset --soft {something}
  git rebase -i --autosquash {commit id}
Lots of force pushing, will be needed, but never, ever force push to a shared branch unless every one agrees and is aware.

Those can straighten out your history right out.


> Pretty much always do what a dev suggests in review, even if you disagree.

I disagree.


Sure, if there's some kind of egregious error or objectively bad suggestion, push back, but generally give the other devs the benefit of the doubt. I'm assuming you're working with professionals.


I try to make clean commits that do one thing only. If I have trouble writing a meaningful commit message, it means I have failed and should do a better job next time.

That said, I often make a messy "wip" commit that I push to my branch, just so that the work doesn't get lost. But I always undo such a commit and clean it up.

Also, I always use git add -p, so that I can break changes into multiple meaningful commits and review them one more time before pushing.


You can have the best of both worlds -- "commit early, commit often", and "nice clean commit histories" -- with git, and it is easier than most people think.

- So you start work on a new branch, and reach a checkpoint. Create a new commit with "git commit".

- Continue working, and when you reach the next checkpoint, create another commit. But this time, use "git commit --amend". Contrary to what the flag says, this doesn't modify the previous commit, instead it modifies your commit history and replaces the last commit with a new one[1]. So instead of having two new commits in your branch's history, you only have one.

- Repeat the process until you have something worth pushing to the remote.

- Once you've pushed to remote, remember to not make any further modifications to your git history[2] i.e. going forward, create a new commit and only then run "git commit --amend".

Now, there's an obvious question here: if "git commit --amend" keeps replacing the last commit with a new one, what happens when you mess things up and want to revert to NOT the last checkpoint, but some checkpoint before that?

The trick here that git has up its sleeve is called "git reflog". You see, while "git commit --amend" replaces the last commit in your branch's history with a new one, the older commit is not actually gone, its still there. And you can see all of them with "git reflog". Basically "git reflog" returns every commit at which the local HEAD (i.e. the commit checked out in your working directory) has been. So none of the commits that you replaced with --amend are actually lost, and you can find them all in the reflog.

Restoring to an older checkpoint becomes as easy as running "git reset --hard <previous-commit-id>", or if you want to play safe, "git checkout -b <new-branch-name> <previous-commit-id>".

Hope this helps!

Notes:

1. By design, a commit, or in fact most objects in git cannot be modified. They are immutable.

2. Basically you can play with git history as long as it is private to you and not shared with others. But the moment you push it to a remote you share with others it's no longer private and you must not modify that history anymore. Read Linus's note on keeping a clean git history: https://www.mail-archive.com/dri-devel@lists.sourceforge.net...


I like --amend a lot too. Usually a new feature requires a few steps. I commit every step and make a list in git commit body. It's basically almost same as rebasing at the end, but amending seems more logical to me.


I commit often (as long as it compiles, pass the linter, and fast unit tests if there exist), then I rewrite the history. If I work alone on the feature (which is most case) I also push quite often on a private branch (mostly for backup, and for running the CI). If needed, I also rewrite the history of my private branch and `push -f` (even though this isn't ideal...).

My PRs are clean, each commit pass the CI and is its own thing.


I use Sublime Merge (https://www.sublimemerge.com) for all git commands. For automating commits on things like curated lists / wikis / .. I use gitupdate (https://github.com/nikitavoloboev/gitupdate)


I use stgit[1] which makes it easy to polish patches (commits), re-order patches, format and email patches, etc. It also makes it very easy to keep a lot of balls in the air at once, working on lots of commits all at the same time while keeping them reasonably separated from one another. I've had maybe 50 to 100 patches going at once from time to time (but usually more like 5 to 10). Then there are guys like Andrew Morton who's known to juggle 1000s of patches at a time using ancient home grown scripts that are the primordial ancestor of quilt[2] and stgit.

[1] https://stacked-git.github.io/ [2] https://en.wikipedia.org/wiki/Quilt_(software)

Edit: forgot to mention, once I'm happy with a given patch/commit, or series of patches/commits, I then cherry-pick or merge them from my patches branch over to the "real" branch, and then push them.


It works, I stage (git add), I refactor, it breaks, I checkout (preserves staged work), refactor again, commit push.

Also a fan of numerous small commits - makes good documentation and elimates wasted time untangling hours of thought.

And when I'm reviewing the last thing I want is for someone to waste time doing me the favor of burying their journey into some "perfect" mega-commit


I've read a fair number of the top level comments and I don't see anyone who develops like me, so I'll throw my hat into the ring. I'll start by saying that I've always worked at companies where the norm is trunk-based development, where the norm is to develop small changes (< 500loc, typically < 100loc), get them reviewed, and land them as a single commit with a nice message on the tip of the trunk branch.

The way I build these commits is to check out the main branch and start coding. As I write code/tests/documentation I start staging (git add) and committing (git commit) into a single commit. Subsequent commits are "git commit --amend". When I commit I usually update the commit message with any details about the stuff I'm adding.

Changes in the commit I'm pretty sure are going to be in the CR, staged changes I'm not sure about, and unstaged changes are "what I'm working on now". This way, "git diff" shows me what I'm working now (in case I'm interrupted) and "git show" shows me what I'm ready to push. If I want to sync with the trunk, I can do "git stash && git rebase && git stash pop", which mostly works well though sometimes there are merge conflicts that need resolving.

When I'm ready for review, I rebase my single commit, rerun any tests, do a git show and review the change top to bottom, then create the code review. Then I review the change again in the code review tool, where I quite frequently find silly things I missed like typos in documentation of print statements I forgot to remove. Then add reviewers.

The only downside to this I have is when I want to back out changes I've already committed. Git doesn't give you a great way to do this, but I've written an alias that can remove all changes to a file from the commit at the tip of the branch. I use this a couple of times a year.


I squash all commits in a PR as a clean commit. I don't see any benefit to knowing when or what someone did other the past few weeks to get to the point we're updating the main branch, and i don't want them to waste time either. If your PR is so large you need to review it as individual commits, maybe your PRs are too big.


My opinion is that the time for clean commits is the merge commits created by PRs. Git uses a two dimensional commit data structure and most git commands today take a --first-parent argument that gives you a smart, clean view of just the merge commits. It's only a bit of config work to make those your default aliases, and I just wish more UI tools defaulted to something more like it.

I don't necessarily care how "messy" the commits inside a PR are at that point if my default view is at the merge commit/PR level.

Teaching your coworkers how to use --first-parent may be beneficial to everyone and less stress than worrying about the peer pressure to conform to smaller "cleaner" commits.

That said, making cleaner commits is sometimes a useful skillset to learn for yourself. Sure, it can make code reviews easier, but it's advanced skills that take practice and most PR tools aren't particularly good at reviewing a commit at a time anyway so most PR reviews are entire PR at a time. I use tools like git add -p and git rebase --interactive as I feel necessary to clean up a commit narrative, but I also have a strong understanding of when not to force push and have years of "training" in these commands. I don't expect and don't want to expect junior devs to use them. I'm happy with the --first-parent approach to the DAG. I'm also always happy to teach junior devs how to use git add -p and git rebase --interactive, but when it is for them, when it is they who want to improve their skills.

A semi-related thing here is that the other way to remove a lot of "fix misspellings and whitespace" commits is to automate them away. Standardized formatters like prettier remove most of the manual effort of whitespace management. They can be setup to format on save in editors like VS Code. They can be added to pre-commit hooks. Similarly, many misspelling problems (certainly not all) get automated away with type systems and linters.


always use `git add -p`. review each individual hunk before staging it into commit. dont blindly add everything.

once you've got it working, do a second pass on the commit history and compress it into one or more coherent logical commits, reordering or squashing changes as necessary. learn the interactive rebase interface. lots of operations of the form `git rebase -i HEAD^^^^^^`

where necessary, if you need to erase your old messy history from a feature branch and overwrite it with your tidied up commits, `git push --force-with-lease some-remote some-feature-branch`. in general, avoid using `--force`, and if you ever believe you need `--force`, prefer `--force-with-lease` . the latter avoids some failure modes related to race conditions where a coworker pushes some commits and your concurrent `--force` push that lands a few seconds later blows them all away.


Rebase is the key.

There are two types of commit: checkpoints and versions.

Checkpoints are just you randomly deciding to "save" your progress. They don't necessarily correspond to completely working versions. The commit messages can be completely inappropriate for sharing but only make sense to you.

Versions are fully working copies of the program that could be checked out, run, released etc. These are what you present to your team and what gets merged in the end.

Sometimes you can write a new version straight away but other times you'll generate checkpoints first. Then you use rebase to get to versions. Use rebase as a way to make it look you are a superhuman writing perfect commits first time.

Getting good at rebasing takes some practice and pretty good understanding of git's data model. Learn this. And learn how to use --fixup, --squash and --autosquash to make your life easier.


I tend to keep my PRs small enough to where several can be submitted within a day. This being said, I tend to keep my changes un-staged.

When I'm ready to commit:

1. `git diff` to get an overall picture of what changes were made. Which parts of this diff can be packaged into an isolated commit?

2. `git add -p` This is where I selectively stage bits.

3. `git diff --cached` to verify that the staged items are all in place

4. `git commit` with a detailed message.

5. Repeat steps 1-4 until all changes have been committed.

6. `git fetch origin main && git rebase origin/main`

7. Finally, `git push`

When PR feedback is left by peers, some teammates prefer you to not rewrite commits and force push. This makes re-review easier for them (especially if you use the Github features around PR review).

I opt for rewriting commits if it's okay with team members. This way you don't have "fix typo" commits getting merged into the main branch.

Edit: formatting


`[commit] verbose = true` in .gitconfig makes `git diff --cached` step unnecessary, as you then get the diff displayed in commit message editor (I'm baffled why this isn't the default)


I think part of it depends on the tooling that you're using and what your review system supports. Assuming that these are all supported, I would generally:

1. Push very often, essentially every time I'm going to context switch or take a break

2. Iterate often, get things working, then clean them up, pushing at each step along the way

3. Interactive rebase to squash into meaningful stacked commits for review

4. Mark the change as ready for review by my team

I have found stacked commits to be the best way to both perform and author reviews by a wide margin. It's so much easier to review 4 200 line semantic changes than a single 800 line change. You'll get better feedback from others, and thinking about development in this way can also lead to better results on its own.


Commit often, then interactive rebase, then push to GitHub. Then code review happens, and any comments get addressed with some extra commits.

The annoying thing is that often someone will change CI for some unrelated subsystem (we're in a monorepo) and the way to fix that is "merge master into your branch". Of course I could rebase on top of master before pushing, and I do, but that doesn't work well if the CI change happens while my PR is in review. Then I have to merge master into the PR branch, since people do check branches out to review them sometimes.

Thankfully we squash PRs before merge, so it only ends up with a single commit in Master, but it does mean there's sometimes an intermediate mess.


I follow your flow, and I don't bother squashing my commits. So the repo history has lots of dumb little commits. We use a branch model for features, though, so I can just filter for those merge commits if I want to see granularity at the feature level.


Easier for you. Harder for who reads you history.


Probably. Like I said, though, it's easy to just look at the merge commits and ignore the rest. I don't personally read per-commit history almost ever. If I have to follow someone's train of thought that closely to understand what happened, then 1) the docs suck, 2) or there should have been comments and they are missing, or 3) the story was too complicated.

I will stipulate in advance that this is very much going to vary by project, language, and experience. It is entirely legitimate to have a different point of view ;-).


In addition it makes rebasing harder, which might make you use merge commits and then reading your commits gets even worse because changes might hide inside those merge commits.


How I was taught:

A branch for each feature.

Each commit should be a logical segment of work, such that it tells a story of how the feature came to be from the different aspects of the system that needed to be modified.

The branch will NOT be a history of how it was actually programmed.

If master or another dependent branch changes, rebase the feature branch onto the dependent commits to keep merge conflicts under control as the branch is being developed.

The branch will be rebased with commits squished and fixed as many times as needed (often 100's in a long running feature), with the end effect being such that the branch and commit history will look like divine providence.

Took some hazing and months to learn but absolutely worth it IMO, and I can't go back now :)


> The branch will NOT be a history of how it was actually programmed.

I'm not sure how to reconcile it with the idea of having a branch for each feature. In my head, developing a feature on a branch creates a history of how the feature was programmed. Could you elaborate on that?


I do the same as you - commit early and often, push nearly on every or every few commits, but I try and write my commit messages nicely even (and especially) if the code behind them is not so clean.

I care less about the code quality during development, so long as the commits make sense and the end PR is clean. Commits that may have placeholder names, disorganized code could include:

"fix: failing test case" "feat: support new user flow" "cleanup: variable names and helper functions" "cleanup: organize files" etc

If the commit messages are clear as to why the change was committed, reviewing is a lot easier even with less than ideal code along the way. So long as the end state is cleaned up.


I got conditioned to commit early and commit often for much of the same reasons, hardware failures while dealing with code that touched hardware. GPU’s mainly. So I would commit and push to have CI do checks while I pushed more code changes so that I could reduce the cycle.

A few years ago, Acme Security Corp decided to use my commit frequency as an excuse to say “I didn’t know how to code…” and let me go. So really it’s all over the spectrum and the correct answer is, whatever your team is doing.

If they are holding back commits for uber PR’s, well, so should you. Are they in a commit frenzy and everyone PR’s all day long? So should you.


My main git case is idiosyncratic -- I use blogdown [0] to host my personal blog, which means I am first pushing the content to GitHub and then it gets built/deployed on Netlify.

I typically commit once at first as a pretty complete draft, and then every subsequent edit/revision is a commit and rebuild. This comes to 20-30 commits per post, and some are indeed like 'fixed a spelling error'.

For this case, the benefits of being a frequent commiter clearly outweigh the costs.

[0] https://bookdown.org/yihui/blogdown/


I don't use commits as "atoms" of the work. Instead, I work at a PR level. I make PRs small and often, and try to make each one a small, self-contained piece of work, usually not more than a day or two.

Practically this means that I commit and push whenever I feel like it, but then always squash the PR. At that point I create a clean and informative commit message.

However, I leave a lot of context and information in GitHub rather than putting it in large commit messages. I rely heavily on GitHub's automatic linking of issues and pull requests for this.


It really struck me to read how many people have a "I don't give a f..."-style workflow with git, especially in the hackernews audience which I always considered to be more on a technical side.

With "I don't give a f..."-style I mean basically the idea that just committing away with some "wip"-style messages and then squash them all together before merging, or squash-merging them. This _kills_ all traceability and all future options to find out what was happening and _why_ (the whole reason git exists)!

In constrast, my workflow:

- Branch off of master

- Develop things, commit cleanly (which sometimes means 10s of lines of commit message for less than 10 lines of change!)

- Before everything goes into review, I revisit each commit. If there has to be formatting done, I create fixups for the individual commits and `--autosquash` them into the PR commits. Sometimes I even do something like `git rebase master -x "cargo fmt && git commit --amend --no-edit -a"` to automatically format each patch in the PR before submitting (or if I just missed to do it).

- Submit PR

- For each review comment, I create a fixup commit. As soon as I addressed all of the comments, I push the commits to the PR

- Repeat the step above until everyone is satisfied

- `git rebase master -i --autosquash` or, if master changed, sometimes even `git rebase $(git merge-base master HEAD) -i --autosquash`

- Wait for the PR to be merged

When a PR is ready, it could be one patch only, but also easily be 50 patches. Depends on the scope and size of the project and the PR of course.

This is my workflow for contribtions, both private and professional ones, but also for my private repositories (whereas "review" here is my own but also simply CI).

When working on projects on github in my free time, I even stopped submitting patches (after discussion of course) if projects use squash-merge, because if I put much thought and careful crafting into my commits and they just squash them anyways, I feel that they don't actually care and so there's no point in contributing for me.

(Edit: Formatting)


Make as many commits as needed to get whatever it is I'm working on "working". Then by default all the commits get merged during Merge Request and we get one commit to master.


> The code may be a mess, the variables may have names like 'foo' and 'bar' but I commit to have a last known good state. If I start mass refactoring and break the unit tests, I can revert everything and start over.

For that use case, if you are sure you don't want the intervening history in the main source repository, is to keep your “scratch” commits to a local store. This could be a branch, or a separate repo.

For my work areas I have a timed rsync taking snapshots, and can kick it off manually if I want to make a snapshot for a specific time (just finished debugging a major function, for instance). If there have been no changes since it last run the snapshot is not kept (just a note added to a file to say one was considered at that time). Identical files are all hard-links to the same data so it is pretty space efficient unless you have some large assets that change regularly. This has the advantage that I don't even have to remember to commit to anything: the snapshot is automatically taken regularly. Rolling back is manual, but rarely needs to be done and can be done easily for small parts of the update if the rest needs to stay.

Every now and then, as this is just backing up temporary work status, clear down snapshots older than a given point in time. Using a source control repo and auto-committing to that would work as well as creating filesystem based snapshots like I do I should think, but all my other backups were rsync+snapshot based when I set this up so I just repurposed existing scripts for the job.

> I'm forever aware disks can fail. I'm not leaving a day's worth of work on my local drive and hoping it's there the next morning.

Keep the snapshots on a different drive, or even a different machine (they are mostly in my case, though “just because it happened to get laid out that way” rather than by thoughtful design). Mine are even covered by daily off-site backups so if the place burns down overnight I still don't lose them (or at least absolutely no more than a day's worth). Just be careful with rsync, if you use this method, due to the many hard-links as the --hard-links option can be quite CPU and memory intensive, and if your backups are usually append-only you'll need to occasionally wipe old snapshots that are long since irrelevant.


As long as I'm working on my branch, that I work on alone, I tend to rebase and interactively rebase those relatively often. The mantra is you shouldn't rebase public history, but since it's my working branch, I feel I can do whatever I want with it.

Nice thing about small, atomic commits that still pass tests, is it works well with `git bisect` to find issues.

I also split commits defining a function etc from using it in other code. So if I `git reset --hard` because I screwed up some caller, I still have the function.


My "atomic" level is that it can build. I try to minimise time between commits/pushes because often I work on two different computers. So I don't try to keep it clean, and often cleaning is its own commit.

Android is a little weird in that a build may take 6 min on the production app though. So a lot of code is written and tested on an external repo where it might take 20 sec instead and then copied in when complete. Looking clean/complete isn't the intent but that's how it ends up.


I keep my personal dev folder in Dropbox so I don't have to worry about lost work since it always syncs to the cloud. My work dev folder uses syncthing to a remote server for the same purpose.

I make my own branch and then do checkins when I get to good stopping points. Then I do git squash merges to staging or the group branch when I have nice small updates. Keeps the public history clean and also lets me revert back to the previous stopping point, while not being worried about lost work.


I commit when I hut logical milestones. I ammend typos and stuff like that if I catch it early.

The commit all the time is a waste of time for me - my IDE (intellij) has excellent local history.


Create branch, I commit stuff always when my stuff isn't actively broken and I can somehow describe what the change did.

I push often just to be sure. (This is the modern equivalent of constantly hitting Ctrl-S to save).

Squash and clean up before merging to main branch after PR is reviewed and accepted. This makes rolling back the change a lot easier and keeps the main branch log clean from "let's see if this crap works" -style messages.


I commit often, push to branches until I'm satisfied, and for smaller branches often do squash-merges so all of the mess doesn't show up in the history.

Also, I too used "foo" and "bar" as variable names, but stopped doing that. If I find myself using nonsense names it means that my mental model of the problem (or the solution) is too vague. Then I stop and think about it instead of going ahead at full speed.


Commit early. But do not commit garbage. Each commit should be a small but meaningful advance. All tests should pass. If working with compiled code, off course it should compile. No todos. No clunky var or method names. A branch per merge request.

If you do it like this it is great for everyone. Do not squash commits or rebase. Those are antipatterns. We are lying to ourselves when we do it. We cannot learn from history we rewrite.


1. Create feature branch

2. Make changes

3. Commit changes at whatever point and write a good commit message about the feature.

4. Commit more.

5. Create feature/branch_rebase

6. Squash all the commits.

7. Push to fork

8. Make pull request

9. Make fixes and push those commits per fix.

10. Squash when merging back to that first commit.

I like having a separate rebase branch because then I have all my work in one branch and the history of what I did. Then I squash it and get rid of the that history, then maybe rebase main back into it if there we changes since I started the feature.


Messy WIP commits that are reorganized to cleaner commits later, like most others have said.

However, instead of rebasing, I often git reset to the beginning and recreate the commits from scratch, using partial file commits (or staged hunks). IDEs/editors like VS Code make it really easy to stage an individual part of a file for the commit. The CLI way (git add -p) has always been pretty confusing to me.


Work on it, committing whenever I have something working, or it's time to go home for the day.

Merge it as is.

Because nobody is ever ever going to go through my commits "reading a story". What they care about is the final stage of my code. I could squash it all, but those extra commits don't hurt anyone, and the space they take up is neglible, so why not leave history as, well, the actual history?


This made me realize I haven't heard about git-flow[0] in ages. There was an era when it seemed like everyone wanted to use that for everything, even when it didn't make any sense.

[0] https://nvie.com/posts/a-successful-git-branching-model/


It's really simple - commit early, commit often, push to your own branches/forks/repos where nobody looks but you. When you're done with your changes, squash them into something actually presentable and logically split into self-contained commits (interactive rebase is a wonderful tool), and then push out as a proper merge request.


For smaller PRs/bugfixes, I'll often just have 1 commit, and that's all it took me to write.

For bigger things, I do stream-of-consciousness, rebasing somewhat often when I get to a place where I like things.

Unfortunately we're all about merging, so that mucks up my commits a bit.

I use Magit to handle all that for me, which is by far the nicest git interface I've ever used.


I generally know what I want commits for before I start writing.

While writing I make my first commit once I've got one file changed, then ammend it as I go.

If I write a whole ton of code/don't plan well in advance, I can end up with a few changes on the go at the same time. I'll tend to reset all the commits at the end and interactively add lines to commits


I think clean commit history is overrated as long as your squash and merge prs into main. I’ve tried clean commits using rebase and I just don’t think overhead for me personally is worth it. What’s more important is small PRs. And if you can’t do that usually find spending some time reviewing the code with the reviewee is also helpful


I commit chunks of work, so I can view that functionality/diffs in one segment. Can squash it later. I do try to save "often" in case something happens. I'll just push something up like "progress save".

I've seen some people commit like 40 changes in a PR lol. Happens just squash it.

Can also use another branch


git pull default branch. Create local branch. Make change. Commit. Test. Remove quotes. Commit. Test. Add quotes back. Commit. Test. Google quotes. Remove quotes. Commit. Test passes. Rebase to default branch and squash. Push branch and MR. Ignore peer review policy and approve MR. Repeat!


I have everything aliased and use few commands:

gb feature/Foo // make a branch called foo from the current (usually develop) and switch to it

// make edits

gac I did a thing // add all files and commit with this message

mduc // Merge current remote develop into this branch

git push

mkpr // Create a draft PR from this branch on github and open my browser to it


When forming commits, this is the rule of thumb I have: can the entire diff of a commit be understood and explained, both in content and intent, by its commit message, ideally by someone months or years from now? If not, the commit is not properly “atomic” or the message deficient in some way.


I commit/push to a remote branch often mostly as a backup/easy undo. My commits are squashed for each PR; if I have two commits that I want to stay separate, I send them as separate/chained PRs (I like the idea that each commit in master represents a reviewed point in history).


We use feature branches. We commit frequently to those, but squash it in the end when merging to main. We rebase to the latest main to keep the feature branches up to date. This way history is organized (not that anyone ever looked at it), but also integration is frequent.


You need to include a step where you squash commits. For me that was the point where I had to learn vi, since "git rebase HEAD~4" will throw you into vi. So then I ran "vim tutor" daily for a week and learned enough to continue using git.


Before I ever commit anything, I always run diff2html command to view my changes in HTML side-by-side (I even have an alias `diff` that makes it fast and easy).

https://diff2html.xyz/


I like to

  git commit --amend --no-edit
often

once a piece of functionality is working, edit the commit so it makes sense

start new commit and work on next piece of functionality, repeat

clean commit history not just for me but for my coworkers who will benefit from seeing a cohesive commit diff


I imagine many like me use it so often they have this as the alias “git oops”.


depends on function for me:

- new features tend to be one clean PR since there's so much experimentation on what will work and lots of commits tends to slow things down -- a nominal disk backup is helpful, or even just a commit when taking a break on a branch of your own, no PRs until clean

- refactoring tends to be lots of test / experimental / ugly branches with incremental PRs, since tends to be easier to say, ok, change this one thing

- been experimenting with some third in-between mode, say add X with a bit more of a planning / sprint mindset vs. oh wow, let's see if this works

- feels like this approach is more of works for startups, probably harder to do in enterprise / large teams?


Branch out

break down work to logically separated commits (one logical change at a time)

Fixups will eventually come when I change something that should be two commits down.

Make sure commit history makes sense to the reviewer

Explanatory commit messages: the why (this library did not fit the bill because ABC)

Open the PR


I do the same as you, get the work done, keep is stable. Being clean and pretty is the _last_ thing I do.

(based on years of experience with errors, failures, data loss, etc...)

edit: But I also now like to be messy in a dev branch, so it matters less in the grand scheme.


I try to commit whenever I have enough to write a coherent message. I branch out to provide me with an objective and I merge back into main when I accomplish it.

I think I'm going to start using rebase to get those clean commit histories though.


Lots of commits. Squash merge the PR (github feature) to clean up the commit history.


I commit anytime the staged code can have a logical commit message. So I go for the smallest possible change per commit. Push when an issue can be closed/new release can be built or I leave my computer for any reason.


> I also push often because I'm forever aware disks can fail.

You could consider using a tool other than git push for protecting your work against disk failures. I rely on Apple Time Machine for that for example.


My dev box is deep inside a corporate lab. I'd be much happier if I could manage my own backups.


GitHub/Gitlab allow you to rebase your commits when merging. We do that, but has the disadvantage of mucking up things for people who are using that commit. I personally think it’s worth it


I commit and push often, with no sensible commit messages. The git messages aren’t used, except for issue ids and random garbage.

All details regarding code and reasoning is on the issues and pull requests.


I use branches gratuitously, and push to my forks regularly as you do, but I also do interactive rebases and edits when preparing to push to upstream to clean up my mess.


have you looked into squash commit merge.

Basically you do the same workflow as you are doing now, just squash it to a single commit so that wherever is your last commit stays.

if needed you can go to your branch for history

https://stackoverflow.com/questions/5308816/how-can-i-merge-...


commit early, commit often is really great for beginners. I dont do that. I commit when it makes sense to. Since my commits will get rebase squashed on merge into master, I dont put a ton of effort into doing it perfect.

Reasons I commit 1) it feels right, this was a good logical step 2) ima bout to do something that might break everything like a refactor 3) everything works so it makes sense to before I do something else


No branching + messy commits. Exceptions would be forking public projects to contribute, and public project issue resolutions ("fix #XXX").


ohmyzsh adds two git alias which I use very often `gwip` and `gunwip`. I push `gwip` commit for work which I am not clear on if to do a clean commit. Once I have settled on a unit of work ( like code + tests + docs ) I use `git add -p` to selectively create clean commits. For "fix misspellings and whitespace" kind of stuff use `git commit --amend --no-edit`.


I commit chronologically, then depending on the scope of the changes refactor into logical subsets, or just squash into a single commit.


It sounds like a main problem is that you don't have your local drive backed up. Can you address that first?


We're using gerrit and so when I have enough that I consider it worth keeping, I commit it and push it as a new change, I don't put anyone on review. I continue working, and doing commit --amend and upload until I'm done. Then I put people on review on fix whatever they come up with, and do the last commit --amend and push the final change. When QA accepts it I either submit, or rebase (if that is trivial, I submit, otherwise, I repeat the review, then submit when it has +2).


Amend and force push are your friends.


I've been introducing https://ejoffe.github.io/spr/ at work and am pretty happy with it. It makes the process of managing stacked PRs really smooth. With that tool, I usually just create a local branch and run git spr up when commits are ready.


depends on the review and PR policies. most are very strict, so I tune my branch to the expected commits.

with magit of course. GNU even expects proper Changelog entries per commit.

and I always keep on track with rebase, pull --rebase and rerere. everybody hates nonlinear merge trees


We just squash the commits in Gitlab at a Merge, so the sequential commits won't matter.


git rebase -i HEAD~5

Where 5 is some number of commits (or origin tags)

Is your best best friend! Write great messages, squash things, or fix ups. Force push over your branch after you clean up… and send out a great PR!


Create a branch. feature_todo

Do a bunch of commits.

Push branch to remote as needed.

Refactor.

Create a new branch. feature_1

Reset soft back where I started.

Commit a nice, clean change and good message.

Rebase main.

Push.

PR.


I commit when I finish a task.

I push at least once a day


rebase -i <branch-to-merge-to> then squash the changes you want to hide


Master only.


Maybe they commit "misspelings and whitespaces" to their private branches and then commit clean batch to the shared repository?


>I also push often because I'm forever aware disks can fail.

In the 20+ years that I've been using computers, and ≈15 or so that I've been writing software, I've never experienced a drive failure.


About 30 years using computers, a little over 20 being paid to do it. I've had two fail outright (one internal, one external) and a couple others develop errors that I noticed (one because I finally started using ZFS on my bulk-storage machine and it's very good at telling when a drive is misbehaving, the other, SMART caught, years ago) so they had to be replaced. All were spinning rust. Maybe 50 drives total, over my computer-using life, counting those in game consoles and ones in work machines that were assigned to me. Including SSDs. Give or take 10.

With the experience of the last 3 or so years of using ZFS, I'd bet most of my "still working" drives over the years were erroring and losing data by the time I replaced them, and I just didn't know it.


This is a good example of how scale can bias your view! :-D

Personally I share the same experience: In the 30 years I've been using personal computers at home and work, only 2 disks failed on me, one was an error on my part during an OS upgrade and the second was an external disk that physically fell off my bed while I had it plugged into my notebook. So both of these weren't really those random failures.

On the other hand, while working in an IT infra team of a 60 developer company with 4 racks full of servers for 3 years, we'd get to see about one failed SSD per 2 months and 2-3 failed disks or SSDs per year in servers.

During the 1 year I worked for the IT infra team at a smaller hospital of a medium sized city with a large VDI environment and two HP EVAs in a two datacenter configuration each, we'd get 3-4 disk failures per week. Those had over 140 disk per storage each and were approaching the end of their support life, being around 5 years old, so the failure ratio started to get higher.


I've had a Macbook SSD fail (suddenly and completely) within the warranty period. Surprisingly I've only had one hard drive fail before I retired it, maybe when it was about 6 years old.

But those are just the catastrophic failures, I'm pretty sure I've also had hardware-related data corruption here and there as well.


Some non physical failure scenarios I have experienced include catastrophic filesystem failures (NTFS: never again), loss of machine and no handy backup machine capable of reading the disk or filesystem in question (extX, ZFS, etc.), accidental damage to disk, partition or filesystem tree due to bad code (operator error; PEBSAC).


Sure, but your anecdote does not refute in any way that they can fail. It’s all about risk vs. overhead of avoiding that risk. The overhead of pushing often is very low, so even with the low probability of a drive failure, there’s not much reason to not do so (or any other form of backup, really).


I have. You're not coding hard enough!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: