How to teach Git

ahartmetz · on Jan 16, 2019

And here is something to take the garbage quality of Git manpages with some humor

https://git-man-page-generator.lokaltog.net/

"git-eliminate-head eliminates all downstream heads for a few forward-ported non-counted downstream indices, and you must log a few histories and run git-pioneer-object --pose-file instead. [...]"

romwell · on Jan 16, 2019

This is a kind of very specific, targeted, niche comedy, excellently executed. I only use Git casually, and I can't tell how much of that text couldn't make sense in an actual doc; it all looked real enough at a glance.

Thank you for sharing, this made my week.

Groxx · on Jan 16, 2019

Impressively small generator data too, given how realistic it reads: https://github.com/Lokaltog/baba-grammar-git-man-page-genera...

but maybe that's due to the low bar that it's being compared to.

virtualwhys · on Jan 17, 2019

Please note[1] that:

> To rev-parse an automatic FLOUNDER_LOG or diff the working subtrees, use the command git-link-submodule --retrieve-wrestle-change.

I often overlook this detail when trying to hulk smash some broken change I've made upstream (what the manual entry correctly refers to as RIP_OTHER_TIP).

[1] https://git-man-page-generator.lokaltog.net/#81394c8bf3806f9...

MisterOctober · on Jan 17, 2019

Brilliant, absolutely had me rolling. Most cleverly-executed satire I've seen in a long while.

"To parse a staged SKIRT_SUBTREE and blame the working histories, use the command git-purchase-pack --snuggle-muster-branch, as after reapplying subtrees..."

pferde · on Jan 16, 2019

Thank you for this, it made my day. :)

euske · on Jan 17, 2019

Am I only one who's getting an error on Firefox?

    Content Security Policy: The page’s settings blocked the loading of a resource at self (“script-src”).

hyperpallium · on Jan 17, 2019

Good to see tax accountants contributing to open source.

zwieback · on Jan 16, 2019

Love it. I'm sold on git but it often feels like I'm in one of those man pages.

LyndsySimon · on Jan 17, 2019

In my experience, it gets better. After a while you know everything you need to not only do your job, but to understand and resolve common issues. After a couple of years you'll be at the point where the only time you have to dig through manpages is when you're trying to do something esoteric or clever.

arduinomancer · on Jan 16, 2019

That's hilarious

mrzool · on Jan 16, 2019

Nothing short of sublime. Thanks for sharing.

paultopia · on Jan 17, 2019

OMG THIS IS THE REALEST

philbarr · on Jan 16, 2019

That's fantastic!

ppezaris · on Jan 16, 2019

oh my sides.

sonnyblarney · on Jan 17, 2019

This is funny, but in reality it's not.

Git is one of the most amazing, powerful tools ever conceived, with one of the must byzantine and ridiculously designed 'interfaces' ever conceived.

People confuse the raw power of a tech, with how well it can be feasibly used. Sadly, due to the later issues, git will only ever be a shadow of what it could have been.

With all due respect to Linus, who'd be the first to admit he's not very good at UI stuff (I mean command line as well)... it's truly a sad thing.

This is a major 'problem that needs to be solved' I'm interested to see how it could evolve into something 'better'.

karmakaze · on Jan 17, 2019

I don't think the complexity of git's command options is a UI problem. It results from the basis of its operation. We could change some names, add or remove some concepts to how some of the operations are performed, but there are simply a large number of actions to handle many edge cases.

A better solution for prose was to always be merging with live multiple collaborator updates. Conflicts are visible in real-time. I can't see something like this would work with code. Hmm interesting... unless we only allow additions and refactorings to working checkpoints.

P_I_Staker · on Jan 17, 2019

Yes, I don't really understand all the people trying to "fix git". Git's fine, though the complexity makes it challenging, especially for new users. However the complexity is a direct result of useful features. "Keeping it simple" is great, except when the complexity is needed. I'm hard pressed to name any features I could do without.

sonnyblarney · on Jan 17, 2019

There are two problems: one of inherent complexity, one of UI.

First - the UI is a mess, and that should have been fixed. It would make a big difference.

Second - is the inherent complexity. That's a good point, but I feel many things could be hidden or obfuscated.

Most poignantly, Git does something that most of us do not need: it was designed to work as a 'completely distributed system', i.e. for open source.

Almost none of us do that. 95% of uses cases related to you and I working collaboratively, on a project together.

The need to have repos which are essentially totally distinct from one another is a huge source of complexity and it simply doesn't need to exist in most cases.

So Git is basically an 'admin level tool' that is commonly used in scenarios for which it wasn't meant to be used, with a confusing interface.

It's costing a lot of time and money and headaches, I do believe someone may come along eventually and fix it.

This thread is essentially evidence of this - see how many people have difficult teaching what should essentially be a simple thing in most cases.

Way too many very smart people still spend too much time clustering around in git.

fiblit · on Jan 17, 2019

Ever used a centralized VCS, such as SVN? That seems to be what you want? A distributed VCS is extremely useful, for mine and many others' uses, however.

romwell · on Jan 17, 2019

>I'm hard pressed to name any features I could do without.

Nobody is complaining about there being too many features. People are complaining about the arcane incantations that one needs to conjure to call them.

P_I_Staker · on Jan 17, 2019

Good point. I wouldn't say nobody though. There are people out there that think there are too many commands. I've even seen academic papers that claim the staging area is problematic. I love the staging area, and don't think it's terribly confusing; there's always commit -a if you don't want to use it. It does lead to some confusion, but it's worth it for the added features.

That's just one example, but there definitely are people that think git's too featureful. As for more valid criticisms, I'd agree. I've heard the CLI compared to being in an abusive relationship. All that said, I can't really think of a better way to handle things without losing useful functions. In which case, I don't have any better ideas, and don't really know what I'm criticizing.

cyborgx7 · on Jan 16, 2019

Here is my personal recommendation for getting more comfortable with git. Use "git status" a lot. Everytime you do something in git, and before you do something, do a "git status" and see what you change with your commands. And what you didn't change.

Also "git log".

Ajedi32 · on Jan 16, 2019

`git log` on its own (with no flags) isn't that useful, as it's missing a lot of important information. I prefer `git config --global alias.lg "log --color --graph --oneline --decorate"`. Then you can just type `git lg` and get a much more useful overview of the state of your current branch.

oblio · on Jan 16, 2019

Git in many occasions (including when using git log) feels like the designer just threw their hands up and said:

"F*ck this shit, make your own UI on top if you want to use this tool!"

I don't think Torvalds has a proper excuse for the pain and suffering he's inflicted on millions of developers worldwide :(

(To the people going: "Oh, it's Open Source!". Sure, so are Mercurial, Fossil, etc.)

rkangel · on Jan 16, 2019

This makes a lot more sense when you realise that Linus thought he was building a low level tool that people would build a UI on top of. The 'original' git command line was more a proof of concept and engineering tool than something aimed at actual use.

WorldMaker · on Jan 16, 2019

Right, doing a search through the git mailing list for the use of the word "porcelain" is fascinating. It's unfortunate some of those "porcelain" projects never finished.

mcguire · on Jan 16, 2019

... because none of the power users used them.

yjftsjthsd-h · on Jan 16, 2019

Heh. I think Linux on desktop it is like that, at least a bit; frequently, by the time someone could write a friendly GUI (or friendly CLI for that matter), they have little use for it.

zbentley · on Jan 17, 2019

That's a disturbingly insightful parable for everything from startups to linux/FOSS contributions.

pjc50 · on Jan 16, 2019

I don't think it was ever really intended for worldwide use by Linus, it was intended to exactly replicate his workflow and his alone. The real responsible party is github; I'm not sure how they came to dominate?

delian66 · on Jan 17, 2019

By being much, much, much better than the competition at the time (sourceforge) - github was faster, without ads, used superior vcs. Github also had Octopuss/Octocat as a logo :-) .

pjc50 · on Jan 17, 2019

Sure, but why github rather than mercurialhub etc? The website has great usability but the git CLI does not. Maybe the github gui was also critical.

delian66 · on Jan 17, 2019

Yes, the gui was good (and it still is, or may be I am too familiar with it at this point, I do not know).

Another factor is that git was always FAST, and the most common operations (init, log, status, commit, checkout, push, pull) are not that complicated as people make them to be, so you can very easily start to use it for new local projects, and continue without installing additional plugins, unlike mercurial ... And at some point, when you wanted a colaboration hub or just a public remote repo, you just pushed your code to github.

Liskni_si · on Jan 18, 2019

> Git [...] feels like the designer [...] said "F*ck this shit, make your own UI on top if you want to use this tool!"

Actually this is exactly what happened back then, which is why many people used https://en.wikipedia.org/wiki/Cogito_(software) as a frontend to git in the early days. That said, I personally find current git UX absolutely fine and I can't imagine being effective without having all those commands like git reset --whatever -p and git rebase --interactive.

delian66 · on Jan 17, 2019

What an absurd statement: "Torvalds has a proper excuse for the pain and suffering he's inflicted on millions of developers worldwide :(" !

Who are you exactly to judge people like that, oh failed incarnation of a tibetan Lama ???

He does not need an excuse to make a tool that has proven very useful to many (including me). In fact, I am grateful, that he chose to make it and publish it.

da_chicken · on Jan 16, 2019

It definitely feels like it was written by someone used to writing an ABI or library rather than a set of command line tools.

Certain tools are organized by what code goes together rather than by what user functions make sense to go together. It's part of what makes it confusing.

hoorayimhelping · on Jan 16, 2019

    git config --global alias.lg "log --color --graph --oneline --decorate"

for easy copy-pasting. HN doesn't support backticks or triple backticks like a lot of markdown parsers. To get fixed-width fonts, you have to insert 4 paces at the beginning of the 'code' block.

Stratoscope · on Jan 17, 2019

Two spaces (unlike Markdown).

Of course four will work, it just adds more indentation.

u801e · on Jan 17, 2019

In some mobile browsers, doing this will result in the fixed-width font line getting cut off when viewing the resulting post.

WorldMaker · on Jan 16, 2019

I got into the habit of teaching people `gitk`. It's not the prettiest tool, but it's included with most distributions of git, defaults to a decorated color graph log, and thus avoids the "copy paste this weird config line" step on other people's machines.

gglitch · on Jan 17, 2019

I have a soft spot for Tk interfaces. I hope they become cool again (or, I mean, for the first time).

creeble · on Jan 16, 2019

+1 these, I wish they were default for git. Much better both for beginners and experienced users.

99052882514569 · on Jan 16, 2019

I just use a GUI for that. I don't like using applications like Sourcetree for any actions, but they really are superior for visualisation of the commit tree, diffs between non-adjacent commits or between branches, etc.

Fellshard · on Jan 16, 2019

I'd recommend Git Extensions, still, as a good compromise between porcelain and plumbing. The most useful commands are all at your fingertips, with deeper ones available if necessary. Visualization is best I've seen - clean graph display, easy to read, and _doesn't lie in complex cases like SourceTree does._

SourceTree tends to treat commits with multiple ancestry in a very weird way that has led to difficulty on multiple occasions, where people think changes 'went missing' because SourceTree decided the other ancestor wasn't important enough to show.

99052882514569 · on Jan 21, 2019

I don't use Windows. Seems a little clunky to get it to work on Linux, even more so on the Mac. Haven't had any multiple-ancestry issues with SourceTree - perhaps resolved in an update since you encountered it?

miccah · on Jan 17, 2019

Zsh users can use glgga to get a nice full detailed graph log as well. It's not as concise as the custom config, but I use it quite often.

IshKebab · on Jan 17, 2019

Classic git CLI. The option everyone wants is behind 5 optional flags.

war1025 · on Jan 16, 2019

`git log -p` is my favorite obscure git command. Shows you commit by commit changes. If you specify a path, it limits to only those files. If you do a single file you can do `git log -p --follow <file_path>` and it will track the file across moves and renames.

Also `git whatchanged` is a super helpful command to see just the list of files that changed in each commit

LyndsySimon · on Jan 17, 2019

`git add -p` is also very, very useful. I use it all the time, to the point that I've aliased it to `a`.

If you're interested, here's my .gitconfig, including all of my aliases: https://gitlab.com/lyndsysimon/dotfiles/blob/master/git/gitc...

war1025 · on Jan 17, 2019

Completely agree. I think a lot of people would enjoy git more if they made better use of the `-p` options.

tome · on Jan 18, 2019

Do you know about `git commit -p`? You might like it even more than `git add -p`.

rachelmcarmena · on Jan 16, 2019

I didn't know "whatchanged", thanks! However, look at the description: https://git-scm.com/docs/git-whatchanged

war1025 · on Jan 16, 2019

Yea. It's technically deprecated, but it is also super useful so I don't see any reason not to use it.

tome · on Jan 18, 2019

Yikes, do people consider `git log -p` obscure? I can't live without it!

tkjef · on Jan 16, 2019

here's my git log alias: alias gl='git log --date=local --pretty=format:"%C(124)%ad %C(24)%h %C(34)%an %C(252)%s%C(178)%d" --stat -p'

has nice color coding around the date, person committing, commit hash, and message.

wyclif · on Jan 17, 2019

Better yet, alias "git status" and "git log" so you'll be more inclined to use them frequently.

Derbasti · on Jan 17, 2019

I have taught git for university classes for some years. To be honest, git is a mess. It is conceptually not that hard, but the nomenclature is inconsistent and dangerously ambiguous (quick, what is the difference between reset, rebase, revert, and checkout?).

The most effective work flow I have found so far, is teaching only status/clone/pull/add/commit/push. Show them explicitly what happens normally, what happens when two changes conflict, and how to resolve merge conflicts. Using git on the command line only.

Then, have the students use git for a big-ish multi-student project. They will figure out the workflow themselves. After that project, once they understand the basics, you can talk about branches, debasing, the log and reflog, pull requests, and all the rest. Don't introduce graphical front ends before this point.

This method works well. It takes about one hour of teaching, and five weeks of active use afterwards. Git is a total pain to learn, and can only be understood by actively using it. I often get very positive feedback for having taught git.

I have gone through a few iterations with this topic, and have found that stripping down the initial instructions to an absolute minimum works best. All those fancy box diagrams are actively harmful to beginners.

probinso · on Jan 17, 2019

I found that running workshops was the most effective way to teach git. I wrote a set of tools against the github api to spin-up and manage a large number of repositories and students in a way that would result in merge conflicts (and doesn't require knowledge of any programming language).

https://github.com/probinso/ABC

_a3nw · on Jan 17, 2019

I don't know if this is true. For me personally I have had to use git for a lot of different projects, but I still don't understand anything besides commit, push, pull, add, and force-push (lol). In my experience you can get away with just learning those commands, but I still don't feel comfortable with git.

marssaxman · on Jan 17, 2019

Does anyone really feel comfortable with git? I've been using it for over a decade, in a handful of different organizations, on projects large and small, both at work and for personal projects, and it still feels like a convoluted mess. Its terminology is wildly inconsistent and my mental model of its behavior remains stubbornly full of fog. I have a library of recipes committed to memory, but I still don't really understand it, and sorting out what's gone wrong when things inevitably go wrong remains challenging.

diegoperini · on Jan 16, 2019

An honest criticism:

If I had trouble with understanding the reason behind add->commit->push workflow, I would definitely have no idea what this article talks about when it says things like "merge, rebase, diamond shape". The flow chart looks almost exactly the same for "pull" and "pull --rebase". The only difference between the charts is the wording which has no meaning at all for a newbie.

jamesb93 · on Jan 16, 2019

It sounds stupid but something that really simplified git to me as an absolute beginner about a year ago was the fact that it doesn't exist on GitHub. Knowing that git was just something that sat in the folder on your filesystem and monitored changes took away the notion that there is some kind of sync between your remote and local. Working just on my own laptop made me realise how intuitive all the commands and strategies for solving problems were. Then came pushing, pulling etc between branches and it all just fell into place.

benwaffle · on Jan 16, 2019

I wouldn't say it monitors the directory, because there is no git daemon running. Git just looks at the changes whenever you run a git command

cyborgx7 · on Jan 16, 2019

This is an article, not for people who don't understand git, but for people who do understand git and want to explain it to others.

rachelmcarmena · on Jan 16, 2019

Right, thanks!!

leni536 · on Jan 16, 2019

> The flow chart looks almost exactly the same for "pull" and "pull --rebase".

"pull" and "pull --rebase" can cause two kind of conflicts:

1. Merge or rebase conflict inside the repository.

2. The would-be merged or rebased HEAD conflicting with the dirty working directory.

The article demonstrates the latter and it's the less important one as it's avoidable by pulling from a clean working directory or pulling a non-HEAD branch.

mcguire · on Jan 16, 2019

I would suggest making sure to explain why there staging step exists: to create coherent commits. Then go into "why distributed version control."

How to write a Good Commit Message (https://chris.beams.io/posts/git-commit/).

rachelmcarmena · on Jan 16, 2019

Thanks for the feedback! Right, this is only a help to have a mental map. Only some initial drawings for a mentor or trainer, not for the newbie.

diegoperini · on Jan 17, 2019

If so, I'd love to see a more elaborate, guide-like version of this model.

I've tried a similar approach years ago for teaching and failed spectacularly. Eventually, my peers became comfortable when they got used to the Github Desktop Client. They compared the buttons they click with my terminal commands. We also compared our graph views on Github website to visualize the logic.

It's been years and still none of them used rebase even a single time. A sad story in my teaching non-career. :(

rachelmcarmena · on Jan 17, 2019

I drew those diagrams after reading Pro Git Book. I missed what appeared at my head, but the entire book is very useful.

For me, command line is freedom. GUIs are very limited.

Don't give up! For my humble experience:

- Try to know your peers, how they work, what difficulties they face when using command line, ...

- "I'm lost" > "git status"

- "How did we solve... ? > "git log"

- "This command is difficult to remember." or "This command makes no sense, I prefer this another name for that action" > "git alias"

iheartpotatoes · on Jan 16, 2019

This is nice, but I'd like a 201-level handholding on git. I've been using it for 5 years and I'm still just a clone/commit/merge/(bang head)/push user yet I know there is tons more it can do that would probably make me more effective. (I'd also like to switch my team of SVN. Someday....)

svat · on Jan 16, 2019

My recommendations:

* Git From the Bottom Up: https://jwiegley.github.io/git-from-the-bottom-up/ (PDF: http://ftp.newartisans.com/pub/git.from.bottom.up.pdf) (See also: "Linus Torvalds' greatest invention": http://perl.plover.com/yak/git/) Once your mental model matches the program, you'll be able to understand everything, hack your own solutions if necessary, etc.

* Learn Git Branching: https://learngitbranching.js.org -- in fact I think this should be the UI for Git; wish someone would make that happen (i.e., so that you can point it to any Git repo and get that sort of visualization for it).

rachelmcarmena · on Jan 16, 2019

Thanks! Awesome resources! I'll add it to my post as soon as I finish reading all the answers. I'm learning a lot from you!

malingo · on Jan 17, 2019

I'd also recommend http://matthew-brett.github.io/pydagogue/rebase_without_tear...

sjburt · on Jan 16, 2019

I think once you understand the graph, then looking at what the graph looks like and what you want it to look like usually leads you to being able to figure out what operation you want.

And then git fetch/push/pull are mostly about copying parts of that graph from one place to another.

After that, it's mostly a question of what workflow you want and that becomes much more than a git question because git works for a variety of workflows, and many of the features of git are really only relevant for certain flows.

joachimma · on Jan 16, 2019

100% agree. Once you know what you want the tree to look like it's usually just a matter of making a branch/tag (in case you mess up, and need to start over) and not being afraid of reset, even with the --hard flag.

wickchuck · on Jan 16, 2019

SublimeMerge is also a wonderful tool. Been using that since it came out and has helped tremendously with difficult merges. It's lightning quick as well.

Similar to magit - Sublime Text also has SublimeGit, not as fast as SublimeMerge.

Visual Studio Code built in git functionality is also nice.

jen729w · on Jan 17, 2019

A mention for Tower. I only do the basics but it totally took away my Git fear.

https://www.git-tower.com/mac

yyx · on Jan 17, 2019

gitKraken is nice too

https://www.gitkraken.com/

IshKebab · on Jan 17, 2019

Use a GUI. Git repos are graphs and trying to understand a graph from the command line is like trying to paint over the phone.

I recommend GitX (FOSS, Mac only, a little buggy but has the most logical UI and let's you easily amend commits), SourceTree (free, quite slow, Windows/Mac), Tower (paid, cross platform) or SublimeMerge (paid, easily the fastest, cross platform).

sokoloff · on Jan 16, 2019

If you're an emacs user (and maybe even if you're not), try the magit package.

That taught me in a few weeks of usage more git 102 and 201 type functionality than a year of using only the basics did.

Fellshard · on Jan 16, 2019

As a new emacs / spacemacs user, I bounced off magit hard. Too many weird things going on, hard to tell what I can do.

A couple months later, and a bit more experience with how emacs structures things, and I've returned to Magit and _love_ it. It just feels natural to hop around in for standard tasks. Certainly a good replacement for other porcelain UIs I'd normally use.

aldanor · on Jan 17, 2019

I use Spacemacs ONLY to have access to Magit. And I've successfully converted many non-emacs non-vim folks to use it, all of them love it.

akulbe · on Jan 16, 2019

This is so so fitting, and describe me to a "T" as well.

I'd add oh-crap-I-screwed-up-so-let's-clone-the-repo-and-start-over to the list.

buovjaga · on Jan 16, 2019

Alternative to nuking from orbit is taking a look at "Flight rules for git": https://github.com/k88hudson/git-flight-rules

rachelmcarmena · on Jan 16, 2019

Awesome! I'll add it to my post! Thanks!

smichel17 · on Jan 16, 2019

What can you screw up that requires re-cloning the repo rather than just a reset --hard?

akulbe · on Jan 16, 2019

Not knowing about the "reset --hard" feature? :)

hinkley · on Jan 16, 2019

For one, accidentally deleting the .git directory.

P_I_Staker · on Jan 17, 2019

Honestly, you just need to try some things. Your reflog is your friend. You'll get the hang of reset/rebase/ect with a little experience. I also think it's worth it to learn a bit more about diff and log.

I have no idea why anyone would want to go back to SVN. If you don't want to use all the features, I understand; by all means, continue to only ever branch -> commit -> push -> create pull request. There's no need to subject the whole team to version control that doesn't work. That's what sold me on git; it's the only version control system that never failed at it's job of letting me track my changes. It doesn't break it's promise to always be there for me.

SVN, centralized systems, and even mercurial to some extent, prevent users from tracking their changes. This leads to questionable workflows (lots of copying directories to "save things"), or even worse, developers just don't track their changes at times. It sounds weird that I have to say this, but I feel that version control systems should be available to track changes 100% of the time. It seems like many of the people who dislike git, don't see the value in this, which I find absolutely baffling. Git means never being scared to create a commit.

krupan · on Jan 17, 2019

"and even mercurial to some extent, prevent users from tracking their changes"

False. Stop spreading misinformation about Mercurial.

"Git means never being scared to create a commit."

This is indeed one of the biggest benefits of DVCS tools, including git and mercurial.

P_I_Staker · on Jan 17, 2019

Thanks for the feedback. Quite frankly, I never found a comparable workflow in Mercurial (but that doesn't mean one doesn't exist). Commits still felt like a permanent thing that made me pause and think about whether something was "worthy" of a commit, thus I was unable to track my changes. This problem just doesn't exist in Git, though some care is needed when publishing (as w/any system).

I'm unaware of anything within base Mercurial that lets me do what I can do with git. Maybe the answer is to use the extensions, however that's a bit unsettling when some of them are being depreciated (eg. hg queues). Often times I found myself cloning and maintaining multiple directories for something that was better off as one thing. This led to reluctance to track changes, which is clearly evil. I guess I must have been doing something wrong, but I've heard similar feedback from others regarding Mercurial. I loved using hg and would have no problem using it (it was my first); I just prefer git now.

LukeShu · on Jan 16, 2019

I agree with sjburt when he says "I think once you understand the graph, then looking at what the graph looks like and what you want it to look like usually leads you to being able to figure out what operation you want." When I was first getting in to Git ~10 years ago, I remember that I found the PeepCode "Git Internals" book[1] very helpful for getting that understanding of the graph.

Once you have the better understanding of the graph, it's hard to find resources on how to improve from there; most resources focus on beginner stuff, or function more as a technical reference without really talking about use-cases. I've found following Mark Dominus' blog[2] for his posts about Git to be the single best thing to "level up" my Git usage once already being at a high-level.

[1]: At the time, I had to pay money for the eBook, now they have the whole thing on GitHub: https://github.com/pluralsight/git-internals-pdf

[2]: https://blog.plover.com/

vharuck · on Jan 17, 2019

I took it upon myself to introduce Git to my colleagues: statisticians who've never used version control. I tried to preempt hard questions by going into a repo, wrecking it like a mad bull, and trying to undo the damage with Git. Turns out, it was a great way to learn for myself.

Exercises:

- Commit a "secret" and involve it in other branches, commits, merges, etc. Then remove the secret so that nobody can ever learn it from the repo.

- Clone a repo from another local repo and then rewrite history with rebase and revert. Then commit different work in each repo. What's the least painful way to get them "compatible" without losing any work?

Fellshard · on Jan 16, 2019

Humbly, I'd disagree that's the best, though this is a superb _part_ of the picture to teach Git well. These are nearly the last steps, I would say.

When new colleagues joined our firm and hadn't yet learned Git, the problems were always the same: uncertainty. They didn't know which Git operations were safe, and they didn't understand how to perform seemingly risky maneuvers with zero risk. They're used to even more dangerous tools that can wipe your work in a second - and, to be fair, Git can as well. The difference is that once you understand Git, you never have to worry about losing work.

So the way I would teach Git is to honestly start with the graph. Show it in action with pictures. Show how to always keep references to commits around to ensure work sticks around. Show how branching and stashing work, let them be confident that the tool will keep everything right where you left it.

_Then_, once they're confident in the basics, weave in the remote repositories.

epage · on Jan 16, 2019

> So the way I would teach Git is to honestly start with the graph. Show it in action with pictures. Show how to always keep references to commits around to ensure work sticks around. Show how branching and stashing work, let them be confident that the tool will keep everything right where you left it.

Personally, I think this should be coupled with teaching `git reflog` as the universal undo (as long as they don't `gc`).

hinkley · on Jan 16, 2019

Don't teach anybody git gc. The people who really need to find it will come upon it all on their own.

bagrow · on Jan 16, 2019

Having taught git several times within a data science course I find two concepts especially worth extra time: WHY there is a staging area, and what is the difference between “git” and “github”.

radarsat1 · on Jan 16, 2019

> WHY there is a staging area

I understand your second point, but I have a hard time understanding the difficulty with this part. Why is it hard for people to understand the idea of staging?

You put things in a box one at a time before closing the box. Does it require more explanation than that? What do people find difficult about it?

pjc50 · on Jan 16, 2019

People are very used to the web "save always" style: There is one document, and you're editing it. Most people will be familiar with the traditional desktop "save" model where you have to do something to make your changes permanent.

People often then learn that there is a local file and some remote file: they can cope with a save -> upload workflow. Lots of traditional VCS turn this into a save -> commit workflow.

Git adds two stages to this that people can't see the need for without understanding the internals: an extra step between save and commit, and an extra step after commit.

(The discussion reminds me of all those people who think that if they just start by talking about monads then people will find Haskell easy and natural...)

doubleunplussed · on Jan 16, 2019

There are a whole bunch of layers now, though they're all useful.

1. Is my document saved?

2. Are the changes staged?

3. Are the changed committed?

4. Are the changes pushed to my fork on e.g. github?

5. Are the changes merged into the upstream repository on e.g. github?

rakoo · on Jan 16, 2019

The don't need to understand the internals for this: just knowing that every save you do will be stored forever as-is makes you double-think about what you put inside

JediWing · on Jan 16, 2019

So I have a solid mental of git, and I understand the theoretical need for the staging area.

However, I find the occasions for using the staging area in practice are few and far between, for the simple reason that I can't test and execute the code that's in the staging area without also having the code from the working directory also be there. It feels like after having partially staged some of my working directory, it would be a blind commit with no guarantee that things are working.

Very rare is the situation that I can break out a list of files over here that are for feature A and some over there for feature B, and never the two shall interact.

I think this is probably what most struggle with regarding the staging area, without being able to articulate it.

aurumpotest · on Jan 16, 2019

I use it quite a lot, especially with `git add -p` to stage only parts of a file for an atomic commit.

anyonecancode · on Jan 17, 2019

I second this. It wasn't until I adopted this practice that the staging area really made sense to me. I find it helpful not just for making atomic commits, but as a way of remembering what I was actually doing, so that I can write a good commit statement.

url00 · on Jan 16, 2019

This has never made sense to me. I've seen others say that they commit only parts of a file. How does this scenario start? Are you working on solving one problem, but then notice some other unrelated issue and fix that too, before committing the first change?

acemarke · on Jan 16, 2019

Partly, yes. Or, I'll be working on a task overall, and have to touch multiple files in the process. Then when I'm ready to commit, I review all the modified files on disk, and look for ways to break those down into smaller discrete logical changes. I prefer to avoid "big bang" commits as much as possible, because smaller individual commits are easier to inspect, easier to back out if necessary, and provide a better "story" when inspecting a file's history sometime down the road.

Someone · on Jan 17, 2019

But then, you either never run/tested those smaller individual commits, or you have to do extra work (stash changes, test, restore stash) to do that.

I do not see why a source control system should make it easier to make a commit that hasn’t ever existed on disk and thus cannot have been tested.

I think the better model would be to stash your changes and have an diff editor between the on-disk working copy and the stashed version that allows you to commit a set of changes as several smaller, more coherent commits.

That wouldn’t guarantee that each of those intermediate commits gets tested or even built, but it would guarantee that each smaller commit is in the on-disk copy at some time.

u801e · on Jan 17, 2019

> But then, you either never run/tested those smaller individual commits

Not necessarily. One nice option that the git rebase command has is --exec (which can be specified multiple times). So you can run a rebase and have git execute a command (like running a test suite) for each commit in the branch. If any commit files, the rebase process will stop and let you amend the commit to fix the issue.

> or you have to do extra work (stash changes, test, restore stash) to do that.

I've found that it's easier to write and locally test a given feature and them incrementally stage parts of it and create commits before pushing the code up for review. To me, that's easier than just making a large commit and then trying to split it out into a better set of commits after the fact.

For example, I may write a new method and then call it several places in the code. So my first commit would be to add the new method along with its unit tests and my second commit would be to add calls to it in the code base and update the associated integration tests (if necessary).

JediWing · on Jan 23, 2019

Did not know about rebase with exec! I'll have to try that! Thank you for the insight.

nemetroid · on Jan 16, 2019

One common scenario is that I'm working on one problem, and in the process of solving that issue do some refactoring of related code. In this case, I want to commit the refactoring (which does not change the program's behaviour) before committing the changes that do change the program's behaviour.

jononor · on Jan 16, 2019

I typically then send that first refactoring commit to Github (on its own branch) so that it gets full CI test coverage. And then continue working on the fix/feature while it runs.

u801e · on Jan 17, 2019

One use case is to exclude extra lines of the file you don't want to commit. For example, I might have some debug print statements in my file that I want to keep in my local copy of the file while testing, but I don't want to include in the commit I push up for review.

smichel17 · on Jan 16, 2019

> Are you working on solving one problem, but then notice some other unrelated issue and fix that too, before committing the first change?

Almost. Most often it's:

- Working on solving problem A - Notice problem B - Start to solve problem B - Notice I'm getting distracted from A, and return to finish it. - Want to commit my fix for A, but don't want to lose or forget the partial work on B.

Two different approaches I might take in this situation, depending on whether B is related to A.

1. If they are related (eg, B depends on A), use `add --patch` to commit A, then finish and commit B. 2. If unrelated, use `git stash --patch` to stash B, then commit A, then switch to a different branch to finish B.

Honestly, I see the point of both stash and staging, but not both together. Too many tools for the same job. On my long list of projects to do is a git porcelain that combines some of these concepts (eg, stash and working directory which would be tied to a branch):

- Each branch would have a single stash. - When you check out a new branch, all uncommitted changes are automatically stashed. - If the branch you're switching to has anything stashed, that stash gets popped. - Any current workflow that involves stashing can be replicated by using a branch instead of a stash.

This way, branches can be thought of as "state of the working directory", which is more intuitive with the branching tree model, imo; commits are a snapshot of the repo at that point in time; and the staging area is just a way to choose what should be included in those commits.

pseudalopex · on Jan 16, 2019

Amending the last commit does basically the same thing and records each state in the reflog.

dahart · on Jan 16, 2019

You never amend commits or rebase locally before pushing? I rebase before pushing almost every time.

Git’s workflow wouldn’t even be sane without the staging area. This is what allows you to fix mistakes and make your work presentable for remotes.

pjc50 · on Jan 16, 2019

> Git’s workflow wouldn’t even be sane without the staging area. This is what allows you to fix mistakes and make your work presentable for remotes.

I did exactly the same diff/tidy/diff workflow when I used p4 and svn, neither of which make a distinction between "working directory" and "staging area".

dahart · on Jan 16, 2019

Right, but p4 & svn have “checkout” which is similar to staging. Staging is part of what we get because we can edit files without having to checkout / open for edit.

P4 and svn don’t have a strict commit parentage, which is why you can push commits in those systems in any order. Git’s strict concept of parentage is what makes the staging area so important for keeping your workflow similar to p4 & svn Workflows. Without a staging area, you’d either have to always fix mistakes with new commits, which is bad, or rewrite already pushed history, which is worse.

pjc50 · on Jan 17, 2019

> without having to checkout / open for edit.

The terminology is a bit different - unless configured with mandatory locking (essential for some workflows) you don't have to open for edit. You just edit stuff and it goes in the "default changelist", roughly equivalent to automatic staging.

> Without a staging area, you’d either have to always fix mistakes with new commits

Mistakes at what point? In the normal svn workflow you can review with svn diff, then when you're happy do svn commit; it's just that there's no local place you're committing to. In both cases there's a critical point, either "svn commit" or "git push".

dahart · on Jan 17, 2019

> unless configured with mandatory locking ... you don’t have to open for edit.

I’d guess you’re learning toward talking about svn, which I don’t remember very well, and I am leaning towward talking about p4, which always does mandatory locking.

You’re right the terminology is different between these different systems, I’m just pointing out that the git staging area has what you can think of as some equivalences in the other systems. Or, you can think of it as tradeoffs. Either way, the git staging area is something that helps you pretend like you’re using svn or p4 in the sense that it helps support editing multiple changes at the same time before pushing them to a server.

> Mistakes at what point?

With git I’m referring to mistakes between commit and push. But there’s a philosophical difference here that I glossed over. With git it’s easier to commit early and often than it is with svn or p4. With svn & p4 it’s easier to lose your work because version control doesn’t know anything about it before you push. If I make micro-commits, which I want and I like, then I put more “mistakes” along the way into my local history, and I can use the staging area to clean everything up before I push. With svn & p4, you make those mistakes and do the cleanup without ever telling the version control, and you run a greater risk of losing that work while you do the cleanup.

mixmastamyk · on Jan 16, 2019

Never, and can never remember what rebase actually means.

At work I’ll hit the squash option on gitlabs merge request which moots all local machinations.

whorleater · on Jan 16, 2019

judging by the atrocious management of remote history I've seen at workplaces, "making work presentable" is pretty far down the line of priorities

mcguire · on Jan 16, 2019

Amending commits and rebasing involve the staging area?

dahart · on Jan 16, 2019

Usually. You can also amend and rebase remote commits, but that’s usually a big no-no.

NoodleIncident · on Jan 16, 2019

Committing isn't a commitment. After making the first commit, you can use the `git stash` command to put the rest of your changes aside, and go through the normal test->amend loop until you're happy with that first commit. Then you just retrieve your other changes from the stash to make your second commit.

It's also possible to do this without the stash command, by making both commits right away, and testing them later. However, that would involve rebasing(?) your second commit on top of any changes you end up making to your first commit, so using the stash makes more sense to me personally.

dahart · on Jan 17, 2019

Fwiw, stash can get you into trouble more easily than commit. It’s no more typing to commit or branch, so I recommend preferring those to stash when it makes sense, or when you’re playing with changes you don’t want to lose. Stash is handy for a bunch of things, so use it by all means, just remember that there’s often an equivalent way that is just as easy and much safer.

The git stash man page talks about this: https://git-scm.com/docs/git-stash

“If you mistakenly drop or clear stash entries, they cannot be recovered through the normal safety mechanisms.”

One of the best things about git is how big the safety net is, as long as you tell git about your changes. Almost any mistake can be fixed, so why use features that aren’t sitting over the safety net?

mcguire · on Jan 16, 2019

A scenario:

You're adding a feature to your proggie. That involves modifying the main bits to add the feature and, say, adding a couple of interfaces to internal library modules.

Split out the changes to the library modules into separate commits---it's safe because nothing uses them, they're logically separate from the feature changes (although they don't appear to have a justification without the feature), the log will be marginally cleaner, and git bisect will have more granularity.

noselasd · on Jan 16, 2019

Why is the staging area needed in such a case ? In more traditional systems, you'd just do, say, "svn commit library/" and then commit the rest. (and you could do just the same in git too without seeing the staging area)

scrollaway · on Jan 16, 2019

Understanding the staging area first requires understanding the need for it: The need for atomic commits. The need to create commits that have specific changes in them and are not always a snapshot of the entire world below the git root exactly as is right now.

recursive · on Jan 16, 2019

Yes, it requires more explanation than that. I've used git for years, and never really understood why staging is even a thing.

Your example is an implementation of the box-putting algorithm, but it doesn't need to be mirrored in the put-box CLI.

    put-close-box file1 file2

This command could encompass all the putting and closing. Since you only close boxes when you are done putting things in it, I don't see a need or purpose to split it up.

    put-box file1 file2
    close-box

A closed box (commit) is always going to contain stuff that was put in it, so why separate commands?

ezrast · on Jan 16, 2019

That's not convenient when you're putting things into the box piecemeal, especially with `git add -p`. A thing I do frequently is to run `git diff`, scan through it, and add files (or parts of files) one by one in a second terminal. Then I do a final review of the staging area (with `git diff --cached`) to make sure it only has the changes I want and commit. I'm the sole devops engineer at my company and my workflow is a bit more scattered than a typical developer's.

Anyway, `git commit file1 file2` by itself is most of the way to being the put-close-box function you want; it just doesn't work for adding/deleting files from the repo. Seems like they could make a lot of people happy by closing that gap and letting `git add` be an intermediate-level feature.

recursive · on Jan 17, 2019

To me, that ought to be a concern of the "porcelain", although no one uses that word anymore. CLI is particularly bad at certain types of interaction. So to compensate, a mitigation is moved into the underlying model of git. That mitigation is staging. The inconvenience of "piecemeal adding" could have easily been addressed in the UI layer using a more suitable presentation, rather than forcing all clients to follow the stage/commit dichotomy.

Gys · on Jan 16, 2019

For simple projects (like ppl experimenting with git) you will always want to save all changes. So why stage first ?

tux1968 · on Jan 16, 2019

Not everyone stays a beginner forever, and it's nice to have a tool that doesn't play to the lowest common denominator. It's really not that hard to just do a "git commit -a" if you want to avoid staging.

oblio · on Jan 17, 2019

> Not everyone stays a beginner forever

But the vast majority do, or at best become perpetual intermediates (https://blog.codinghorror.com/defending-perpetual-intermedia...).

99% of developers out there didn't need a power tool for source control (source control is already quite a power tool many devs can barely handle, even in SVN form...), yet here we are: Git is imposed everywhere, with its horrible UX.

tux1968 · on Jan 18, 2019

Git's UX isn't that bad if you're only cloning projects to build them locally and keep them updated. The UX only gets really crufty as you use more and more of the features.

allenu · on Jan 16, 2019

I think people find it difficult because for most beginners at git, they just want to put everything in the box. Having the option to put just some things in the box seems more complicated than needed. Obviously, as you get better with the tool, you realize the power of literally "staging" your changes into multiple commits, but as beginner, it's not even in your purview.

tokyodude · on Jan 16, 2019

My hurdle was 15-20yrs of no staging area from previous VCSes so the extra step took some time to understand why it was needed.

altairiumblue · on Jan 16, 2019

Isn't the staging area closer to an intermediary box? That's where it can get confusing.

detaro · on Jan 16, 2019

Staging puts things in the box, commit closes the box, puts it on the pile with the other boxes, and gives you a new empty staging box.

Double_a_92 · on Jan 16, 2019

But why is it an extra step? It's basically just a "longterm" selection of what you want to commit.

detaro · on Jan 16, 2019

Because you not always want to put everything in the box (and if you do, there's a shortcut to do it), and "git commit file1 folder/folder/ * .cpp folder/folder/ * .h ..." for a complex set would be annoying and require you to mentally keep track of it from the beginning.

Many beginners will start by always doing "git commit -a" and that's fine, as long as they know there's an alternative once they need it.

url00 · on Jan 16, 2019

But why is the exceptional case the default?

Surely, most of the time when you go to commit, it's all the files you've changed?

IggleSniggle · on Jan 16, 2019

Not for me! I often find myself refactoring tangential features while producing a new one. Sometimes that will even intersect in a single file. But that refactoring doesn't come with any changes relevant to the feature I am working on in my branch. So I save them for their own isolated commit(s). While this doesn't happen on every commit, it probably happens for me about every other push. The alternative is bundling in a bunch of changes that have very little to do with the feature that my branch is ostensibly about.

EDIT: Now that I think about it, I also have several repos where I have changes that I never intend to ever commit them, because they are development conveniences for me personally.

Accacin · on Jan 16, 2019

Same for me! Webpack config changes to cache settings, config changes to hit a different API for testing, using a different database for testing. Most of these live in my staging area and get stashed and popped when I switch branches/rebase.

Accacin · on Jan 16, 2019

Not really. I think of my git use case at work pretty simple. I usually stash, pull down, fast-foward and then pop my stash on top. Occasionally I'll need to rebase too. Just to show I'm not a super advanced user or anything.

I'm a JS dev mainly working in React on a web app with a backend team using PHP. Often I'll be working on a branch with maybe 2 or 3 people and I often end up working on a few things at a time. Say I'm working on a feature, and I notice some bug I'll fix that and then get on with my feature. Once I go to commit I pretty much always do a 'git add . -p' and I very rarely want to add all the files I've worked on!

Even things like switching a config file to use a service like apiary where I don't want to commit my change to the config to use apiary.. Or change to my webpack config for testing, etc.

I've used Perforce, SVN and Git and the whole 'staging' area thing always felt very natural to me. Here are the files you've edited, which ones want to be commited? It gives me a second chance to go through and check everything before I've commited, and often that stops me leaving in any odd comments or debug code.

elcomet · on Jan 16, 2019

Almost never actually. I never commit all the changes in my repo (for big projects I often have some small changes in other places, I don't want to commit them)

Double_a_92 · on Jan 21, 2019

My point was more why staging is a special feature that even has a name. You're basically just selecting what changes you want to commit.

What is the usecase where one needs to remember that selection for more than just a few minutes?

fdsak · on Jan 16, 2019

probably related changes grouped together

jordigh · on Jan 16, 2019

The staging area is really an extraneous concept that isn't required. It's like a commit that isn't a commit.

In Mercurial, I much prefer to just make it an actual commit in the draft phase (the default phase) and just keep rewriting that commit. Mercurial provides tools for both selectively adding and removing hunks from a commit (both `hg amend` and `hg uncommit` accept --interactive for hunk selection). If you're extra paranoid, you can make it a commit in the secret phase so it's not shared prematurely by accident.

It's pretty much functionally equivalent and doesn't require an extra location in which your code can be. It's either in your working directory or in a commit.

A bonus of this approach is that now you have a meta-history, hidden by default, of what you've "staged" and "unstaged". It's kind of like a reflog but with, in my opinion, a better UI. And of course, the index/cache/staging area in git doesn't use refs, so there's no reflog there.

nlawalker · on Jan 16, 2019

I've helped move a couple teams (kicking and screaming) from TFS to git, and I start back even further than that - why is it so much more complicated than clicking a button to save and share my work, and what is the benefit of that complication?

tome · on Jan 18, 2019

I'm very experienced with git, approaching expert level, and I don't use the staging area. I use

    git commit --verbose --patch

and bypass the staging area entirely. I don't find it helpful.

Jach · on Jan 16, 2019

Git itself is so simple, it's all the stuff around it that can be overwhelming. Social mores about all the possible workflows are maybe the biggest (rebasing vs merging, granularity of branches and their longevity, acceptance of partial commits reflecting a state never realized in isolation on disk, commit hooks, requirement of every commit to satisfy properties x,y,z, direct access to common parent repo copy vs requiring some sort of pull request flow (and dependence on github and all their stuff)) but there's also work tracking, code review, build automation, test automation, deploy automation...

Animats · on Jan 16, 2019

From the article: "If you take care the commit history, consider the use of git pull --rebase. Instead of fetch + merge, it consists of fetch + rebase. Your local commits will be replayed and you won’t see the known diamond shape in commit history."

No, not how to teach Git.

Open source just can't do good user interfaces. The result is almost always a zillion features in search of an architecture.

Blender managed to almost dig itself out of that hole, but it took over a decade. Gimp is still down in the pit.

Already__Taken · on Jan 16, 2019

Perhaps managing git is finally the business case for VR, Hold my coffee I'm going rebase jumping.

sephoric · on Jan 16, 2019

I was just running into this dichotomy this morning. To test my app Autumn on High Sierra, I had to install it in a VM, so I tried VirtualBox (open source) and Parallels Desktop Lite (in-app purchases). Not only did Parallels have a smoother, cleaner, more modern and easier GUI, VirtualBox just out-right doesn't work when trying to install High Sierra, and I had to find third-party instructions online just to bypass this bug. Plus it likes to crash right after shutting down the VM. I'm not really sure if there's some deeper philosophical reason behind this dichotomy, but I've seen it hold true for a lot of apps and their open source alternatives, and many people have said the same thing holds true about my app Autumn and open source alternatives like Hammerspoon and Mjolnir. As a rule, we really do seem to get what we pay for.

y4mi · on Jan 16, 2019

You're pinning THAT on opensource?

a) VirtualBox is an oracle product. That by itself should be telling.

b) High sierra is unsupported as a Guest OS in VirtualBox. You do know what that means, right?

c) You seriously complain about the darth of open source virtualization for an OS, which disallows virtualization on anything but apple hardware? ...really?

If you want good open source virtualization you'll want to use Qemu/KVM. Which obviously doesn't support any apple OS either, because they're not allowed to virtualize it. Take that up with Apple, not open source

sephoric · on Jan 16, 2019

It was the most recent thing I did (literally this morning) so it was fresh in my mind. I'm also not an expert in virtualization solutions given how I haven't needed to use it until now. But this has held true for many, many apps and their open source alternatives. Paid products generally tend to be higher quality than open source, for whatever reason.

saagarjha · on Jan 16, 2019

> Paid products generally tend to be higher quality than open source, for whatever reason.

Putting aside the fact that this isn't true, and that there are quite a few quality open source apps on macOS, it's pretty clear why paid products have higher quality: the bar for people to buy them is much higher, so they generally need to be at least somewhat decent for people to consider paying for them.

saagarjha · on Jan 16, 2019

Virtualizing macOS on a macOS host is allowed, and I think QEMU will let you do this.

y4mi · on Jan 17, 2019

There wording was iirc "you're allowed to run one instance of Mac OS per Apple chipset."

So, virtualization is technically only allowed if you're running your apple hardware with anything besides Mac OS.

But apple isn't enforcing that limitation, as products like VMware fusion on Mac OS are (at least as far as I can tell) officially sanctioned.

saagarjha · on Jan 17, 2019

Here's the phrasing for macOS Mojave:

If you obtained a license for the Apple Software from the Mac App Store or through an automatic download, then subject to the terms and conditions of this License and as permitted by the Services and Content Usage Rules set forth in the Apple Media Services Terms and Conditions (https://www.apple.com/legal/internet-services/itunes/) (“Usage Rules”), you are granted a limited, non-transferable, non-exclusive license:

(i) to download, install, use and run for personal, non-commercial use, one (1) copy of the Apple Software directly on each Apple-branded computer running macOS High Sierra, macOS Sierra, OS X El Capitan, OS X Yosemite, OS X Mavericks, OS X Mountain Lion or OS X Lion (“Mac Computer”) that you own or control;

(ii) If you are a commercial enterprise or educational institution, to download, install, use and run one (1) copy of the Apple Software for use either: (a) by a single individual on each of the Mac Computer(s) that you own or control, or (b) by multiple individuals on a single shared Mac Computer that you own or control. For example, a single employee may use the Apple Software on both the employee’s desktop Mac Computer and laptop Mac Computer, or multiple students may serially use the Apple Software on a single Mac Computer located at a resource center or library; and

(iii) to install, use and run up to two (2) additional copies or instances of the Apple Software within virtual operating system environments on each Mac Computer you own or control that is already running the Apple Software, for purposes of: (a) software development; (b) testing during software development; (c) using macOS Server; or (d) personal, non-commercial use.

navinsylvester · on Jan 16, 2019

No one is appreciating the effort and a different take here. Let me congratulate the author on job well done.

There are people who love illustrated explanation and for those these are perfect. This is just meant as a template which others can use to build the illustrated material and in no way a comprehensive git tutorial.

e40 · on Jan 16, 2019

I completely agree. I've seen a lot of these, and this one is simple, to the point and easy to understand.

max23_ · on Jan 16, 2019

Agree, I think this is sort of like a very simple introductory on what git is about and to me it does the job.

allenu · on Jan 16, 2019

I gave a git talk at work recently and what I found works was teach the graph from the beginning. This includes a lot of diagrams of what the graph looks like as you commit and branch:

- show what happens as you add commits to a branch - show that branches are just pointers to commits - show what creating new branches looks like, i.e. creating a new pointer - show what merging looks like (a new merge commit is added, or else fast-forward merge and that the branch pointer just moves) - show what happens if you don't rebase (i.e. "ugly" non-linear graph) ... then teach what rebasing does (i.e. creates new nodes and moves the pointer)

I found that building up from the ground up and illustrating graphs allowed people to conceptualize things much better. There was still some confusion once merging was introduced (what happens to those earlier commits? do we need them?) and mostly because people hadn't thought of the graph before.

Git is one of those tools where you can totally do your job just by knowing the basic commands but not really know what is happening under the surface, which I think is a testament to the tool. But, that leads people to conceptualize their own idea of what is going on... and getting it wrong and being confused when they want to do something outside of their basic toolset.

da_chicken · on Jan 16, 2019

Yeah, it was definitely a set of images [0] that made git click for me, but even those are not necessarily the best pictures to use.

Git's mental model is basically:

> Ok, what if we took our SVN repository and the first thing we did was check in an SVN repository to that. So now you check out the repository, and you get a complete local repository that you can do anything you want with!

> So Pull/Push are the terms we use for the check out and commit of your local repository to the remote repository, and then Checkout/Commit works from your local repository just like you're used to with SVN.

And the real magic is what they did by building a system on top of this model to let you merge changes all the way up while looking at the code like you'd expect you'd need to.

The problem is that git's toolchain still feels arcane to use, and it requires that you have good working knowledge of the underlying models. It's confusing enough that you can't function unless you have that because you don't know where you are or where you're going. It's a fantastic tool, but it's like driving a car with two steering wheels, two gear shifters, and six pedals. Then you say to yourself, "How do I get to the market, buy some milk, and come right home?" and your brain starts to melt a little bit.

You shouldn't need to know the nitty gritty of how git works internally just to get it to work right any more than you should need to know how a disk works in order use a file system, but over and over we keep seeing that knowing that is really the only way to use the tool correctly and that it takes quite awhile for people to get.

[0]: https://blog.osteele.com/2008/05/my-git-workflow/

ramses0 · on Jan 16, 2019

I did this quite a while ago following your same teaching methodology. Feedback welcome! :-)

http://www.robertames.com/blog.cgi/entries/git-in-two-ten-mi...

Wehrdo · on Jan 16, 2019

When I was first learning git, I found an online visualizer like this [0] that really helped make make concrete the ideas of git history being a graph, and what different operations did on that graph.

There was still obviously the issue of memorizing the commands, but at least I knew what the commands were doing on a deeper level.

[0] https://learngitbranching.js.org/

collyw · on Jan 16, 2019

The other problem I find in git is that there are many GUI interfaces and none of them are consistent. In Eclipse I had a different interface depending on what project I opened, despite both the projects being in Python.

ddevault · on Jan 16, 2019

I disagree with this idea. The best way to learn git is to read the git book, in this order: chapters 1, 10, 2, 3, and the rest at your discretion. This way teaches you about the internals first, and if you understand the internals the rest of git is pretty intuitive.

https://git-scm.com/book/en/v2

Mirioron · on Jan 16, 2019

Great idea. This kind of culture is why there are a lot of people that don't and probably never will use git.

u801e · on Jan 16, 2019

There's something to be said about reading documentation rather than relying on stackoverflow answers or possibly inaccurate tutorials.

Substitute C, Java, Python, etc for git. You can probably do something with those languages, but you aren't going to get very far without reading some sort of documentation.

someguy101010 · on Jan 16, 2019

The article is about how to get a more fundamental understanding idea on how git works, and this book demonstrates fundamental ideas on how git works. I don't see a problem in this reccomendation. If you want just a cursory knowledge of how to use git to get by, this probably not the right choice, but that's not really what this discussion is about is it?

mcguire · on Jan 16, 2019

Well, that is what the article is about.

rkangel · on Jan 16, 2019

Unfortunately, it isn't possible to effectively use git without knowing something about the internals. You can do the basics taught more 'by rote', but sooner or later you're going to run into something unexpected, or something complex you need to do and you need to understand the data model in order to have a chance of sorting it out.

collyw · on Jan 16, 2019

That's a lot of reading for a tool that should be making life easier.

ddevault · on Jan 16, 2019

It's an engineering tool. You'll be using it all day every day for the rest of your career, the investment is worth it.

0xffff2 · on Jan 16, 2019

That's simply not true for many Git users. I'm a developer, so I do use Git every day, but I work with a bunch of researchers that absolutely do not need to use Git more than once a week at most. Convincing those people to care enough to learn the internals has been a constant uphill battle for me.

pjmlp · on Jan 16, 2019

I will save this answer for when anyone complaints about C++ or Rust being complex languages.

ddevault · on Jan 16, 2019

How about now :) Rust and C++ are complicated languages, and that's bad.

But git isn't complicated. Git is a handful of simple ideas composed in interesting ways. It looks complicated because there's a lot of porcelain commands with a lot of options, but all of them are just manipulating the same simple internals which, once understood, are clear and intuitive.

mcguire · on Jan 16, 2019

Many people would disagree---the permutation of simple ideas gets...less simple quickly.

On the other hand, I've used ClearCase.

(Do not use ClearCase.)

pjmlp · on Jan 16, 2019

That is the thing, dealing with git issues brings back flashbacks of using Clearcase views on UNIX.

pjmlp · on Jan 16, 2019

I see you got my devil's advocate gist. :)

Thing is, dealing with git feels like being in the early 2000's using Clearcase views.

collyw · on Jan 17, 2019

The difference being that you will likely spend hours of your day thinking in rust, while git should be taking minutes of your day, but often ends up taking hours when you screw up a command and need to restore things to how they were.

collyw · on Jan 17, 2019

You could say the same bout Stack Overflow, but its a lot more intuitive.

jecxjo · on Jan 16, 2019

I like the diagrams, as many explanations often skip the staging portion of the whole git process. I use that so much that it baffles me that most people skip it. Then again, I typically don't like having tons of "WIP" commits so I stage a lot and if I need to switch to another branch I'll commit a WIP that I quickly `rebase -i` to get back to a clean status.

As for the teaching part, I have found the best results by having you and the learner actually "working" on the project at the same time on your own machines both pushing to a centralized server. It becomes too easy to go over all the commands and feel like you understand it. Most of the scary parts of git are used when its multiple developers on the same project. And oddly enough, having the learner do both Developer A and Developer B's tasks don't seem to work as well as having the learner just do Developer B while I do Developer A. Trying to explain to someone when to use merge, when to rebase, and when to use cherry-picking to get the code I just pushed into their current working branch can be done so much easier when its hands on AND knowing exactly who is doing what steps.

jstimpfle · on Jan 16, 2019

I found this very helpful: http://eagain.net/articles/git-for-computer-scientists/. Git's data structures are well designed. Once you have internalized them, you will be better equipped to navigate through the jungle of command-line options.

To understand the data structures interactively, use "git cat-file -p HEAD" and continue drilling down to an individual file in a subdirectory with "git cat-file -p OBJECTHASH"

jejones3141 · on Jan 16, 2019

I've never had to understand the internals of a web browser or text editor in order to use it; drivers ed courses don't start with a discussion of thermodynamics. Why should it be necessary for git?

JoeAltmaier · on Jan 16, 2019

Because every source management tool has a model, and to use it at all you need to know the model. Else you're jabbing buttons and turning dials on a complex machine and the outcome is going to be tragic.

oblio · on Jan 17, 2019

I've used SVN reasonably well without knowing its internal model. I kind of knew a bit about it, I'd never call myself an SVN expert and I still managed to do my job efficiently.

Git fails majorly in this regard.

JoeAltmaier · on Jan 17, 2019

Maybe because git has a different model. Learn that, the objections go away. As the OP attempts.

oblio · on Jan 17, 2019

This logic is kind of circular.

"Git sucks, the UX is atrocious, I don't want to spend half my life learning a tool that shouldn't even need that much hand holding."

"Learn Git!!!"

delian66 · on Jan 17, 2019

No the logic is not circular, and the advice to learn git is a good one.

It may seem paradoxical at first to you, but is true (as are many things in this profession). Another paradoxical advice like that, is to learn vim, or emacs, but I digress.

Git does not suck - as any other tool it, it just has strenghts and weaknesses (for example working with very large binary assets is its main weakness).

The UX of most common git CLI operations is clean actually, as they are fast, and you do not need many arcane options (although they are there, and are documented well, for people who read...).

If you screw up something, you just use the reflog to fix the state of your repo in most cases. Even if you can not (or do not want to), the troubleshooting is still easy - you can always do a fresh clone from your remote repository in a new folder and copy what you want there.

oblio · on Jan 18, 2019

You're assuming that I haven't read about Git. I've read a ton about it and its internal data structures. And regarding your digression, I'm a vim user.

Regarding Git, Git does suck. It does the job Linus designed it to do, but that job is not most software engineers across the world need it to do.

In smaller or in corporate shops, Subversion was almost adequate and several bad implementation details, mostly related to branching, led to its demise. So that world needed Subversion++, not Git.

In the FAANG world, there's basically no company that uses Git as-is. It's strength/weaknesses aka tradeoffs aren't good enough for them.

Git won because tech is a popularity contest and people in our domain like to do a lot of virtue signalling ("this tool is hard to use, I use it, so I'm special/cool").

delian66 · on Jan 19, 2019

My response was to your sarcastic mini dialog above my reply. Do not try to read other peoples minds - it is impossible, and if you really want to, you can simply ask.

>> It does the job Linus designed it to do, but that job is not most software engineers across the world need it to do.

Speak for yourself, you are not most software engineeers.

>>In smaller or in corporate shops, Subversion was almost adequate ...

I have administered SVN profesionally for several years (2005-2007), and was paid to unfuck screwups made by other developers using it (which many times involved restoring from incremental hourly backups done on the SVN server side). Dealing and helping others with their git problems is many times easier.

The FAANG world (which I had to google just now) I imagine has unique requirements (many teams that must coordinate, super gigantic legacy source code base), and the resources to do whatever they want (money, humans to develop and maintain tools and do research). For them, the integration pain from managing multiple smaller repos may be significant. Outside this world however, teams are more independant and the source code size is much much smaller (even for legacy projects).

>> Git won because tech is a popularity contest

You have a point here, but this factor (and network effects in general) is just inertia, and does not explain why git won, given that for example SVN or Perforce had such a head start (in tooling, and in mindshare), and there were other distributed contenders like mercurial and darcs and BitKeeper developed at aproximately the same time, and even earlier. It won in my opinion because it was simply superior tech - faster, good enough and very very easy to get started.

(edited to clean up formatting)

jstimpfle · on Jan 18, 2019

Git won because it is technically superior (IMHO, but I've used most other VCS only sparingly). It's well designed, flexible, and fast. Apart from a command-line interface with some unfortunately named options and a "big file issue" (that has never been a problem for me), I don't think there is anything wrong with it.

What do you think is wrong about it? Or have you only "read a ton" but never used it for a while? In the latter case, I suggest you start with the things I mentioned above, and make good use of "git reflog" as suggested by somebody else. If you know git reflog, "delete tree and clone a fresh copy" is not a thing anymore.

> In the FAANG world, there's basically no company that uses Git as-is. It's strength/weaknesses aka tradeoffs aren't good enough for them.

Do they use svn then?

jstimpfle · on Jan 17, 2019

git is much more powerful. It gives you control to manipulate the repository like say, a relational database. If svn is enough for you, that's fine.

oblio · on Jan 17, 2019

> It gives you control to manipulate the repository like say, a relational database

And how many people do that? Not that many. This angle is a bit like the guy who said that he doesn't want Unix file names to be UTF-8 text-only, because he crafted a sort of relational database on top of a Unix FS and by having file names be just text he couldn't do some super niche trickery. I think it was in response to this article: https://dwheeler.com/essays/fixing-unix-linux-filenames.html

> If svn is enough for you, that's fine.

It is, but every job these days forces you to use git. And most places I've worked at, git is used as a glorified SVN where people just have a 2-step commit to a remote server.