Hacker News new | past | comments | ask | show | jobs | submit login
Git in one image (githubusercontent.com)
173 points by guptarohit on Feb 12, 2022 | hide | past | favorite | 65 comments



Lots of confused beginners in this thread. IMO diagrams, and tutorials, that take this approach to teaching git are the reason people have such a hard time. After learning the very basic commands, the next step is to learn the internal data structure of git at a conceptual level. This may seem like a bad design to some, and that is a reasonable thing to debate. It doesn’t change the reality that knowing basic git internals makes git much easier to use.


I disagree that you need to know the internals. You need to understand an abstract model, that may have been influenced by internals of early versions, and part of such internals may have been preserved into current versions. But unless you want to modify git, you should not have to know anything about the internals. The abstract model is enough. I consider to know virtually nothing of git internals, yet I consider to be proficient in understanding its abstract model and using it.


Or so you may think. Working with diffs / merge conflicts already exposes you to internals. Knowing that committing big binary blobs is a bad idea also could be categorized as "knowing internals". Knowing why LF/CRLF leads to conflicts (without setting .gitattributes) also is knowing git internals.


It really doesn't, because you can know HEADs and the working index and the Merkle tree works, and even know the how you want the internal state to be transformed, but still run into "ok what is the command for doing what I want? Is it git-stamp-log or git-execute-detached-head?"


The approach I take when I'm not sure of which subcommand to use is to read through the git man page and skim through the commands listed. If one looks like a candidate, I'll open up the man page for that command and see if it's what I need.


Anyone who doesn't really understand what a graph is, and what a branch means in terms of graphs, has nothing to map those commands to. Sure, they form some internal model. Maybe "commit = save", "checkout = load", but load what? Oh, load that commit id, okay. But now they are at the "final v2 ZIP.ZIP" stage, not terrible, but they'll bleed out on the first git pull that results in a merge conflict :/

Speaking of merge, I'm a git user since ~2010, submodules, interactive rebase, filter-branch, sure, they are simple from a graph perspective, but I still don't fully understand how git knows which diffs to pull up during merge conflict... :D ... and now reading the answers here, it seems when it does the merge it uses the graph only to find the merge-base, and then it's just 3-way merge, which doesn't really map to individual commits. (And because the 3way diff looks at the raw text this means during the merge the branches are "squashed" into 1-1 commits both on top of the common base.)


Agree. Definitely think trying to learn the model first just results in confusion. First learn a few basic command so you can get by then delve deeper.


Looks great!

My 12-year old questioned why a 'pull request' is named the way it is. Obviously it's after the command 'git pull' which fetches the latest changes, then merges them, but it's confusing according to him, and should be called a 'merge request' instead. I don't disagree.


My understanding is that Github called this "pull requests" because it does something similar to the `git-request-pull` function (See https://git-scm.com/docs/git-request-pull).

I quote

> Generate a request asking your upstream project to pull changes into their tree. The request, printed to the standard output, begins with the branch description, summarizes the changes and indicates from where they can be pulled.

GitLab, on the other hand, _does_ call it "merge requests" because you to your 12-year old's point, that is what they are.


"git pull" doesn't merge the "latest changes"; A pull is fetch+merge, and you can fetch from any repository, so a pull request is saying "please fetch commits from my repository and merge them".

I think git made a mistake in making pull the default (fetch+explicit merge is better) but "pull request" is correct terminology.


Is pull the default? What does that even mean? I certainly tend to fetch, checkout origin/master, create a feature branch, push, then merge in github’s web interface.

I «never» pull and I don’t have to change any settings.


pull is "default" in that everyone gets taught to use it and you have to unlearn that once you realize it's not what you want most of the time.


this statement surpises me, because `git pull --rebase` is practically my most frequently used operation, besides commit, push and changing branches.

intellij even comes with the concept of update, which is either a `pull` or a `pull --rebase`

it asks u once which of those should be the default behaviour, then all i do is cmd-t to updaTe (aka git pull --rebase) my current branch, then push my commits, after a potential 3-way merge, which intellij has the easiest-to-learn interface for, imho.


"git pull --rebase" is not "git pull", it's a different operation. I agree it would be a better default to rebase instead of merge though.


You can make --rebase the default:

git config --global pull.rebase true

I have this set because it's never intentional when I do a pull to end up with an ugly merge commit


In the early days it was literally a “request to pull from me” - one would literally email Linus (for example) “hey, I made X changes, consider pulling them from Y (an attached file, or a server)”

Linus talks about this in a great panel he did at Google 14 years ago: https://youtu.be/4XpnKHJAok8


That’s an opening to discus the differences between git and github, since pull requests are not a part of git.

(I agree that merge requests is a much better name.)



git as designed is truly decentralized and doesn’t consider your copy of Linus’s kernel tree to be much different than his - and so you ask him to pull from you.

Of course we immediately centralized around GitHub and turned it into a glorified SVN but that’s the fate of all decentralized entities.


Of course, but 'pull' doesn't seem to be a very good description - Linus will 'fetch' your branch, and then 'merge' it into main, or whatever.


Fetch+merge is literally the definition of pull, so it cannot be more or less appropriate.


Gitlab calls them merge requests


Gitlab tends to work with multiple developers working in a single repo where branches are pushed and the command to integrate the changes from one branch to another is git merge.

Github tends to work with forks - where each developer is working in their own fork. When the work in the fork is complete, since it is a different repository than the original one, in order to integrate the changes the command to invoke is git pull.

The difference reflects how the typical developer works with integrating changes on that hosting solution.


Wisely.


This used to be one of my there's-no-wrong-answer interview questions: what's the (philosophical) difference between a pull request and a merge request.

Engineers start to think out loud about it, mention github and gitlab, etc etc so I got a meta-level scan of their experience (or the lack of it) with these tools.

Of course this was just a warmup (or cool down) question before (or after) the real git/engineering related questions.


Yep. At Google it is "change list", which I think is good because it isn't a "request" until you send it to someone to merge it, which you don't always do. Maybe "diff list" would be even better. Or maybe we could have just stuck with "patch" or "patch set" from back when people would just send these around in emails.


Ah, miss creating CLs and using fig. I now use git but always liked fig/mercurial better and wish it had won the version control space. When people ask me what I liked about mercurial, I tell them that I could rebase, cherry pick, branch and do whatever else I could think of without needing to Google it or going to stackoverflow, it was just intuitive. I can’t seem to grok git that way. Any time I need to do anything more than branch, stage, commit, push, I have to google and read man pages.


Yep, mercurial is great, much better design at the API level, though I don't know the internals of either of them. Fig is pretty good but it feels a lot slower than git; I think it's doing more stuff remotely than I expect it to.


If you like Fig and want a similar (but better, IMO) UX that can be used with Git repos, you may want to try my project: https://github.com/martinvonz/jj. It's also pretty fast (can rebase >1k commits in <1s in the Git source repo, for example).

I'm on the Fig team at Google, but the project is my own.


Cool! Thanks for the link!


+1 for Mercurial.


In git, "merge" is a local operation: it operates on two branches already available in the local repository. In order for you to incorporate somebody else's changes, they would have to first push their changes to you, or ask you to pull theirs.


> Obviously it's after the command 'git pull'

I'm not entirely certain about the history of the term "pull request" as it pertains to Github, but git itself has a subcommand called request-pull that will provide information about changes between two commits, where it can be fetched from, and a short log (a list of commit titles grouped by author), as well as the overall diff.


This was something that stuck in my brain for the longest time, until I framed it as "requesting to have a patch pulled [away] into the repo to be merged." Then it all made painful simple sense. I don't know why I got stuck on it for as long as I did. Maybe it's an English language context thing: change pulls are You pulling, but pull request is Them pulling.


The most recent computerphile YT video [1] is talking about the inner workings of git (Inside the hidden git folder) which I found quite interesting. Anybody reading this that came across some previous article about how git works and couldn‘t be bothered (just like me) may find this interesting.

[1] https://youtu.be/bSA91XTzeuA


Coincidentally there was a pretty good Computerphile video released today about how git is working under the hood: https://www.youtube.com/watch?v=bSA91XTzeuA


Ah, no wonder I’m so confused by git.


I am fully convinced that there’s a “better” way out there waiting to be discovered. I think it could even be built on top of existing git internals. But the user story of actually interesting with git is very bad.


Then there’s NDPSoftware’s Git Cheatsheet which is interactive.

Click or tap the backgrounds or the arrows.

https://ndpsoftware.com/git-cheatsheet.html#loc=local_repo;


I use Github but every time I do something it feels like I’m going to break the whole project. Looking at this map, I realize I basically know nothing.

Where did all you guys learn how to use GitHub?


This is a diagram illustrating Git, not GitHub. Git is a version control system. GitHub is a hosting platform for Git repositories that adds a web interface and social features.

I learnt how to use Git by picking up the basics, then whenever I needed to do anything new or whenever something didn’t make sense, I read about how that part of Git worked. Also, I read the Pro Git book, which is free and linked to from the Git homepage. If all you ever do is look for a quick fix or use a cheat sheet when something doesn’t make sense, you’re the “One year of experience ten times” developer, so try not to do that.

Learning how GitHub works after you understand Git is easy.


I'm a big fan of git's tutorial manpages: gittutorial, gittutorial-2, and especially gitcore-tutorial which digs into the actual on-disk representation to let you understand what the upper layers are actually doing. Understanding the underlying model helps you reason about how to accomplish what you want to do.

Manpages available on your local system, or they're online here:

https://git-scm.com/docs/gittutorial

https://git-scm.com/docs/gittutorial-2

https://git-scm.com/docs/gitcore-tutorial


Step one is to learn the difference between git and GitHub. ;)

Jokes aside, there is a wealth of literature on how to understand git from the ground up. I strongly recommend reading through one of those. I can't recommend anything in particular, but I'm sure someone else will chime in with their favourite.


[shameless plug]

Since you asked, I just published Head First Git. I did a "Show HN" that didn't get much traction, but rather than repeating what I said there, I'll just post the link—https://news.ycombinator.com/item?id=30072348

[end shameless plug]

And to the point many others made, there are a _lot_ of resources out there, but if you are looking to start back at the basics, I feel my book (which is designed for beginners) might be a good start.

Of course there is the canonical Pro Git (https://git-scm.com/book/en/v2) book.

And once you understand the _Directed Acyclic Graph_ I feel https://github.com/git-school/visualizing-git is a great project to see what happens when you run certain operations in Git.

Perhaps you've concluded that I love Git, and I love teaching it—I am @looselytyped on Twitter (DMs are open) if you want to reach out and we can carry out a conversation elsewhere—always willing to talk Git.

Edit—I replied to the wrong comment. My apologies kqr. This was meant for VeninVidiaVicii.


As others have mentioned, make sure you understand the difference between git an github.

The way I built an understanding of git was to create a local git repo, and start working with some plain text files. I tried the various commands - add, commit, branch, merge, and observed what happened to each file. Take this a step further and start working with a remote - you can use github.

This is how I built a mental model of what git and its commands do.


If you like video content, Jessica Kerr has a very nice talk on git.

https://youtu.be/yCh6TSLIQBQ


OK but it shouldn't be necessary to know what's going on under the covers of your source code control system.


It isn’t necessary to know what’s going on under the covers of git. This map isn’t showing under the covers either; it’s just diagramming the shape of a repo.

It is, however, necessary to have a mental model of what your source control system actually does above the covers, and how to use it. It is important to know that you have a staging area and a local repo and a remote/upstream repo. It is important to know that each commit has a parent commit (or occasionally two or more parents). It’s important to know where your commits go, how to put them there, and how to get them back later.


This is barely "under the covers". It really just pictures the core ideas behind decentralized version control systems such as git. You can't reason about decentralized version control without many of these core ideas.


A lot of things aren’t necessary.


Probably the best way I have seen so far to explain git.

Whoever made this: Good work!


I'd say the creator of the graph is probably the owner of the repository, based on the commit records here: https://github.com/JannikArndt/git-in-one-image/commits/mast...


Interesting idea, I’d love to have more context; to hear who this is aimed at, and what it’s for or how it’s meant to be used. Is this helpful for either git newbies, or git experts? I feel like if you don’t already know git, this map doesn’t actually explain what rebase or stash or clone or any commands really do without a very long side-explanation attached. The image doesn’t clarify what the wide arrows (e.g. reset) do differently than the solid lines or the dashed lines. It’s cool to see the local & remote repos sort-of mapped out if you know git, but it seems like the explanatory power of this might be lower than talking about it, or maybe mixing many images, even for experts? Sorry I don’t mean to be overly critical, just curious what the goal is.


Does anyone know which tool can be used to make such visualizations?


Inkscape (free, Linux mainly), Gravit Designer (cloud app, happy user), LucidChart (cloud app, great IMO but disclosure: ex employee), Visio (Windows), OmniGraffle (Mac), Illustrator ($$ but excellent).


Graphviz (dot diagram notation) or Mermaid diagram editor. Emacs has good support for rendering dot notation.


Same energy as: https://ptrthomas.files.wordpress.com/2006/06/jtrac-callstac...

In that the image works equally well as elegance for some, and criticism for others.



Hey, can we get a rendered png somewhere? plz? (with a reasonable resolution for full HD screens)


I personally find this to be the best video on git by far:

https://www.youtube.com/watch?v=2sjqTHE0zok (MIT OpenCourseWare, Missing Semester).

1h25mn well spent


This only explains one particular git flow. A simpler explanation is possible especially when you don't treat the named branches in a special manner


Well done. Do +X/-X represent exactly what is present in "changes"? If so, consistency in terminology would help.


I wonder if torvalds had something like this on a whiteboard at some point. Or just in that big juicy brain of his maybe.


Good job!


Wow, good job!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: