Git is the C of Version Control Tools

mechanical_fish · on Feb 17, 2009

Many git users say that the problem with learning git is to unlearn all the stuff you know about CVS or SVN, because git is just not the same. Now I understand what they're talking about.

Beacuse Eric Sink is a smart guy, but he doesn't understand git yet. He's still following the Subversion workflow. The secret is that "git commit" is not a sacred act. Don't be fooled by the words. When you do an "svn commit", the result is written in stone in some centralized repository where it can break everyone else's builds, now and forever. When you do a "git commit" the result sits on your machine. It goes nowhere. This is distributed version control -- a commit just changes your local repo, and if it doesn't work you just undo it. (Or, alternatively, make seven more commits in your efforts to fix it, then use the mighty "git rebase --interactive" to smash all of them together into a single, perfect commit. Then you pretend that it was correct all along. Nobody else will be the wiser. You'll look like a genius!)

So, if you want to pull half the hunks out of a previous commit, make a branch (you can make branches more easily than blinking -- this is git), pull out half the hunks, commit the result. Then compile and test. (If you like, you can probably use a post-commit hook to compile and test automatically.) If the tests pass, merge your experimental branch back into your development branch, then do the "git push" to the central repository. It is that push -- not any of your commits -- that is the point where you have to be sure that you're not breaking anything. And you can't use git's index to push files that have never been visible to your compiler and your tests.

UPDATE: I have yet to actually use them, but another one of Sink's commenters reminds me that git has pre-commit hooks that seem to solve his specific problem. Here's something I googled up on that subject:

http://www.nabble.com/git-pre-commit-hook--best-place-to-mak...

Again, I don't use such a thing, because I think it's kind of silly to insist that every local commit pass all tests. That's a recipe for failing to make enough local commits. I've been finding that it's better to make a commit every time you make any change that can be described in an English sentence. Then you have a little log of everything that you've done that you can rearrange with "git rebase -i" and turn into your final, pushable commits.

jcdreads · on Feb 17, 2009

> Because Eric Sink is a smart guy, but he doesn't understand git yet. He's still following the Subversion workflow.

His article makes it seem like he doesn't have his head all the way around git's internals, but I think his problem with git is a valid one: it makes it _much easier_ to do the wrong thing to the build than the relatively simplistic svn does, and it can be a drag to use tools that make breaking the build quite so easy. For big team projects it is, in that sense, very much like C programming, and even more like C++ programming: it's a lot of power, but it's way more than enough rope to hang yourself with, or for the guy in the next cube to accidentally hang you with:

> ...you need to act differently than you do when you're in an environment that attempts to protect you from your own mistakes.

mechanical_fish · on Feb 17, 2009

I just don't understand this mentality. Maybe it's because I haven't had a lot of experience using SVN (without the help of git-svn) in a multi-developer environment. But to me SVN users look like people who have been forced to walk a tightrope -- every time they check code in to their VCS, they have to make sure that the code is free of bugs, or their colleagues' machines blow up. Which is a dangerous situation, so they sensibly erect guardrails, which they then become completely used to, to the point where they don't even feel them - they can follow the path of those guardrails with their eyes closed. They come to assume that the only safe way for a human to walk is in a perfectly straight line.

But then a DVCS user comes along, just sort of wandering around in two dimensions. Because DVCS isn't like a tightrope; it's more like a flat parking lot with a line painted on it. You can stroll wherever you like, so long as you're standing on the line when you type git push. But the sight of a git user makes the SVN user freak out: "Watch out! You're going to fall off that level parking lot!"

And when someone compares the use of git to C, they are not being profound. They are just freaking out. Because that metaphor is poor. The problem with C is that you can easily write code that contains memory leaks or buffer overflows that are invisible to you, and invisible to your peers, and invisible to your automated tests, but that will cause a severe emergency weeks, months, or years from now. This is a problem that is orders of magnitude more serious than any of these alleged problems with git. git is not secretly introducing long-term time bombs into your code, even if you do manage to find a way to make it break the build.

jcdreads · on Feb 19, 2009

> ...this mentality...

I'm saying not that git shouldn't be used because it offers power. (Quite the contary: I use git-svn at work exactly because it's more powerful.) I'm pointing out that, "Oh shit, I better be careful with this different and more powerful tool," is a reasonable attitude for Sink to have.

ionfish · on Feb 17, 2009

Perhaps that's the price one has to pay for making it easier to do the right thing, which is to say, not squashing a bunch of unrelated stuff into the same commit just because you happened to be working on them at the same time.

patio11 · on Feb 17, 2009

He's still following the Subversion workflow.

That wouldn't be my first guess for the source control product of choice for the CEO of a company which makes a source control product.

mechanical_fish · on Feb 17, 2009

Hee hee! Point taken.

jackowayed · on Feb 17, 2009

>And you can't use git's index to push files that have never been visible to your compiler and your tests.

That's true w/in your scenario but a little misleading because it's not always true.

If you don't use the experimental branch, you can add and commit pieces of files, yet the tests will run not on what's committed but what's in your directory. So if you had

    print password_to_the_admin_interface

for some reason, and then you comment that out/delete it and forget to add that part of the file, you test it and see that it's not printing the pass, but then you push and deploy and bad shit goes down.

Just clarifying so no one's like "I can't push code I didn't test! Git is so safe!" and then allows disaster to occur.

mechanical_fish · on Feb 17, 2009

That is true -- thanks for pointing that out. I generally don't do pushes without first handling all the local changes, one way or another (either committing them or deleting them) so this possibility had not occurred to me. But it is there.

I'm guessing that you could fiddle with git's hooks to discourage or even prevent this problem, if you wanted to. For example, you could refuse to allow "git push" to be run from within a repository that contains uncommitted changes, and also include a requirement that your tests pass before the push commences.

jackowayed · on Feb 17, 2009

I do sometimes.

Usually it's when I do one thing, then start another, then realize that I really should push the old thing first.

So I add the old thing I did, commit, test (when I test. sometimes I'm bad and don't have real tests.) and push, leaving this half-finished method uncommitted.

I guess I really should just git stash.

C does not have nearly as many useful things built in as git.

anamax · on Feb 17, 2009

> When you do a "git commit" the result sits on your machine. It goes nowhere.

Those of us who don't know git are now confused.

We're used to having "the result" on our local machine without interacting with our source control system. So, why do git users use its "commit" command to accomplish that?

We're used to "commit" meaning something like "publish". We're pretty sure that git has something similar, so we're wondering what it's called and why it wasn't called commit.

To refer to another discussion, this example makes it look like git requires more user actions to accomplish what we want to do and it uses idiosyncratic terminology. It's unclear why we'd view either of those as an improvement....

Daniel_Newby · on Feb 17, 2009

"commit" usually (always?) means to take the changes you made and enter them into the version control system.

Git is a distributed system, where every user gets their own repository, so "commit" alters the repository on the local machine. This means you can use version control with impunity and not worry about giving bad code to colleagues. You can commit a broken work-in-progress without harming anything. You can also create branches with impunity and not have to worry about naming conflicts, polluting the space of branch names, etc. If something turns out to be a bad idea, you just zap it and nobody ever has to know.

When you reach a version that needs to go out to the wider world, you use "git push" to send it to a shared server. By default git remembers where you used "git pull" to obtain the source from in the first place, and can automatically push back to that server. (Note that I say shared server, not central server. If you want a single official server, you a free to dictate that for your project. You can also give each work group their own repository.)

anamax · on Feb 18, 2009

> Git is a distributed system, where every user gets their own repository, so "commit" alters the repository on the local machine. This means you can use version control with impunity and not worry about giving bad code to colleagues. You can commit a broken work-in-progress without harming anything. You can also create branches with impunity and not have to worry about naming conflicts, polluting the space of branch names, etc. If something turns out to be a bad idea, you just zap it and nobody ever has to know.

I can do all those things without involving the source control system, so what do I get by involving the source control system? For example, why would I want to commit broken code? What does a local commit do for me?

In most cases, "distributed" is a necessary evil that folks work to hide, with varying degrees of success.

Jebdm · on Feb 26, 2009

A local commit gives you a rich undo system.

Xichekolas · on Feb 17, 2009

This is why I prefer to create a branch for each 'feature' I'm working on, commit to that branch a bunch of times as I feel like it, then merge the branch back into the master branch once I know everything works.

Running a full test suite before every commit just means you'll commit less often, which leads to bigger commits that are more likely to contain a smattering of unrelated changes.

icefox · on Feb 17, 2009

For the Arora project when I do a commit it doesn't run all of the autotests, but it does detect that I am changing files X, Y, Z and lookup the matching autotests and run them. This results in very quick testing. I found having local hooks very powerful and you can do many things that you could never do on the server.

http://benjamin-meyer.blogspot.com/2008/10/git-hooks.html

jrockway · on Feb 17, 2009

# So when you type "git commit", your repository will contain something that has never been compiled or tested.

rolls eyes

If you don't like a feature, don't use it. I can easily commit crap that doesn't work to CVS or SVN repositories too.

But seriously, some people have the ability to reason about their code without having to run their unit test suite first. Not committing this patch:

     (defun foo-bar (baz)
    + "Here is some unrelated documentation that I don't want to commit with my other changes."
         (let ((quux 42))

is not going to break anything. But it will keep my doc change isolated, in the history, from other stuff I may be working on.

If it does break something, you can always rewrite the history at a later date. (In Git, history is a tool to make development easier -- not an indelible audit trail for the suits.)

briansmith · on Feb 17, 2009

"I know this isn't going to break anything" is the rationale to every build-breaking commit in history.

jrockway · on Feb 17, 2009

Remember, "git commit" doesn't share your changes with anyone else. Presumably you run the test suite before you "git push" to share your changes with others.

jcdreads · on Feb 17, 2009

"It'll be faster to await the broken build email and roll back than to test this on my dev box," is another good one.