1. Problems with storing big files in Mercurial. You can use the bigfiles extension if you really want to manage large files with Mercurial.
2. Inefficient renames. Making renames efficient is a currently recommended GSoC project. I'm sure this will be fixed eventually even if no one picks it up this summer.
3. Destructive commands actually, well, destroying stuff. This point makes me feel like the author knew git first, tried mercurial out, and switched back, making the blog post a little misleading. I don't know how you could expect a delete not to be a delete unless you were familiar with something like git already. If I were to run a delete command in Mercurial, I would read the documentation on the command first to make sure it created a backup for me. Only if it then failed to create this backup would I complain. Also, I think I prefer the bundle approach of Mercurial. You can always rename the bundles to keep track of them. You could even write your own extension in 5 lines that ran a destructive command and immediately unbundled the created bundle to restore your changesets, exactly duplicating git's functionality.
The first two issues he mentions I would consider to be actually valid complaints against Mercurial, the third just a personal preference of the author. I would consider neither of them to be major game-changing issues.
> I don't know how you could expect a delete not to be a delete unless you were familiar with something like git already.
Ability to control and be able to reverse any changes you make to the source code tree is an essential feature of version control system - precisely because you don't know in advance what will work and what won't. The fact that people are not used to that simple idea just shows how broken are most of the other tools out there.
There is no reason not to want an unlimited undo for almost everything, especially today when disk space is so cheap.
Maybe this is not that much relevant for DVCSs but in 'classic' VCSs such as SVN an 'obliterate' command would be quite useful, think about the case when someone accidentally commits something highly sensitive (such as private keys) to version control. People make mistakes all the time, either way.
"git rebase" on its own does not permanently alter history. It creates a new branch that looks like a new history, but you can get back to the original history.
"git gc" will prune the unused old histories for that permanent effect.
Most destructive commands in mercurial do create a backup first. However, some command, for which all we know, could have been written by a third party, didn't. That's why I suggest that you read the docs for a command and test it out on non-important data before applying it to your only copy of a repository. Someone could just as easily create an extension in git that forgets to create a backup as well. What happened to the author is entirely the author's fault.
I don't know about Mercurial, but in your IDE / Editor, don't you like having multi-step undo no matter what command you execute?
Or do you prefer to manually backup the files of your project before doing any operation in your editor that is potentially destroying your files? I'd much rather hit ctrl-z as many times as needed once I noticed my last refactoring has shredded all the files it touched.
Being able to easily undo whatever I do to in whatever application I can think of is a nice feature - even more so if the application is what I entrust all my work to.
When I run sed on a file, I don't expect it to make a backup copy of my file. Similarly, when I run rm, it doesn't create a backup of what I'm deleting. Destructive operations are, well, destructive! I'm amazed at the response I got to my post. Read the docs on a mercurial command. Chances are if it is destructive it creates a backup for you. On the off chance that someone writes an extension that fails to do so, you can create the backup yourself. There's nothing magic about git's operation here. Someone simply took the time to create a backup in git for a command for which the equivalent mercurial command didn't. Big deal.
Git's repository structure, plus the built-in reflog prevents committed data being lost by any command, unless it explicitly erases the repository or reflog.
I explicitly said "committed data" would not be lost, because data only staged in the index certianly can be lost, and I think stashed data is also semi-suceptable to loss as well.
git stash drop does print the sha1 of the stash, and as long as you know that sha1, and it's not been gc'd, you can get a dropped stash back. (git fsck will also show the sha1s of dangling commits left from dropped stashes.) But while a separate reflog records existing stashes, that information is not retained when they're dropped. It's perhaps best to think of git stash as a more convenient form of git diff > patch, and you wouldn't expect that to keep a log of the patch file either.
Yes but the original author in this very thread has said that the data loss was through MQ(Mercurial patch Queue) delete. mercurial queues are an optionally controlled patch queue, so he had some code stashed it in an uncontrolled queue, deleted it and was surprised that it was really gone. This is rather analogous to what I posted for git. It was a bit of bad luck on his part but switching to git doesn't close that hole his code is still vulnerable to the same set of actions.
So his whole reason for switching is I shot myself in the foot. And instead of learning more about his weapon of choice to avoid doing that in the future, he traded in his gun for one that is slightly more complicated.
Kind of like git branch -d, which, as http://www.kernel.org/pub/software/scm/git/docs/git-branch.h... states, deletes the reflog as well? If it doesn't delete the reflog, then it git world, at least, it probably isn't considered to be fully destructive!
git branch -[dD] does not delete reflog entries for commits made to the branch.
A branch may have its own, separate reflog which would be deleted, but that is only a convenience feature as the man page you linked to documents WRT the -l option; the primary reflog still records all operations made on the branch.
You might find it useful to actually test stuff before posting links to man pages that you don't fully understand. I know I do. :P
Part of the point of using a versioning system is to avoid ever destroying stuff. Therefore, in a versioning system, you'd expect to have to work pretty hard to make a change you couldn't roll back from, wouldn't you?
Yes, and hg and git both make you work hard. I'm not sure what his point is. He used a quite advanced command without understanding its consequences, he paid the price. A non-advanced user would really have no need to use hg strip.
I'm not aware of any VCS which meets your standard of not allowing destructive commands (I'm sure you realize you can destroy history/data in git as well, right? It may take a --force, and it'll yell at you, but it's definitely doable)
Subversion, the most used VCS in the world, has no totally destructive commands per se.
Once you commit something, it stays in the repository in the commited version forever.
In SVN if you really want to delete a file, you have to stop the service, dump the repository to a text file, edit the dumped file to delete the data (or alternative, do not export the latest revision), restore the repository from the file, and restart the service. SVN has no native way to do that, you have to use some external utility to totally delete something.
Now, it can be argued that this is good or bad design, or that centralized VCSs are bad/distributed are good. But you are making a broad statement about VCSs and ignoring the existence of SVN, and this is misleading to anyone who reads this thread.
It's a lot harder than that, actually. You have to do --force, clear the reflog, and then run the garbage collector. Your data isn't actually removed from disk until the last step.
Yep, and beauty of the system is that anyone who is a sophisticated enough user to do that, would never do it accidentally.
Although, I imagine quite a few users --force a command and, unaware of the afore mentioned facilities, write up a nasty blog post about switching to hg from git because git lost their data :)
Sophisticated users make stupid mistakes all the time. They misread output, get confused about the context in which they're operating, etc. Human beings are not machines. You cannot assume that just because a user is sophisticated, they will not make mistakes that lead to data loss. Even the best drivers still have accidents.
If I git rm something, and then git reset, my file is back. Are we talking about something different, or am I remembering git wrong? I don't actually git rm very often.
And it makes sense that there's a permanent delete feature, but I'd expect it to be outside of the normal workflow.
Because he lost data? hg rm wouldn't ever lose data. He probably used an hg strip (which is an advanced command and must specifically be enabled in .hgrc through the mq extension) but didn't realize what he was doing...
The command that lost data was hg qdel. Strip won't lose data; it dumps bundles (although they're a pain in the ass to restore from). Like I said, I used Mercurial for three years. And not "I play with it sometimes at night" kind of used, but rather "it was my main version controls system all day, every day" kind of used. I was using MQ for most of that time.
The fact that so many people consider "hg strip" an advanced command is part of the problem. Modifying history should not be considered advanced. Being able to recover from ANY command, including destructive ones, should not be considered optional.
Yes, so that deletes a patch. Anything in patches is basically in flux, and I wouldn't call losing a patch "data loss". If you call hg qdel "data loss", you'd call any sort of modification to a patch "data loss", since patches aren't versioned. If you want versioning with patches, use pbranches.
> Modifying history should not be considered advanced.
Maybe, but the only way I modify history in practice is through rebasing. I've never ever felt the need to modify history any other way. What use case do you have for modifying history in potentially destructive ways?
Git gives me everything I needed MQ for, but with the complete safety of the reflog. There is no such thing as a change I can't undo. The fact that patch management is "special" in Mercurial is the problem!
With respect to history modification:
First, I rebase a couple dozen times per day. I'm on a team that doesn't use merges unless we have a reason to (this makes it easier to bisect and think about history). I also create, destroy, and rebase many of my own branches every day.
Second, I amend commits a lot. I'll often spike some little piece of code I don't understand, then start amending the commit as I rewrite it with TDD, until the commit no longer contains any traces of the original spiked version. For more complex spikes and TDD rewrites, I'll do it over many commits, rebasing the spike over the rewritten version until the spike commit is empty and gets skipped by the rebase. Doing that in Mercurial would be... arduous. I can easily do multiple history rewrites per minute while doing this.
Third, I amend commit messages a lot, usually with "git rebase -i". Maybe I forgot the ticket number, or maybe the meaning of the commit changed (see the next point).
Fourth, I sometimes do drastic commit rearranging. This is harder to explain, but it usually involves splitting commits (in the simple case) or moving sets of related changes from one commit to another (in the complex case). These are sometimes at the file level, sometimes at the hunk level, and sometimes within a hunk. This is rarer than the others; I probably do it once or twice per week.
Fifth, I "git reset <ref>" a lot. It took me longer to start doing this, but it's useful in a lot of situations. For example, "oops, I accidentally created a merge bubble."
Yep – and I did it all, using those tools, for a couple of years. :) Now, when I go back to Mercurial (which I know better than Git, mind you), I get frustrated. Those tools are much more blunt than their Git equivalents.
so i tried it for a couple of days, and the speed was the thing that impressed me the most actually. the rest is not that dissimilar, but i didn't dislike it as much as i thought i would :) - our workflow assumes mq already so a git add is a qnew, or qref, a git diff --cache is a qdiff, a git diff is an hg diff ... so on. most commits i make are usually a qfinish, which is comparable to a git commit (no -a).
how do you manage patch queues in git? we need them because we are constantly backporting to different versions of our app. branches will mean n merges for n versions. stacked git? or is there a git native way to manage the same?
what about something like tortoisehg? gitk is pretty crude in comparison.
i assume one can glue a diff/merge tool like meld. is the experience similar when resolving conflicts?
1. Problems with storing big files in Mercurial. You can use the bigfiles extension if you really want to manage large files with Mercurial.
2. Inefficient renames. Making renames efficient is a currently recommended GSoC project. I'm sure this will be fixed eventually even if no one picks it up this summer.
3. Destructive commands actually, well, destroying stuff. This point makes me feel like the author knew git first, tried mercurial out, and switched back, making the blog post a little misleading. I don't know how you could expect a delete not to be a delete unless you were familiar with something like git already. If I were to run a delete command in Mercurial, I would read the documentation on the command first to make sure it created a backup for me. Only if it then failed to create this backup would I complain. Also, I think I prefer the bundle approach of Mercurial. You can always rename the bundles to keep track of them. You could even write your own extension in 5 lines that ran a destructive command and immediately unbundled the created bundle to restore your changesets, exactly duplicating git's functionality.
The first two issues he mentions I would consider to be actually valid complaints against Mercurial, the third just a personal preference of the author. I would consider neither of them to be major game-changing issues.