I've only just made the switch. Like most people I was stuck with SVN due to my employer using it, and refusing to update.
Now I'm freelance I've had the time to put into learning Git and I'm glad I did, but the point is there are still a lot of people using SVN, some of whom will switch to Git in the future, and posts like this can be very useful to them.
How true this is. I spoke with one sysadmin who told me that even if 100% of the developers were using git-svn he would never allow git as the official location to store code on the servers and it must be pushed back to svn. I am not sure how to respond to people like that.
Edit: When pressed for more details it was clear that he was happy with his svn server setup and didn't want to change and have to learn something new. This was not a logical discussion, but an emotional one and as he ran the servers he had the final say.
"didn't want to change and have to learn something new"
Resistance to change might be an authority thing - you could just walk around this troll's bridge and see what happens. Try talking to his boss or higher about source control, casually. If his manager asked him to change to GIT he would do it. He might have dismissed you because he believes you're not in a position of power over him or the work required to switch is mundane.
It only shows that there are many development shops who have not migrated to a more powerful DVCS (either git or mercurial) and we ought to spread word about how inadequate centralized version control systems are.
I regularly run into dev shops (with code as the product!) that barely use "old style" VCS properly (e.g. CVS, SVN or TFS), and just refuse to look at git for a litany of reasons, largely boiling down to "our tools are magic, we are afraid of this new magic".
I just started a new job, at it's the first time in a fairly long career I have had to use git instead of svn. The company I joined only recently switched to git too.
Thing is, I understand svn. I get it, and I have used it for years and know, almost instinctively now, what will happen when I use the various commands.
Git for me just feels overly confusing. I suppose it's just because I'm used to "the old way" but just take a look online at how many articles that are out there trying to explain how git works. Why are so many needed? It feels like there are many more than there are explaining how svn works.
By the way, I can see the benefits of git and I feel I'm getting the hang of it quite well now, but I still like svn :)
There are so many articles because there are a lot of people out there that really like git. It's like cutting meat with a spoon and then you are given a knife. Everyone is complaining about how you can cut yourself or how two sides of the knife but only one is sharp is confusing but once you know how to use it it is a game changer with respect to actually cutting meat. There are a lot of people passionate about cutting meat.
It's not really that hard to learn btw. There are some esoteric things that can be tricky but the Svn workflow can be learned in a day or two tops.
Anything in particular you would like to know? I have done it twice. First at Trolltech where Qt was moved over (I want to say there are a handful of public blogs on this) and I have helped with a bunch of perforce/git migration/integration at RIM and consulted with various other companies.
I can't give explicit numbers, but yes I experienced some of the issues Facebook wrote about. This particular problem stems from SVN/Perforce and how they give the user the ability to only checkout part of the repo.
Setup 1 exploites this. This would be movie companies were every version of every rendering is in Perforce. On the server you have easily many times even the size of the desktop hd, and users only grab what they need/want. These companies should stick with Perforce.
Setup 2 came about from laziness. Everything was dumped into the repo with no organization. Converting it to git will require a XXXGB repo which while manageable windows can't handle well. You encounter say dozens of copies of the binary sprinkled around inside of the source directory (not even revisioned, just copied with the version included in the file name). Need a copy of every Windows NT CD's stored somewhere? Why not in the src directory!?! Once you cleanup/split this the src repo goes down to a usable size.
After the above basic cleanup (which honestly can/should have been done in svn/perforce anyway.) the src repo can still be a large size because the src isn't the src for one project, but maybe dozens or hundreds of projects that compile to one binary (or some similar setup such as lots of little binaries that are one product). It is then up to you to decide on how you want to proceed. There are a few different approaches with different pros and cons and situation specific. (And discussing them is really a full blog entry not a random hacker news comment).
I'll leave you with my law about repo size:
When every developer in a company is committing to the same branch the odds that a commit will break the build increases as more developers are hired.
We are doing that at Amazon. I made the switch a couple months ago but it's up to each team when they'd like to switch. My team is lucky in that we have a couple people who really know Git well.
Sure, and I'm one of them. It's not exactly startling news in 2012 that a lot of people prefer the git/hg model. Hell, I prefer it myself! But I've got a complete svn ecosystem set up here, with 44 projects in a single svn repository, automatic offsite backups, and hundreds of checked-out directories spread across five (or more?) computers with five distinct operating systems. As far as I can see, the one-time costs of moving over completely dwarf the relatively small benefits I would gain from using a better branching model.
I'm one of them as well. I'd also love to use git. The issue I have we it is that I feel like the "port" to Windows was more of a crowbar sort of operation rather than a well thought out plan to make a program cross-platform compatible.
Install TortoiseSVN. Done.
Install TortoiseHg. Oh it needs something else. What's it called? Ok. I need msysGit. I'll get that. Ok, they're all in beta? Whatever I'll just get one. What? It runs on top of Cywin? And on, and on...
Am I missing something here or is it really that convoluted? I work with non-programmers (engineering types) who program and I need it to be easy for them. Any recommendations?
The guide at github[0] is pretty easy to follow. I've done it on some of my fellow students' pc's a couple of times during a project, but most of them switched to Ubuntu after a couple of weeks.
At the risk of sounding like an elitist hipster hacker, the core audience of this site comes here looking to read articles about innovation, not staid, conservative companies slowly transitioning to established and proven technologies.
A common issue, especially with fast paced startups, is technical debt. This debt can show itself in various ways, and the underlying technology that one uses could be one of those debts. SVN was the popular choice in version control for a long time, and it wouldn't surprise me that there may have been startups that chose this because of comfort-ability in the technology, which has now become a technical debt that will need to be taken care of.
Just because you are an elitist hipster hacker, doesn't mean that this issue will only pertain to staid, conservative companies.
As I mentioned in an earlier comment, that's the beauty of giving the power to the community to decide what is interesting to them. I wouldn't describe SecondMarket as a staid, conservative company, we use amazing open source technologies such as Solr, Scala, Akka, mongoDB, etc. in a very agile environment. We also like to think we are changing the way in which the financial markets are working. However svn was one technology we were stuck in the past with and we wanted to share that making such a basic change has made our life better.
Even hipster hackers sometimes work for staid, conservative companies. Articles like this one give the hiphacks some ammo when they want to convince management.
That's the beauty of giving the power to the community to decide what is interesting to them. There seems to be a lot of people using older source control solutions and for one reason or another haven't made the move to use a better solution. That's what inspired me to write this post in the first place. We made the move and it improved our lives and I simply wanted people to know about it.
Under a rock? Is git _that_ much better? I used to read about how it didn't need a central repository, yet this article, and github, seem to indicate everyone ultimately wants/needs a central repository.
So git is worth switching just for branching/merging superiority? Maybe I don't do it enough, but svn has never let me down. Or maybe I'm not on a large enough dev team?
Guess I'm really just trying to figure out if git is so amazingly better that I really am under a rock (I'm not seeing that), or if it's more of the "if you're not using {latest hip tech} you're not to be taken seriously" kind of thing that we see so much on HN.
(sorry about new account -- can't find original credentials)
I wish I had time to get in all the details about why git is better but I will leave you with two questions to ask your existing version control system:
1. How fast is your version control system?
Unless you have tried git, you will not realize how painfully slow SVN (or any other VCS which has to talk to server) is. 90% of my git operations take less than a second.
Now, you may say that 'speed' is not an issue. Trust me, it is. Once you have a super fast VCS, you will embrace it as your friend in everyday coding instead of using it at the end of the day. It's something similar to what Google thinks that making website/webapp faster brings in more users.
2. Can your version control handle renames?
There are very few version control systems out there which can handle renames. If you dread renaming files because of SVN, you ought to look at git. BTW, not only can git handle renames without a hitch, it can show you code history across file renames.
Again, you may think that 'renaming' is not an issue. But if you believe that 'naming' is very important for your code and you do re-factoring all the time, you don't want your version control system to dictate how you work.
I can imagine why one may think that git is fad. But after using it for 2 years now, I can not use any other version control system. [And for a background, I have used following version control systems for real proejcts: CVS, VSS, Perforce, Surround SCM, SVN, TFS.]
1) Fast enough? I have tried git. It's pretty cool, but the difference between 5 seconds and 1 second is minimal at best to me. I don't think I've found myself dying for a faster SCM, though I usually make maybe 1 - 5 commits per day. Git on Win32 is meh compared to git on Linux, so that doesn't really convince me any further either. Git is also a PITA to compile on !Linux using !gcc, because heaven forbid we write using ANSI/ISO C.
2) Yes? I can do renames, but strictly speaking it is not an atomic "rename" operation. I don't particularly care about rename history, and I'm not really sure why others do, but I guess if I tried I could make up a use-case for it. Still, it wouldn't be a selling feature for me, that's for sure.
It seems strange to me that you take the time to tell us that "git is better", like we're comparing O(n) to O(log(n)) algorithms. Different workflows, projects, and organization structures have different needs. It seems hopelessly naive to simply assume everyone's needs are best fit by git, and git alone.
How about this: I want to tell, given two revision numbers in the mainline branch, which came first? In SVN, it's pretty simple. In git? Well, first open a terminal...then ...argh. As release manager, it's kind of annoying. Simple use case: is 2aa8aca05357004d4418807c06f53d81517bc629 before or after e8d0555ce401b6acb0f225aa0263e9b72136347d? Did e8d0555ce401b6acb0f225aa0263e9b72136347d make it into the build given that 5b5c73ceaa442bafff08b4470aad3ff276be94e3 was the final commit before the cut-off?
I might waste 5 seconds opening a terminal window or tab, 30 seconds finding, copying, and pasting the revision numbers, then a second or two determining the answer. This, to me, is not a productive use of my time when revision numbers simply tell me what I need to know. All of that magical time git just saved me might have just been absorbed or gone into time debt.
Have you ever thought about why do you only have 1-5 commits a day? I do 70 to 100 commits a day. I commit each and every semantic change. I do not commit changes so that I have a backup of my code, I create a commit so that I can track my 'semantic code change'.
Let me give you an example:
Let's say I have a working website. The website has header/footer/content style. I spend one hour to customize my website layout by modifying html/css.
When I am done with header, I commit my HTML/css.
When I am done with footer, I commit my HTML/css.
When I am done with content, I commit my HTML/css.
During my work, I found a bug in CSS which fixes float in IE and I created another commit for this particular bug fix.
The most important point being that each of my commit is 'independent' and represent a meaningful unit of work.
You may ask, why create 4 commits instead of one? The reason is that git will let me use these commits anyway I chose.
It's 2.30pm and my manager stops at my desk and tells me that I have to fix the float but in IE 'RIGHT NOW' before the 3pm meeting. I am only needed to fix the float bug and nothing else because the website has to match with printed material for the meeting. How do you do it? Easy. git will let me 'cherry-pick' my last float bug commit and merge it in to current production branch. After merging the one particular commit, I test everything to make sure nothing is broken and I am all set before the 3pm meeting.
Obviously, this was a simple case. There will be cases where you can not just pick one commit as it depends on other commits. You will have to do more work to get the fix in. Git is the helping hand which can let you play with the commits to get the answer you want. SVN can not do that.
Going back to my point, I think you may not realize that you do far fewer commits than you should because your subconscious brain knows that commits are slow. Once your brain will realize that commit can take less than a second, you will automatically commit more often.
To give a unrelated example: Back in the day when we didn't have access to high speed Internet, we would consult local documentation only because searching online was slower than finding it locally. Now, with Google and high speed Internet, I have stopped downloading documentation locally and rely on Google to find it for me.
'caring about rename history'.
Again, this is dependent on the way you code. I like to re-factor my code a lot. I rename my classes whenever I find a better name. I move my code around in different folders as I find better way to organize my code. And as I re-factor a LOT, I DEFINITELY care about my code history. Git can track renames/folder movement and still show you the file history. You may not care about rename history now but you may care about it in future. With SVN, you have no choice. Git offers you that choice.
Now to your workflow comment. The beauty of git is that it is a 'content tracking' tool. You can use it to FIT any work flow you like. git does not force you to use any particular workflow. You can use it in the exact same way you currently use SVN. Or you can come up with your own workflow which no other team in the world uses. It's all fine.
Finally, the revision number comment. Generally, I haven't had a need to figure out which revision came first based on 'revision numbers' alone. But once again, git has an answer. You can use git describe command. It will give you something like:
"git describe gives you a version description like the following: v2.0-64-g835c907. The v2.0 part is the name of the latest annotated tag preceding the commit, 64 is the number of commits after that, and 835c907 is the abbreviated commit id. It is basically there to identify any revision in an exact and convenient (although technical) way."
And that's the beauty of git. Git already has a way to address your common concerns and if it doesn't, you have enough meta data to make it work the way you want to.
Once again, I really understand where you are coming from because I was on that side once. But I sincerely believe that git is a much better tool and it will change your life as a developer. It has definitely changed mine.
Please, please, do yourself a favor and at least experiment with git before dismissing it.
Yes, SVN can handle renames. And maintain history across renames.
Btw, SVN supports partial checkouts (git doesn't). Which I use regularly on a gigabyte repo.
I use git regularly for all my hobby projects (mainly because 'git init' seems way simpler than the equivalent svn command), and I might even start a company off on git, but I'll never claim that SVN is obviously a bad way to do things.
Actually, I've had many more problems with git renames than with SVN ones. IIRC, it has to do with git internally tracking content, not really filenames. So, I've had cases where changes then a rename result in git thinking a new file was created and an old one destroyed.
But, at the end of the day, I rename files so infrequently that it's hardly a selling point or problem on either system.
Do you care to elaborate on point #2? In svn I can do "svn mv" to rename files. I don't see how this is more difficult than just "mv". Maybe the history of file renames is somehow more easy to follow in git? If I run "svn log renamedfile", I can see the history across file changes.
With git, I don't have to tell it that I renamed file. It can even figure out if I moved the file from one folder to another. And with git, I can see the complete history without worrying about the fact that the name might have changed in the past.
Yes, git is that much better. Branching is only part of that. Another part is speed—and especially network speed.
Git makes operations, during which in SVN I could go, make some coffee and come back just to find them still running, instananeous.
Then there is rebasing. Then there is differntiation between author and commiter. Etc., etc.
> github, seem to indicate everyone ultimately wants/needs a central repository
Of course you still need some centralization in order to share work with other people. But the more compelling thing about Github is forking, not centralization. With Git you can easily synchronize commits between multiple remote repositories.
Even at that, branching isn't hard at all on Subversion. I know that before 1.6 it was more cumbersome, but making a branch for my work and committing a lot of small changes and then merging back when done isn't hard.
There's also the benefit of being able to checkout just one folder of the project wherever you want without having to keep the same folder structure as the source project. I understand git is able to do a "sparse" checkout now, but the parent source folder structure is checked as well.
Simple difference: in svn, until you're ready for other people to see your changes, you have to leave them uncommitted. In git, you say "well, I just refactored function X, let's throw that in a commit before I do anything else, and if I have to revisit it, I will"
I've never had to use svn myself, but it's this aspect that's always sounded positively crippling to my workflow
There's a fair few extra features you wouldn't think you would need but you'll discover mainly through using it. We made the switch from svn to git a while back and I can't imagine how we used to get by without stash, cherry-pick, bisect etc. There's more to git than just a distributed repository and better branching for sure.
I agree. The entire Internet would have you believe that OSS just didn't exist before the advent of git. I like git and prefer it, but SVN gets the job done. My only real gripe with SVN is it makes it harder for those without commit privileges to commit. Within a company setting, that point is moot. A lot of the other anti-SVN sentiment is based around old versions of SVN. It still doesn't branch as fast as git, but it does have merge tracking.
At the end of the day, use what works best for you. If you want to test the waters, use git-svn. You can then work locally and then dcommit your changes when they're ready to go upstream. It'll give you a pretty good approximation of what git can do for you workflow-wise.
I just recently switched to Git after joining a new company. The team I'm on was just in the process of switching from SVN to Git, so we were all spending a lot of time online in Pro Git as well as researching various workflows.
I think that we're still not fully taking advantage of Git because the other members of the team tend to commit and push all the time (so the code is backed up) but that makes rebasing and squashing quite difficult.
Git has also allowed us to work better with maintaining feature branches and bug fix branches which helps us to give individual branches to QA for test and only release those features/fixes that have been QA approved. If a feature isn't ready on time for deployment, it just doesn't get added to the integration branch and it'll go out the next release. That's pretty cool.
Personally, I've settled on using SourceTree as a GUI with a fallback to the command-line for certain operations.
Do you have an anecdote about why having more commits have made rebasing / squashing difficult? Is it just that team members are pushing their numerous changes to the central repo? I'm wondering if I'm missing something because I never squash commits...
Chances are they're pushing to the same branch or to the same small number of branches instead of to private-ish branches. This is an easy trap to fall into: lots of people have a hard time getting used to using branches for everything.
As for squashing or not, I rarely do, and in general only squash commits that are simple typo fixes. This makes bisecting much easier.
Also, and this may or may not be relevant, maybe some people are reluctant to spam the central repository with their own temporary branches. To avoid this, I have everyone in my team set up a personal, backed up repository. Anyone can pull from these personal repos, but only their owners can push to them. When the temporary branches are done, they can be merged.
It can make reverting a feature easier if the branch being merged had intermingled with master several times. You could avoid that by rebasing your topic branch before merging, but that's lying too. I personally never rebase unless I need to fix a commit message and I only squash merge if I had a series of ping-pong commits while trying to fix something.
We think that a repository with too much noise isn't so useful. When you squash you should use common sense and aggregate the commit comments as you see fit.
Really? How? I've always found Subversion branching to be painless, reliable, fast enough, and merges often go flawlessly, and it's really not been that hard to resolve merge conflicts.