Hacker News new | past | comments | ask | show | jobs | submit login

Could you please elaborate on how the use of hashcodes rather than counting numbers for identification on an atomic repository botches history tracking, diffing, or merging?



Well the hashcode effectively signs the full repository state, so that you know that if a repo has the same hash code it has the same contents, however that's not the main problem with svn.

The main problem is that svn has no notion of a project to attach it's merges or comparisons against. A directory in a repo may be a subdir of the project or it may be a branch. This leads to an explosion of edge cases and legal commands that make no logical sense. As long as everyone follows certain practices (such as not committing changes to two branches at the same time) then a heuristic approach sort of works, but that's hardly the way to design a robust (and ostensibly simple!) system.

If you can merge subdirectories in a project, and if you can merge branches which are just an ancestor directory of that, and if a given merge only affects certain subdirectories in a repository, how can anyone expect subversion merge tracking to be viable? Even if they somehow munge it to work in 99.9% of real world cases, think of the complexity and mental overhead of maintaining this solution compared to what a DVCS with clear-thinking primitives can achieve. It's time for the successor to svn in centralized version control systems to be built from scratch. svn itself is hopelessly hamstrung.


I've discussed this at some length with a few of the original Subversion implementors. The short version is that they didn't go back and do a slight rethink when they came up with changeset objects, and that causes part of the problem. The rest of the problem is that branches aren't a first class thing in the system, leaving no good way to mark mergeinfo. Hindsight being what it is, you could probably build a Subversion-like system with almost the same filesystem by making the notion of branches and tags first class, and their just-a-copy nature being somewhat hidden from the user.

Sadly (on some level, I'm a huge DVCS advocate, but see the need for CVCS in corporate environments), that's largely water under the bridge - many of the original team have moved on to using DVCS tools. Subversion 2.0 as a completely clean break with the past to fix some flawed design decisions feels very unlikely due to the political forces at work.


When there's only one repo, there's no ambiguity about the content represented by a revision number. Revnums are perfectly adequate for the many situations in which frequent coordination with a central repository is completely feasible.

Git also does not require every file and subdirectory to be modified by every merge, so based on that criterion it's no more astonishing that SVN merge tracking can work than it is that Git merge tracking can work.

SVN's merge tracking was an after-thought, and it was implemented using the general, user-visible, metadata facility (properties). So historically svn was prone to problems such as repeated merging, and today it's possible to make things complicated, and then manually edit and botch the complex merge history, and suffer.

Fundamentally, svn:mergeinfo summarizes history rather than pointing only to merge-parents. I suppose the latter is what you mean when you speak of clear-thinking primitives. I wasn't involved in the mergeinfo design, but it seems less likely the result of muddled thinking than of a design decision recognizing that the entire revision graph isn't locally available to svn users and that the summary is sufficient under reasonable restrictions on usage.

Your condemnation of svn as a "hopelessly hamstrung" "dead-end" seems derived more from your dismissal of centralized version control generally than from the particular details of svn's design or implementation. In that dismissal I think you're taking too narrow a view of the ways in which people work together.


Wow, what I said just sailed right over your head.

Git also does not require every file and subdirectory to be modified by every merge...

Uh, nooooooo... git always merges the whole repository. This makes merge tracking easy to implement in a complete way, and easy to reason about as a user.

Fundamentally, svn:mergeinfo summarizes history rather than pointing only to merge-parents. I suppose the latter is what you mean when you speak of clear-thinking primitives.

No, I'm referring to the fact that in git has a strict definition of both branches and the project tree, the notion of a merge isn't a primitive, it just sort of falls out of the primitive definitions naturally. Each revision in a git repo contains one and exactly one copy of the whole working tree. In Subversion the repository just has directories, some of which are project tree directories, and some of which are branches, and some of which are tags. That lack of clarity leads to all sorts of problems with basic functionality that VCSes should have.

it seems less likely the result of muddled thinking than of a design decision recognizing that the entire revision graph isn't locally available to svn users and that the summary is sufficient under reasonable restrictions on usage.

What it seems like is that the designers had only ever used CVS, so what they were working on seemed so advanced to them at the time, that they thought that "cheap copies" and partial checkouts were just icing on the cake and they had no idea of what they were trading away.

However in hindsight it's clearly not worth munging everything together. You are far better forcing users to actually define branches and subprojects. The amount of work that svn saves you is insignificant next to the degradation of the information stored in the repository data structure.

Your condemnation of svn as a "hopelessly hamstrung" "dead-end" seems derived more from your dismissal of centralized version control generally than from the particular details of svn's design or implementation.

It might seem that way because you don't appear to have understood a word I said. Nothing I've said had anything to do with distributed development. Everything I've said is specifically about svn's primitive concepts, and dead-end is a perfect way to describe it.

This ignorant defense of subversion needs to be stopped. I understand there are reasons people need to use subversion. I understand there are use cases that DVCSes don't fit. I certainly don't think git is the end-all-be-all of VCS. But to defend subversion without adequate understandings of its failings just makes you look bad. It's no different than a Java developer talking about how they don't see what's so great about Lisp macros. If you don't grok the concepts then any arguments you make are just noise.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: