Hacker News new | past | comments | ask | show | jobs | submit login

> Fossil, in contrast, puts more emphasis on recording exactly what happened, including all of the messy errors, dead-ends, experimental branches, and so forth.

I think this is an interesting distinction. The question is: what is the function of the history? Is it to document what happened - and if so, shouldn't every keystroke be committed? Or is it to document which changes relate together, and e.g. should be reverted together? Or perhaps something else entirely?

One thing that isn't mentioned is whether Fossil supports working offline as well as Git does. My impression from this page is that it emphasising creating every branch on a server as well, which implies you need to be connected? If so, I'd consider that a "missing feature" as well, as it's something I use regularly enough and that's important enough that it'd be something I'd miss if it weren't there.




> One thing that isn't mentioned is whether Fossil supports working offline as well as Git does. My impression from this page is that it emphasising creating every branch on a server as well, which implies you need to be connected? If so, I'd consider that a "missing feature" as well, as it's something I use regularly enough and that's important enough that it'd be something I'd miss if it weren't there.

The way fossil works is that by default autosync is turned on, which means that if you are connected to the internet all of your commits, wiki and issue tracker changes are synced to the server.

In the case that you're not connected, everything still works as expected, and will sync back up next time you're online and try to commit or just run `fossil update`.


Thanks, that makes sense then.


One should not have to choose - the history of complete units of work should be an abstract view of the detailed history.


So the abstract view should be useful and free from noise, but you want to be able to get at the noise anyways. But to what end? What's it to you if I typo and make other silly errors during development? What if I never commit until I'm done?

If I never commit until I'm done you will not see my mistakes, but if there's nothing like `git add -e`, then I won't be able to split up my commits logically and so the upstream history when I push will be devoid of useful detail. Even if I do commit early and often, I will almost never do all my work in the order and logical splits that makes most sense for others to see after I'm done... unless you give me a powerful rebase facility.

No rebase -> no clean history upstream, only lots and lots of merges with pointers to messy history (where authors commit early and often) or shallow history (where they commit only when done) in branches you'll almost never want to see. This is the worst possible outcome! The only logical history organization here is "the large merge commit". You get far too little abstraction on the actual history and far too much noise if you want more detail.

There's also the risk that, because you cannot separate bits of work as "done" and push them separately, your branch will accumulate ever more change and reach a point where you cannot continue without doing much more work, and then you won't be able to easily salvage any of the work done to that point in that branch. This one is a big deal to me too.


What I ended up doing before git is having multiple workspaces and copying completed changes from my messy workspace to the clean workspace and commiting from there. Git's partial commit feature just made it a lot easier to do what I was already doing.


I haven't looked into Fossil into enough detail to know if your objections are valid complaints about the way it does things, but they are not objections to the principle that I set out in in my original post.

If you have to rewrite the actual history to get what you call a 'clean' history, rather than just overlay the detailed history with a sparse abstract lattice joining the key points, then it is not a history at all. This could be a problem if you need to do a post-mortem analysis, or if you make a mistake in the rewriting of the history.

I tend to regard frequent extensive rebasing as a process smell - not necessarily a problem, but a warning that there might be one.


When I worked with VCS that couldn't rebase, my strategy was simply to not commit until everything was perfect. I had a local branch. It was just not version controlled.

When I later started using git, my workflows simply became safer and easier.


Could you give an example of where it is necessary (in principle, not just because that's the way the tool works) to rewrite the detailed history in order to get the history you want?

I realize that we generally have to work with the available tools, but it is also useful to think about how things would optimally work. When I wrote that frequent rebasing looks like a process smell, that could because the tool is not optimal.


I rewrite history (not in the upstream) every day, multiple times a day. I do it to split commits. I do it to squash commits. I do it to reorder commits. I do it to make my commits easier to review by whoever is doing code reviews for me. I do it make my commits logical: bug fixes get their own commits, features get their own commits, tests get their own commits if that's what the upstream wants, ...

It's the only way to do things that yields useful history in the upstream. What is useful history in the upstream? It's history that others can read (linearly!) that is informative and makes sense and makes it easy to bisect, git blame, and so on, to find bugs, to understand changes in the smallest logical units.

This is impossible to do without rebasing.

If you find yourself copying changes to another clone to commit them one at a time, then you are rebasing, and you just didn't know it.

History in the upstream is sacred. Unpushed history is absolutely not.


Do you do code review using a formal tool, like Gerrit or Phabricator? If so you already have a "code review history", separate from the repo history. The code review history is at times interesting to review, because it contains discussions, tradeoffs, etc.

Given that we have this secondary history, why require a completely different tool to track and access it? That's just pointless duplication. We should track all the history using the same tool.

Now to your point, it's still useful to call out "final" commits for bisecting, blame, etc. So it would be good to group commits, and hide the detailed history by default.


I very much like features like built-in wiki (which is trivial to do in Git anyways, using either a separate branch or a separate repo with a named derived from the base repo), built-in issue tracker (this is harder to do in Git, though there exist projects that do it), built-in code review, ...

Still, I've worked with codebases sized in the hundreds of millions of lines of code. To deal with that level of complexity one needs things like OpenGrok, cscope, and so on, to find one's way around. And when it comes to history, I could not care less about past code reviews or history internal to a feature branch. When I need to do `git blame` or look through commit history or a large codebase, I want to see clean history with a high signal-to-noise ratio. The more noise, the slower I'll make progress on understanding whatever code/history I'm trying to understand, therefore the slower I'll make progress on bug fixing or feature development -- I might even give up on history, and lose a lot of important information, if the noise level is too high.

For me the ability to rebase, and to require clean history, trumps all the great things in Fossil -- each and every one -- that Git lacks. And this even though I love Fossil's design.


Is it really true you never want to refer to a code review history? It can provide important context missing from even a well-commented commit.

Regardless it's possible to have both. An example is hg's changeset evolution. With changeset evolution, each commit has two histories: the repo history and the changeset history. Commands like `blame`, `log`, etc. show only the repo history; a separate set of commands accesses the changeset history.

An example where this is useful: sometimes rebasing can inadvertently produce bugs, such as collapsing two identical lines which ought to have been duplicated. `git blame` cannot check if that happened. But the changeset history, by tracking the rebase, can tell you that.


Yes, it is. In my many years in this business I have never gone back to any code review -- of my code, my reviews of others' code, or anyone's review of anyone's code.

EDIT: I suppose I might look at past code reviews when evaluating a candidate for employment. Still, there is no need to store those along with code. And if a code review comment needs to be recorded for posterity, it gets recorded in the code or in the commit comment.


No. I worked for years with software that had no way of changing history. It was never required.


It was required at Sun, in Solaris engineering. Merge commits were absolutely verboten (which prohibition was enforced by tooling) -- therefore pushes had to be linear. Commit commentary had a very specific required format.

Clean, linear history has never been required anywhere else I've worked, but I've done it ever since Sun taught me to. Just because it's not required doesn't mean it's not a good idea, and it can't be forbidden ("what happens on my dev instances, stays in my dev instances", and the only thing seen in the end is what I choose to publish, and it's going to be clean and linear).

Large projects at Sun used a rebase-heavy / rebase-only workflow like so:

                +-----------------+
                | Upstream "gate" |<------+
                +-----------------+        \
                  /     \                   \
                 /       \                   \
                v         v                   \
    +--------------+    +--------------+     +--------------+
    | Project gate |    | Project gate |     | Project gate |
    | (re)based on |    | (re)based on | ... | (re)based on |
    | build N of   |    | build N+1 of |     | build N+M of |
    | upstream     |    | upstream     |     | upstream     |
    +--------------+    +--------------+     +--------------+
            ^             ^
            |            /
            |           /       ...
            v          v
      +-----------------+
      |   dev clones    |
      |                 |--+
      | periodically    |  |
      | rebased --onto  |  |--+
      | next rebasing   |  |  |
      | of project gate |  |  |
      +-----------------+  |  |
        | ...              |  |
        +------------------+  |
           | ...              |
           +------------------+
In large projects individuals tracked a project fork of the upstream, which forks were periodically rebased onto the latest "build" of the real upstream, and the individuals' forks of the project "gates" were rebased onto the latest project fork as needed. At the end, when all the i's were dotted and t's crossed, the tech lead would push the project gate's linear history additions to the upstream.

We did this with early 90s tech known as Teamware, with lots of scripting on top. We later did it with Mercurial (again, with lots of scripting on top). Mercurial was a mistake. Git is much, much easier to use this way than any other VCS I've ever worked with, which for me includes: CVS, Clearcase, PRCS, Subversion, Mercurial, Git.


Mercurial was a mistake.

Anecdotally, as a fellow Sun Alumni, I strongly disagree (and I know many Sun Alumni would as well). I greatly miss using Mercurial instead of git now that I work elsewhere. The cadmium extension we had in-house at Sun more than made up for any plausible deficiency and Mercurial phases let me mostly drop use of cadmium.


Yup, and then you had no real history. And one big commit. Super useful, not.


You're just assuming that. And, despite your snark, you are completely incorrect about what my commits looked like.


Ay, I meant the general you, and the snark was directed to VCSes that lack an index and rebase.

You wrote this:

> > > When I worked with VCS that couldn't rebase, my strategy was simply to not commit until everything was perfect. I had a local branch. It was just not version controlled.

if I were to do that (and I have) with anything other than Git, I'd have a hard time splitting up the commits in the end. Mercurial has `hg record`, which is akin to an atomic `git add -p && git commit`. I don't think Fossil has anything even like Mercurial's `hg record`, and it famously lacks an index/staging area.

(So Mercurial has an index! but as always with Git features belatedly adopted by Mercurial, it's a pain to use in Mercurial. If you want to stop in the middle your choices are: say 'N' and commit what hunks you've accepted so far, or quit and abandon the hunk selection work you've done so far. And you don't get to edit hunks.)

> > > When I later started using git, my workflows simply became safer and easier.

Mine too.


Ah. I see what you were saying now, but for me it was just more work to split things manually. The end result didn't change. At least, not much.

This highlights one of the benefits to framing comments in a positive manner: they tend to be inoffensive even when misunderstood.


`hg record` has been supplanted by `hg commit -i` (for `--interactive`), which has an improved UI that is certainly more flexible than the old `hg record`.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: