Hacker News new | past | comments | ask | show | jobs | submit login

I think the reason for why the commit hash has to change is that a commit represents the entire state of a repository, not just the change made in the commit. Being able to take a sequence of commits and insert them into a repository just is not a thing that makes sense in git's model.

If you just hashed diffs, you would not get whole-repo integrity guarantees.

It is possible to go the other way with patch theory (see Darcs) but it's far from trivial to implement performantly.




> I think the reason for why the commit hash has to change is that a commit represents the entire state of a repository, not just the change made in the commit.

Yes, of course I realize that's the reason. My entire point was that a commit shouldn't represent the entire state of a repository.

> Being able to take a sequence of commits and insert them into a repository just is not a thing that makes sense in git's model.

Yes, and this is exactly why I declared this to be a fundamental flaw in git's model.

> If you just hashed diffs

Diffs are an implementation concern, which I don't care about. I'm only talking about the logical semantics.

> you would not get whole-repo integrity guarantees.

As I explained, I wasn't suggesting you must get rid of that hash entirely: "Of course it seems fine to have a hash that depends on the history, and it's very likely useful for many purposes, but that shouldn't be the primary mechanism for identifying commits."

> It is possible to go the other way with patch theory (see Darcs) but it's far from trivial to implement performantly.

Again, I didn't say you have to get rid of the current hashes. I was just saying we need something else to use for identifying commits.

------

If an example helps: consider what happens when you (say) sign off on a commit. Are you genuinely signing off on the history? Can you even claim with a straight face that you even know everything in the history behind every commit you sign off on? The reality is, you don't, and you don't need to, because you're only concerned about the commit itself. There's no reason a change in history should invalidate your signature. (Of course, the point here is not just signatures. They're just one example to illustrate what I'm saying. You can think of other scenarios.)


>Are you genuinely signing off on the history?

No, you are signing off current state of the repository. Otherwise it would be possible (not trivial, but possible) to take signed commit and apply it on different history, which could create a security loophole.

Your view on commit is a logical set of changes. Git's view is state of the repository. The set of changes between revisions, which is useful for developer to see more than the whole state, is computed on the fly.

>I was just saying we need something else to use for identifying commits.

The commit message?


> Your view on commit is a logical set of changes. Git's view is state of the repository.

No, my view of a commit is not a logical set of changes. It's everything that would be in my worktree if I checked out the commit. Which is neither merely the changes from the previous commit(s), nor the entire history leading to the current commit.


But git already has this object, it’s called a tree and each commit has a unique tree associated with it. The commits are the object that carries history and metadata on top of the trees. Is your objection that the commit metadata is associated to the commit and not the tree?


To a first approximation, yes, I think that captures the general idea.

But I don't think it's literally as simple as just taking the commit metadata and slapping it onto the tree objects.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: