Can it really be that simple though? If you are using a newer version of Git on ...

ProblemFactory · on March 19, 2017

Yes, that is sort of Linus's plan: http://marc.info/?l=git&m=148798319024294&w=2

> You want to have a model that basically reads old data, but that very aggressively approaches "new data only" in order to avoid the situation where you have basically the exact same tree state, just _represented_ differently.

> That way everything "converges" towards the new format: the only way you can stay on the old format is if you only have old-format objects, and once you have a new-format object all your objects are going to be new format - except for the history.

As soon as there is one new-hash commit in a repo, all users of it will have to upgrade their git client - and that git client will (probably?) default to writing new-hash commits.

BrandonBradley · on March 19, 2017

Probably not. I imagine the deprecation period for this change will be measured in years.

CydeWeys · on March 19, 2017

I think backwards incompatibility would be acceptable. Add read support for the new format to git, but then don't have widespread repositories using the new format until some period of time later. By the time they do become commonplace, everyone should already be running a version of git supporting them. It's not exactly hard to upgrade git in most situations anyway, just a simple invocation of:

    $ sudo $PKG_MGR upgrade git

ufmace · on March 19, 2017

Well, it's not hard to update the command-line git client on Unix-y systems with package management. The trouble will be with the hundreds/thousands of other programs that use Git in various ways and are essential to development workflows in various places. Github themselves, Microsoft and Jetbrains IDEs, etc.

CydeWeys · on March 19, 2017

There's two potentially mitigating factors at play here:

I suspect a lot of the tools you mentioned also already treat hashes as strings, not as 160 bit numeric types. The entire front-end JS for GitHub, for example, just uses strings. That's what I'd do if I were writing IDE integrations and such too.

Secondly, the new format will likely still be a 160-bit numeric type, just calculated using a different hash algorithm (e.g. it might be the first 160 bits of the SHA256 result). The tools you mentioned likely don't have to calculate said hashes, they just display them. The entire GitHub front-end, for instance, just displays whatever is given to it; commit hashes are input data to it, not output data.

eklitzke · on March 19, 2017

Just using the first 160 bits of a new hash function was proposed at one point, but it's not part of the current plan. The new plan is to introduce full SHA3-256 hashes (which are 256 bits in size). More information here: https://docs.google.com/document/d/18hYAQCTsDgaFUo-VJGhT0Uqy...

(Of course, CLI and frontend tools could still truncate display output to 40 hex characters, but internally full size hashes will be used.)

CydeWeys · on March 19, 2017

Woah, that's disorienting to be linked to a bluedoc unexpectedly like that, considering I'm not on my work account. I ever recognize one of the authors.

boxfish · on March 19, 2017

What is a bluedoc?

CydeWeys · on March 19, 2017

It's just a Google Docs template used for internal engineering design docs at Google. The linked doc is a typical example.

taeric · on March 19, 2017

Don't most of these go through the git codebase? Or just parse the output of standard commands? Curious what implications the hash has on those.

jlgaddis · on March 19, 2017

I would assume that most other stuff uses libgit2 (except maybe the Microsoft stuff). Is that not a safe assumption?

d0m · on March 19, 2017

One straightforward way is to use the new sha only for new Git repos. Old repo could be migrated but it would require "re-commiting" everything to the new repo.