The biggest and weirdest commits in Linux kernel Git history (2017)

yjftsjthsd-h · on Dec 25, 2020

> Rails is probably more representative of the average project; I expect that most git users don't know that octopus merges are even possible.

Joys of having Linux and git both written+managed by Torvalds:), although I do wonder if that points to to a shortfall in docs/UI.

masklinn · on Dec 25, 2020

The docs of `git merge` do indicate acceptably clearly that multiple commit-ish can be given.

I expect it's really more that "octopus" merge are usually of limited use, most projects are not structured as a distributed "tree of trees" the way the linux kernel is, where merging multiple downstreams regularly is a normal workflow and thus octopus merges make a lot of sense.

ufo · on Dec 26, 2020

The weirdest one for me was learning that their git repository has more than one root commit.

dang · on Dec 25, 2020

If curious see also

2018 https://news.ycombinator.com/item?id=17265151

Discussed at the time: https://news.ycombinator.com/item?id=13647927

kalium-xyz · on Dec 26, 2020

SoC is mostly used for: within one packaging, not for: on the same silicon. For example: if you open the esp32 or 8266 you will see multiple chips even with their own packaging inside.

brundolf · on Dec 25, 2020

Is there some practice around making sure the octopus-merged branches concern separate files (have no conflicts)? Imagine resolving a 12-way merge conflict. Yeesh.

paxswill · on Dec 25, 2020

Octopus merge will fail if there are any conflicts that require manual resolution [0], but there are ways around this like manually creating the commit. Raymond Chen had a has some interesting posts using this to split files while preserving blame history [1] (the manual method is linked at the beginning).

0: https://git-scm.com/docs/merge-strategies

1: https://devblogs.microsoft.com/oldnewthing/20190918-00/?p=10...

lmm · on Dec 25, 2020

So why disallow merging unrelated histories? Does it cause any actual problems? In many ways it seems like a neater model than what git now does.

masklinn · on Dec 25, 2020

> So why disallow merging unrelated histories? Does it cause any actual problems?

Because it's usually a mistake. Think cutting one of your fingers or running off of a cliff, sometimes it's intentional but in most cases it's likely a mistake.

It's not actually disallowed as there can be the odd use for it[0], just not allowed by default anymore (since Git 2.9: https://github.com/git/git/blob/master/Documentation/RelNote...).

[0] as the article shows, the linux kernel does seem to have 2, though in both case it could have been done otherwise e.g. by rewriting the history of the external development to be inscribed in the normal kernel history

lmm · on Dec 26, 2020

> Because it's usually a mistake. Think cutting one of your fingers or running off of a cliff, sometimes it's intentional but in most cases it's likely a mistake.

It just seems like such a useless case to treat specially; merging a branch that branched off from the first commit in the repository is equally likely to be a mistake, but the check doesn't cover it. It feels like a lot of effort for a completely minor case that doesn't even cause a significant problem when it happens.

masklinn · on Dec 26, 2020

> It just seems like such a useless case to treat specially; merging a branch that branched off from the first commit in the repository is equally likely to be a mistake, but the check doesn't cover it.

It's both way less likely to happen (getting a handle on the first commit of a repository really is not something you can do by mistake) and way less inconvenient when it does happen (because merging forked branches is a perfectly normal operation and it doesn't really matter how much the branches have diverged).

> It feels like a lot of effort

It is not a lot of effort: it's just adding a check when merge-base returns nothing.

> for a completely minor case that doesn't even cause a significant problem when it happens.

It completely fuck up diff views of the merge commit.

lmm · on Dec 28, 2020

> way less inconvenient when it does happen (because merging forked branches is a perfectly normal operation and it doesn't really matter how much the branches have diverged).

Why is two branches that diverge after the initial readme commit so much easier than two branches that diverge from the start? Why doesn't it just treat it the same way (as if the empty repo state was a commit that's the merge base)?

> It completely fuck up diff views of the merge commit.

Same question as above - why is the diff so different in that case?

t0astbread · on Dec 25, 2020

Linus' words from the linked mailing list thread:

> I'm very annoyed, because while the multi-root situation can be useful, it can also be confusing as hell. It can cause bisection problems, and it can just cause people to go "WTF?"

http://lkml.iu.edu/hypermail/linux/kernel/1603.2/01890.html

I guess history visualization tools would have a hard time with it as well. But in general, you normally start writing your changes from an existing commit anyways so why not branch from that?

masklinn · on Dec 25, 2020

> I guess history visualization tools would have a hard time with it as well. But in general, you normally start writing your changes from an existing commit anyways so why not branch from that?

The case where that feature can legitimately be used is systems which started as completely separate (possibly with no idea that they even could get merged ever) and then got integrated e.g. TFA provides the example of btrfs which started development out of tree.

In that case, just copying the project you're importing is obviously unacceptable (as it loses all history), and rewriting the entire history in order to "rebase" it has its own issues e.g. conflicts with existing files, loss of traceability since the commits get rewritten entirely and their hashes change, …

So while it has its own issues (which you note), there are situations where merging a separate root does have advantages which outweigh the issues. The other alternatives (e.g. submodules) but all of them have their own drawbacks.

dmurray · on Dec 25, 2020

It doesn't seem fundamentally harder than pretending all of the changes happened in a different directory, and then that directory got moved from being a sibling to a subdirectory of the nominal project root.

Hello71 · on Dec 25, 2020

it's a problem in the git model, since all the commit hashes would change. that means that all existing references become invalid, including emails, commit messages, and pull requests.

kmeisthax · on Dec 25, 2020

I've actually had to do this just last week.

I had a web design client that decided to move to a platform with integrated Git deployment, for a WordPress site that was already developed on a different platform with it's own Git deployment and all of the development history on there. The new platform had a different way of getting configuration, and a different set of platform integration plugins. So I couldn't just force-push the old platform's Git history; I actually did need to do a divergent-history merge in this case and resolve the result in the new platform's favor.

I assumed the warning meant that you couldn't do it, or that it would be super error-prone, or something; but it turns out that it worked perfectly fine with no problems (aside from standard Git merge conflict resolution).

laverya · on Dec 25, 2020

I've done it a few times, and it worked perfectly well each time. Here [0] for instance, a couple repos had evolved to be codependent, and so we combined them into one.

But it's something that you only have to do rarely. Projects merging is far from the normal PR after all, and it makes sense to give a warning in such a situation.

0: https://github.com/replicatedhq/kots/pull/511

ferbass · on Dec 26, 2020

`git cthulhu merge --force`