Pijul is a free and open source (GPL2) distributed version control system. Its distinctive feature is to be based on a theory of patches, while still being fast and scalable. This makes it easy to learn and use, without any compromise on power or features.
I downloaded this and tried to follow the documentation [0]. `pijul add` doesn't accept wildcards, instead returning a pretty unhelpful error message:
> λ pijul add *.*
> Error: The filename, directory name, or volume label syntax is incorrect. (o error 123)
The documentation for add [1] gives no information on the format, however I put 2 and 2 together and managed to do `pijul add . -r`. The next step is `pijul record` [2] which returns another error:
> Error: No identity configured yet. Please use `pijul key` to create one
Ok great. I ran `pijul key` [3] (which isn't documented) which returns 0, with no output. `pijul record` still says the above, so I tried `pijul key --help` which tells me I need to run a _different_ command (`pijul key generate <LOGIN>`) but doens't tell me what `login` is supposed to be. I ran `pijul key generate <my_email>` which appears to have worked, and then `pijul record`.
I actually killed it after an hour and 15 minutes, rather than waiting to see how long it would take. For reference, git add * takes about 3 minutes on the same project. Between the documentation, error messages and performance I can't see myself taking another go at this for a while.
On POSIX shells, yes. But from the error message, maccard is on Windows, and Command Prompt doesn’t expand wildcards. (PowerShell, I have no idea.) Windows programs where wildcards are reasonable are expected to resolve them themselves.
Honestly as a user it doesn't matter, ~it's a bad look for a tool when it's biggest competition _does_ handle it.~ - I'm going to consider it a feature of the competition!
The goal of a beta version is to collect feedback, so thanks for that. However, that particular command worked in all the repositories I tried, so I'm really curious about what you are doing.
To differentiate from Git Pijul should focus on usability. Thats by far the main complaint people have about Git. Not many people care whether it is patch based or snapshot based. It's super hard to explain what the advantages of that are.
If Pijul has an easy to use interface like Mercurial did then that will massively help adoption.
> To differentiate from Git Pijul should focus on usability... If Pijul has an easy to use interface like Mercurial did then that will massively help adoption.
Mercurial is effectively dead (outside of Facebook?) now. If the goal is simply to grow a project, repeating the same mistake doesn't seem like a good idea.
I don't think the goal or differentiation of pijul is to be popular via good UI, though. If the theory of patches is good, it doesn't matter if pijul "wins" or not, as long as whatever does can integrate it. If the theory of patches is bad, I don't want pijul "winning" just because it has a good UI.
> > To differentiate from Git Pijul should focus on usability... If Pijul has an easy to use interface like Mercurial did then that will massively help adoption.
> I don't think the goal or differentiation of pijul is to be popular via good UI, though. If the theory of patches is good, it doesn't matter if pijul "wins" or not, as long as whatever does can integrate it. If the theory of patches is bad, I don't want pijul "winning" just because it has a good UI.
Note that usability and UI aren't synonymous. A big part of usability is the mental model that users build up in their heads. With Git, the naive mental model doesn't match what Git is doing behind the scenes at all, which is a big reason Git's command line syntax is considered arcane.
It is possible to stick a Git 'porcelain' on top that does match the naive user's mental model (eg. https://gitless.com/), but very few use them.
If Pijul's Theory of Patches has superior usability, part of that superiority is likely to be how easily users can internalize a mental model of it's operation (learnability). The UI can help or hinder that process with various affordances or lack thereof.
So, from your perspective, a 'good' UI would be one that reveals the Theory of Patches in a useful, learnable, usable way; and ought not be just some superficial gloss hiding away the internals from users. In that sense, surely 'winning' due to 'good UI' is a worthy goal to strive for?
I'm not writing a usability thesis here, I'm just using "good" to abbreviate the GP's "an easy to use interface like Mercurial".
I don't think a good UI would be contingent on it surfacing the theory of patches. It needs to surface productive workflows supported by the theory of patches. One way to do that is to teach it; another way to is build something else on top of it.
What I don't want is for pijul to become something I have to integrate into my toolbox just because the next semi-technical founder I sling for found it "easy" three years ago. I already had to deal with git taking over a) because it was fast, b) because github was easy.
> What I don't want is for pijul to become something I have to integrate into my toolbox just because the next semi-technical founder I sling for found it "easy" three years ago.
If Pijul becomes popular, somebody will inevitably make it easy to use, so on some level it doesn't matter exactly why Pijul becomes popular in the first place.
> If Pijul becomes popular, somebody will inevitably make it easy to use
That's exactly why it shouldn't become popular because it's easy-to-use.
If something becomes popular because it's efficient (git), or has some beautiful core logic (pijul), or because it supports well-integrated workflows (fossil), then it will eventually be that and easy-to-use.
If something becomes popular because it's easy-to-use but sucks in every other way, then we are stuck dealing with it for another 30 years.
My point was that regardless of why something becomes popular, some people will add it as a dependency solely because of the popularity, even when the original reason for that initial popularity ceases to be valid.
Are you saying that it was a mistake for Mercurial to make their software easy to use? That's a strange thought!
Mercurial didn't lose because it was too easy to use. It lost because it was too slow and because GitHub became too popular. Those are orthogonal issues.
He's on Windows; https://news.ycombinator.com/item?id=29993842 mentioned that Windows cmd doesn't expand * (like if it were quoted, on Linux) and thus it's expected that Windows CLI apps expand the * themselves.
MUST is a strong word. I would rather not have pijul handling wildcards in Command Prompt. After all, who uses Command Prompt nowadays, when WSL2 and PowerShell are available at the + button in Windows Terminal.
PowerShell doesn't expand wildcards by default either, because it also assumes that commands handle their own wildcards by default.
You can cobble together the expansion somewhat easily yourself with something like Resolve-Path in PowerShell, but PowerShell doesn't use a posix-like approach here with "native" commands either because Windows (and DOS) has always left wildcard handling to EXEs by default.
It’s a must if you consider the windows platform to only have the native windows shell(s).
I have no data, but I’d wager cmd is still much more widely used than either ps or wsl is.
Of course the dominance of windows on the desktop among developers is smaller than it is elsewhere, but it wouldn’t even surprise me if as many git users use cmd as any posix shell.
Usually, yes. With git specifically, I find that it's better when git expands the wildcards though (ie I quote them so the shell doesn't). This allows things like "git add *.py" which will work over the whole repo.
I think you intended an asterisk to display, but it didn't, because it's being interpreted as the start of an italics markup. So you need to escape the asterisk on this line with a backslash (i.e. type \*).
The source code isn't public on github, you have to create an epic games account and link it. Steps are here [0]. There's 277,925 files and 22,000 directories. Installed from the community maintained builds linked from the documentation for windows.
It is fast for some operations indeed. But its merge is not always correct (conflicts might reappear, lines might get shuffled around randomly), and rebase and blame aren't super fast (Pijul isn't as optimised for speed as Git, yet Pijul credit is faster than Git blame, and apply is faster than the diff-and-replay strategy of Git rebase).
FWIW, git on windows ships with a (originally msys based) posix environment so that it can have only a single frontend as well (unlike unix shells, the windows command shell does not do wildcard expansion)
Thanks, but our problem is different, the datastructure on which all of this is based is memory-mapped, and does need some filesystem and OS cooperation to work well.
I haven't tested Pijul on Windows much, and I know that at least at some point it didn't work on WSL.
I think you're going to open a world of hurt by doing that. The windows command tools don't really like .cpp, or a.cpp for files. If you _do_ have a file called .cpp and another called a.cpp in the same directory, explorer will display the asterisk as a missing character. calling `del .cpp` will delete both files too.
No problem. This is on plain windows, not on wsl. I have a Mac that I can try it on again later (although it will have to be a much smaller repository heh)
Sanakirja even reached a point where bugs were found only in the last few functions without comments. Writing the comments to explain exactly what each line does in all cases (there are lots of cases everywhere) fixed everything.
Congrats on the beta release! I've been following along for awhile with great interest, looking forward to picking out a couple repos and shadowing git with pijul now that it's beta.
Since you mentioned sanakirja, I'd like to ask: the obvious (to me) reason to build it, rather than use SQLite, is easy ABI compatibility with the rest of the project through Rust. Are there other architectural reasons it's particularly suited to pijul?
I ask because I already store as much as I can in SQLite databases, and am moving to patchsets and changesets for certain kinds of document-oriented data management. Storing pijul's artifacts in that same system would be a natural fit for my application, so I wonder how closely tied together the two codebases are on a practical level.
> Are there other architectural reasons it's particularly suited to pijul?
Yes, I originally wrote it because I wanted to fork tables without copying any byte. This is how channels (similar to Git branches) are implemented in Pijul.
> I wonder how closely tied together the two codebases are on a practical level.
Pijul can use a generic backend, although in practice only variants of Sanakirja have been implemented. Sanakirja is generic enough to handle different workflows, although large values aren't implemented yet (they aren't hard to implement, I just never needed them and nobody else did it).
As it happens SQLite added some of the machinery from fossil for no-copy tables and changesets as an optional extension. I mention this because it gives me hope that I might cobble something efficient together in SQLite, I could offer a few reasons I'm glad Sanakirja exists aside from pijul and the advantage to your project of being able to tweak both sides of the system is compelling.
Congratulations again on the beta. Doing hard and basic work for years is to be commended.
Hi, I have some questions about Sanakirja, if you don't mind..
It stores 4kb blobs, right? Does Pijul first parse the data (copying it to other allocations), or does it use the data as is? I mean, there are some libraries like cap'n'proto[0] and rkyv[1] that can directly use the file contents as an in-memory data structure, without a deserialization step (or rather, deserialization is simply a quick validation to check the data isn't corrupted), I was wondering if Pijul did anything like that.
And like, is this btree header [2] stored exactly like this on disk, and does Sanakirja exploits this to avoid further copying data?
(I guess there's a trouble with compression there: to decompress you really need to write in another buffer)
Also, is the I/O done with something that prevent userspace copies like mmap or io_uring, or does it eventually calls read() to copy the data to its own buffer?
I want to build something like Sanakirja, but with those features to avoid copying data, so I'm wondering if there's any overlap.
I just saw it uses mmap. My other concern was that mmap handles I/O errors asynchronously through unix signals, and it isn't possible to pinpoint exactly which operation caused an error. Does this mean that a write error could cause data loss (other than what was written)? Or does Sanakirja guards against that?
The worst case scenario is that if the program doesn't know about a write error right away, ensuing read operations on the same location could handle corrupted data, which could lead to further erroneous operations, until the error is dispatched to the signal handler and hopefully the program exits.
Sanakirja has atomic commits, the only hypothesis is that just a few bytes can be written to disk atomically (I don't remember, but certainly at most 8 bytes).
You're not the only one, I too like it. But there comes a time when you want your merges and rebases to work predictably (absolutely never reshuffling your lines randomly), your conflicts to be solved once and for all (without hacks like rerere!).
Maybe you may also want your tool to serve your workflows and not the opposite.
> Maybe you may also want your tool to serve your workflows and not the opposite.
This sounds like you claim that pijul serves all workflows well, which would contradict my own experience. I'm so used to think in branches / PRs (to display a history and showing lineages), that I find the current way of handling channels/patches of pijul alien. Right now I would need to break my workflow in my daily operation to use pijul.
I like Git, but it does give me tons of conflicts when a smarter system shouldn't.
I recently merged two functions from two different branches. It interleaved the lines because the functions were similar. Tooling can help this problem of course, also choosing different merge strategies i think is an option, but that in think is what Pijul is attempting to solve. To reduce conflicts when slicing and dicing past histories.
For example, Git a) has some really upsetting merge semantics, and b) can only remember how to resolve merge conflicts you've solved before via the hack that is `git rerere` - which I call a hack because it is entirely not part of the core Git model of the world.
Git views history as a series of (notionally) atomic snapshots; programmers tend to view history as a series of diffs; Git tries to supply a compatibility layer to better support how we naturally think of history, but it's such a leaky abstraction.
I'm not sure I agree that history is "naturally" a series of diffs. Recording snapshots is natural enough for me; you just store whatever the state is at commit time. Diffs are generated by comparing two different states. No version control stores all the changes you make between two commits, only an approximation of them to get from one state to another.
Git's "problem" is that you can get identical content via different routes, but due to the way git fundamentally works, history is part of the snapshot and two snapshots with identical contents but different ancestry are not the same. Actually implementing a system that handles this properly is far from trivial, and git makes the tradeoff in favour of implementation simplicity.
Storage is not really the issue here. If you merely think about snapshots, all this is fine.
The problem come when you merge and rebase your snapshots, possibly solving conflicts in the process. Then none of this snapshot thinking makes sense.
And actually, Git knows that well, since its default merge algorithm diffs the tips of branches with the youngest common ancestor. And rebase "replays" diffs (how would you "replay" snapshots?).
Suppose I type a line into a file, and then I ask you what I just did. If you answered "You updated the contents of the file from <complete contents of the file> to <complete contents of the file>", I'd look at you like you were insane. An actual human would answer "You added the line <blah> at position <blah>", or "You added the line <blah> after the line <blah>", or similar.
So I claim that very small changes are obviously thought of as diffs. And what is a large change but a composition of small changes? Pijul's model naturally represents the process of creating a change - you could slice up the diff as small as you liked without anything about the mental model changing, down to individual keystrokes if need be - whereas Git's model gets more and more unnatural when you do that.
> I'm not sure I agree that history is "naturally" a series of diffs.
GP cites rerere as evidence it is, and I'm inclined to agree - at least when I'm merging. During a few operations like a bisect, I probably view history as snapshots. But during merges and rebases, I definitely view it as diffs; I question whether someone can even explain these operations abstractly without resorting to a diff-based explanation. And I merge/rebase 100x more than I bisect.
Patch based version control like Pijul can deal with merges better than snapshot based systems like Git.
E.g. if you merge two branches with the same patches but in different order, but the file contents are the same at the end, git will need to you to manually address the merges for each patch. In a patch based system, there is no conflict if two patches can be reordered with the same final result.
I have not used Pijul a lot, but I did use Darcs before Git. Some of the merges felt like magic.
Git is fine but it can get messy with complex branching and merging strategies. Patch based systems are intended to improve that.
> E.g. if you merge two branches with the same patches but in different order, but the file contents are the same at the end, git will need to you to manually address the merges for each patch.
That's not really true - only in the case where the changes are ambiguous.
I've worked on very hairy Git repos and haven't really felt blocked, even when some crazy merge conflicts happened.
I was surprised that the post numbers appear to be sequential, and that there would be ~30M. Over 15 years of HN, that's ~5K posts/day, and about 200 new posts/hour. Things accrue. I'd estimate that HN is well north of 300K readers/day ... https://news.ycombinator.com/item?id=9219581
edit - it looks like each comment has its own item id. So posts are both new items and comments. That makes more sense to me in terms of scale.
In any case, the answer to your question is that Pijul can solve certain kinds of merges automatically when Git gives you a conflict that needs manual intervention.
What made it reallyclick for me is understanding the internals and how git works. Then all the terms like head, branch, parent, commit, tree, etc. made sense for me and the way git commands work is intuitive.
I still have to look up the order of arguments for commands like git rebase, but overall, it makes complete sense.
I can really recommend "Pro Git" by Chacon and Straub. I got a printed version, but it is also available under the CC-NC-SA here: https://git-scm.com/book/en/v2
None of this helps you understand how merging works, of course. Pro Git is great at telling you mechanically how to merge, but doesn't explain at all what can go wrong, and I don't think it actually describes the merge algorithm anywhere (certainly not in the main body of the book, or the section on Internals).
Yeah, I like Git; it's a very good implementation of a simple + powerful idea. It does have its pain points though.
Specific things I've experienced that could be better in Git:
- Merge commits that aren't actually merges. I work on a game team w/ non-technical people who commit to the repo; periodically it'll happen that they get confused with a merge conflict, try to back everything out, and end up committing a "merge" which actually just completely drops one of the parents. Then some time later someone notices stuff is missing, and it's really annoying to go back and fix. This shouldn't even be possible! (And in Pijul it is not.)
- Inability to use VC while resolving conflicts. In git, being in a conflicted state is a "special" circumstance. Let's say you do some huge merge, and you have a gazillion conflicts. You can't gradually fixing these conflicts, committing as you go, nor can you collaborate with someone else to fix them. This is annoying and unnecessary. In Pijul having conflicts is a first-class state of the system, and you simply add more commits to resolve the conflicts.
- Bad "cherry-picking" supporting in git (e.g., inability to cleanly share bug fixes dev <-> stable). Let's say you make a bug fix on a dev branch which should also be applied to a stable release branch. If it's a single commit, you can cherry-pick; if it's a serious of commits, you can try to cherry-pick them individually, though this gets hairy. But in either case git doesn't actually "track" what happened: the cherry-picked commits are recorded as completely new commits that just happen to have the same content. Later on you're likely to get spurious conflicts. In Pijul (as I understand it, from reading) you just have the same commit included in two channels.
- Lackluster support for binary files. If you check in binaries, the repo gets huge and all operations slow down (and GitHub sends you angry emails threatening to delete your repo). So you use centralized things like git-lfs to get around that, but those are a little janky and not as nice as having everything in one VC. Not sure if Pijul does this better or not, but there's definitely room for improvement here.
I also suspect there are other advantages that would become clear working w/ a patch-based system, but since I haven't yet had the pleasure of doing so, I can't be sure. :-)
I'm old enough to remember when everyone used to use svn, and git was the crazy new upstart -- back then, very similar arguments were made against git as are made against patch-based VC now. Stuff like how distributed version control was too complex, and what did it really buy you if everyone was using a central repo anyway?
It's also interesting to note that distributed VCs had existed for a while, but they didn't break through until there was a really good implementation with a famous initial user (Linus and Linux). And git really took off once the excellent GitHub site was made.
For software to be good, besides strong theoretical foundations, you need to nail all kinds of nuts-and-bolts engineering and design issues. And for software to be succesful, you need to get the social and publicity factors right as well.
Will Pijul manage to fulfil all those? Who knows, I hope so!
And I think inevitably something will come along and replace Git, and probably it will be patch-based.
Git has hit a strong "local maximum" of "good enough". If you like git, there's probably no reason to move to something better, because it is a comfortable "local maximum" (especially with the network effects of everyone now learning/using git and git just being the comfortable de facto winner).
Everyone here points out that the big advantage to something like pijul is in how it does merges. Git has a set of extremely complicated merge algorithms (the most common up to today has been "recursive merge" which you'll see mentioned a lot in git console output, it's on the way to being replaced with ORT [Ostensibly Recursive's Twin] which is a from-scratch rewrite of the "recursive merge" algorithm with a better understanding of the problem space; there are a couple other algorithms [strategies in git parlance] that have more niche uses). When they work well, they do a brilliant job, but they are extremely complex algorithms and have to do a lot of work to figure out things like "what things got renamed/moved and where between these two branches?". Git mostly doesn't store any of that information at all and generally computes it all from scratch every time it is needed. (You'll notice, again, if you watch console output a lot how often "rename detection" alone often runs, sometimes on the same commits over and over.) But a patch language like pijul encodes things like renames/moves much more directly and doesn't need complex algorithms to guess when they occurred because it wants the user to encode that more directly (as possibly a better "higher level" representation of the user's intent: the user wanted a file renamed/moved and that was an action they took).
In theory then, the merge algorithm of a patch language like pijul (or darcs, its older relative) is simpler because it has a lot more higher level recorded constructs and a lot less "guesswork heuristics" to try to oracle user intent (sometimes long after the fact). (In practice we find such merge algorithms have nearly as much complexity in their own way. The patch algebras of pijul and darcs are both academically related to work done on OTs and CRDTs, all three approaches informing each other, which is in part why darcs was written in Haskell. [pijul is not, just as we have CRDT libraries in most languages today, we've come a long way in our understanding of the complexities of these tools since the early days of darcs.])
Though, I don't think merges on their own are the killer thing that makes an approach like pijul's truly better than git's. Again, git has achieved an incredible "local maximum" of "good enough". As complex as git's recursive and ORT merge strategies are, they are by far "good enough" (and hands down better than many predecessor VCSes) with a lot of smart engineering work behind them at this point. In many "day to day" uses cases you aren't going to notice a huge difference from git's well optimized "dumb" "guess what the user was thinking at the time" merge algorithms and pijul's smart "capture what the user was thinking at the time" approach. Most people don't see the horrors of things like git's rerere cache in their day-to-day git lives (and that's a great thing; I would not recommend learning git rerere if you can avoid it).
The killer feature is cherry picking. Git has git cherry-pick and it "exists" technically, but anyone who is smart has warned you away from ever using it, and especially not in day-to-day workflows. Cherry Picking is the concept of taking just one or two commits from the middle of a branch and applying them to a different branch without taking the other changes. In git this creates entirely new commits with their own very different identities that are entirely unrelated on the DAG (git object graph) in any way to the original commits. It's like a rebase, but much much worse because typically you rebase an entire branch and throwaway the originals, but if you are desperate enough to cherry pick you likely need to keep the originals as well safe in that other branch. In my experience in git this is one of the worst sources of bad merges when those two branches eventually (and often inevitably) merge. A lot of the "dumb" heuristics in git's merge algorithms get really dumb when the same changes were made in both branches.
On the flipside, cherry-picking is a "native" behavior of a patch algebra and how its merges work. Just like in CRDTs that are eventually consistent, a patch is often the "same" no matter what branch it is in (context it has of other patches/time it merges in on other devices in a CRDT perspective). You can generally pull an individual patch between branches with ease whether "beginning, middle, or end" of that branch, and it will just work. (And when you eventually merge branches back, though it is far less "inevitable" than in the git case, it knows the same patch was already applied once and doesn't have additional work to do.)
When I was using darcs heavily, I used a lot fewer branches than in my workflows in git. I knew I could pretty easily cherry pick changes between branches without worrying about where they were in "commit order", so often individual patches felt like entire branches (or tags) and you could often pick and choose specifically what you need at any time. Cherry picking was so common it was just assumed you could do it. (There are cases where you couldn't quite cherry pick what you want, where patch has a more direct dependency on an earlier patch that you didn't expect, but the darcs UI was really good about making it clear that pulling one patch would pull some others it relied on.)
Pijul should be the same, "natively and easily cherry picking", and that can be a killer feature that git doesn't have a lot of ways (today) to do better.
In many ways that "native cherry-picking" felt like a much better way to flow changes between branches. Cherry picking always seems like a key feature that developers need (often because management needs it: "can you get just Feature X into Production without Feature Y because Feature Y isn't ready yet?" but they were developed/integrated in the same branch), if not how they naturally think of changes flowing between branches, and I've seen too many teams already fall to "well git cherry-pick exists so it might help us here" and the ugly merge hell that approach leads to (including horrors like hand-holding the git rerere cache).
I ran `cargo install pijul --version '1.0.0-beta'`, but can't find a `pijul git` command to test importing Git repositories. It prints "No such subcommand: "git"".
(pmeunier, if you're reading this, you should probably make that the default on the webpage -- i think trying to import a git repo will be a very common "kick the tires" first use case)
Seems like an interesting project. It's probably impossible to fight against git nowday though. The first mover advantage of git is just too hard to overcome despite all the flaws of git.
Git didn't really have first mover advantage. Pijul is essentially a successor to Darcs, and Darcs was around before Git.
The DVCS space was quite diverse for a number of years. I remember the main players being Git, Mercurial, and Bazaar; whose advantages were speed, usability and precise tracking, respectively. Bazaar became pretty much obsolete once git became (slightly) better at tracking history across renames/moves.
Huge projects, like the Linux kernel, Xorg, etc. relied on Git's speed. Since those were also high-profile, and widely used, their adoption of git lead to it becoming the de facto standard. Code hosting sites like Gitorious further cemented git's position, and eventually GitHub came along and made it truly explode.
Git's evolution was interesting to observe at the time.
Back in around 2005, the company I worked for was using Subversion, but then I demoed Darcs to my colleagues (which I had been using at my previous company), and everyone immediately loved it, so we quickly migrated over.
Around 2006, we hired a new developer who was what you might call an early adopter. He wanted to use Nginx back when all the documentation was in Russian — and he also wanted us to use Git. But the benefits of Git were not obvious at the time to someone using Darcs. In fact, Git was positively stone-age at the time — terrible CLI UI, inscrutable commands, steep learning curve for no obvious benefit. And so we resisted the change for several years.
But it was apparent even back then that the writing was on the wall. After all, it came from Linus Torvalds. People were adopting Git en masse, and not much was happening with Darcs. And then, in 2008, GitHub came along, and that was that.
Darcs was amazing to use [1], but it quickly joined the graveyard of technically-superior-at-the-time projects that were too good to succeed: The Amiga, Plan 9, Borland Delphi.
Darcs was ahead of its time in its approach to change management, and I think that we'll eventually have get something like it again. It's probably not going to be sufficient to merely make a better mousetrap, so Pijul isn't it (with all due respect to the technology, which looks nice).
Whatever replaces Git will have some kind of synergy that just means everyone will want to move, similar to what Slack did with chat. Then again, we still use email nearly 50 years later, so it might take a while.
[1] But not perfect. Ultimately its dreaded "exponential-time conflict" bug contributed a lot to its lack of success, I think. It was trivial to get your repository into a state which messed it up permanently. And the bug had no real fix until years later.
Displacing git is certainly impossible in the short term, but that is not the only conceivable goal. Building a sustainable community does not require git levels of success. A few strategic adoptions can go a long way, as shown by Ocaml, Nix, and other similar projects who have small but passionate and vibrant communities. The key there is a core group of users who prioritize a certain set of values over the advantages (ecosystem size, polish, job opportunities perhaps) of technologies with a larger userbase.
There are also ways to displace git without immediately replacing it. Many projects adopted git only after using git-svn for an extended period, giving teams more time to develop git skills without completely disrupting existing workflows. It looks like a pijul-git bridge is planned, which could facilitate gradual adoption. Beyond patch ergonomics, there are many other areas where a concerted effort to provide a better solution than git could yield a highly attractive tool. One that comes to mind immediately is management of multi-repository codebases and dependencies; Pijul (or one of the other git competitors) could provide a better and more cohesive way to manage multi-repository code changes than is possible with git sub[tree,module,...], which would be an adoption magnet. Zig appears to be pursuing such a mixed adoption strategy: they are providing both a new language and a set of tools which support C and C++ and solve several huge pain points -- such as cross-compilaton -- without requiring direct Zig adoption. (the tools have the benefit that they make Zig projects with C or C++ dependencies much easier to bootstrap, which should also allow faster ecosystem growth).
Out of curiosity, why the hate for git submodules? They've worked really well for me over the years. They solve the problem of "pin a specific version of some external 3rd-party repo" really nicely.
The only complaint I have is that the git-submodule CLI has unintuitive syntax. But this is Git after all, so it's par for the course. :v
Because they merely make really hard things hard and simple things painful. They solve a problem (albeit one that can be better solved by a decent package manager like Cargo/NPM).
And in return, basic operations like merge/rebase start breaking in unintuitive ways.
E.g., I work on repo ss14 that has submodules RobustToolbox, but never modify them in my PR.
I rebase my changes. I have changes in RobustToolbox that are only visible in Git Bash. Other Git clients show no changes... Ofc `git submodules update --recursive` might help or something, but it made a simple operation more complicated.
And that's if I didn't change the submodule. If I did, merging/rebasing becomes its own, separate nightmare.
> They solve the problem of "pin a specific version of some external 3rd-party repo" really
With Cargo/NPM you can pin your dependencies to some version you control like a fork or a local directory, making submodules non-necessary.
And unlike submodules, they don't require you to modify your workflow to sprinkle `git submodule update` everywhere.
Agreed that Git submodules aren't a good tool for the problem of "I want to work in multiple repos at the same time".
It's designed for things like "my project needs to check out pinned versions of these 4 external repos", ideally where the pinned version doesn't move all that often.
Git-submodule is best used as a replacement for the "shell script that clones a bunch of dependencies" pattern, not as a multi-repo workflow coordination tool. And in that context, it works very well.
Imagine you had a CLI tool that tool that consumes some text-file list of repos, commits, and paths. Every repo in the list gets cloned to the path you asked, and each repo's requested hash is checked out. The text-file is controlled in your main Git repo.
That's essentially all that Git-submodule is, except that it's built into Git instead of having to roll your own or introduce another tool dependency. There's also some light integration to handle recursive clones and reporting if the submodule state is dirty.
Cargo is a great tool if you're writing Rust. Go-get is a great tool if you're writing Go. Npm is a great tool if you're writing Javascript. C/C++ options are all nonstandard and maybe bad.
Git submodules is a "none of the above" tool that handles generic situations where you have one top-level repo that depends on pinned versions of a few other repos.
So in other words. It's a source versioning tool trying to solve an issue that's better solved by a package manager, and the abstraction mismatch shows.
Like trying to perform eye surgery using rockets.
I'd be ok if Git submodules actually worked (not sorta worked, not "run git submodules update and pray to Linus the issue goes away" worked). It's just that it doesn't. It solves one hard problem, at the expense of making everything else more convoluted.
That's like saying "just use Typescript" when anyone complains about Javascript. The world is drowning in Javascript; it is work to migrate any given instance of Javascript; we are not drowning in labour; so it won't get done.
> And in return, basic operations like merge/rebase start breaking in unintuitive ways.
I just haven't had the same experience at all. Could we work through a specific example?
> I rebase my changes. I have changes in RobustToolbox that are only visible in Git Bash. Other Git clients show no changes.
That's because git-submodule doesn't really interconnect across repos (not much anyways). A submodule is a completely separate repo, with its own stage, commit history, etc. It doesn't even know that it's a being used as a submodule.
Committing/merging/rebasing in your submodule doesn't change the parent. And committing/merging/rebasing in your parent doesn't change the submodule (aside from maybe what commit is checked out).
> They solve the problem of "pin a specific version of some external 3rd-party repo" really nicely.
My understanding is that they sound to many users like you could do more with them, such as making your repository modular, where in fact if you do that, you run into all sorts of problems.
Git had git-svn. I used it for a while to interface with a project that was still on subversion. So, I was able to gradually get my team off subversion. Subversion in turn had an easy migration from CVS and of course CVS was a bit painful to use so lots of people ended up migrating. I did such a migration once. It took a while to run but we got it done and kept our version history. We did not look back after that.
There probably is already some work on this but the path to success for pijul would be removing as many obstacles as possible between git and pijul and alternative systems such as mercurial. Make migrations easy. Make interfacing with git remotes easy for pijul remotes or interfacing with pijul users easy for git users.
Git is well entrenched of course with web UIs, IDE integrations, CI/CD support, etc. So there's more to replacing it than just writing a bunch of cli tools.
Sourceforge was pretty big at the time. And they lacked git support for quite some time. Github basically burried them by the time they figured out they had a problem. Of course it still exists but it's not a very obvious place for people to park their code at this point.
What pijul needs to truly succeed is an halo project. Git got where it is now only because it was developed for linux kernel. Not only is it a major pr boost, but it also gets all sorts of kinks ironed out pretty quickly, and that the tool fits into different peoples workflows; for linux that means that git format-patch and am were pretty much day1 features so that it works with lkml flow seamlessly.
rustc could be one strong candidate as such halo project, but it is probably pretty difficult to convince them to switch, especially as it means not only switching away from git, but also github which they use heavily.
Which conveniently brings to last point that to get any major project to switch to pijul, the transition story needs to be really solid and not disruptive to the day-to-day development. For git it meant not only that it needed to work well with lkml, but git also had really solid svn and cvs (and others!) clients pretty early on. It allowed people to use both git and the legacy vcs in parallel during transition. Beyond that, individuals could use git to work with svn (and other vcs) repos that had no plans to transition anywhere, and still reap some of gits benefits.
For pijul I think it means that it is necessary to be able to work with git repos seamlessly similarly that git-svn allowed git users to work with svn repos.
I am not saying git is flawless but most complaints seem to be about the "porcelain" and not its inherent design. I think Git tremendously outgrow the original audience so a larger proportion of people complain.
Complaining about the porcelain is like complaining about vim's complexity: It takes some time to master but ultimately it is rewarding.
Context: I'm the main author of Pijul, and also a big fan of Git.
This sounds like a potential strawman argument to me, so let's be clear about the complaints you've heard, and my complaints against Git (which, again, I love and admire as the most elegant design I've ever seen for a tool like that).
Mine are not at all in the porcelain, but in the plumbing. My specific complaints are:
- Merges in Git don't have enough information to do the job, and the optimisation problem it tries to solve is underspecified, regardless of which merge algorithm it uses (3-way merge or other): sometimes, there are multiple solutions, and Git ends up choosing a random one. This breaks a fundamental expectation on merges called associativity.
- Solving conflicts doesn't really solve them, it just records a version without the conflict. This can waste engineering time in two ways: either because you need to think deeply about the tool when solving the conflict (and use `git rerere`), or you need to follow a strict workflow, making your work methodology serve the tool rather than the opposite.
- Similarly, rebasing is operational transform. It works most of the time, but not always, and is quite clunky and hackish. Pijul fixes that by using a datastructure that happens to be a CRDT for that part.
In my opinion, all the porcelain issues and the proliferation of unintelligible commands comes from these basic shortcomings in the plumbing.
I don't think GP was trying to make a strawman, for me DCVS is a solved problem thanks to git, and all arguments I ever heard were basically people not wanting to put any effort into it and then pretending this is because 'git is too hard'.
These are pretty advanced issues which are worth solving, so I find these to be beautiful motivation. But I suspect most people are not aware some projects are trying to fix these issues.
It's a bit as if you are trying to solve some of Haskell flaw while most people are still stuck doing javascript...
The homepage is giving this motivation but it was not clear from the linked page, which is the point where the discussion in this thread is starting from.
Congrats on the milestone and good luck!
Potential doesn't mean intentional! Sometimes just a lack of precision causes one to argue against a point that was never made (such as "porcelain is the problem").
IME/IMO, one of the big piles of crap in Git’s porcelain is that gitrevisions(7) is absolutely terrible (and, of course that git diff then proceeds to do something which looks the same but behaves completely differently).
Does pijul have something better for navigating and working with sets of changes? For reference, though I didn’t get to use them that much mercurial’s revsets were a pleasure, mixing readability, expressivity, and flexibility (modularity).
> Complaining about the porcelain is like complaining about vim's complexity: It takes some time to master but ultimately it is rewarding.
It really really is not. All it is is a hurdle you get used to in your way towards getting work done.
Git’s porcelain is at best a beating whose reward is more beatings. I could count the numbers of times git’s porcelain has delighted me on two hands even if both had been cut off and burned to ashes.
Learning to master something that is complicated for no reason is not rewarding, it is a massive waste of time and effort, and only feels like an ongoing drain.
> I am not saying git is flawless but most complaints seem to be about the "porcelain" and not its inherent design.
This my line of thinking as well.
Git has a great low-level API that's very powerful and flexible, but the porcelain and the way that we use the remotes (the Hub's and Lab's) could be a lot better. The current way is neither intuitive nor shaped after how most developers want to work (it's hard for beginners to get started, collaborate, and share ideas, etc).
Depends. It all comes down to the ratio between switching benefit and switching cost, with the added requirement that Pijul must fix a significant pain point people have with git that is visible to decision makers.
Possible pain point candidates: Monorepo speed, submodules, large files. Don't know if Pijul solves any of that.
I saw a project a while ago that was a non-Git version control system that used the Git disk format. I thought that was a pretty great idea. Totally compatible with all of the existing Git stuff, and the Git disk format is pretty good, but you can avoid all the UX mistakes Git made.
git does not have first mover advantage. The reason that the current VCS darling is difficult to topple is the high cost of migration as made apparent by how many companies have yet to migrate to git from e.g., Perforce and SVN.
> as made apparent by how many companies have yet to migrate to git from e.g., Perforce and SVN.
So like, hardly any? Almost all companies use Git these days. As I understand it an exception is game companies because Perforce is much better at handling large binary assets (Git LFS is a pretty ugly hack).
Spend some time in hardware manufacturing shops that don't focus on software. SVN is "easy" and has been in use for decades at this point. The SVN mindset is entrenched.
Migrating requires someone know how to handle the command line, buy in from management, and so on...
It takes a lot to get a behemoth (even a small one) moving in a new direction.
I actually do work in a hardware manufacturing company at the moment. We use git. But anyway, hardware manufacturers are a tiny proportion of all companies that write software.
Obviously it's hard to get good numbers but if you look at job adverts on https://www.itjobswatch.co.uk/ they say 778 ads for Subversion and 9579 for git.
I have visited various embedded development shops and have only personally witnessed one using git exclusively. P4, clearcase, and svn are all quite common still.
If you think “hardly any” companies use other VCS’s then I’d say you don’t get out much.
Many enterprises are either still stuck or only just now planning migration to git. They come from Perforce, TFVC, SVN, and god knows what. Big game studios are using perforce (places like Ubisoft), although my info is dated. Google does not use Git for its products either - it uses Piper.
Heck, many high-profile open-source projects held out for a long time with things like FreeBSD only recently migrating from SVN, and other BSDs even sticking to CVS.
Remember that the software development industry is extremely large, much bigger than what you read about on HN.
Is there a FAQ that explains this? I sort of have an intuition for what does that ("prevent commutation of patches") mean, but I don't know how complete (or useful) it is :)
(I guess it has something to do with commit hashes depending on the whole chain of commits, so even if the diff between two commits is empty if they are not the same hash, they are not the same ... thing. But I don't really understand the practical implications of this. To me it seems like there's none. If two branches have the same content I'll treat them as equal. (And with some simple heuristics - like rebase is better, reverts are ugly - I'll pick the nicer one to keep. I don't really care how they ended up equal.
Practical implications: merges are 100% predictable and deterministic, this isn't always the case in Git, sometimes lines are shuffled around (look for "associativity" in the manual).
Mostly solo: if that means you're in small teams, yes, since you don't need to follow the rigid workflows that prevent large companies from running into Git's problems.
I've been trying pijul fully solo. In my opinion, yes!, there are benefits.
I'll try to give an example. It's been a while so I'm not 100% clear on how exactly I did it, but I had a central nix configuration Pijul repo, and the "personalized" config files for a particular machine under pijul as well – in another repo iirc? I cloned the nixos configuration files from the main repo into /etc/nixos. Then I added "personal" things like MAC addresses and an SSH key to the config file. Personal to that machine. Things that shouldn't go into the main repo. Recorded those changes into the "live machine" repo. What I ended up with was a very nice mechanism that knew about the customization entries and didn't get confused about them. I could write general changes to the machine-specific repo and easily pull them into the main repo without the version control system always getting confused about them. And vice versa.
This might be possible with git. Personally I wouldn't bother and I will claim that it's not due to inexperience with git; Quite the opposite. I can certainly attest that it was very very easy and nice to do this with pijul.
Well, even Gitflow, while not being unreasonable, wastes engineering time. Most devs know it, so at least they don't have to relearn when onboarding, but then their VCS shapes the way they organise their work, when it should be the opposite.
The below post conversation illustrates what is wrong with Git,
and what is right with Git.
I dont know much about cars.
I drive a car that is boring.
Automatic, ABS, whatever else.
It does not spend much time at the shop.
It starts, I drive it to where I need to go, turn if off.
repeat.
My car cannot pull a boat, it can't go off road, it can't go 0 - 60 in 3s,
I cant mount a snow plow on it, i cant transport most furniture in it,
It is not bullet proof, I could never race it, it does not impress the ladies,
it is not very hackable, and so on and so on.
A lot of people want a boring versioning system.
That does not need much care,
much effort,
nor a careful study of its internals.
and certainly not the ability to remember sequences of obscure commands and
switches to get things done.
Git is not a good fit as a boring versioning system.
I have a couple of friends who love tinkering with their trucks or cars.
They have cool cars that can do the stuff above in some combination (not bullet proof)
They are fun. It is cool to get a ride. It is great to feel their passion.
They want to optimize them for their needs and usually that involves making them
go faster or pull more stuff, or do even bigger sand dunes and river crossings
snow storms etc etc.
Git is a good fit as an amazing versioning system that will be there
for you when your tasks get complicated.
It will allow you to perform far more use cases.
The need to put in the time to learn it well is fully worth it.
The problem is that Git is sold as the best versioning for all things.
The vast majority have no need for its complexity.
The vast majority dont care that it is distributed, or even wish it was not.
They dont want 5 gig of repository history since the beginning of time.
Most people treat GitHub as the server and their computer as the only client
and would do just as well with SVN.
In a good deal of the projects, I have worked on, I think locked files would be
an easier and better solution, than merging hell.
(I know that it is a wildly unpopular view)
junon
Am I the only one that actually likes git? Everyone here says it's a hodgepodge of broken
things but I really struggle to see how.
TobTobXX
What made it really click for me is understanding the internals and how git works. Then all
the terms like head, branch, parent, commit, tree, etc. made sense for me and the way git
commands work is intuitive.
I still have to look up the order of arguments for commands like git rebase, but overall, it
makes complete sense.
I can really recommend "Pro Git" by Chacon and Straub. I got a printed version, but it is
also available under the CC-NC-SA here: https://git-scm.com/book/en/v2
No; you do. You know quite a lot about operating cars.
You (or at least people following a traditional path to driving) took a class, operated under direct supervision for some minimum number of hours, passed both a written exam and a practical exam. All of that to be trusted in control of a car.
The only reason operating a car can feel like such a lightweight operation is because extensive training and practice were employed to make it second nature.
I can work on my laptop all day long and get a lot of stuff done.
I know how to use a laptop as long as it has an operating system
the applications I need.
That requires a lot of knowledge at several levels.
How to use the OS, how to use a specific application.
It also requires a physical dexterity keyboard, mouse
buttons, eyes etc.
It does not require much knowledge about what the
computer is doing while I am working.
If it stops working work stops.
Same with a car, you can learn to operate it,
and you have to learn to operate it in a safe and
responsible manner. (in theory)
If the car stops, I have no idea what to do.
From the movies I have learned to "pop the hood"
and look at the engine.
Usually looking at the engine is not sufficient for it
to start working again.
Of course any use can add any alias they want, but I hope they consider naming their command line tool something that is easier to type. It may sound silly, bit it's UX for the command line - names that encourage rolling motions or at least alternating sides of the keyboard has a significant ergonomic advantage.