Hacker News new | past | comments | ask | show | jobs | submit login
Launch HN: Diversion (YC S22) – Cloud-Native Git Alternative
280 points by sasham 11 months ago | hide | past | favorite | 423 comments
Hi Everyone! We’re Sasha and Egal, co-founders of Diversion (https://diversion.dev). We’re building a modern, cloud-native version control. Our first users are game developers, who like its simplicity and scalability. See a quick demo here: https://youtu.be/DD0XkL8kDYc

Why a new VCS? There is no doubt that Git vastly improved our lives, and played a significant role in the advancement of software development over the past 18 years. But - it was built for a very different world in 2005 (slow networks, much smaller projects, no cloud), and is not the perfect tool for everyone today.

The biggest drawback of Git is its limited scalability - both in repository and file sizes, and the number of concurrent users. This is the reason Google and Meta built their own version control systems. It’s also the reason why other large companies, most notably in games development, semiconductors and financial services are still using legacy tools like SVN and Perforce.

Another issue we’re trying to fix is Git’s famous complexity. In our previous startup, a data scientist accidentally destroyed a month’s work of his team by using the wrong Git command (EDIT: we were eventually able to restore from a non-updated repo clone, after a few hours). As a developer who used CVS and SVN before Git was created, I often wondered why Git is so difficult to learn, compared to other tools.

On the other hand, Git’s branching and merging abilities are exceptional - this has enabled the modern software development methodologies that we all take for granted today (e.g. feature branches, CI/CD), greatly improving developers’ velocity.

We were wondering - is it possible to create an easy-to-use, fast, scalable version control system, with Git’s branching capabilities? And what else can be improved, while we’re at it?

One thing available in modern cloud tools is real-time collaboration (e.g. Google Docs, Figma). While developers don’t necessarily want their work in progress to be visible to everyone, it may be very useful to easily share it when you want to get feedback before a commit, to detect and prevent merge conflicts, and to have visibility into which parts of the codebase are being changed by others.

Diversion is built on top of distributed storage and databases, accessible via REST API, and runs on serverless cloud infrastructure. Every repository operation is an API call (commit, branch, merge etc.). The desktop client synchronizes all work in progress to the cloud in real time (even before a commit). Users can work with Diversion using an interactive CLI, Web UI, or IDE plugins (currently JetBrains, more coming soon). The Web UI allows to perform most basic operations, without needing to install a desktop client.

Diversion is compatible with Git, and can synchronize with existing Git repositories (each new commit in Diversion goes into Git, and vice versa). We’re planning to release it as open source once the code base matures, and when we implement an open source repositories directory on our website (naturally, Diversion’s code is managed on Diversion!)

We’re in open beta, you can try it here (https://diversion.dev) (click Get Started). It’s completely self-service and there’s no need to talk to anyone, and it’s free for small teams (https://diversion.dev/pricing).

Building a version control is hard (as we have learned), and Diversion still has a long way to go. We are currently working on improving speed, CI integrations, plugins to IDEs and game engines, and other usability improvements. We would love to hear your thoughts and feedback on what we’ve got so far!




> Cloud-Native Git Alternative

Not sure if this is a good summary of the product. For one, cloud-native is an implementation detail, unless the company plans to sell the new VCS as packaged software instead of service. For two, I'm not sure how being cloud-native addresses any issue with my daily interaction with Git.

> The biggest drawback of Git is its limited scalability

I wonder how many people really has this problem. Millions of people have been using github and gitlab. I'm curious about the percentage of users who feel that there is a scalability issue with their own repositories. Personally, I don't have any beef with git's scalability at all, even though the companies I worked for had anywhere between hundreds to tens of thousands of engineers. Maybe having a monorepo will lead to scalability problem? But monorepo is a debatable topic, to say the least.

> Diversion is built on top of distributed storage and databases, accessible via REST API, and runs on serverless cloud infrastructure. Every repository operation is an API call (commit, branch, merge etc.). The desktop client synchronizes all work in progress to the cloud in real time

Again, how does this have to do with me, a user? Why would I care about the underlying protocols when I simply use a CLI or a UI?


Monorepos are indeed causing problems with git, and this is one of the main arguments against them (see [1]). Some companies are building their own solutions (Google, Meta), and some are splitting their monorepos because of these problems. IMO if a company wants to run a monorepo for their reasons, they shouldn't be limited by their VCS.

The technical details are for the readers who want to know, I agree it's not really important for the users (most of them, at least).

[1] https://medium.com/@mattklein123/monorepos-please-dont-e9a27...


Having an effective monorepo at scale will also require an entire infrastructure to solve all the problems that a poly-repo must solve and more. In particular,

- Partial download, as a monorepo will quickly grow too large for a single person to download. This is trivial for poly-repo but requires dedicated system for monorepo.

- Dependency management. With a decently sized monorepo, one can't compile everything and test everything. So, someone needs to build a dependency manager to track all the DAGs, and build only the DAGs that are impacted by a commit. One also has to build a trackign mechanism for deploying different build artifacts because a team may deploy all the build artifacts in different date and time. We will need more sophisticated build tools too.

- Build infrastructure. Even with a perfect dependency-tracking system, we may still end up building large-enough source code that we need to build the code in parallel.

- Directory-level access control. This is also trivial for poly-repo since the granularity is at repo-level, but it requires dedicated implementation for a mono-repo.

I'm not sure if the marginal benefit of having a monorepo can justify the investment for most of the companies. Google created monorepo initially to manage the dependencies of C++ code, and Perforce already supported partial downloads. But with more modern languages that have their own way of dependency management? I'm not so sure about the benefits. Making refactoring easier? How many repos are really shared at source level across multiple teams in a company? Encouraging sharing source and therefore knowledge? Isn't it a solved problem? Any decent company allows searching source code at semantic level across multiiple repos. If I want to see the source code of a particular package in my IDE, it's just a click away. Note I'm emphasizing marginal return of monorepo. Case in point, Google maintains the very use Guava library, which is probably used by millions of engineers. Does it lead to pains of incompatibility errors at runtime across different releases? Absolutely. Is it worth changing my poly-repo to monorepo to solve the problem? I highly doubt so. The compatibility issue happens rarely given good testing setup. When I do need to migrate my code, the cost is bi-modal: either the refactoring is trivial, or it requires serious testing and design changes, which a monrepo will not help anyway.

Note I'm not saying that monorepo is not useful. Instead, I question how many companies will benefit from switching to monorepo, which may lead to the discussion on the potential market share of Diversion.


Polyrepo is such a pain in the ass though. At a smaller scale it's much, much nicer, and then when we get big enough to hit all those scaling problems, we'll be able to afford it by hiring a team of 3 to go implement Bazel/Buck2/..., and perhaps switching from Git to Diversion.


It’s not fully open source yet (ie you can’t use it yourself I think) but Facebook’s EdenFS project solves the partial checkout problem.

The queries in Bazel/Buck to figure out the changed set of dependencies probably isn’t complicated and that’s why there’s no turnkey solution? You do need to adopt a build system with precise dependency tracking (afaik only Buck and Bazel support that) or the monorepo path isn’t going to be very successful.


> Monorepos are indeed causing problems with git, and this is one of the main arguments against them (see [1]).

The article is a disappointing read. It spends a lot of time talking about monorepos and how they spell all sorts of trouble. Yet, the article makes zero mentions of submodules as a way to get the best of both worlds.


Submodules are great, but they're hardly an alternative to monorepos.


ive never seen positive feedback for submodules before


Why not? Just want to understand.


Submodule workflows have a lot of overhead at review time. During development it's fine, you work with the fully materialized tree just like it's a monorepo. But once you need to submit your changes for review, how does that workflow look?

1. Commit in submodule A, then get it reviewed and merged as SHA 123

2. Update submodule A to 123, get it reviewed

3. Reviewer has feedback on usage of new API in submodule A

4. Make another PR on A, at commit 457. This time don't merge it since reviewer on main repo might have more feedback.

Monorepo:

1. Make PR to monorepo

2. Get review feedback

3. Push changes to PR branch

4. Merge

5. Update submodule to 456, push to existing PR

...??


> But once you need to submit your changes for review, how does that workflow look?

1. Post PR to submoduke A. Get it merged.

2. Post PR to the main repo updating it to point to subproject A.

Done.

The only difference between a monorepo and splitting the repo into submodules is that the main repo's history is coarser and basically tracks the output of integration tests. There is no need to overcomplicate things, and if you need to overthink them anyway then you have far more degrees of freedom to worry about in monorepos.


That’s a really slow review process. It also prevents reviewers from seeing the bigger picture of how step 1 manifests in step 2. In practice what I’ve seen you end up with both reviews simultaneously referencing each other in the description and once approved you merge 1 and update the pointer in 2 to point to the new merged commit if it changed.

That’s a lot of annoying and sometimes error prone manual bookkeeping that has nothing to do with the engineering work itself


Anything that cuts across submodule boundaries needs as many MRs as boundaries it crosses, conflicting submodule pointer updates in the main require additional MRs (in the submodules) to resolve and coordination between those MRs.

They're basically fine for slowly-moving dependencies, vendoring, etc. but they emphatically do not solve the large-org many-team coordination problems that monorepos are meant to solve.

FWIW, git is a great monorepo platform for 1-10m lines of code (Linux, $MY_JOB, ...). It's only the very largest scales (Windows, Google3, ...) or asset heavy cases (ML, game dev) that need special treatment.


Monorepos are a problem born of CI which can't cope with cross project dependencies properly. People have solved the problem by pumping everything together, but it's the wrong answer.

Fix CI and the problem goes away.


Sorry no. CI and monorepos are at best tangentially related. Dependency tracking across repos is a PITA which inhibits code reuse - git sub modules suck as do whatever that alternative git submodule concept is called (subtree?).

Code repos like Cargo and NPM can help but even still it’s an annoying dance to update dependencies in multiple downstream projects. And if there’s a code change you need to make, it’s a 3-way orchestration of new api, update downstream dependencies, remove old api.


That's exactly my point. Cross dependencies is exactly the problem I want CI to solve.


Like the CI system automatically push code commits updating downstream dependencies that reference the upstream repo? Or something else?


Re: scalability, in the very first sentence they mention game development, which deals with large quantities of large (and growing), nowadays versioned assets like 3D models, textures, animations, etc.


As someone who worked on the backend (workflow, infra) side of a game dev studio, there are a lot of massive benefits I see with this sort of "what if Dropbox but Git" product workflow.

We couldn't actually use git for our asset management, because when you're dealing with 1GB+ Photoshop files, versioning them with any reasonable granularity breaks your Git repository, makes clone times and local file storage requirements astronomical, and doesn't really make sense anyway. We ended up using SVN, since it only transfers what you check out and you can check out subtrees trivially, but then that required getting a GUI SVN client, providing it to our art team, teaching them how to use it, and then having them come to me whenever something in SVN got confused or broken (e.g. they opened and then closed a document and Photoshop updated the thumbnail, now there's a merge conflict and they can't commit).

We also ended up using Google Drive for a lot of stuff, and eventually migrating to Team Drives once that was a thing, but that doesn't integrate with... basically anything, honestly, or at least not with any reasonable degree of straightforwardness.

I don't work for that company anymore, but the thing that would make me most interested in this product would be:

1. Self-hosting it (would pay 'enterprise' rates for this); or

2. Being able to locally proxy/cache assets for users in the office, so that committing a 1 GB PSD didn't require 20 artists to all pull down 1 GB each from the server

A lot of people seem to be comparing this to actual Git, but this doesn't replace Git unless you're using Git wrong; what it replaces is the absolute disaster of a workflow that a lot of companies have to try to build/use/teach internally.


Ah, I guess this is the curse of ignorance: I saw the sentence but didn't register its significance as I'm not familiar with what's required in game development.


[flagged]


Yeah, possibly. I have only visibility to the teams I worked with. That's partly why product market fit is hard to find, as it relies heavily on intuition. I'd be happy if I'm wrong.


>> most notably in games development, semiconductors and financial services are still using legacy tools like SVN and Perforce

I think this should be your elevator pitch. Don't focus too much on "git complexity" as most people already know git so it just creates an argument. Scalability, in terms of numbers of users is somewhat hard to argue as well (Linux kernel has 1000s of contributors). However, it is completely true that git does not natively handle large binary assets well. You can even quote Linus:

"I really don't know what to do about huge files. We suck at them, I know."


> Don't focus too much on "git complexity" as most people already know git so it just creates an argument.

I'd say this phrase is both right and wrong.

It's right in the sense that it creates an argument.

It's wrong in that it creates an argument with the peanut gallery of git experts. But guess what, most people using git are not experts. They're software developers who don't want to learn the intricacies of git (probably most software developers out there), they're software development adjacent folks (think data scientists, etc) who for sure don't want to learn the intricacies of git, etc.

The "common person" using git will most likely resonate on the "git complexity" argument.


> peanut gallery of git experts

There are people who use git for its original purpose (kernel devs and very few others) and then there is the remaining 99% of people who essentially use “github flavored git”, using only three or four git subcommands and for the most part never needing to understand its intricacies.

Unfortunately, although they are using git-the-chainsaw-shotgun with all the safeties on, it’s nonetheless a chainsaw shotgun and sometimes they’ll run into issues where they or somebody in their company needs to be an expert and figure out how to un-scramble an egg, so to speak.

If a new VCS can solve the 99% case and never need users to fall back to understanding nitty gritty details, it could very well have strong takeup especially among people who don’t give a crap about what VCS they’re using as long as it gets out of the way and doesn’t make them think (game devs, data folks, etc).


I see 2 huge barriers for a new commercial VCS:

1. Devs don't like to pay for core tools. And VCSes need network effects.

2. VCSes seem to be really hard.


Totally agree, it won't be easy. Companies do pay for GitHub/GitLab/Perforce though, and for indie devs there's the free tier. I think what made git really take off is actually GitHub's free tier/OS hosting, and not git itself being free (at least for parts of the market, and I might be wrong).

100% correct about VCS development, it's much harder than one can expect.


> Companies do pay for GitHub/GitLab/Perforce though

Those products also provide a huge amount of other value and functionality (though at a high price).

As someone who worked at a (mobile) game dev studio, this "what if Dropbox but Git?" product design really hits for me. Teaching our artists to use Google Drive or Team Drives was easy, but the functionality isn't there. Teaching them to use SVN was a nightmare (because SVN workflows are a nightmare) but the functionality is... also not really there?

Give me either a local installation that I can set up in my office or a local proxy to reduce download speeds (or peer-to-peer on local network, the way Dropbox does it) and I could see this being beneficial for a lot of especially smaller studios.


As someone who is making one (not Diversion), I agree with both of your points.

Do you think that a VCS that solves the large file problem and binary file problem could succeed?


> If a new VCS can solve the 99% case and never need users to fall back to understanding nitty gritty details

Mercurial was a bit like that? Yet it didn’t stand a chance against git regardless


Learn your tools or one day you’ll lose a finger, or worse, your life.

— high school shop class


Yet software is not a chainsaw and a huge amount of people never learn to use their software tools. Especially since they're 100x more complex than hardware tools and nobody has time to master everything.


If you are using a chainsaw in a wood shop, you are probably doing something wrong. The saying “learn your tools” means to spend some time learning your options and what is available to you, learning the “gotchas” and why. Woodworking tools are rather complex with “gotchas” that will kill you in less than a hundred ms.

Using Git isn’t much more complex than using a lathe (simpler even, as you can get by with no skill and rote memorization). Taking a weekend to learn the data structures, and how everything fits together is not a hard ask. Especially since you literally only have to do it once in your entire career.


> Taking a weekend to learn the data structures, and how everything fits together is not a hard ask. Especially since you literally only have to do it once in your entire career.

Like all simplifications, this is false. If all you do for years after is commit, push, merge, you'll forget.

Especially since you'll need those brain cells to learn the new CPU/GPU architecture, the new JavaScript framework, the new corporate security policy, the docs from your internal architecture team, etc.

Nobody's life revolves around intricate VCS details.


You don’t need to memorize it for life. Jeez man, don’t be so hard on yourself.

The point is, you know what is possible. You know what is impossible. 12 years later, something happens and you go … hmm, I used to know what is going on. I think I need to search for something about git-tree or something?

The point is, you know what to search for, a starting point. You don’t just reach for git cherry-pick, but realize you can use git rebase --onto to copy/paste an entire branch. You don’t worry about merge conflicts because you only have to do it once with rerere. You learn git reflog will remind you what branch you were working on this morning before you got pulled into some shenanigans in prod. You can set up automation with global hooks. There’s so much you can know to do less work and you only need to remember the parts that are valuable to you.

After learning git about 5 years ago stuff like ^ is all second nature to me, for nearly 10 years since switching from SVN. My first 5 years was just like you said. Commit, pull, commit, pull. I didn’t even know it could do anything else and I was worse for it.


It is just a tough argument to make: the thing you have been using for your entire career and used almost everywhere is suddenly too complex.


Ummm... A lot of people just endure using git. Go to the average enterprise software shop, the ones where people don't code for fun in their spare time, and ask around.

There are a lot more of those devs than unicorn and FAANG devs.


Why aren't these teams choosing something else then? If everyone on the team dislikes git, switch to mercurial or something else.


Because often it’s not the team that chooses, but tooling is instead standardized across the enterprise. And enterprises like to make the “safe” choice of choosing what’s most popular. And then there’s the whole aspect that you have to know some basic Git anyway to debug your way through the open source code you use (maybe not for JavaScript/NPM, I don’t know). Git also happens to currently be the most interoperable with other kinds of tooling, from CI to IDEs, so not using Git makes your life harder in ways unrelated to its inherent qualities. It’s a network effect in multiple dimensions.


How did it get so popular if disliked by the majority of dev?


Because generally people picking the tools are not majority of dev. They are architects or Senior Developers who do enjoy learning new things.

Also, git generally does just work and most IDE/Source Control Systems take care of basic operation of pull/branch/commit/push/open PR.


I suppose. It might seem a bit perverse after all if the non-engaged, uninterested in coding, clock puncher devs got to make all the decisions.


Popular as in “everyone is using it”, not necessarily “everyone is fond of it”. How did Jira become so popular? The dynamics that lead to such outcomes are interesting, but hardly unusual.


Git might be the best thing there is today (outside of very large companies or environments with large binaries). It doesn't mean that'll always be the case... There were other VCSs before git, and there will be after.


I use git because other people use git, and ultimately I try to accommodate the tools that my peers are going to be used to. But I do think it's too complex (by a lot), and if I was running some kind of dictatorship I would never touch git again. Frankly, I think that git is a significantly worse tool than svn was for most use cases.


I like git better than svn personally. But, svn is better at managing large binary files.


Out of curiosity, are there any VCSs that operate on AST instead of plaintext lines? (Or is something like this being developed or proven impossible?)

I guess it should be possible to cooperate on shared codebase without need for every contributor to check in and out text files following exactly the same formatting. Or even naming convention. Or even same language, provided all collaborators can transpile to and from some agreed-upon shared AST target.

I know it might seem unhinged at first, but think about it: your (parseable) code is representation of the tree anyways (with some unrelated "whitespace fluff" around). If you follow strict formatting rules that you can express programmatically, you can recostruct that "fluff" from bare AST. If you can store all your violations against your style near the code, you can even sin and break it. If you store data about what you need to see differently from the shared AST - local renames of variables, for example - then you should be able to use your own naming convention, formatting and even source language, without bothering collaborators with tabs/spaces, hungarian notation or the fact that you prefer some different dialect or metalanguage.


> I know it might seem unhinged at first

Not "unhinged". Most kids these days get their first introduction to computer programming using of of the many "block coding" environments, almost all of which are straightforward recapitulations of Javascript under the hood. And it works, and it avoids the problem of having to teach them how to deal with syntax errors before you teach them imperative logic.

The reason people don't do this is that it's just a bad idea. The fact that all source code is stored in a universally understood data format with pervasive support across decades of tools is a feature and not a bug. How do you grep your AST to see if it's using some old API that needs to be refactored? Surely you'll answer that you use your fancy AST grep tool, which is not grep, and thus works differently for every environment. Basically every environment now has to have its own special editor, grep, diff, merge, etc... Even things like documentation generation and source control rely on files being text. And you're throwing all that out just to be different.

Also, FWIW: it's optimizing the wrong part of the problem anyway. The total cost to an organization that develops and deploys software of any form is overwhelmingly dominated by tasks like debugging and documentation and integration. The time spent actually typing correctly-formatted text into your editor is a vanishingly small fraction of software development, and really that's all this helps.


PlasticSCM has semantic merge that does something like that: https://docs.plasticscm.com/semanticmerge/intro-guide/semant...


Not (just) a VCS, but this is the idea behind the Unison language: https://www.unison-lang.org/docs/the-big-idea/


Considering many languages' very own out-of-the-box tooling (e.g. gofmt, syn) often have glaring gaps[1][2] in the understanding/roundtripping of the language's AST constructs, I would never be able to trust something like this to store and restore my code.

[1] https://github.com/golang/go/issues/20744

[2] https://github.com/dtolnay/syn/issues/782


I believe the smalltalk vcs Monticello work on a semantic level?

https://eng.libretexts.org/Bookshelves/Computer_Science/Prog...


You can do most of this in git via custom diff-driver and smudge/clean filters.

For example git can already convert line-endings on the fly for windows. This is special-cased, but can just as well be implemented via smudge/clean.

Oh and git-lfs is done via smudge/clean too.


One problem with that though is that smudge and clean are not used in rename detection. Git purposely skips running these filters to detect renames for performance. There are quite a lot of other issues with smudge/clean too though.


I was thinking about trying this out, but there are some reasons why I don't think it's feasible.

Where are your comments stored?

What happens when you need to run out in the middle of a fire and you don't have time to make your code compile-able? How do you commit "un-compile-able" changes?

I think there are some really compelling reasons to try AST-checkin - all your loops can now be changed to functional, dialect changes like you mention, etc. - but there are some pretty significant downsides as well.


Nodes in the AST for comments, block comments and "raw text I don't understand" seems like a way to go?


Honestly yeah. Might have to give this another go.


these are both already solved issues that IDEs deal with with red-green trees


This would enable some advanced merge conflict resolution strategies, I suppose. However, it can also be done by building the ASTs on demand and still storing plain text.


It would be cool to integrate Tree Sitter into a VCS. It'd be more flexible if that were an option for a project/folder/file, but also offer a text diff option for readmes/docs or for if someone is using the VCS to write a book or something.


It would also alow the file structure to be relevant to source control, users could customize how the methods in a class are organised.


there's some machine from the 70s that does this. iirc it stores all source code in an ast like representation alongside binaries and has some kind of built in version control.

wish i could remember the name...


ahh yes, the rational r1000. an ada machine from the 70s that stored programs in a mixed ast/object data format called diana: https://insights.sei.cmu.edu/documents/948/1988_005_001_1565...


Plastic SCM developed Semantic Merge and diffing about a decade ago


I have not analyzed the full potentials and benefits of Diversion but I would not agree with the statements you made about the Git. I think you should not focus on Git in your pitch.

>>it was built for a very different world in 2005 (slow networks, much smaller projects, no cloud)

Slow network: why is this a negative thing? If something is designed for a slow network then it should perform well in a fast network.

Mush small project: I do not agree. I can say that it was not designed for very very large projects initially. But many improvements were made later. When Micorosoft adopted Git for Windows, they faced this problem and solved it. Please look at this https://devblogs.microsoft.com/bharry/the-largest-git-repo-o...

No cloud: Again I would not agree. Git is distributed so should work perfectly for the cloud. I am not able to understand what is the issue of Git in the cloud environment.

>>In our previous startup, a data scientist accidentally destroyed a month’s work of his team by using the wrong Git command

This is mostly a configuration issue. I guess this was done by a force push command. IFAIK, you can disable force push by configuration.


> Slow network: why is this a negative thing? If something is designed for a slow network then it should perform well in a fast network.

Designing for resource-constrained systems usually means you're making tradeoffs. If the resource constraint is removed, you're no longer getting the benefit of that tradeoff but are paying the costs.

For example, TCP was designed for slow and unreliable networks. When networks got faster, the design decisions that made sense for slow networks (e.g. 32 bit sequence numbers, 16 bit window sizes) became untenable, and they had to spend effort on retrofitting the protocol to work around these restrictions (TCP timestamps, window scaling).


That makes sense but then the pitch should include something about how back in 2005 the design for git had to make a trade off because of X limitation, but now that restriction isn’t applicable which enables features A and B. I don’t really see what trade offs a faster network enables other than making it a requirement that you have a network connection to do work (commits are a REST call). I’m not sure that’s a trade off I’d want in my VCS, but maybe I’m just not the target audience for this.


Even a force push doesn't destroy the reflog or runs the GC server-side. I wonder how you can accidentally loose data with Git. I've seen lot's of people not being able to find it, but really destroying it is hard.


He force pushed a diverged branch or something like that, and we only found out after a while. We were eventually able to recover because someone didn't pull. But it was not a fun experience :D


So multiple people did a git reset --hard origin/master and nobody complained or checked what and why this was done? That's not "one data scientist with the wrong command" but the whole team that fucked up hard IMHO.


I think you just sold their pitch with this comment... I, like many many people here, have done quite a bit of product design. What do you call it when a bunch of people use your product, and it breaks for several of them? That generally indicates your product is weak, or has a very rough UI.


The pitch simply wasn’t true. Data was not destroyed and was restored hours later.


For many of us, the story rings true. We have ourselves had horror stories that we did manage to recover from after a few hours of fearfully googling, and we know of other, less capable friends and colleagues who were unable to recover the data and who just accepted the loss.


It's kinda crazy argument, I think data loss is way more likely with a centralised system than a decentralised system.


You think Microsoft losing GitHub repos is more likely than poor bastards trying to make sense of the git command line? You think these guys are going to do a worse job with their centralized service?


People have lost data on GitHub from repositories being copyright striked for example.

At least with git, every developer has a copy of the full history so full data loss is impossible really. What happens if this company folds? You're left with some proprietary repo that you suddenly have to workout how to self host.

It just doesn't make sense when compared to just learning git which is definitely the most fruitful thing a developer could learn at the start of their career.


It's a pitch. The story has obviously been embellished and polished and condensed, ready public consumption. Being pedantic against it is not productive.


Politely disagree. It’s productive because hopefully future teams who launch on HN ask each other, “Is what we’re saying true?” during all those polishing and condensing sessions. If they don’t, the risk is crossing a line that damages the reputation of the team and undermines months if not years of hard work.


That's a creative way to defend a dishonest pitch


If the pitch is dishonest, why would I ever trust them with something as vital as my VCS? (And yes, "embellished" means dishonest)


This is not a pedantic criticism.


But that seems like pretty much the equivalent to "rm -R *"? And also just a permission/configuration issue.


To put into perspective, that was in 2014 :D There were no branch protections, and git was even harder to use. Plus everyone was new at git, obviously (we started in 2013 with mercurial, which was still a legit thing to do, and switched to git).


Yeah, these days stopping force pushes is a checkbox (default?) in GitHub.


Or drop table|database or delete from. To _nearly_ lose data it took multiple clueless engineers and not detecting the issue for months.

I wonder how Diversion handles operations that possibly delete data. Whats their solution?


> but the whole team that fucked up hard IMHO.

Multiple individuals with similar problems would tend to imply systematic inadequate training. Or the enterprise concerned adopting an inappropriately complex system for its intended userbase.


Or, git is both very complex and very useful, and a large portion of its users have a poor understanding of git but enough for it to be a useful tool. If you want to do source control (which you do), then you’re investing time into learning git and/or fixing git, or maybe using a project like this.


You literally just said what GP said in different words, but prefaced it with "Or" as if it's a disagreement. What you said boils down to "inadequate training".


We both agree that they didn’t know the tool, but GP seems to blame them for deciding to use the tool without training. I was more or less defending their choice to use git, while also acknowledging the potential of a tool like Diversion. My interpretation of GP was that it doubled down on git, while claiming that anyone using git without understanding it is “doing it wrong”, which I agree with in principle but not in practice, as I argued in my initial comment.


And the tool made a screwup that hard not only possible, but very difficult for the victims to recover from.

Doesn't say a lot for git's usability.


A couple of thoughts about this:

One is that the possibility of overwriting history / etc is a really powerful and useful feature, but one that should only be used with some consideration, hence being gated behind the scary '--force'. The fact that git provides one the ability to discard and overwrite commits for a ref shouldn't be an endorsement of doing so freely. I'm glad git has this capability though and any "git alternative" would be all the worse if it didn't provide it, IMO.

Two is that if the concern is git's usability - i.e. the "problem" here is that it's too "easy" for users to do destructive actions accidentally - well, there are ways to solve that other than to reinvent all of git. There are plenty of alternative git UIs already, and an alternative UI is a great way to be "wire compatible" for existing users but still help protect those novice users from footguns.


That all makes sense and mirrors many of my own thoughts.

Though I'll say that "--force" isn't necessarily a "scary-sounding" option name unless you're used to Unix CLI naming conventions.

Further, the warnings git gives you about this are virtually inscrutable if you don't already understand what's happening.

A good interface to "blowing away history" would give you a brief summary of what will actually be gone, e.g.:

"If you go ahead with this overwrite, the following changes will be completely removed from the repo:

a3bf45: Fix bug in arg parsing 22ec04: Add data from 2024-01-17 scraper run ...

Are you SURE you want to completely destroy those commits? (Y/n)"

and if user says "Y", output should log all removed commits and also say:

"These commits can still be recovered until <date>. If you realize you want these back before then, run the following:

<command to restore commits>"

Generally, I think it's a mistake to put UI improvements in a secondary tool.

If there are issues that need fixing, get those changes in the canonical project, because layered patches on top will always be short of maintainers and behind the main project.


> Are you SURE you want to completely destroy those commits? (Y/n)

While there is a lot of user interfaces that could be improved, I believe the above have empirically been shown to be inferior to the alternative "re-run this command but add scary option to proceed".

Users habitually answer "Y" to questions like the above all the time. And certainly after a few times it becomes routine for anyone. But having to re-enter the command and type some a whole word like "overwrite", "force" or "i-know-what-im-doing" is a whole other roadblock. The example is especially ill-chosen to have Y as the default option.

Any operation in git that destroys so many commits will include a list of commits that is destroyed, similar to what is suggested here, and trying to push the resulting repository will say exactly how many commits will be removed, and require rerun with force option (together with the necessary privileges). So reality is already not far from what you suggest, but with more fail safes.


You make a good point about "Y/n" being more dangerous than refusing and requiring an explicit option be passed.

The clear warning about what commits will be lost is not at all how I remember force-push working.

That said, I usually use magit in emacs for git and understand the force options well, so I haven't actually looked at the standard push failure warning in years. Maybe I'm remembering wrong, or perhaps it's been improved in recent versions.


It doesn’t overwrite the commute though. It inserts new ones and resets the branch pointer … doesn’t seem like you’d need a whole new tool to mitigate this - just an automatically generated tag or something when you —force-push - would be easy to do if there was demand for it …


They used --force which is usually the flag to say: Here there be dragon. Be careful.


Yeah, I can’t see how use of a —-force flag by people who didn’t know what they were doing is enough of a reason to switch to a different VCS (let alone write one). The issue was people using a tool in a way that they shouldn’t have. Which isn’t a technical problem, but a training problem. You can’t fix people problems with technology, so I’m sure there will be other footguns in this new system that someone else will figure out how to almost lose data.

Git is great in that it is flexible and powerful. But that power leaves some tools open to people who don’t know what they are doing… that’s the trade off.

(Now something that better handles non-code assets and large data files, I’d be much more willing to listen to that pitch.)


So the work wasn’t actually destroyed, and you were able to recover it. So all the people pointing out how implausible that part of your pitch was were right, and you were in fact just lying.


That's really not the main point of the post, but you're right I should have been more precise.

Edit: updated in the top text now!


I think the point being made is that you spent a lot of your opening post talking Git, and lead with that bit, rather than with Diversion. What makes Diversion different is added in the end, after you've spent time trying to convince Git fans that their current tooling isn't good. Worse, the examples you listed of why Git is bad is more reflective of configuration and processes than Git itself.

This is ultimately a very weak pitching strategy. The first thing you convey to your potential users is insecurity--an insecurity that people won't choose your product over Git. And it's hard to want to buy something from someone that isn't secure enough about their product to pitch the product first, and answer questions/make comparisons after, as a form of clarification.

Alternatively, instead of doing a comparison to Git, you could start with a list of "have you experienced these Git issues? <list of problems>. Here's how Diversion improves on Git in this regard." In this case you're actually solving people's problems, rather than looking like you're grasping at straws to complain about Git and justify an alternative.

FWIW, I personally have 0 interest in a cloud-first version control. I like the cloud as a form of backup and syncing with team members, but I ultimately want a version control that works as well offline as it does online, and prioritizes the local experience.


The main point of your post is how much better you are than git. You support this main point by making up lies about git. This does not make me personally interested in trying your product.


From my point of view, it's not that much about lying, for me the OP demonstrates a degree of incompetence of the post writers about Git.

The fact that they don't seem to fully understand working of Git (not on the level of Git developers, just the level of Git administrators/users) does not inspire trust in their competence to create a Git alternative.


Just somewhat surprised because if anyone did a `git pull` they'd get divergent history and therefore a merge on default configuration. It would take a lot of manual work to ruin more than one copy of the repo.


For your information you can use the reflog command to find the previous head commit and restore your branch. It takes 10 minutes and then you learn to disable force pushing on the main branch.


I find it funny how many comments in this angry rebuttal section actually endorse a Git replacement.


It's an interesting new application of that joke, "when I have a question on Linux I use a sock puppet account to leave an obviously wrong answer which prompts dozens of corrections."

I'm trying to imagine how to generalize this to other products. I think if I state the competing product has negative feature X, but also intentionally get some details confidently incorrect or deliberately feign incompetence, you get a group of people confirming X.


I find it funny how many comments you've made in this thread missing the point. People are reacting against the dishonest pitch, not the product.


So you were able to recover and did not lost a months work of data? Your story just doesn’t make sense. Come on.


Indeed you're right the work that was erased from BitBucket was restored from one of the employees that didn't yet pull, the post was edited accordingly.


Wouldn’t you still have been able to recover it even if everyone did pull, assuming GC had not run on everyone’s machine?


> the work that was erased from BitBucket was restored from one of the employees that didn't yet pull

Actually those commits that you considered lost, were still stored on everyone's personal computer in your team. You just didn't know how to use `git reflog` to find them.


> doesn't destroy the reflog or runs the GC server-side.

Git doesn't give you access to the server side reflog either. So it's of not much use if you don't control the server.

As for losing data with Git, the easiest way to accomplish that is with data that hasn't been committed yet, a simple `git checkout` or `git reset --hard` can wipe out all your changes and even reflog won't keep record of that.


That data not committed to git can not be recovered by git should hopefully not surprise anyone.

Neither is it the fault of your version control system, or any other system really, if you cannot access your server and are without backups.


> As for losing data with Git, the easiest way to accomplish that is with data that hasn't been committed yet

Also Git has pretty awful behavior losing changes when one doesn't press "Save" in their IDE. Bad, bad Git.


Your applications also shouldn't lose work when you don't press save, this is the entire impetus for the "recover unsaved work" in most document editors. A version of Git that shunted uncommitted changes to a special named stash whenever you did anything destructive would be a positive thing.

It's what I end up doing manually anyway but why make a system where the default behavior is destructive and I have to remember every.


It may be prudent to note that git by default is rather kind in that way that it will not change your data unless you explicitly force it to with --force or --hard. I think git, as hard to learn as it can be, sometimes have a bit of an unfair reputation here. It's not all bad.

Not only is it quite careful about not losing data, someone actually took the time to make it spit out messages that not only describes what just happened, but also gives suggestions of what to do next depending on how the user wants to proceed. That adds a level of discoverability that is usually associated with dialog based guis. The quality of these messages can sometimes be surprisingly good, far from the Clippy-level helpfulness you sometimes see.

There are a few exceptions to the principle of not losing local changes, where you explicitly restore an old version of a file for example. But saying the default behaviour is destructive really gives a false impression.

But yes, you are absolutely right that a system to recover unsaved work is a good thing, but I would argue that it belongs at the editor level, not in a version control system. A user could have a number of files open that have local changes. The editor has a much better idea in which order changes were made, and which changes hasn't even been committed to disk yet.


I can't say I'm widely traveled, I have no idea how desktop Office works, but Apple does this so well.

Using their desktop apps, Pages, Keynote, Numbers, TextEdit, Preview, I never hit "Save". I just close the apps. When I come back, the windows reopen right where I left off.

I wish emacs did this. I honestly don't know what it would be like for a code editor to be "constantly saving". I guess I would adapt, but there are times when I do all sorts of changes and go "Ah, this isn't right" and just kill the buffer. The ultimate undo.

But there's a great feeling, to me, when I go to close the app (or shutdown the computer) and it just closes. No prompts, no warnings, just saves its state, shuts down, and comes back later. And with the ever popular "naming things" issue of computers, I have a bunch of just "Untitled" windows. They're there when I open the app, and that's all I need to know.

The nag factor and cognitive load reduction of that is just unmatched. "Just deal with it, I'll come back later, maybe, and clean it up". One less thing.


A month of work for a whole team was never even committed or stashed let alone pushed? That is not a git problem.


I agree. It's quite hard to actually destroy data in git. Even with the so called "destructive" commands, walking through the reflogs can usually restore work that was accidentally deleted or whatever.


I configured my github to only allow commits with an anonymised email address. Time passed and I used another machine on which I had already opened that repo before. I pulled my recent work successfully, wrote stuff and then committed and pushed.

Github rejected my commit as I had the wrong email address. I then had to try and work out how I delete a commit but keep all my changes so I could commit it all again but with the correct email address.

I'm not sure exactly what I did but in my ham-fisted experimentation I deleted the commit and restored my local copy back to the way it was before my commit, losing all my work that day.


If you had already committed, `git reflog` should have still found your changes (even after you deleted the commit and restored the local working tree) unless you deleted and re-cloned the repository.


Honestly I don’t understand why not more people use a GUI for git.

What you describe would be 1 Minute of work and maybe 10 clicks with a very low probability of shooting yourself in the foot in Tower.


Destroying it and nobody knowing how to recover, or that it can be recovered at all, it are identical.


Thanks! We're definitely not trying to bash Git, it's done a lot of good for software development and for sure is going to continue evolving.

Git had much more edge when it was competing vs SVN and other centralized VCSs. With 10Mb networks (if you were in office) you could feel physical pain when committing stuff ><

Reg how Git is not perfect in the cloud world - check out GitHub's blog post here about their cloud dev environment, Codespaces https://github.blog/2021-08-11-githubs-engineering-team-move...

"The GitHub.com repository is almost 13 GB on disk; simply cloning the repository takes 20 minutes."

Moving 13GB inside your own cloud should take seconds at most. The problem is the way Git works, it clones your entire repository into the container with your cloud environment, using a slow network protocol. With Diversion it takes a few seconds.


> Thanks! We're definitely not trying to bash Git, it's done a lot of good for software development and for sure is going to continue evolving.

It is not about bashing git; it is about anchoring your argument of why Diversion is a better alternative around git. You're basically taking your game/arguments to their playing field, and thus will have an uphill battle for mindshre.

Instead, consider reframing the playing field and mention git less (if at all). Something like "the future of version control is blah". Surprise us, talk to us about your vision for source control, or better yet, code and multi-discipline collaboration (e.g. between eng and design), etc.


I personally would not bother reading any "the future of X" if it did not address problems of existing tools. I know you're trying to give advice from a marketing pov, and it is good, but it's also inherently bulshitty – because its purpose is to net more sales rather than actually make a good argument


I'm not sure I understand this at all.

> The problem is the way Git works, it clones your entire repository into the container with your cloud environment, using a slow network protocol.

What about git's network protocol is 'slow'?

I think I can also come up with a pretty simple experiment to prove or disprove this: 1. Fill a file with 13Gb of data and commit it. 2. Upload that to GitHub or wherever you want 3. Time how long it takes to clone and compare that to the real GitHub.com

You will find the one we made takes 'seconds' (or minutes, depending on your network connection), while the the GitHub.com will take some time.

So, same data, two different results? The difference in this experiment rules out the 'slow' network protocol as the difference maker. The real reason is that the GitHub.com repo will have hundreds or thousands of commits.

Basically, the difference is the commit history, because that's how git needs to work. Git stores the diffs for the entire commit history, not just the literal files at the HEAD. I don't know what the network protocol has to do with that.


It is perhaps worth pointing out that if you don't need the history you can just `git clone --depth 1` and save the network transfer and disk space.


It reminds of when someone told me git submodules are slow.

They just forgot about shallow clones..


If you use the dumb http protocol, both cases should be equally fast.


git clone https://github.com/github/docs.git 123.57s user 37.02s system 74% cpu 3:35.73 total

git clone --depth 1 https://github.com/github/docs.git 3.37s user 1.83s system 35% cpu 14.521 total

Not a scientific test at all, but the second one was literally 15x faster, wall clock time.


> We're definitely not trying to bash Git

Using git with bash is the best way to use git (:


Came here to make a similar joke


That article also states that using a standard Git feature, shallow clones, you go from 20min to 90s. Most of the problems touched upon in the article are about state management for local environments, yes that can be tricky. And it can take time, but it has nothing to do with Git.


>> a data scientist accidentally destroyed a month’s work of his team

> This is mostly a configuration issue

git apologism :)

(FWIW I do agree with the rest of your comment, and I hope you forgive the slight joke. Product users, for any product are fallible humans. That might be fallible in accidentally deleting, or it might be fallible in forgetting to turn on the safety settings.)

Very seriously, something like this should not be possible in a source control system. Data integrity needs to be built in by design.


> Data integrity needs to be built in by design

It is built into Git by design. Git keeps commits around for 90 days even after they’re “deleted.” This is why people who understand Git were so skeptical of OP’s claim. The point that Git is confusing still stands, however.


The issue with a lot of freedom and unopinionated tools is always going to be the multitude of ways to fuck up. On the flip-side, you may not like what choices are made if you’re forced to use it in a certain way.

We enforce a strict pull-request squish commit with four eyes approval only. You can’t force push, you can’t rebase, you can’t not squish or whatever else you’d want to do. But we don’t pretend that is the “correct” way to use Git, we think it is, but who are we to tell you how to do you?

We take a similar approach to how we use Typescript. We have our own library of coding “grammar?” that you have to follow if you want to commit TS into our pipelines. Again, we have a certain way to do things and you have to follow them, but these ways might not work for anyone else, and we do sometimes alter them a little if there is a good reason to do so.

I don’t personally mind strict and opinionated software. I too think Git has far too many ways to fuck up, and that is far too easy to create a terrible work environment with JavaScript. It also takes a lot of initial effort to set rules up to make sure everyone works the same way. But again, what if the greater community decided that rebase was better than squash commit? Then we wouldn’t like Git, and I’m sure the rebase crowd feels the same way. The result would likely leave us with two Gits.

Though I guess with initiatives like the launch here, is two Gits. So… well.


> But again, what if the greater community decided that rebase was better than squash commit? Then we wouldn’t like Git, and I’m sure the rebase crowd feels the same way. The result would likely leave us with two Gits.

Meh, this is overrated. We'd end up with 2 Gits, and over time just one fork would probably take over, based on marketing, PR, dev team activity, etc. The second one would probably still be around but used by only a minor part of the community.

Just because a thing has on paper many forks, does not mean those forks are equal. In fact, a situation with many major forks rarely survives the long term. See Jenkins vs Hudson, Firefox vs Iceweasel, etc. Most people will congregate towards one of the forks and that's it.


What if someone pushes something inappropriate? Shouldn't there be a way to delete it?

As an example, what if someone pushes:

- A private key or password - Copyrighted content - Illegal content

In cases like this, it needs to be possible to remove the bad commit from the repository entirely.


Yes, but this should be only possible by way of commands that make it abundantly clear what you are doing, e.g. `git delete <whatever>` with extra confirmation “Do you really want to permanently and irrevocably delete <whatever> in the master repository?”, or a more obvious “recycle bin” that presents deleted branches/commits in familiar ways and with explicit expiration dates. But the Git architecture doesn’t lend itself to that level of user-friendlyness.


> When Micorosoft adopted Git for Windows, they faced this problem and solved it.

On Windows. On Linux Git still doesn't scale well to very large repos. Before you say "but Linux uses git!", we're talking repos that are much bugger than Linux.

Also the de facto large file "solution" is LFS, which is another half baked idea that doesn't really do the job.

You sound like you're offended that Git isn't perfect because you like it so much. But OP is 100% right here; these are things that Git doesn't do well. It's ok to really like something that isn't perfect. You don't have to defend flaws that it clearly has.


>> When Micorosoft adopted Git for Windows, they faced this problem and solved it.

> On Windows. On Linux Git still doesn't scale well to very large repos.

All of Microsoft's solutions for git scaling have been cross-platform. Even VFS had a FUSE driver if you wanted it, but VFS is no longer Microsoft's recommended solution either, having moved on to things like sparse "cone" checkouts and commit-graphs, almost all of which is in mainline git today.

I also find it funny the complaint that git scales worse on Linux than Windows given how many Windows developers I know with file operation speed complaints on Windows that Linux doesn't have (and is a big reason to move to Windows Dev Drive given the chance, because somewhat Linux-like file performance).


`fsmonitor` is still only available for Mac and Windows.

https://git-scm.com/docs/git-config#Documentation/git-config...


Fair enough, though there is a hook to provide your own on Linux: https://git-scm.com/docs/githooks#_fsmonitor_watchman


How common are repos bigger than Linux?

Linux also has the huge advantage of an ecosystem, tools and integrations. It is overkill for small projects and there are friendlier alternatives for those - but git wins because it is what everyone knows. Something aimed at the small number of large projects will suffer the same problem.


> How common are repos bigger than Linux?

In terms of number of commits, Linux is probably bigger than most. In terms of storage size, almost any video game project will be significantly bigger.

It's no secret that git is very bad at handling large binary files.


So this is very specifically for things like games with large binary assets?


No, large companies using monorepos will have repos much bigger than Linux even without large binary assets. Apparently Linux has ~10 commits per hour. I probably do ~10 commits per week. So a team of ~150 mes produces commits at a fast rate than Linux. Very rough estimate but it takes less than you'd think.

Also if you vendor a few dependencies that quickly increases the size.


You don't even need game assets, your company's icon library is likely enough to tip the scales into territory git doesn't handle well.


> really like something that isn't perfect. You don't have to defend flaws that it clearly has.

Certainly true. But it's not clear at all how does the product solve these specific problems (they say "Painless Scalability" which sounds nice but did they try developing any 100+ GB projects with massive numbers of commits/branches on it?)


> This is mostly a configuration issue. I guess this was done by a force push command. IFAIK, you can disable force push by configuration.

If a feature can lead to actual unintended data loss, it should come disabled by default. Are there any other "unsafe by default" features in Git? What would be a sane general default that prevents unwanted data loss, and why is it the case?


--force always imply data loss. You're overriding the remote state.

Do people use it in an unsafe manner because they don't understand git and there lies a problem that could be tackled? yes.

With that, I don't think git has any feature that is unsafe by default.


In that specific case there was some error that the user didn't understand, he googled and found a StackOverflow answer with --force. And naturally tried it BitBucket didn't have branch protection back then, today it's a bit better (you can still destroy your work but usually not others')


I agree that git is very complex (just try reading its documentation and how many options or commands you have never heard of before). But I think push --force is probably one of the easiest git concepts to get. The fact that someone in your team copy pasted something from SO without understanding it doesn't seem to be related to git. Otherwise we could say that the fact some people lose their data through "sudo rm -rf /" proves the complexity of Unix. I don't think so.


This was Pebcac my dude. git wasn’t at fault here, the script kiddy that pastes before understanding is the fault. Amateurs


My biggest problem with git is branch deletion — if you never do it you end up with far too many, but deleting a branch can’t be version controlled.


It is somewhat version-controlled but not completely. If you use the reflog you can find it again and you can find how it moved around. But the reflog gets rewritten and gc'd so it's not true vc.


Just curious, why do you want that to be version controlled?


Because I might realize later I made a mistake, or I might want to view history.

I’d I never cared about historical state and mistakes, I wouldn’t need version control at all :)


You could delete the branches locally while archiving them to any another clone of the repo.


> With that, I don't think git has any feature that is unsafe by default.

Well, you just mentioned `--force`. It is unsafe by default. Git has a couple of flags to make it safer (`--force-with-lease`, `--force-if-includes`) but those aren't the default.


If you’ve ever had to remove private information from history before making the repos public (think domains, names, configuration, etc) you will appreciate the ability to rewrite history (and all the other things --force gives you)


I don't get your point. Nobody is saying don't use `--force`. Just that the default `--force` flag is the most dangerous variant.


I am not aware of any default use of force. Where does that happen?


The feature is 'git push'. --force is the opt-in to the unsafe behavior. It should not be used lightly.


You're missing the point. `--force` is the default of the force variants. The other `--force-but-something` arguments clearly modify that default. It's the wrong way round.

Obviously they've done it for backwards compatibility, but the fact that they haven't even added an option to make it the default is pretty lame.


Should a chain saw come with the ability to start the engine disabled by default?


Yes. That is a great idea. You could do something like a tab that you have to remove that tells you about chainsaw safety.


The problem here is not the tool. The problem is the author's colleague's willingness to paste a stackoverflow answer into their terminal without taking a moment to understand what it does.

If stackoverflow told them to break off the chainsaw safety tab there is no chance it would have been read first.


But it doesn't lead to data loss.

The commits that were overwritten by "force" are still there on the server. Any admin could recover them pretty easily. They're probably still present in the local repo of the person who ran "git push --force" too, as well as anyone else's machine who has cloned the repo.

The only way you'd actually lose data is if every single person who had a clone of the repo ran gc.

Or apparently if nobody knew about "git reflog" and nobody bothered to do a Google search for "oops I accidentally force pushed in git" to learn how to fix it.


The Windows Git repository is only 300GB, that's basically childs' play when people are talking about "large repo scalability". Average game developer projects will be multiple terabytes per branch, with a very high number of extremely large files, and very large histories on top of it. Git actually still does handle large files very poorly, not only extremely large repos in aggregate. The problem with large Git repositories is nowhere near solved, I assure you.


This includes assets right or some kind of prebuilt data in custom formats? Otherwise it would be hard to have this much data in source files.


Yes, game development studios include their raw art and environment assets directly in source control, just like source code. That's because the source code and the assets for the game must go together and be synchronized. That also includes things like "blueprints" or scripting logic. Doing anything else (keeping assets desynchronized or using a secondary synchronization tool) is often an exercise in madness. You want everyone using one tool; most of the artists won't be nearly as technical and training them in an entirely different set of tools is going to be hard and time consuming (especially if they fuck it up.)

But honestly, you can ignore that, because Git doesn't even handle small amounts of binary files very well. Ignore multi-gigabyte textures and meshes; just the data model doesn't really handle binary files well because e.g. packfile deltas are often useless for binaries, meaning you are practically storing an individual copy of every version of a binary file you ever commit. That 10MB PDF is 10MB you can never get rid of. You can throw a directory of PDFs and PSDs at Git and it will begin slowing down as clones get longer, working set repos get bigger, et cetera.

The 300GB size of the Windows repository is mostly a red herring, is my point. Compared to most code-only FOSS repos that are small, it's crazy large. That kind of thing is vastly over-represented here, though. Binary files deserve good version control too, at the end of the day.


Git is bad for games and they should definitely compare them in their pitch if they want to capture that market.


No, it's not. LFS has improved over the years. Git is supported as a first class citizen in Unreal Engine 5 - alongside P4.


Just because it has integrations, doesn't make it great. LFS is still not great. Doesn't have a lot of backends for instance. And a real locking system is table-stakes for a gamedev VCS


Good for developers using Unreal Engine 5 I guess. Fact remains that most game developers struggle with Git.


The complexity people think they face with Git can often be overcome with a good UI and/or tutorials.


In part yes, e.g. lots of people like SourceTree. Some of the complexity is inherent though, e.g. local vs remote branches and the various conflicts & errors as a result. Git exists for 18 years, and yet the complexity problem wasn't solved yet. Other tools like SVN were never considered to be so hard to use / easy to screw up.


Have you ever tried running Git in the cloud? :)

Cloud-native and running things on “EC2” are very different things.


Yep :) Lots of products run Git on EC2/containers, e.g. GitPod or GitHub Codespaces. Ironically, Diversion works much faster on these than git

https://github.blog/2021-08-11-githubs-engineering-team-move...


Focusing on Git seems like completely the wrong pitch. Git is a distributed VCS - in all your examples you were clearly trying to use Git in a centralized manner with no backups. I suggest focusing more on your own product than on Git.


You're totally right, we were using BitBucket and pushing and pulling from there. It's really more of a centralized manner, but this is the usual workflow for most teams and companies and what they actually need (a single source of truth). Totally agree about backups, lesson learned :)


I want to add counter-feedback. I think focusing on git's weaknesses is really appealing. My background is more ML research and data and I viscerally connect with your pitch around both git's limited scalability and the complex/dangerous nature of git operations (ML researchers + git is not great).


Tbh they should really just learn it. It's super simple and powerful


Until the day, the company is target by a ransomware group and everything is switched off in panic. Or the day the network connection of a building goes down. Requiring a REST api in the cloud for every VCS is command bytes hard then. We had this before with SVN and it wasn't nice.


So it sounds to me as you are trying to create a replacement for BitBucket\GitHub and their ilk, not Git. This may be a worthwhile task. Maybe it makes sense to concentrate on this in your pitch.

For BitBucket\GitHub\GitLab and for workflows enabled by them Git is just an underlying technology. Some of the functionality of these services is implemented using Git commands very clumsily. Some Git commands don't make sense or are dangerous in such environment\workflow. Yet Git interface is fully exposed to the users of these systems.

(Despite your statements and example, Git commands are not dangerous in the sense that they can destroy information already pushed to the repo. However, as your example demonstrated, they are dangerous in the sense that to recover from them requires expert knowledge and capabilities.)

Git was designed to for truly distributed development, and it is great for that. A lot of projects use it though for a centralized development. Git the software is fully capable to support such development with proper configuration, but has arguably bad defaults for it, and the existing solutions seem to be half-assed (to tell the truth, I hate GitHub, but won't go into this right now).

For me in my work and personal use the fully distributed character of Git is not important, but being able to work offline is, crucially. I know it is important issue for many developers. With working from home being more and more widespread I'd think this issue becomes more important, not less. Not being tethered to your good internet connection, or being able to work during an outage is really cool :-)

(Before Git I would have 2 VCS applications installed on my work laptop, one working against central database that required internet connection, one fully local, with separate local database. Synchronizing them was a constant chore, a significant part of it I was not able to automate and had to do manually. Sill, it was worthwhile price for being able to work offline.)


As a game dev I find the pitch unexciting.

> git is bad we're better

Honestly, a modern git lfs workflow is really smooth. I think it handles binaries fine. Show me cumbersome git feature and why this works better. You can't just tell me tools I use every day are unusable.

I think the main pain of git is if you want to put everything in a single repo. Big isn't a problem, getting just what a I need (checking out a single large model) is the problem.

From the website I have no idea if this can do partial checkouts. I assume yes but its not stated at all.

> cloud native

A lot of studios want on-prem and self-hosted private cloud support. Cloud native is touted as a feature but details are left out. That has me wondering if some things don't work when I try to host on-prem or that its an afterthought.

Can I easily host this on my own k8s cluster? Its not stated. Cloud native doesn't mean it's on the internet

Another feature that artists like is file locking. P4 has it and git lfs actually has it too. The heavier usage of P4 streams and branches makes it hard use locks effectively these days. Merging something that was locked but now isn't is sticky business...maybe you guys solved that.

> File locking across branches - coming soon!

"coming soon"... so close.

Good luck to you guys but I think the pitch needs work.


Git LFS is a giant hack ontop of Git. Most game devs I know moved away from it over time (back to Perforce or SVN). It might seem okay at first - but deep into a project you'll want to rearrange/rename folders and keep history logs, and discover that Git LFS doesn't actually work like normal Git and your file history wasn't kept. Only once you start dealing with issues will you find all the weird hacks Git LFS does ontop.

I'd say Git not working well for game dev isn't a pitch that Diversion needs to make, because it's already clear to most game devs.


Its gotten a lot better over time as far as adding tooling to make big changes. Its the same ol' 'its fine if you know what you're doing and not so fine if you don't.' Personally I'd rather deal with git's issues but p4 has the a lot of built in support in engines.

Day to day I think its the industry tooling and the partial checkouts that have people pick P4, not esoteric problems you face years down the line.


Thanks for the feedback, glad you asked! Partial checkouts are supported. K8s not yet, but we do run Diversion in a container for testing. Private cloud works as well, it's more a question of support manpower though - will be available for large clients. Obviously we still don't have every possible VCS feature, we're just getting started :) But we are adding features pretty quickly (e.g. conflict notifications took a few days to implement). Thanks!


Game dev here, totally agree.

File-locking across branches excited me, support for visual diffing of uassets would be insane. The platform screenshots as a git/github alternative didn't excite much interest as Plastic and Perforce are major players.


Other game devs mentioned visual diffing as well, we'll try to make this happen!


Also a game dev — if you want to attract devs, tell me why this beats P4 or Plastic. How's it similar or different? Plastic has a lot of the features you're bringing up as "new" and has great DX, which tells me your either didn't do your research, or aren't saying enough about how what you're doing is meaningfully different from them.


> In our previous startup, a data scientist accidentally destroyed a month’s work of his team by using the wrong Git command.

While git is indeed a usability clusterbomb and there is massive space to improve, the problem above sounds like a devops failure. Git gives you all the tools to prevent such a disaster, all you have to do not give the root password of your CI server to any data scientist.

On the topic of your startup, I would very much like a lighter learning slope, where I can introduce regular non-coders to the benefits of source control, and still have the advanced brancing/merging/rebasing etc. for the wizards.


I also always wonder in cases like this if the person just didn’t know about reflog.

You _really_ have to try to fuck up hard enough that everything is gone, it just might require even more arcane commands than what got you into a mess.


I've used reflog many times, but I'm not sure if this:

> You _really_ have to try to fuck up hard enough that everything is gone

...is still true when using git-lfs, which seems common when using large data sets.


I believe it works the same. The large files are not immediately pruned.


True! They didn't actually have a root pwd to anything, the repo was hosted on BitBucket which didn't have a branch protection feature back then.

Non-coder users are actually an important use case for Diversion (like in game development where many/most users are artists).


And nobody on the team had a reasonably fresh checkout of the repo on their local machine?

To put it bluntly, this story does not sound credible. It's also one of the first things you say in the pitch, which taints everything you write later. I would suggest focusing on what you do well, not in making up stories about data loss with what you perceive as the competition.

(It's especially odd when Git isn't the obvious competition; Perforce is.)


Swear to god, the story is true! We were able to restore most of the work because someone didn't pull the updates. (I should have added it in the story, didn't think it was important) But it was nerve wracking :')


But if someone DID pull the update, the old commits would have still been in their local repo and they could have run "git reflog" to retrieve them.


The person, who apparently googled some command sequence and happily entered them only to find out it overwrote the team's data, could just as easily have used those google skills to search for how to undo last git operation.

Obviously this person had all the privileges required to force push the previous commit in order to save the day. The old adage of never fact-checking a good story holds true however. Not sure it's a good selling point though, by nerd sniping everyone to explain in detail what the actual problem was no one will read to the end.


Git cli UX made be not great, but the git datastructure of representing commits, branches, trees and blobs as immutable pointers and merkle trees is a phenomenal invention.

I don't agree with every command needs to hit some REST api. That seems like throwing the baby away with bathwater.

The most powerful thing about git is that I can work fully offline with a partial clone. And sync the commits when I get online.

Git got popular because of it's distributed nature.

I remember when our svn server used to hang and the entire team was blocked from even making a commit.


These SVN memories are why "cloud native" is an anti-feature to me. Most software, especially productivity software, should be offline-first.


Yep, exactly! I remember these "happy" SVN days as well. It sucked, git was much better. Today you're almost never offline though, you even have wifi on airplanes. Git's datastructures are a masterpiece, for sure


> Today you're almost never offline though, you even have wifi on airplanes.

The airplane wifi is good enough to look up an API, but it's not good enough to sync a code repo, especially one with large assets.


For assets no, definitely. At least until they get Starlinks :D I did commit into our code repo from an airplane though.


The world needs a git alternative. Anyone who has used mercurial at Google or Facebook knows the tooling could be much better.

As soon as you have 2-3 people committing to the same repo daily, git falls apart fast. The biggest difficulties with git are merging and branch rebasing. If git could do rebases better, I would suspect software development teams to universally move about 20% faster.


I've worked on projects with thousands of developers committing to the same repo daily.

Git is far from perfect, but in my experience it's been far superior to any of the alternatives I used before it (cvs, svn, p4).


2-3 people daily I would be completely surprised there's any issues.

It suggests they're working on the same files and on the same lines, why??.

My current project has 40+ developers and I haven't seen a merge conflict in a long time.

Why do you think there's issues with merging and revising, pretty much all cloud solutions have a button to do both for you.


rerere has worked pretty well for me when handling conflicts and it’s built into git.

I’ve worked with hundreds of engineers in a single codebase. Never had any issues like you’re describing.


I don’t understand how it isn’t easy?

“git merge origin/master”

Done. Unless you have a conflict, then fix it and commit. It couldn’t be simpler.

If there’s value in this product, it won’t be because it’s somehow simpler than git.


Rebasing is extremely easy with git. Sounds like more of a skill problem then a git problem.


I think the issue comes about more when you have refactoring going on at the same time, and suddenly multiple feature branch maintainers need to figure out if they're going to merge in the main branch, rebase onto the main branch, if it makes sense to squash their changes first before attempting either path, etc. If they get only partway through either one, it's difficult work to pause and resume, so that integration work is fundamentally hard to collaborate on or even really review.

And never mind maintaining an ongoing integration of multiple unmerged feature branches— perhaps that's more of a "front end" issue for GitLab and Github to solve, but the whole business of patchsets and the like is very much unsolved, and that's painfully obvious when you see Debian storing their quilt patches inside of of a git repo, rather than being able to leverage git's native capabilities to achieve those effects: https://salsa.debian.org/debian/netplan.io/-/tree/debian/0.1...


You can always merge feature branches together to check for conflicts


As others have mentioned I believe you underestimate the security requirements (often physical) imposed on game studios. Running P4 (and using the often publisher mandated tape backups) is the least of your problems in such an environment.

If diversion were to succeed it would need a huge security team because it would represent far too tempting a target. Games stuff inspires a level of attack which normal businesses simply do not encounter.


Thanks for the feedback! Totally true, I was surprised by how much emphasis on security there is in game studios. They are moving to cloud however, and we're hoping that a private cloud solution (Diversion running on their cloud account) would satisfy the requirements, at least a few years from now as cloud usage grows. Security is definitely going to be super important for us, in any case.


You need to stop worrying about the cloud and focus on being agnostic to things like VPNs, on premises and so on. (Also SSO mechanisms). Any cloud needs to be optional. If you can make it easy to deploy on a private cloud where everyone accesses it via corporate Google accounts but also deploy on prem (say containerized) with Active Directory integration you will cut a lot of noise.

Game artists will increasingly not trust the cloud as it is where their data goes to train AIs, but more pressingly a large proportion of asset development is done where the network (and electricity) is surprisingly flaky.

To be specific I have been involved in several situations where we were not allowed to mention things in email due to it being cloud hosted, and those restrictions were imposed by companies that are themselves cloud providers. That is the kind of level being discussed.


Thanks for the feedback! We do hear that remote artists with flaky network connections is common, and plan to address it in several ways like CDC (a method for efficient chunking and diffing of binary files to minimize network transfer and storage duplication), local network caches or peer-to-peer transfers. Private cloud deployment or on-prem with SSO will also be available.


I get that enterprises will buy anything with "Cloud" or "AI" in the name, but VCS doesn't have anything to do with the cloud. Lots of VCS's have had server-oriented architectures, well before Git.

I see a lot about the architecture and design here. This is a product smell: focusing more on technology than solving problems. Some nerdy people may be interested in it, but it all means nothing if the experience of those users isn't good. You want me to buy your product? Sell me on why the experience is better. How it's going to speed up dev time, reduce errors, make collaboration better. None of that will be improved by a REST API or distributed storage in your backend app.

I'm also not looking to change my development practice. If most of your features are only available in a browser, I'm not going to want to use it, even if it were better than what I do now. Meet the users where they are.


Totally agree, and the launch post could only be that long :) We're trying to build a better dev experience, and the tech is only a means to that end.

> If most of your features are only available in a browser, I'm not going to want to use it, even if it were better than what I do now.

The CLI is the most complete interface. Web UI still can't do everything (getting there though).

Thanks for the feedback!


It's about time someone revisited & reimagined version control. the previous generations each lasted about 15 years: SCCS/RCS -> CVS -> Bitbucket/git/mercurial -> ??? so I am glad to see this.

I would start by talking about what is great about Diversion -- what it lets you do that you couldn't before.

Since you mention gaming and perforce I looked in vain to see if it supports binaries (a major limitation of git -- just simply not in its design space). "Binaries" can actually mean for some people compiled code -- not for me but I understand why people do it -- as well as images, data files, Word files etc.

Sounds like the second is scaling but you don't say what you mean by that. Git scales pretty well until either the repo & its history gets enormous or when there are a lot of people making simultaneous changes.

The realtime collab integrated with a version control mentality could be interesting -- a major problem with google docs is the lack of useful version control (even Word is better).

And why cloud native?

Once you've done that you need only briefly mention "why not just use git instead?"


> And why cloud native?

And what even is cloud native? When "cloud native" isn't just marketing, it seems to have all sorts of different meanings.


We're using S3 storage, lambda and ECS compute, and serverless DBs. It allows us to build a scalable product much faster, and leverages cloud features like multizone backups and distribution without having to develop these ourself. It also allows fast transfer of data between Diversion and other cloud systems.


Those are things that matter to you not your users.

You did talk a little (in a comment or your post I don't remember) that you can use various cloud APIs to integrate into other systems.

But at the moment, from what you're telling me "cloud native" is as interesting to me as how you format your source code.


So basically not self-hostable.


??? might be https://pijul.org/ with its commutative awesomeness


It's worth mentioning that Pijul is largely based on the same theoretical model of Darcs, which actually predates Git. But Pijul definitely makes it a lot more workable, e.g. by solving the exponential (iirc) complexity merge that Darcs has.


This is not true, actually. Pijul answers the question "how to model states so that conflicts are allowed (and hence modeled)", whereas in Darcs it is "how do operational transforms on patches alone inform us about conflicts"?


> SCCS/RCS -> CVS -> Bitbucket/git/mercurial

You skipped Subversion/SourceForge. People forget but for about 5 if not 10 years, SVN was the biggest SCM in town, next to P4, P4 having more of a hold in the game dev world.


I not only did not forget about subversion, I funded it in part. But it was of the CVS generation.

I did leave out the proprietary things like Perforce and Aide de Camp, Solidworks PDM and the various in house things. Life's too short!


You also forgot Microsoft Visual SourceSafe :-p


Thanks for the feedback! Totally agree. Diversion does support binaries, should have mentioned it directly (will update the website).

The thought behind cloud native is that workloads and devtools and data are moving there, and we want Diversion to be the best choice for when everything is in the cloud. Besides that cloud storage and DBs allow us to build and iterate much faster, and worry less about scalability, data distribution & storage etc.

But we can also run Diversion locally / in a container, it just won't be as scalable.


why is it about time??

VCS is sort of a solved problem like SQL.

it's like saying it's about time someone revisited those Javascript frameworks.


The comment you're replying to mentions some limitations of the current state of the art.

I agree that there's no reason to change just because something is older than some threshold. Bit I think the roughly 15 year cadence has reflected the time it has taken for a new VCS idea to develop and then become mainstream enough for its drawbacks to become annoying enough that it loops again.


I don't know about that. The UX of git leaves a lot to be desired. monorepos and large files are not definitely solved, either. If not git, what solution did you have in mind?


millions of repos on github don't have those issues.

Youre talking an extreme use case. Like saying we need to replace the shovel because it cant do what a bulldozer does.

Its miniscule fraction of GitHub repos have those issues like an incredibly small number mainly devoted to games or media.

UX is fine if you know it. SQL UX isnt ideal either.


Being deceptive about Git's shortcomings is going to raise eyebrows with anyone seriously evaluating your solution, which is already going to raise eyebrows because it's not free software.

Most studios will try to avoid locking themselves into another expensive, annoying VCS that they have no control over. There's good attempts at FOSS p4 replacements now, you need to do better if you want to stand out.


> There's good attempts at FOSS p4 replacements now, you need to do better if you want to stand out.

There are? Like what?


I've been attempting this but have paused the efforts due to what seems like lack of interest -- giving me an indication that the problem is not that big of a deal.[0][1]

If anyone is interested in my resuming them, let me know at contact at weedonandscott dot com

[0] https://www.reddit.com/r/vfx/comments/11s08ne/your_opinion_o...

[1] https://www.reddit.com/r/gamedev/comments/11s5haf/what_do_yo...


I don't read the replies here as a lack of interest, more as a lack of interest in the solution you've come up with.


I don't disagree, but usually when a solution doesn't fix a painful problem, people are almost disappointed after getting their hopes high. In those posts, the commenters give me the impression that they are mildly annoyed at best (or worst).


I can't speak for the VFX post, but I think the gamedev post shows the weaknesses of git in one line:

> Git does have problems, we’ve had processes fall over on projects, many times actually, but it’s always solvable.

Every team that I've worked on would replace that tool with someonthing more stable if it existed. It does for version control, and it's perfoce, which comes with a hefty license fee, and a _different_ set of problems.

The selling point of Pipetrack is the commutativity, but to echo the comments in those threads, I've never found myself wanting commutavity. In games I want a mainline branch, support for large assets, granular access controls, performant, shared global system, and a method to cleanly differentiate between wip changes and ready changes in a way that works with non-technical users, and won't be the highest individual line item per-user subscription we pay for. Unless the selling point of your tool fixes one or many of those problems, it's not going to help in games.

Honestly, I think we'd be better off on SVN than git most of the time, but the _tooling_ around git is far superior.


Yes Please. I would switch my studio today if this existed.


Hi, why the switch? Purely cost? Or admin overhead?


Yes, to both. P4 itself is solid, but it's a very chatty protocol and is very latency sensitive. Running a master in the us, with clients in europe is painful for everyone involved. Replicas and edge servers come with other tradeoffs too.

As a developer, doing things like "I only want this subtree of the stream" is hard. Virtual streams exist, but they have a (non-negligible) overhead on the server. It has some quirks due to it being 30 years old which make it... interesting, to work with sometimes.


Agreed. It's also expensive, and support has been lacking. We'd much prefer something we could maintain and fix ourselves.


Of all the complaints I have about p4, support is actually one I'm ok with. They've pretty consistently helped me fix issues over the years (plenty of which are bugs on their side that they have fixed after I raised it). Their sla is good, and their engineers are usually good at troubleshooting.

They are wildly, wildly expensive though.


That totally depends on where you put value. We're a fraction of other companies licenses for comparison. Also, we have new pricing for our new Helix Core Cloud SaaS product!


Yeah. New pricing, but not cheaper. Don't get me wrong. It's a fine price to pay for a studio going strong. But there is no way I can use it for starting my side-project indie game.


> there is no way I can use it for starting my side-project indie game.

I am not a defender of P4, but for some reason I'm defending them in this thread. It's free for < 5 people [0] if you want a side-project indie game.

[0] https://www.perforce.com/products/helix-core/free-version-co...


Yeah, but afaict, not their new cloud service.


Lightweight branching is coming "soon" and will help this exact issue. Contact me for more info.


> Being deceptive about Git's shortcomings

Are they being deceptive? I'm not sure I see it.


Not deceptive but I would say they're being hand wavy.


The story about irrevocably losing data at least was not true (already admitted by the authors).


I think I would give them that one, they were only able to recover the data because they got lucky, not because the system was designed to not destroy data. I would go even further coming from my infra background, even if you don't truly lose production data, if you have to reach into your DR backups that's a failing of the systems in place that come before it.


I'd say git is deceptive: "checkout" doesn't mean "delete my stuff" any more than checkout a book from the library means throw it in a wood chipper.


> a data scientist accidentally destroyed a month’s work of his team by using the wrong Git command (EDIT: we were eventually able to restore from a non-updated repo clone, after a few hours)

This is actually reads like a benefit of using Git. It's really really hard to lost something completely in Git, because reflogs, because there are multiple people on your team each has the same copy of the repo, etc.


For sure, but that requires a lot of expertise and leaves the non-experts free to shoot themselves in the foot...


You are not competing with Git, you are competing with Perforce.

(Personally, I would never use a proprietary cloud-only offering for version control.)


> (Personally, I would never use a proprietary cloud-only offering for version control.)

yeah, I have no idea who would want to replace a free, open, well-debugged, featureful, ubiquitous vcs with a closed, immature SaaS offering...if your sourcecode is valuable, you don't let a YC startup gate-keep it (sorry)

git has terrible ergonomics for newcomers, but UI is exactly what the various git forges solve

for experienced devs, git issues get internalized like the issues with every other tool

a more realistic approach is the jujutsu stuff google is working on...works WITH git to create a better workflow


Disclaimer: I am designing a Git alternative too.

Maybe "cloud native" will have a pull for game companies, but I am not so sure. I think a lot of studios would want to self-host.

"Git compatible" is an interesting phrase; does Diversion use the same type of backing store? If so, I am not so sure it will handle large files as well as hoped.

I had to solve this problem myself, and I did, but it required a different storage design.

Can it handle binary files? Is there a plan for doing so beyond "commit the entire file every time" or "use xdelta"?

I think this is how fourth gen version control systems will be defined. Game studios have a lot of binary assets, so this will be important.

All in all, I could see a product like this succeeding, but I think taking VC money was a mistake because there may not be enough space in the market for a company that has to keep growing to satisfy investors. I am taking zero VC, and that will allow me to make money on as few as three clients.

Anyway, I wish you both the best of luck!


> I am taking zero VC, and that will allow me to make money on as few as three clients.

A VC would give you legitimacy to help you get the customers you need. What large customers are going to back you with their business unless they know there's some deep pockets behind you? There are few things more valuable than a business's source code, it encapsulates all of their business processes and is how all of the business's data is accessed.

Why would a company want to entrust you with this? How are you protecting against data loss, and the legal liability that comes with this responsibility (insurance, legal)?


You make good points.

My VCS is designed for self-hosting, not cloud.

I will support customer installations, not host customer source code.

I am hoping that customers see less of a need for deep pockets in that case.


This is a tried and true method of bootstrapping a SMB. Sell the bits in a box with support until you are big enough to sell the service (or don't).


I mean, insurance is cheap. I have professional insurance allowing me to personally do several million in damages due to a mistake. Or recover your losses if I give bad advice. All for the low cost of 50 bucks per month, purchased through a local dev-co-op.

Trust isn’t built by who backs you though. Does it make it an easier sell? Maybe, but as a buyer, if that’s what you’re leaning on to sell to me, I’m going to be put off.

If it comes down to two, I’d trial both and care about how well it does before caring about who is backing who. In the end, I might choose the smaller company and negotiate access to source code to hedge them going under. And that sounds like an even better deal than some VC’s trying to “monetize the fuck out of me” in three years.


Oh, how might I find that professional insurance? Even though I am not going to host, I'd like something like that.


> Maybe "cloud native" will have a pull for game companies, but I am not so sure. I think a lot of studios would want to self-host.

Cloud native and self-hosting are not mutually exclusive.


They are in this case.

…and, frankly, they are in most cases. Most “cloud native” apps are designed to run on cloud services like VendorHere cloud storage, cloud functions, cloud containers, etc.

A vanishingly few of them are actually self hostable.


Can confirm, a lot of studios want to self-host.


Thanks, good luck to you too! Would love to check it out! Diversion has totally different storage, that handles binary files with no issues, but same concepts as git - branches, commits and tags. It has a Git sync feature that allows to sync commits between Git and Diversion repos. Kudos on bootstrapping the product, it's definitely not easy! Version control is much harder than it seems, you've probably found out already :)


Do you have more documentation somewhere? If so, I’d suggest making it more easily accessible. I couldn’t find any on the site and the support link just takes me to discord. I’d be more inclined to sign up if there was some documentation I could read through to get a sense for requirements, setup, branching, CLI commands, etc.



Thanks for flagging that! Just updated the site and forgot the link. It should be there now! (You also have an intro video and in-app docs, but there's more work we need to do there)


> We’re planning to release it as open source once the code base matures

FWIW, historically this has meant "we'll release it it we fail". If the product has actual market value and is attracting investment, you'll never win that fight with your backers unless you make the decision preemptively before they write their checks.

To be clear again: I don't doubt your sincerity here. I'm just saying that by the time the "code base matures" it won't be solely your decision to make, and that I don't trust the other stakeholders.


> In our previous startup, a data scientist accidentally destroyed a month’s work of his team by using the wrong Git command.

I'd like to hear about how this happened. No one in the team heard of reflog?


> I often wondered why Git is so difficult to learn, compared to other tools.

Yes! This is something many people wonder -- other than those who love git :) I used to love Mercurial and I still mourn its mostly-loss. So I welcome a new DVCS system that is friendlier than git.

> Diversion’s code is managed on Diversion!

This is a good sign.

> can synchronize with existing Git repositories (each new commit in Diversion goes into Git, and vice versa)

Can you expand on what this means please? Does it simply mirror, or is it feature-for-feature compatible? Is it a backup capability?

> still using legacy tools like SVN and Perforce

I won't argue re SVN, but Perforce? It's used in the game industry primarily for its excellent handling of binaries / large binaries. How well does Diversion handle that kind of thing -- multigigabyte data sets, frequently changing?

The site says large files are fine, but that's too vague for games, IMO. Large, frequently changing, binary-not-text hard-to-diff files?

Edit: one final question: why cloud only? Why not software that can be locally hosted, or hosted by (other) service providers? What if I love Diversion and want to run a Diversion setup on my own Linux box in a cupboard?


The difficulty of git is that it can do so much in so many different ways. What's actually needed is something like a linter for git workflow, where an org can enforce an opinionated subset of git's capabilities in a prescribed order of operations.


Right now we're trying to get it to users who prefer something that just works, and don't want to think about hosting. But you can actually run Diversion in a container (that looses distributed storage and DBs, which means it won't be as scalable). Other providers would definitely be great, if we succeed in standardizing it like git.


Thankyou for this (and the other!) replies, much appreciate your involvement in the thread!

Answers like this also make me feel more confident / interested in the tech.


The Git sync feature allows one to import an existing git repo (currently GitHub is supported) into a new Diversion repo, and keep both in sync: every commit into Git is imported into Diversion and vice versa.

This allows a member of a team that works with Git to try Diversion, to keep backups, and to use GitHub Actions or other CI tools that work with git.


Perforce is amazingly scalable, but that comes with a price (literally and figuratively :)) We still can't handle petabyte repos like P4 does, but we'll get there. But we can handle very large and frequently changing binaries pretty well.


As this is a startup the competitors are GitHub, Bitbucket, GitLab etc. and not the underlying technology. That is a mean to a goal, making money or most probably a lucrative exit.

> [git] was built for [...] much smaller projects,

It was built for the Linux kernel. There are larger projects than that but how many? Of course if you manage to make them switch to Diversion and get paid for that you could be well off even with a handful of customers.

About technical matters, as a company I'd be very at unease to have all my code only in the cloud on somebody's else servers. At least with git there are dozens, hundreds, thousands of copies of any repository around the company. If GitHub or AWS close our accounts we can keep going on and move somewhere else.


Thanks for the feedback! We're offering a private cloud option for large customers, from many conversations this is OK for most of them. And adding a daily backup to any location is actually very easy.

> At least with git there are dozens, hundreds, thousands of copies of any repository around the company.

This is actually a huge issue for large companies (data breach), that doesn't have a good solution with Git.


Well, that's risk mitigation.

However if a file touches a developer's machine it can be breached from there unless they do like banks with sensitive information: no internet, no USB, no anything. Kind of impossible to develop like that.

So... IDEs in remote desktops and local browsers to lookup for documentation on the internet? But then you still have to protect against screenshots of code and OCR. Or photos. You can't defend against everything because developers must at least read the code. It's like DRM for music and movies. There is always a loophole.

So, mitigation. If your customers pay for that, good for you. About me, I prefer a local/remote repository for my projects. My customers decide what they prefer, currently git but they don't have many alternatives.


I agree that the "project size" argument doesn't make much sense, however size as in file size does. There are multiple methods to add support for large binary files in git, but none are great.

With their focus on game developers, I imagine that's one of the primary use cases.


Your core feature is live updates of changes. That is actually a nightmare. I don't want live updates. It will be a mess.

If you really want to do cloud. Use git.


Thanks for the feedback! If you're on a separate branch it doesn't update from main automatically. We're also thinking to make live update of main optional as well, this is actually one of the most debated features internally. Apparently some users like it and some really don't.


> cloud-native version control

Just how "cloud-native" are we talking here? One chief selling point of git is that you are not reliant on a centralized server. You have remotes, yes, but if one of them goes down, you still get the entire project history.

I am of the opinion that the software world needs to find something better than git. One of my major pain points as a lead are the inevitable, yet unpredictable, git questions I will be getting from new juniors. But "cloud native" sounds like a poor starting point for "better git". Not needing external servers for VCS is one of the things git gets right.


> Just how "cloud-native" are we talking here?

So we're entirely serverless, using distributed cloud storage and DBs. Basically Diversion is up as long as you don't have a major AWS outage (happens, but rarely).

> Not needing external servers for VCS is one of the things git gets right.

I agree in general. But the way most devs are using git today isn't really decentralized, everything is going into and out of GitHub/Lab. And the more often the better, because CI and merge conflicts. So I wonder if having a VCS that is decentralized in theory really important - taking into account the upsides of building in the cloud (scalability, distribution speed, collaboration etc).


Perforce seems to already fill this niche pretty well, at least in the gaming sector which this startup seems to target.


Totally right, perforce is used by almost all large game studios. But it's painful to use for smaller studios and indie devs - it requires managing your own server with configuration, backups, networking etc. It's also not great for cloud based workflows and remote work, and super expensive with rigid licensing.


Most of the time we need to host our own servers for licensing issues, you might need to consider a "on premises" option if you want to be an viable option for a lot of studios.


This sounds like a classic innovator's dilemma style division in the market.

The incumbents need to cater to existing customers with a need to host their own servers for licensing reasons; but this forbids them from using cloud native features in the core of their products.

There may well be space for a newcomer to make a cloud only product targeted at the subset of studios that have the legal ability to use the cloud. Instinctively, there may be some useful features around large files which are possible in the cloud but impractical in an on prem environment.


My understanding from reading much of OPs responses is that “the cloud” may be vendor locked to AWS. There are serious issues there to hosting your code on servers most likely owned by a competitor. E-Commerce, video streaming, games, AI, video production, electronics, retail, pharmaceuticals, logistics, publishing, etc. Like they literally picked the worst vendor to build this on.


Why the worst? Microsoft is a more serious competitor to game studios with Xbox Game Studios and now Activision Blizzard. (And yet they do use Azure).

You definitely have a point reg vendor lock, we've planned for this and Diversion will be able to run in any cloud and on-prem in the future (we are running it in containers now).


Worst due to spread. Almost any industry Amazon has a hand in means you might run into issues selling in that industry.


That's interesting!! Does the license prohibit any use of cloud to store the assets? Or private cloud is OK?


Our recently released Helix Core Cloud actually doesn't require managing your own server, we'll do that for you, as it's a SaaS using flexible licensing: https://azuremarketplace.microsoft.com/en-us/marketplace/app...


It does, but perforce is a horrible slow beast, and not as reliable as you might expect.

That said, for large game projects, it really is the only viable option.


One only uses Perforce, well, perforce.


Perforce is painful, doesn’t have good cloud offering and does not evolve


We have quite a few cloud options actually, with our Helix Cloud option just releasing last week! https://www.perforce.com/perforce-and-cloud


The cloud is just a crappier version of a datacenter built around the notion that you can make users captive to your services and make them pay hardware for 3 times its price.

It does not lead to better source versioning systems.


I just left a train, commited several changes to several repos and didn‘t have connecivity. Nope, i dont‘t want this.


Quick note to say that discussions like this are asymmetric: the git aficionados are very sure about their opinions that it's fine. The people who believe git isn't great are unsure if perhaps they're holding it wrong or don't understand how to use it. So what you see in a thread like this is 90% posts strongly asserting that the OP is wrong while the potentially large number of people on the other side are scratching their heads, not posting.

fwiw, I think they're pretty much right.


I think this is how most git stuff goes as well.

> fwiw, I think they're pretty much right.

Is 'they' the strong posters or the head scratchers?


I'm having a hard time imagining positioning here. Can you explain further how Diversion differs from DVC, Git and when using Diversion makes sense over other use cases. The GTM is slightly confusing to me (also yes - Git is hard - you cannot teach data scientists this. It'll take months).

Also agreed git is terrible right now for version-controlling workflows in AI (I have a fairly large .gitignore file with S3-hosted things ever for my NextJS + FastAPI apps - pain in the butt


The vast majority of version control system uses are not distributed, even if the system itself is (GitHub and BitBucket were born to essentially make Git centralized). An example use case is game studios having repos with very large histories (hundreds of GiBs and more) where the tip is significantly smaller. Having the entire repo history on your local machine might be infeasible, and usually unnecessary. Being able to get just the tip and get the rest via API calls solves this. Having things continuously synced has other benefits like preventing conflicts at the time they happen on files that are hard for conflict resolution like game scene files, graphics etc.

AI workflows are definitely a use case we are looking at in the near future. What types of files are you hosting on S3?


> An example use case is game studios having repos with very large histories (hundreds of GiBs and more) where the tip is significantly smaller. Having the entire repo history on your local machine might be infeasible, and usually unnecessary. Being able to get just the tip and get the rest via API calls solves this.

"Being able to get just the tip" is `git clone --depth 1`, isn't it?


And then you lose functionality, for example `git blame` depends on the history being available locally. If you want a working repository with all the source control features you need a regular clone. That's where "being able to get the rest via API calls..." kicks in :)


OK so `git clone --filter=blob:none`, then. That downloads the tip and commit history, but no historic blobs. `git blame` then works by downloading missing blobs on demand, which doesn't sound too different to making an API call.


Which, yes, but now we're at the "instead of using Dropbox I would just rsync to my Linux server" stage of it being a product.


1. git is not hard if you learn it

2. I regularly see people storing multiple gigabyte files in git.. I don't understand your issue with large files.

3. cloud native? why? part of the point of git is to have a repo decentralized away from centralized clouds onto dev's machines, if your data scientist broke git then that means your main branch configuration is off, no one else had that cloned onto their machines, and the rest of you don't know how to recover lost data which is stored in the reflog.... even ignoring the lack of simple CICD.

4. > Diversion is built on top of distributed storage and databases, accessible via REST API, and runs on serverless cloud infrastructure

What!?!?!? That is NOT a selling point....all that for a VCS?

You're literally taking everything that makes git great and saying that's what's wrong with git.

The only case you could make to sell this is maybe some sort of a high-end niche specific media heavy version control for storage intense users.

I would think that this is a college project but apparently you guys have a team size of 9 according to the website!!!

This is wild!

It kind of feels like an insult to Linus Torvalds.


> 1. git is not hard if you learn it

I basically agree. People want to just pick up a tool like git and have it just work without investing much time in it. I can understand this desire in a complex world with many tools, but something as powerful as git is really worth learning. I swear if people spent like 1 hour a day for a week in actual git training, and then did refreshers 1 hour a month as they were using it on a team, this would be less of a problem.

You don't need to know every feature, but knowing how to rebase, use the reflog, use the pickaxe, understand the concept/value of bisect and how to look up how to use it, etc. is so valuable.

The other day a professional developer told me he didn't know you could write more than 1 line in a git commit message. WTF.


I don't even use that stuff 99.9999% of the time.

You can literally get by with knowing like 5% of git and be extremely productive with it.

that other stuff does come in extremely handy for troubleshooting or fixing mess ups, but people struggle with just the very basics of git.

what you need to be productive with git can be learned in a morning.


This comment is harsh, but the reality.


I don't think you understand the scale of large creative projects like games. The current (only medium sized) Unreal Engine project I'm working on has about 300k files in the head alone, and a fresh sync for the stuff only required to build/cook the game is about 350GB. If you add the raw content from things like Maya, substance Painter, brush, etc, it is well in excess of 1TB.

Vanilla git just does not scale to this. Microsoft git and Scalar get much closer, but require a bunch more setup and are a giant foot gun if someone tries to use the repo with vanilla git. Add to that the lack of permissions control and remote management, and it is just not a good fit for an industry where say 75% of the contributors are non technical.

Git is a fantastic tool and works for the vast majority of software projects, but there are non trivial amount of projects out there that it just doesn't suit.

Perforce, as horrible as it is, is the game and VFX industry default for a reason - think of it this way, given how notoriously cost driven and penny pinching games companies are, don't you think they'd be using a free/cheaper alternative in hosted Git if they could? But they don't. This is why.

Good luck to the Diversion team, more competition in this part of the market can only be a good thing.


How niche is your use case?

Despite all the questions I have about storing un-versioned large media files in Git instead of EC2 and your lack of good management about modularizing the code base and lack of build tooling....thats the use case I specifically stated in my post.

The monstrosity that is Diversion could have great benefits for niche use cases like yours. i.e. large media file Version Control for teams who aren't knowledgeable about how to manage large code bases.

If for some reason you want to pay an additional nine person team with a huge infrastructure instead of just using EC2 and a build script..feel free.


EC2 and a build script? Are you implying that the parents 1TB of assets are "un-versioned large media"? If so, I'm sorry you are being very dismissive about something you don't fully understand.


well then modularize your code better!

git can handle large repositories there's no reason you should have a one terabyte repository unless youre Google or something with the money to write your own VCS.

but if you want to pay for a nine developer team with a massive cloud infrastructure for something like version control.... go for it!!!

I'm going to check back a year from now and see if products still exists


sigh So, in GameDev, art assets are as important to the final game as code. And is, just as important to be version controlled as code.

And sure, you can split it up between many different repositories, and call it modular But you're just adding complexity, and not really solving the core problem which is that a single version of an art assets can possibly be > 1G, and be a poor format to be stored delta-encoded, so each revision can possibly be nearly a full duplicate. (Yes, we can talk about the file formats being a poor fit for VCS, which is true, but GameDevs are just trying to make their product) And again, it's really important for game assets to be version controlled. And maybe Git isn't the best place for the assets to exist in version control, but we are in a thread about Diversion and not Git.

Now, don't get me wrong, I'm uncertain that Diversion will exist in a year when you check-back, But I can guarantee you that Perforce will, and if Unity doesn't continue it's trajectory of destroying everything it touches, Plastic should continue to exist also.

I don't even completely disagree, Diversion, P4, and Plastic appear to be a bad fit for code in comparison to git, and even in our studio we've taken a modular approach (gasp) and store our code in git, while using a better suited VCS for assets. But it doesn't change the fact that you are being very dismissive for something that you very clearly know nothing about, and clearly have far too much ego to even be interested in a legitimate dialog about.


It's not rocket science to know that if you have a 1 TB git repo filled with massive binary files that all need to be version controlled, youre doing something VERY wrong or you are using the wrong tool.

Literally the very first result on google search says almost every major game studio on earth uses perforce due to this specific problem because perforce allows you to check out binaries on a file to file basis and mark them with a change list. This appears to be a solved problem with numerous solutions.

Theres also git-annex mentioned in the thread and a number of other options.

I love that you're saying I have a big ego, but you're doing exactly one of the things I'm suggesting in your studio. lol

This Diversion seems doomed if they're competing with git, but if they're competing with perforce, that's a different story. They're probably still doomed, but at least their product fit makes sense.

https://www.reddit.com/r/gamedev/comments/2xc5fx/whats_a_com...

https://git-annex.branchable.com/


> Perforce, as horrible as it is, is the game and VFX industry default for a reason

...

> or you are using the wrong tool.

You were literally responding to a comment talking about how git isn't suitable for this problem, and seemed to be suggesting that ec2 and a build script should be sufficient...

Edit: on a second full read, it's really unclear actually what your point was in the first place, And perhaps you were in agreement with the original poster and me all along, and just being confrontational and hostile for no reason?


Git works fine with petabyte-scale projects and handles 100+ gigabyte files totally fine with some modest scripting.

For one of the projects I oversee a typical file size is about 8gb each and a handful are about 140gb. What we version with Git are a list of the files in ion (superset of JSON) so the 8gb files aren't "in" Git but just referred to via hashes by what is committed in Git. These files are accessed via CephFS and rsync. We have "server side" hashing of these files via SSH. All of the relevant scripts are in Git hooks.

Someone (external to our team) once commented that what I just described is "too complicated" and "not friendly for developer speed". Yeah, well, we don't hire stupid people so.. "thank you for your interest in this opportunity, we have decided to pursue other candidates at this time".


Interesting! So you basically implemented something similar to git LFS, correct?

Can I ask what's your usecase and why not use LFS instead?


I just looked at git-lfs and yeah the approach is pretty similar. There may be a good reason either way, as far as I know it hasn't been explored.


Having worked with SVN a decade ago and Perforce more recently: that part of the market is waiting to be disrupted. I'm a little unsure whether it was actual technical reasons (vs cultural) that kept git out of those use cases. Many devs were working with git locally and using git-svn or git-p4 to interact with the local repo. Best of luck!


Thanks!! Having talked to lots of game studios (and other companies with large repos/files) - many of them tried to switch to git because devs wanted to, and failed because of technical limitations.


Indeed the choice for me as recent as 1-2 years ago was still SVN vs Perforce. Despite only having ever worked with Git


There doesn't seem to be any justification offered on why it's not yet open source. Not that one is required, but it is suspicious that such a commitment isn't a priority, and that maybe there is a desire to keep closed-source as an option.

Personally, I'd not want to check any assets into such a tool before it becomes open-source.


Open sourcing code can require a lot of resources to do it right (manage the community, handle licensing etc). If you don't have customers asking for it and you don't need it for customer acquisition it might not make sense.


Licensing: just choice a good safe default: apache2.0 or mpl2.0

Open sourcing does not mean you want or going to accept external contribution, just be explicit that you don't accept external contributions.

There are several open source projects that are developed with no public contribution policy. Nothing new to invent here.


> Licensing: just choice a good safe default: apache2.0 or mpl2.0

No, this is how you get seriously annoying divergences in products like this. GPL2/3 would be 10/10. It would also protect the company too.


Can I offer a suggestion - focus on non-code developer use-cases. It is still wildly difficult to integrate a VCS system into a document-based application, yet this could improve many, many engineering and science applications that rely on text files (simulation configurations, pre- and post-processing setups, semi-manual entry data-backed "dashboards", etc). This is a problem that really needs solving and I don't see anything out there that makes it approachable.

Having built this myself (hacked on top of git) for an engineering analysis platform in the Energy Space, I'd say a good UI (even simpler than GitKraken) and an "just works" integration would be a massive win.

Just my 2 bits.


Thanks for the feedback! We actually encountered a similar usecase. SaaS tools that need to version user-generated data and configurations. We're offering an API so this can be done easily. What you offer can also be done. I'm not sure it's a huge market though, meanwhile lots of developers that can't work with Git (e.g. games) are looking for solutions, we're seeing a real need there. But we might branch out to other things, for sure.


I think this is a great approach. Just looking at the comments here - I'm not sure how many devs would jump for a git replacement.

However, anything that solves version control for all of the non-code cases is VERY compelling.


Ok, I'll bite.

Looks interesting. My first thought, as a gamedev who has been in charge of selecting VCS at multiple studios.

1TB is not a lot, and I would really appreciate clear public pricing terms on data before committing to a VCS. Esp at the smaller indie scale.


Continuing:

> Yes! It can integrate with any CI tool via Git compatibility, or using the API. Talk to us for more details.

You really need to provide a comprehensive cli targeted to CI/Build. Git really is really going to be a bottleneck for games, and most teams aren't going to want to figure out the hassle of manually figuring out differential updates and multi-part downloads via REST

Secondly, I watched your video, and dug through the website, and saw no mention of what your Windows story is. As a gamedev I'm going to need some assurance that your Windows support is top-notch.


Windows is supported, as is Mac and Linux - should mention that, you're right. Most of our users use Windows, so it's definitely 1st priority.

Thanks for the feedback reg CI - I'd love to hear your specific needs, as we're working on this right now.


I have a lot of thoughts on VCS and CI, feel free to reach out: indy@telltale.com


Hi, sure thing! You can get extra storage at the same price as the Free tier (we'll make it clearer on the website).


Ahh, I see it... Being in the free tier, it also just looks like a feature, I might pull the price/100GB/mo into it's own cell to make it more clear. It is a pricing page afterall.


Done!


Little feedback on the "how is it different from Perforce" section of the web page.

There have to be more advantages than "it's in the cloud" ...right? Also, for many this is potentially a huge hurdle / disadvantage. Who are you, and why should we trust you with our precious IP / code? (many game studios are on-prem very intentionally)

Personally, I'm currently (being forced to) use Perforce and I've learned to tolerate it, but wish we could use git. There have to be more things that make your offering better than Perforce and you should really highlight them.


For me the absurd size of data in the game I worked on, terrabytes of video, mocap data, assets, made cloud hard and expensive. Cloud makes me think of all that time I'm blocked on network up and my poor data caps for my employees


Thanks for the feedback! Transfer is a challenge for sure. Can I ask how do you do remote work with on-prem?


Something of note, is that Perforce Helix does offer a Git interface, with Perforce Helix on the backend. It's something I'm looking to explore at my studio to better understand the tradeoffs there.


It's called Helix4Git and you can mirror (and other options) your git repo into Helix Core using a "graph" depot type.


The advantages are that it's much easier to set up, manage and use, and less expensive. You're right, the FAQ should reflect this better.


This looks really cool! Well done! Definitely looking into this for my startup. Currently we use a mix of Git and Perforce, but Perforce is a pain to maintain and very difficult for the non-coder on our team to work with for art assets.

When we started I looked at Plastic SCM (owned by Unity)—looks like it targets a similar use case to Diversion. I honestly can’t remember why I moved away from it—I think it was the lack of polish and capabilities vs git/GitHub. I’m curious how you compare yourselves to them.

I really like the bidirectional git sync you mention so I can try it gradually. Pricing seems good.


Thanks!! What do you work on?


[Skyglass](https://www.skyglass.com), real-time Hollywood VFX on mobile.


> The biggest drawback of Git is its limited scalability - both in repository and file sizes, and the number of concurrent users.

How did you come to the conclusion Git isn't scalable in the number of users? There is no limit in the number of users with Git. There may be limits in the number of interaction between those users, i.e. pull/merge requests, clones, fetches etc. But they are mostly limited by humans.

A central system, even a cloud-native system, has far more contention points in operations that are not even interactions between users, i.e. status, commit, etc.


The number of concurrent merges is definitely a problem. With a few hundred developers committing to the same repo all day, it becomes a chore to make sure that their commits can still be put on top of each other, and that one out-of-sync commit is not holding back a bunch of others which depend on it.

This all is solvable, both through discipline and tools. But if a VCS has a built-in capability to alleviate this, it's a good thing.


True, that is one those interactions between users that I mentioned. But the amount of people having the code checked out and poking around its history is only limited by capacity to clone the repository. All other operations are just local.

Having hundreds of committers merging into one branch is problematic. Perhaps splitting the repository into smaller parts does have its advantages. Monorepos often have the architectural split of the software while throwing away the advantages of the split at the vcs level. But there is also other large software, developed by hundreds of people, that cannot meaningfully split into separate git repositories without messing about versioning and releasing. The Linux kernel is such a system. They use a loose network of repositories -- mostly a hierarchy -- to pre-aggregate merges. Git is well equipped to handle different approaches -- separation of mechanism and policy.

I am looking forward to learn how a central but cloud-native VCS will improve on that.


The solution to that problem is a merge queue and GitHub supports that now. I do agree that it would be nice if the VCS solved the problem natively, but for many companies GitHub — and not git — is their VCS.


Exactly.

The number of companies that implemented their own merge queue is too damn high.


Nice. I make music and use git for my files, but often hit the usual space limitations and I hate dealing with git-lfs. I think there's a big use case for Diversion and music/video production.


I assume the space limit your are talking about is rather a Github/Gitlab/Bitbucket limit.

May I ask, how you use Git for your music? Since binary diffs are not really a thing, the main benefit I could imagine is simply having revisions of your files with you having to create folders or having files named "track01_test_final2.flac"


I literally just `git add .` and let her rip. It's not pretty, I just git pretty much everything in my life.


Nice! We didn't even know about that. Thanks!


Yeah, collaboration in music and video production is pretty unsolved. It often involves sharing huge files in GDrive/Dropbox, lots of project_latest_january_version_3_final_modified_updated.zip etc


Final_version_4_really_final, we've all been there :D


Did you try git-annex instead of git-lfs?


Congrats on the launch! It's always exciting to see more competition in the version control space.

One question I have is whether you guys are better than:

https://desktop.github.com/

This seems to do the exact same thing, be free forever, and have a more mature GUI that is also easier to use than regular terminal git. In my firm, even with people who don't know how to code, they can use github desktop (since it babies you through the process of committing code.)


Thanks! I like GH Desktop as well, as a matter of fact our Web UI is a bit influenced by it :) The difference is that GHD is a GUI for git. It's quite good in hiding some of the complexity, but if you get a git error (like a diverged branch) you still need to troubleshoot it. Diversion is completely different. 1st of all it's far less complex, without local branches, staging area, etc. It also syncs your work in progress to the cloud in real time, alerts users about potential conflicts, handles large files without extra configuration, etc. Feel free to try it! (It's also free forever for small teams).


Congrats on the HN launch. How does this improve or expand or blow git-lfs[1] out of the water because if I needed large blob file support it's what I would use instead. It offers pointers to the big files to the hosted git instead of pushing around the binaries itself -- though I am speculating since I've not used it myself just read about it online.

I mean the nice UI and collab features are indeed improvements but I'm thinking more core git specific improvements.

[1] https://git-lfs.com/


Git LFS solves the problem if you need to have occasional large files in your repo. It doesn't work great though when you have a lot of them or when they're an integral part of your product, because it's slow and introduces devops problems (e.g. you can bomb your repo by committing from a clone without Git LFS installed). Companies that have many large files to manage rarely use it because of this.


Okay. That’s fair and info I didn’t have.


I was interested enough to go look at your page. I didn't find any ways to download a server package to run locally and immediately lost interest.


Sorry, currently we're only offering Diversion as SaaS. I understand that some may prefer to self-host, this might be an option in the future.


Your criticisms of git seem off-base, but I like the idea. I worked on some indie game dev teams in college where I was the only person with any programming experience. Git was difficult for my teammates to wrap their heads around, although they did figure out Github's GUI. A more non-tech-person-friendly tool would be nice to have in the space.


This sounds like a solution in search of a problem. I’ve never encountered any of the problems suggested with git in this post.


> We’re planning to release it as open source once the code base matures,

If it's not open source, I'm not touching this. The one thing Git has over all these alternatives its that GitHub dies, Git is still around and is GPL.

If you really want to be taken seriously, please make sure its open source with a copyleft license.


Point taken! Why copyleft though, and not MIT/Apache?


If a competitor to you comes up, and starts building a server around this client, they can start diverging from this client without upstreaming these changes.

It protects you, future users, and makes this protocol live beyond the age of any single company!

Its argubly why git hasn't gone away despite its shortcomings. And while it has been commercialized, it hasn't significantly diverged.


IMHO the only really interesting alternative to Git currently is Pijul (https://pijul.org) as it is not a more-or-less Git clone but a different approach to the problem itself.

Pijul allows for very interesting development and ci/cd workflows.


Rather than "Better GIT", it would make sense to position it as "better GIT backend" in my opinion. I get the desire to replace GIT bash but it might be harder to convince developers, who is your target audience.


Yes you might be 100% right. We were debating this a lot in the beginning. In the end we decided to build a separate VCS with Git sync, and simpler UI (for now at least - might change, depends on user feedback)


Cool! For my own edification: how did you validate this idea before building? It sounds like the kind of thing you would have to convince people they need. Or am I wrong and people were telling you they needed this?


Like any validation - LOTS and LOTS user interviews. We found out that general software devs are mostly ok with git (many of them hate it, but they manage). They are interested in real-time collaboration features that we can offer, but these are not enough to become 1st users before there's a good ecosystem. Meanwhile, in game development some are desperately looking for a better VCS. This is why we're starting there.


Is my company's private information e.g. source code kept end-to-end encrypted?


All data is encrypted in-transit and at-rest, as per the industry standard. End-to-end encryption is an interesting feature with obvious extended privacy benefits on one hand, that could prevent a system from providing other features on the other hand. Depending on the requirement for end-to-end encryption, in some instances a custom NDA can be signed to mitigate a specific concern. It's a feature we haven't seen much demand for, but if that changes we may prioritize it higher on the road map. Would you mind sharing your use case or concern?


I think the question is "can any one outside my company/org access my code". Apparently it is because the tool offers cloud collaboration using web UI.


If you share your repository with a collaborator, they can access it of course. Same is true if you share your repository with end-to-end encryption, your collaborator should be able to decrypt and use it.


So, the answer is No.


Good question! End-to-end encryption would mean that you can't run cloud CI or merge conflicts in the cloud and other features. To my knowledge none of existing VCS solutions are end-to-end encrypted (I might be wrong though!)

We might add end-to-end encryption in the future (disabling some capabilities), if there's demand for it.


> In our previous startup, a data scientist accidentally destroyed a month’s work of his team by using the wrong Git command.

Can someone explain how is this possible? More importantly was there any git branching strategy and permissions?


It could be done with force-pushes, if nobody has a commit number for the old tree.


Not sure, even then, doesn‘t reflog keep it quite a while locally on the machine that the force push was sent? Maybe he did not commit his changes for a month and sent a git reset —-hard


Constructive criticism.

I would like to see much, much more in a demo. Probably around 20-30 minutes.

What was shown in the demo video was seemingly the functionality of Dropbox, shown working with one file manager and one OS.

I know it is more than that. I’ve checked your docs. You have branching models. You have cross OS support. You have a lot going on.

But that demo video really put me off. It doesn’t demonstrate anything that seems particularly useful to me, especially on the main selling point: version control. It spends most of the time showing automatic file sync, which (although not easy to do) is a basic feature of so many cloud storage platforms that it doesn’t have the wow factor these days. (It’s also a feature I find annoying and would want to disable, but it seems to be core to your approach so meh)

I know not everyone will jump immediately to the demo video to see it, but it’s what I did, and honestly… I’d scratch it and do a full intro video and put resources into doing it right.


Thanks for the feedback! Point taken.

The intro video is really just scratching the surface, you're right. We'll post more in-depth videos soon.


Taking a decentralized application like Git back to centralization is a step in the wrong direction, it's exactly why Git was created.


Decentralization is a cool concept, but didn't GitHub and BitBucket emerge because a centralized server was in demand? Git is a good tool for many uses, but when was the last time you pushed directly to a peer's repo on their machine? How many firewall and reverse proxy configurations did you have to setup to be able to do it?


It's kinda the standard procedure of securing your machine's ssh, I'd recommend trying to setup a bare repo and put it online so you see how easy it is to make it work, that's not Github but it's more than enough for your org, in fact before Github was created that's how we did it, it doesn't really take more than 1 afternoon to set it up, multi-user and all the other jazz.

But Github and Bitbucket aren't examples to compare Diversion to though, Github is just a node in the decentralized network of Git repos (your users, or whoever cloned). They even use an example of this feature, how they were able to recover after a mess-up, then with SVN that couldn't be done and they'd have really lose more than a month of work.

There's one of the reasons why Git was created, to make it easier to resolve conflicts, if you just want a main / children branches structure in one server then use SVN / Diversion but then don't complain when you have a branch "locked" and you cannot get your job done.

Diversion makes 100% sense if you follow the silly "one repo for the whole org" way big companies do, which still doesn't make any sense to me.


I see this sort of more like a replacement for GitHub. Two proprietary repos in the sky For a majority of GitHub users GitHub is Git


how is that different than Dropbox version history?


Plus, for a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.


Branches is one example! But really Dropbox is not a VCS. Although some game studios do use Dropbox/Google Drive for versioning graphical assets - but not because it's a good tool for the purpose, they just don't have a better one.


Wow, tons of classic hacker news feedback just like the "dropbox won't work" one [0]. And we know how that ended up.

With the trend towards more and more non-devs writing code with the help of AI [1], I think you're absolutely correct with your assertion that something safer and easier than git is needed. The rest of the business case makes perfect sense too (git struggling with binaries and large files, massive code bases at Meta/Google have required alternate tools).

Love the open source strategy as well. Github showed pretty clearly that selling services over an open core can be a winner.

[0] https://news.ycombinator.com/item?id=9224 [1] https://twitter.com/amasad/status/1659423752881586176?lang=e...


Thanks! I agree - there are lots of examples how simplifying something (Shopify, Uber, Airbnb) can dramatically increase the user pool to people who otherwise wouldn't even use a comparable service.


I was curious to follow your development, but the lack of RSS feed for your blog means I basically can't.


Ooops! We built a new site and now it's broken. Thanks for flagging! Fixing, it will be available here: http://www.diversion.dev/blog/rss.xml


Fixed


How does it compare with bitkeeper? Does it handle big binaries (digital negatives, game resources) well?


Honestly, I never tried bitkeeper. It handles big binaries with no problems at all.


I don't feel the need for this problem but I'll support anything git related. good luck!


They should call it a new concurrent version system, not a git alternative.


With Git we can work autonomously, fast and distributed. The nice thing about Git is that it works on a B747* and the a cabin in the woods and on a file-server at MIT.

I will not make myself dependent on a proprietary cloud-service which requires an internet connection and reoccurring payments.

* Internet isn’t fast there. If it is available. If it is affordable. If you want ruin the precious time separated from the internet.


i'm thankful there's work being done on improving version control. git is great but if you don't respect it, it will punish you.


Why not just use Perforce?


There's a lot of remarks in this post that git isn't that bad, or that "losing data with git" is at best an exaggeration and at worst a lie. Other remarks include "People know how to use git", "--force tells you here be dragons", and plenty of git internals by way of explanation.

The posts remarks about git ring true to me. I, personally, would have lost work on many occasions if not for IntelliJ having its own "commit log" of every change I made since opening the IDE. I, personally, have had terrifying moments where I thought I'd lost large amounts of work, and spent the next hour fearfully searching google for how to unf*k myself. And I know several friends and colleagues who have had the same experience, and some who were unable to unf*k themselves and gave up: resigning themselves to just recreating the lost work. The fact that their lost work is probably still somewhere on their main repo to this day is cold comfort I'm sure. "It works for me" where "me" is an advanced level of nerd (myself included) is in the same territory as "It works on my machine".

This also seems to have really hit a nerve. There wasn't just one post saying "Meh, git is easy to use, you're an idiot" and that was enough. There was immediately a great many responses on the lines of "well, it's not that easy" or "well, I've lost work", and a whole back and forth has ensued. The fact there has been a borderline flame-war here tells you that git absolutely has room for improvement. Most of the remarks in defense of git feel like "Git Gud" [0], a term used to heckle newbies at video games, yet rather appropriate here. I don't want my version control system to be like Elden Ring: arbitrarily difficult with deliberately punishing pitfalls.

So, yeah, I'd love a better version control than git, and so would many others. I've had work saved from git by IntelliJ's internal commit log (where every edit is saved), so having that as a feature would be great. Being able to learn if someone decided to rename some class that I'm currently using heavily would have been helpful on many occasion (rare, but painful). Being able to pull someone else's work before the commit it, for review, very helpful. Support for large binary files without killing my local machine.

Good luck.

[0] https://knowyourmeme.com/memes/git-gud


Thanks for the support! Reminded me of https://xkcd.com/1597/

I wonder why git became such a divisive issue, nobody seemed to have such strong feelings about SVN :D


any work done in this area is good work.


Looks awesome!


Git is a tool for collaboration, but is sometimes a poor fit for the reality of centralised, corporate software development. I'd like to see a product that fills these gaps, and I sense that this is part of what Diversion is about. For example, large file support in git depends on clunky hacks (that I respect) like LFS, but Diversion aims to make it actually work seamlessly. Companies awkwardly split their projects across many repos just to allow for access control, but Diversion aims to makes it easy by letting you just set access for directory. Those are great, "no-brainer" advances. There are a lot of other places where clunky hacks appear in popular git usage. A lot of those are also areas where big tech has introduced their own internal solutions. Maybe Diversion can cover some of these? Some examples:

- git has a very useful hooks mechanism, but each user has to manage their installed hooks in their own clone. This should be made more team-compatible. Make hooks a part of the history that can be patched by anyone, and then others can update their hooks simply by pulling. A less-clunky, built-in version of https://pre-commit.com/ , basically.

- Keeping with use cases for hooks, it should be possible to apply code formatting transparently. Nobody should have to think about code formatting, ever. It's a common practice to ensure code conforms to the company's standard format. However, why can't a developer also work in their preferred format? It should be possible for the system to handle this. I work in my preferred format, but all of the committed code is actually in the company's standard format. A bidirectional transformation.

- Break down the barrier between repo and other channels of company communication. Often companies have a wiki, a document drive, a chat app, a ticket system. With the ability to include real-time collaboration in the repo, it makes it possible to unify all of these things, ideally while keeping a full history. I basically want to version control my entire company, including what all of the non-devs are producing. Fossil has some of this capability, but it hasn't captured the mainstream: https://fossil-scm.org/home/doc/trunk/www/whyallinone.md

- Make repos composable. It should be possible to include a repo in another repo in a way that isn't clunky. Submodules, subtrees and subrepos are clunky. A use case for this would be including a third party library as source. It should also be possible to take one piece of a repo, and distribute that "view" while allowing bidirectional contribution. An example use case for this would be for maintaining a public open source project simultaneously in the company monorepo and on a public forge. There are (very nice) hacks for this like https://github.com/google/copybara and https://josh-project.github.io/josh/ , but these have considerable clunk-factor. When repos can be sliced and glued like this, a lot of the monorepo vs polyrepo debate becomes unnecessary; we can have both.

- Make it easy for everyone to use a patch stacking / "branchless" workflow with Diversion. There are numerous projects to enable stacking in git, and it's ubiquitous in big tech, yet it hasn't gone mainstream. It's only a matter of time until some company brings this to the masses.

- Allow history to be viewed and maintained at varying granularity. As I'm sure you know, a common debate you will see unfolding online is between keeping a clean, manicured history, and keeping a full history of what you actually did. You see git forges try to split the difference by offering squash merges. You see people recommending --first-parent for viewing the logs without noise. To me, this all points to a pointless limitation of the tooling. Why can't I take the raw, messy, actual commit log of what I did, and then bundle those up into a non-destructive "summary" commit that appears in the log. That way, people can see the clean, summarised version of my change, but can also dive deeper and see all of the twists and turns I took while producing that change.

There is so much room to improve version control. Much like build systems, I feel like the industry standard falls short of what we know is possible. I am happy to see new developements in this space.


Thank you very much for the thoughtful comment! You're exactly right - Diversion's goal is to improve things such as what you mentioned. Moreover, we're aiming to build a flexible system that can be extended and improved further.

We'll definitely implement at least some of the things you mentioned!


going after git takes COURAGE, so I'll give you that lol


Haha evidently :D


If competing with git doesn't work out for you... let me tell you what the world needs... We need to bring freedom and liberation to the .docx nation. Microsoft has held that document format hostage for way too long. They keep changing that file format in order to keep competitors at bay. It is the most grotesque mote in existence in modern times. we need a document creation with versioning baked in but that abstracts those aspects and presents an intuitive way to switch from creation to publication/print. Most lawyers/docx-creators deal with at most 20 or so styles/formats. For anything else, there's publisher or adobe. So as you develop this project, if the uphill battle becomes too unbearable, consider creating a better document platform with versioning baked in.


I think Google Docs are doing a pretty good job there! I personally can't stand both Word and Excel so totally agree




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: