We haven't found a solution as we have given ourselves requirements to store revisions of large binary files (PSDs), but we don't have the ability to effectively manage 8GB repos (and I don't know whether anyone does.)
It seems like ultimately, versioning software will need to grow to understand more file types than text in order to keep track of changes to many large files. I imagine Git plugins available by filetype and perhaps automatically sourced by Gitlab when needed, but I know that would be an enormous task.
Unrelated: can you possibly work with Atlassian to get Gitlab integration for SourceTree? It would be very convenient for us. :)
Versioning large files is hard. The way to do it is not to put them in the git repository itself but to manage symlinks to the SHA-1 of the large file in git. This way your git repo stays under 1GB but you can request the file from a server. GitLab EE uses the open source Git Annex for this but adds code to check your project permissions. All you need to do to get the files is a git clone and 'git annex sync --content'. Read more about this on https://about.gitlab.com/2015/02/17/gitlab-annex-solves-the-... Please let me know if you have any other questions.
We would love to see better support for GitLab in SourceTree and so would many others https://answers.atlassian.com/questions/47020/comments/26160... The response from Atlassian in that thread indicates they are less inclined to do it. Since GitLab is an open source alternative for their Stash+JIRA+Confluence+Bamboo products I understand their stance on this. Of course it never hurts to ask in that thread.
... actually, versioning large files is easy, but not directly in git.
https://github.com/bup/bup strategically breaks files to chunks of ~8k length, in a way that makes small changes to a big file become a small change to a small number of chunks (ideally, changing one byte in a 80GB file would change exactly one such 8KB chunk -- and that is often the case, though it's common than 3 or 4 would change).
bup then puts[0] the chunks into git, together with a "reconstruction map". As a result, you can efficiently put huge files in git and only pay for the real deltas in storage.
[0] It doesn't actually use git or libgit - it writes git packs directly.
I agree git-annex is the best solution for now - I just wish some effort would go into versioning binary files by creating projects that mimic the behavior of the programs that generate them in some way.
Probably the closest thing to what I am optimistically hoping to see someday is how developers save DB migration files instead of versioning copies of the database tables, and then run the migrations as part of the merge process to sync the databases.
Thinking about it more, it seems like a general project to create 'migration' specifications to generate binary files would be more appropriate than a project to modify version control.
I agree that it would be nice to have more files being created algorithmically instead of being binary. You can diff an svg image but you can't diff a jpeg. Although the svg diff could be presented better than how we currently do it.
And you're welcome, more than 700 contributors made GitLab what it is today.
Binary files (large or not) generally can't be merged, and thus working with them in a version control system effectively requires support for exclusive locking. Is this supported (I don't think it is) and/or in the radar?
It seems like ultimately, versioning software will need to grow to understand more file types than text in order to keep track of changes to many large files. I imagine Git plugins available by filetype and perhaps automatically sourced by Gitlab when needed, but I know that would be an enormous task.
Unrelated: can you possibly work with Atlassian to get Gitlab integration for SourceTree? It would be very convenient for us. :)