We are using this for addon synchronization in our community through Arma3Sync. On server side we need to "build" repository - it just generates .zsync files, then clients are downloading just diff. Update size came down from 10-15GB to <3GB.
Fossil (https://www.fossil-scm.org) is a SCM tool that uses the rsync algorithm for syncing repositories. It has a built-in web server, and can also be accessed via CGI from any CGI-capable web server. It also has a SSH option.
Its features make it very handy for a number of file transfer/sync tasks, over and above its chief SCM role.
That said, I'm not sure I know of any other major users of it -- most people just use a .torrent (which similarly has checksums of each piece so you know which pieces need to be downloaded).
Not a major user, but we're using zsync for system updates of our Raspberry Pi based digital signage operating system (https://info-beamer.com/hosted). It's pretty great and offers a few things we couldn't do with bittorrent: Every time we have a new release we put together an install.zip file of everything required (kernel, firmware files, initrd, squashfs). Users can download this file directly and unzip it on their SD card and it will boot our OS. For updates we use a previous (see below) version of our install.zip already available on the device and only download the changes. We then unzip that into a new partition for an A/B booting.
Zsync is awesome as we can specify any number of existing files already available on the device (with the -i command line option) and zsync will try to make use of them to minimize downloads. We really use this feature to our advantage: zsync by default will keep the previous version of a file if it's going to overwrite it. So we have two versions of install.zip on a device. When switching between OS releases (stable / testing...) we can switch back and forth with zero additional downloads as both versions are available and zsync makes use of that. Similarly after a user installs our OS, we just have the unpacked artefacts (kernel, etc.) on the SD. We can quickly recreate an initial version of the install.zip file on the device by seeding the download with those files. It's usually just 500k to construct an initial install.zip file we then later use to minimize all future updates.
OS development for info-beamer started in 2013. I'm fairly sure nothing even close was available back then. I'm not sure about today. So far I don't regret the NIH approach we took.
I'll note that bittorrent uses HTTP for a few different things:
- HTTP(s) based trackers (although UDP is more common these days)
- HTTP webseeds/mirrors (BEP-17)
- (if you count it) webtorrent uses Websocket trackers and can support HTTP webseeds (although really of course the P2P is WebRTC)
A long time ago on a crappy restricted internet I used this and jigdo at different times to download the Ubuntu ISO. Being able to do so for Ubuntu and not other distributions was another factor in my using Ubuntu back then.
If you are looking for a maintained system for online systems that provide software updates I would look into https://github.com/itchio/wharf-spec.
Wharf is used for Itchio to sync folder structures differentially / incrementally. It uses the latest compression algorithms. It has a reference server.
It uses a special file that has to be installed on the server. Perhaps you thought (as did I) that this was supposed to work with any file over http? That does not appear to be the case.
It's just a pair of files. The big thing you're trying to transfer and the zsync file that details the content of that first file, to guide the downloader.
"Rsync over HTTP — zsync provides transfers that are nearly as efficient as rsync -z or cvsup, without the need to run a special server application. All that is needed is an HTTP/1.1-compliant web server. So it works through firewalls and on shared hosting accounts, and gives less security worries."
The other child comment also touches on something that I seem to have skipped over, the .zsync metadata file which contains pre-calculated rsync hashes. Using this means software isn't needed, but during upload the file needs to be processed to produce this metadata.
Interesting. How would that work for clients? If the destination file already exists on the client look for a .zsync on the hosting server, if its not there look for one at https://thirdpartyzsyncs.com?url=someurl ? What happens if the .zsync is out of sync with the resource?
The client, from my very limited testing, expects the url of the zsync file as an input. The zsync file can point to anywhere else for listing the canonical version.
This reminds me about the demo utility that comes wit rsynclib called rdiff, except they approached this problem in less practical way, although it shows rsynclib better.
It works as follows:
- Let say you already have file that is older version, or perhaps corrupted, you use rdiff to generate its signature
- you go then to the place which contains proper file and use the signature file to generate a patch file
- then you use the patch file to fix your local file
Not that I would know any better but I always saw a user-controlled approach built around rdiff as a better alternative than surrendering files to a non-transparent third party such as Dropbox (who, go figure, used librsync originally).
I am aware of another project called "rdiff-backup", also at nongnu.org:
rdiff-backup backs up one directory to another, possibly over a network. The target directory ends up a copy of the source directory, but extra reverse diffs are stored in a special subdirectory of that target directory, so you can still recover files lost some time ago. The idea is to combine the best features of a mirror and an incremental backup. rdiff-backup also preserves subdirectories, hard links, dev files, permissions, uid/gid ownership (if it is running as root), and modification times. Finally, rdiff-backup can operate in a bandwidth efficient manner over a pipe, like rsync. Thus you can use rdiff-backup and ssh to securely back a hard drive up to a remote location, and only the differences will be transmitted.
Yes! I use this and like that it's so simple, and that the latest version of the backup is easily available as plain files. (Any metadata that the filesystem doesn't support are stored separately in files, so it works across different type of filesystems and operating systems.) There is even a FUSE filesystem, "rdiff-backup-fs", for mounting the whole backup history, with each backup point in a subdirectory of its own, like it should be!
Unfortunately, it seems not to be developed any longer, and it has a few things that would need ironing out:
* You can't pause a backup and continue later.
* Some operations (notably recovery after an aborted backup run) is excruciatingly slow. It takes tens of hours for me with a backup of 40 GB or so (on a low-powered computer as server, though). I think rdiff-backup-fs is resource hungry as well, which is perhaps partly understandable, since it has to go through a series of reverse diffs to present old versions of a file.
* I tried it on Windows once, and it could apparently not handle paths longer than a few hundred characters (due to using that older Windows API, whatever it's called).
* You can't delete intermediate backups, only the oldest one.
There's also xdelta which is just an algorithm / program for calculating and applying binary diffs. I suppose the advantage of zsync is that you can always point "new" users to "/current.file", and use zsync to patch "up" to the latest version - with xdelta people would need to explicitly get "my..current.xdelta".
The big difference is that xdelta (much like any diff program) needs both the old and the new versions to create a patch. With zsync, the server only needs to have the new version (which is the only one of interest), and the clients then gets only the parts they want, because they only have the old version.
On a LAN or UDP-based "Layer 2 overlay", there is mrsync for this purpose. One could efficiently distribute regular updates to some data that everyone on the network needs, e.g., "domain names" and IP addresses.
For what use case? We do use `rsync` instead of `cp` in some capacity, even for local-to-local file copy - as there is slightly more verification of a successful copy, and the destination is a quirky flash medium. Not sure how SCP would help here.
I mean that my train of though was first "scp, it seems, can do everything cp can, and more, so let's drop cp and use scp instead", then "but hey, rsync, it seems, can do everything that scp can, and more, so let's drop scp too and use rsync instead".
Possibly. `cp` is ancient and rather basic; OTOH, it is everywhere (as opposed to `rsync`, which I found out the hard way) and it is tiny (fewer toggles to push - less stuff to break).