I've been working on a spiritual successor to Unison for a few years called Mutagen [1]. It aims to provide more flexible synchronization and tighter integration with filesystem watching, enabling ~real-time remote editing with your local editor of choice. It also adds SSH-style network forwarding, so you can forward network traffic to/from remote systems to access remote applications without exposing ports. It currently supports SSH and Docker containers, but additional transports are coming.
I'd be really happy to receive feedback if you have a chance to try it out!
The name seems to conflict with the Python module (https://mutagen.readthedocs.io/en/latest/), which has been around for a longer time. While you seem to have put in a lot of effort seeing your commit history and search ranking, your first commit is in Nov 2016 - nearly a decade after the other one. Perhaps you should consider rebranding the project, with the greater interest of the FOSS community in mind.
Haven't dug into it yet, but quick Q: does this allow the use case of using a local IDE to interactively develop/run code against a docker environment running in a remote server? (so two "jumps")
That's actually one of the primary use cases at the moment. The way Mutagen works with Docker is by using `docker cp` to inject one of its "agent" binaries into the container and then communicating with that binary over `docker exec`.
You can also orchestrate your setup a bit (e.g. to work with Docker Compose) using Mutagen's new orchestration infrastructure. You can find an example of this at the bottom of the Mutagen homepage: https://mutagen.io
Is this project still going? I used to use it a lot for duplicating my project directories between my lap PC and my home PC. It was so powerful and so useful at the time. Honestly I thought it had died.
I recently
switched from Syncthing to Unison, after getting tired of Syncthing either taking forever to sync anything or getting stuck altogether, with very little visibility or control over what it was doing. So far Unison (with fsmonitor) has ‘just worked’, with files updating immediately on save as I’d expect.
(That said, one of the problems I encountered with Syncthing seemed to be that it wouldn’t properly reconnect when I switched networks, which is necessary since it runs as a persistent daemon. With Unison, so far I’ve been starting it up in the foreground in a terminal as needed, so it hasn’t had to provide the same functionality. But I expect I’ll eventually set up some type of auto-reconnect, which will hopefully work better than Syncthing did.)
I used to use it daily to sync projects, music, and documents between computers too. It was really nice once I got it configured to my liking. I would sync my work computer to a external hard drive and the external drive to my home computer, thus creating a very powerful sneakernet and an implicit backup on the external.
I reluctantly installed Dropbox one day after forgetting my external drive at home. Honestly, that was the beginning of a miserable time. Dropbox never really delivered, it took forever to sync, and constantly caused the internal disk to trash. I had tons of other problems with Dropbox over the years and was very happy the day I quit using it.
I still have a unison alias in my .profile
alias unison="unison -logfile /Users/eddie/log/unison.log"
Sorry, I'm probably missing something obvious (and am asking out of genuine curiosity here), but why didn't you go back to using Unison? And what are you using now that you quit Dropbox?
Dropbox ended up be an institutional requirement where I was working. Now I just let iCloud do it’s thing. It isn’t great, but it is less buggy than Dropbox.
Unison suffers from the Lisp Curse - it's written in OCaml, a similarly powerful but small-audience language. It "just works", so there doesn't need to be much activity.
Absolutely, I have a entry in my journal about wanting to rewrite Unison in C. Never got around to it, but it would still be fun. The same goes with Wings 3D, cool app, but Erlang? Really?
Maybe, I really like Lisp and especially Scheme. Don't get me wrong. I just think that a lot of projects suffer from being written in a language with a small community or even, at the time of conception, a thriving yet fleeting community (Perl, Ruby, etc).
Unison seems to be maintained, there just isn't much to maintain. Perhaps being in C would attract a huge community to make up for the increased work required, or perhaps that wouldn't be sustainable and we're better off with projects that can better "hibernate".
Not C but not super powerful seems like the worst of both worlds.
Unison and Wings 3D are both projects I wanted to make changes to when I was using them. Downloaded the source code for both, spent a day or so reading and fiddling and got bored. I actually like learning new languages, but I don’t have the capacity to learn a new language, build system, tooling, style & formatting, install the right Emacs mode, and so on just to scratch a little itch.
I used Unison to sync between two laptops. In using only one laptop now and I switched to Syncthing to sync some folders of my phone, tablets and laptop. Unfortunately Unison doesn't run on Android. I liked the way it could perform a merge of the folders from different computers.
I haven't tried unison yet, but you can mount android on your laptop wirelessly over ssh.
Enable your android phone's hotspot, and connect your laptop to that hotspot.
Install termux on android (playstore). Install openssh in termux. Setup passwordless pub/private keys to log in from laptop to android termux sshd.
cat ~/.ssh/config # Laptop.
Host someName # Your android.
User u0_a168 # Whatever username termux gave you.
HostName 192.168.43.1 # 'ifconfig' in termux for this.
IdentityFile /home/you/.ssh/someName_id_rsa
Port 8022 # Default termux sshd port.
Now this should work from your laptop:
ssh someName
Log out of your laptop's termux session.
Install sshfs on your laptop, through your package manager or the manly way from github. I'm not manly.
Here's part of a script I use to get photos from my phone to my laptop.
# Mount the phone's DCIM directory locally over ssh.
# I had trouble with termux symlinks, so I went directly to /storage/...
sshfs someName:/storage/emulated/0/DCIM ~/mnt/someName
cd ~/mnt/someName/Camera
cp -vn * ~/Pictures/. # Copy all pictures that aren't yet on laptop.
mv * ../Saved/. # So ops on dcim/Camera don't take longer with more pics.
cd ~/Pictures # Can't unmount until leave mount.
fusermount -u ~/mnt/someName # Dismount. Gymnastic salute.
# Do whatever postprocessing you like on your photos ...
I can't praise Unison enough.
I've been using it every day for over 10 years and is easily one of the most useful pieces of software I've ever come across.
I've donated multiple times over the years.
"I can't praise Unison enough. I've been using it every day for over 10 years and is easily one of the most useful pieces of software I've ever come across."
Unison was the first backup binary that we built into the rsync.net platform - breaking our original design goal of only offering client agnostic SSH and the tools that would run over that.
Shortly afterward we also added rdiff-backup. Both of these tools were quite popular and we saw a lot of interest back in 2005 - 2010 but we see very little use or interest in them now.
All of the interest in backup clients is now in rclone[1], restic[2] and borg[3].
restic was easy - you can point it at any SFTP capable host.
borg and rclone, on the other hand, we had to (like unison and rdiff-backup) build and maintain on the rsync.net server side.
All of these (save rclone, which is a binary executable) are python scripts. But we don't have a python interpreter (or any interpreter) in our very locked down platform. Can anyone guess how we do that ?
Like you did with attic and borg? Quoting you on January 2016:
> We solved the problem by (cx)freezing the attic and borg python tools into binary executables. So, still no python in our environment (reducing attack surface) but the ability to run attic and borg just like they are meant to be run.
I only just found out about borg and restic, but the sites I looked at both mention "backup" prominently. And that makes sense. If I ever sign up for rsync.net, it will be so I can store long term backups of my stuff, which I might add to incrementally, but never really change.
But Unison is for syncing, which means maintaining eventually consistent replicas of a changing directory tree (up to some pragmatic exceptions and manual tweaks). The whole point of Unison is to turn multiple devices into a single failure domain, which requires a separate system for storing safe backups.
We still use duplicity because it allows us to use our own rsync backup servers, has encryption, compression and par2 optional (defaults to 10%). Every one of these is optional. Too bad it doesn't offer the ability to use compression algorithms other than gzip (lzma or zstd would be nice).
I'm a maintainer of the Unison package in Fedora and I'd love to know which version(s) of Unison you use day to day.
In Fedora (and I think this applies in Debian too) we have to maintain 3 versions because Unison isn't interoperable across minor releases. For this reason we package 2.13, 2.27 and 2.40, and I think there is discussion about packaging the latest release too. Keeping these ancient (esp 2.13) versions going is a pain to say the least.
This incompatibility is one of the reasons why I reduced my dependency on it, though I still use it for minor things.
The last straw came when it turns out that unison was incompatible with the same version, if that version had been built on a different system. I can't remember the details but there was some library version difference that results in a unison that has the same version number but wouldn't talk to one from another system without crashing.
That can happen because unison uses OCaml's built in serialization, which isn't guaranteed to be stable across OCaml versions. It doesn't often change, but the representation of arrays was changed in OCaml 4.02, which meant unison compiled with 4.01 couldn't sync with unison compiled with 4.02. Debian bug discussing the issue: https://bugs.debian.org/802919
> which isn't guaranteed to be stable across OCaml versions
Oh, so OCaml is a toy language from academics. I've read so many good things about it over the years, but no one mentioned that aspect of it. There should be a law about putting a warning in Big Red Letters on the box so someone doesn't make the mistake of using it in a large, long lived project.
> There should be a law about putting a warning in Big Red Letters on the box so someone doesn't make the mistake of using it in a large, long lived project.
It's documented, and it's not intended to be used for serialization, you use it in very particular cases, for example for faster object sharing with IPC for multicore. We use Json or S-expressions or other stuff for that purposes.
I really don't know why unison people don't switch to a proper serialization.
This was very upsetting when it happened. The same version of unison was incompatible with itself if they had been built with different but consecutively released versions of the ocaml compiler. I feel like using compiler versions like 4.01 and 4.02 is irresponsible when it's documented that they produce binary incompatible products, but that is how it was. https://marc.info/?l=unison-users&m=142286809310149
An ugly way around it is to compile your own version of unison. The good news, if I remember correctly, is that the binary can be built in one place and copied around to the other hosts (and say put in $HOME/bin.)
Thank you for your work! I use Unison on Fedora almost every day. The version is 2.40, which IIRC is the default you get from "dnf install unison".
My use-case is that I run Unison on two computers running Fedora, to implement a form of "sneakernet" file sync. Work computer <-> usb thumb drive <-> home computer.
I wasn't aware that the exact version of Unison was so important. Will start paying attention to that.
Thanks for your work!
At the moment I'm running unison on Gentoo, Fedora and even FreeBSD hosts.
There is even a ppc64le host running Fedora.
On Fedora I'm using unison251-text from the croadfeldt/Unison COPR.
On Gentoo it's net-misc/unison-2.51.2, on FreeBSD it's unison-nox11-2.51.2 from pkg.
I use it in Debian, which used to (until recently) package 2 versions in every release. This was convenient for me, a conservative user, who might run old-stable on some computers for months before upgrading to stable. I have used workarounds in the past (mount home directory in a container, and sync from within) so I'm fine with a single version in Debian.
But you package for Fedora, which is "upstream" for RHEL (if I'm not mistaken) with their 10 year life cycles, so I'm sure RHEL customers would love you to keep as many versions available as possible :)
Yep, this is PITA as a user as well. 2.48.4 on Ubuntu 18.04 but finding about the same working version for Mac required additional hunting. But nevertheless Unison is awesome. Been using for the last 15 years or so <3
that was one of my biggest problems ~8 years ago. keeping the macports and the debian versions in sync was too tiresom, someday i just switched to rsync which worked well enough for my usecase.
A few years ago, I was working in a company that used a shared drive on a local Windows network. Several analysts and people from other departments would use the drive to store shared price lists, tables, and other documentation. Part of my job was to keep these data files updated with info on a daily basis.
The problem with trying to edit these files was that everyone logged in on the network had access to them and could change them arbitrarily. Or they could even just inadvertently lock the files if they left them open on their PCs. There was no control or change management.
I decided to keep the 'canonical' versions of the files on my PC at work and use Unison to sync them over to the shared network versions of the files once a day. Unison would instantly tell me if anyone other than me had changed the files, and I could investigate further. It was a huge relief knowing that I had proper control over those files.
Unison's cool, but I hit some major roadblocks with it, namely: it can't run on an heterogeneous network [0,1]. As many others have already suggested, syncthing[2] is the new best tool out there for multi-directional sync.
Unison has been an indispensable tool for me for many years. I keep my desktop and laptop in sync by using an always-on Linode as a sync server.
Something I love about Unison is the fact that I can sync files from anywhere on my filesystem without having to move them or replace them with symlinks.
I can also sync subsets of my files by defining multiple profiles. I have Unison set up to default to a "common" profile that includes nearly everything, but I can explicitly request a sub-profile if I want. I also have a separate profile that syncs the profiles themselves, so I can make profile changes on one machine and run `unison profiles` to upload them.
Unison efficiently and reliably syncs my website (static blog mostly) between OS X and FreeBSD. It's one of those extremely underrated pieces of software that just quietly does an excellent job, yet most people never heard of it and continue to struggle with rsync or similar (syncthing fulfills a somewhat different role). Perhaps because it's written in an oddball language, or because the website isn't flashy enough.
I often evaluate a wide range of software before choosing what I consider best for a job, and many years ago, Unison came out way ahead in such an evaluation. It never failed me.
The Development Status section of the manual mentions a follow-on project called Boomerang¹. It looks like the GitHub repo² only had activity for about 2 years, then stopped. Is that a dead project?
I use Unison every day to synchronize my work computer and my home computer, using an USB thumb drive as the intermediary. The thing I like the most about this arrangement is that I don't need a direct SSH between home and work, that there is no cloud service involved, and that the GUI lets me double check what files are going to be written to/from the usb drive before it synchronizes.
There is one issue that I have run into though. If I format the USB as fat32 it can't properly store permissions. And if I format it as ext4 then I need to make sure that my numeric user ID is exactly the same in both my home and work computers (otherwise the permissions get messed up).
Is there a way to make my arrangement more robust so that I can sync files through and USB drive, without needing to use the same numeric user id in all my computers?
> Is there a way to make my arrangement more robust so that I can sync files through and USB drive, without needing to use the same numeric user id in all my computers?
Yes, you very likely need to be root. You could also perhaps add in fstab that you mount the USB drive as a specific uid. See man mount and man fstab and search for " uid" (with the space, else you search for uuid as well).
I used to use unison to periodically sync files between a host filesystem and a VM filesystem as it was a lot faster - in terms of filesystem performance when not syncing - than using the shared folder functionality provided by the VM.
A hint regarding running the latest Mac binary on Mojave: if the GUI crashes on you, it's likely to be an issue regarding syncing file _permissions_ and not just files.
This has been biting me for a couple of months now whenever I sync my development tree between Dropbox and OneDrive, since Windows' WSL tends to set wonky 0x777 permissions on entire file trees, and it is usually those that cause Unison to crash.
Otherwise, I've been using it to sync 400K+ files without incident.
Unison is great for keeping multiple machines in sync. I eventually switched to Dropbox for reasons I don't recall, but it might be a good time to reconsider. Also, I'd forgotten, but apparently I even made a (poor) attempt at writing a Unison UI of OS X, some 11 years ago... https://github.com/logandk/autoson
I used and loved Unison for years, but I found myself wanting something that just worked invisibly with no admin effort so I switched to Resilio (Bittorrent Sync). Both work for my purposes. Anyone else try both of these and have feedback?
- Unison is written in OCaml, which is (probably) a perfectly fine language but not commonly used
- Unison synchronizes files to files, but for a Dropbox-like system you really want deduplication for space savings (i.e. server-side storage is a bunch of pointers to content-addressed blocks.)
- in general, it's not clear to me that the client is really the hard part of Dropbox. Note e.g. the part where Dropbox now runs its own data centers for cost reasons.
- there's a million fiddly things to get right, and Unison hasn't had that much usage
> there's a million fiddly things to get right, and Unison hasn't had that much usage
Unison is the only bidirectional sync tool that I trust to get the details right. It is backed by a formal model with various proofs of correctness. Such models are also easily representated in OCaml; which I can assure you is more than a fine language, especially if one cares about correctness. Dropbox has struggled to get these details correct in the past (see the paper "Mysteries of DropBox", by Unison's author Professor Benjamin Pierce). TLDR, Pierce teaches these Python hackers how to fix their broken code.
>Unison is the only bidirectional sync tool that I trust to get the details right. It is backed by a formal model with various proofs of correctness.
That's just about the sync process/stages, the easy part that can actually be formalized.
The "million fiddly things" are about OS and filesystem issues, incompatibilities, and so on, and Dropbox has a hugely larger test base for those things...
I disagree. POSIX, although somewhat dated, has provided a good enough abstraction layer for filesystems and OSs. My proof of this is the number of different and successful filesystems for Unix/Linux. If the abstraction didn't work, everyone would be forced to use the same filesystem.
The issues and subtleties are with bidirectional sync. It is not "the easy part". Dropbox didn't get it right in the past, we have no proof they have it right now.
>POSIX, although somewhat dated, has provided a good enough abstraction layer for filesystems and OSs.
First of all, POSIX semantics are not what Windows support. Second, even where available, POSIX is a tiny part of the possible issues. Adequate for naive apps that need to open or write some files, not for a reliable sync tool.
The issues discussed in that link, such as concurrency, lack of atomicity and failure are all factors that make bi-directional sync a hard problem. Yes POSIX provides few guarantees and so that burden rests on the application code above. The formal model must consider the race conditions and failure modes, it is not "the easy part". I do acknowledge there is a modelling gap to overcome and some problems with POSIX, but I disagree that its the hardest part of the problem.
That's really the job of the filesystem to protect against corruption using checksums. ZFS will do this and hopefully more will follow. I suppose a file sync tool could detect a change of contents with no change of mtime, but that would be expensive.
> - Unison synchronizes files to files, but for a Dropbox-like system you really want deduplication for space savings (i.e. server-side storage is a bunch of pointers to content-addressed blocks.)
> - in general, it's not clear to me that the client is really the hard part of Dropbox. Note e.g. the part where Dropbox now runs its own data centers for cost reasons.
The whole point of Unison is that you do not need a server. And certainly not a server run by a for-profit corporation in the United States of Surveillance.
Erm, with pairwise syncing you still want a "server". I've used unison as my primarily file syncing for a decade, and trying to maintain a spanning tree among partially-available nodes/disks is a pain in the ass. And creating loops means you can't rely on file deletions.
I agree it doesn't have to be a corporate-owned, or even corporate-snoopable "cloud" server, and if this is your goal then unison will work well. Also you can trivially solve the dedup using zfs, but I don't think dedup is a killer feature for personal storage.
(Better would be a "manual dedup" utility to catch those instances where you copied off the same thing multiple times over the years. I'd appreciate any recommendations here, although writing one seems trivial when I get around to it - calculate recursive sha256sums for every node in the filesystem tree, and look for the biggest matching ones)
> Erm, with pairwise syncing you still want a "server". I've used unison as my primarily file syncing for a decade, and trying to maintain a spanning tree among partially-available nodes/disks is a pain in the ass. And creating loops means you can't rely on file deletions.
You do not need a spanning tree or loops, you just need to do a topological sort on the nodes/disks and then consistently sync them in that order. There is no magic algorithm that will handle conflict resolution for arbitrary syncing.
The problem I'm referring to isn't conflict resolution, but rather spurious re-creation of files that have been deleted. Unison's pairwise syncing cannot distinguish a file that has been deleted from one that has yet to be created (whereas say vector clocks do).
Create file F on A. Sync A-B, A-C. Delete F on A. Sync A-B. ?????. Sync B-C. Sync A-B. File F now re-exists on A (and B and C).
??? is some event where you cannot sync A-C to directly save A's changes on C, yet you still want to save any changes from B on C. Say a remote machine crashes and becomes unavailable, you didn't have the time over 3G, or some other type of ad-hoc craziness which is why you're using distributed syncing in the first place.
Which implies you need to choose one node from {A,B,C} that is the most likely to be available to sync the other two to. That node can also provide connectivity to a larger network, which generalizes to a spanning tree.
IMO a topological sort would be an even stricter requirement than spanning tree, in that if one node becomes available, you can't sync anything "below" it! Also what does syncing disks "in order" mean when changes can happen at any time? (eg I use unison for maildir).
> The problem I'm referring to isn't conflict resolution, but rather spurious re-creation of files that have been deleted.
That is conflict resolution, over the set of files. Which is why Unison provides the "reconciling changes" UI.
> Which implies you need to choose one node from {A,B,C} that is the most likely to be available to sync the other two to.
Yes, that is a way to avoid update conflicts.
> IMO a topological sort would be an even stricter requirement than spanning tree, in that if one node becomes available, you can't sync anything "below" it!
Yes, it is also a way to avoid update conflicts.
> (eg I use unison for maildir).
Why? IMAP already takes care of synchronization, without any potential problems.
Eh, we can disagree on a precise definition of abstract "conflict resolution", but I'm referring to unison's behavior with -batch. There's no logical conflict that needs to be resolved by user choice or algorithm. When looking at the larger system of all the replicas, unison's behavior is to simply undelete a file that was explicitly deleted.
> Why? IMAP already takes care of synchronization, without any potential problems.
Less configuration, less attack surface. Sharing mail over NFS is a common thing, and I view unison as a type of distributed filesystem.
Thanks! I guess formulating that question, I was halfway to apt-cache search. I had envisioned a tool that would find an entire equivalent dir tree (two unpacked source tarballs -> one match), and at first glance I don't know if fdupes will do that. But it might make sense to use an off the shelf tool with a little more manual labor.
Unison has support for that on some platforms. On Linux, the option "-repeat watch" uses inotify to watch for changes and rerun unison each time there's a change. You can combine that with -auto and some of the options specifying policies for resolving conflicts (e.g. -prefer newer) to get a setup that works without user interaction.
I remember being surprised when I found references to Unison in the Dropbox Linux client. This was in the early days, I'm quite sure they rewrote it since. It would be cool to get the full story from a Dropbox employee.
Rather amusingly, both Panic's Unison and this Unison used the same ~/Library/Application Support/Unison directory on macOS. Thankfully they used different filenames and so didn't end up conflicting.
> Further improvements to the OS X GUI (thanks to Alan Schmitt and Craig Federighi).
Here are Craig's commits:
- https://github.com/bcpierce00/unison/commit/48f8e1b27edbe2df...
- https://github.com/bcpierce00/unison/commit/6645d1793ce843f6...
Around the same time, Dave Abrahams, now of Apple's Swift team, makes an appearance on the unison-hackers mailing list.