Hacker News new | past | comments | ask | show | jobs | submit login
Mutagen – Cloud-based development using your local tools (mutagen.io)
174 points by vmoore on April 8, 2022 | hide | past | favorite | 56 comments



We use Mutagen for Garden's hot reloading mechanism. (Garden is a dev tool for K8s and hot reloading enables users to sync changes directly to a prod like dev environment as opposed to doing a rebuild and re-deploy).

It really is a fantastic piece of technology and completely transformed the whole experience (we were using good 'ol rsync before). In particular it works seamlessly across platforms.

If anyone's interested in how we use it, it's here: https://github.com/garden-io/garden/blob/master/core/src/plu...


This is a MASSIVE endorsement as far as I am concerned

I used Garden in the rsync days and it was already pretty good. The code behind Garden is some of the best TypeScript + Node I've ever seen, so if this was better and more performant than what you already had I'm sold.

Tell Jon I said hi =)


Thank you say much for the kind words Gavin :)


I’ve been using Mutagen with Docker for about 3-4 months to work around an (at the time) performance problem with bound volumes. I understand the new experimental FS option in Docker might solve this, but I haven’t tested or migrated my setup yet.

Basically, I’m working on a bare metal kernel in C/C++ on ARM, which only has tooling on Windows or Linux. As I work on a Mac system, I setup a Docker container that makes this trivial for me, including GDB debugging of QEMU on the host machine.

Mutagen was easy to get going, well documented, and has worked perfectly for this. I’m able to make my edits in my Mac IDE, then literally by the time I’ve switched to compilation, the edits are synced to the container, and vice-versa for the build product.

Prior to Mutagen, I was getting 8-10x longer build times since Docker file synchronization was slowing things down. Mutagen syncs slower than Docker, but faster than I can work, so all is well.


Author of Mutagen here, happy to answer any questions you might have, whether it's about its use with Docker, SSH, its historical integration into Docker Desktop, or its related Mutagen Compose[0][1] project.

[0]: https://github.com/mutagen-io/mutagen-compose [1]: https://mutagen.io/documentation/orchestration/compose


Thank you for this great project! I’ve been using it for a while now and it has been rock solid and easy to use. Cheers!


Hi Jacob. I am one of the founders of Okteto (https://okteto.com/), a remote development platform for Compose and Kubernetes applications. We use Syncthing to sync code between the developer laptop and pods running in Kubernetes. I would love to know your thoughts on the strengths and weak points of Mutagen vs Syncthing for this use case. Thanks!


Sure, that's a great question. I'll preface my response by saying that I'm a huge fan of Syncthing (and that Mutagen's use cases form a Venn diagram with those of Syncthing (as well as tools like rsync)). Technologically, all three of these tools are very similar in terms of using the rsync differential transfer algorithm, but their architectures and primary use cases differ. I think the core differentiators with Mutagen are:

Development-oriented: Mutagen's sync configuration is primarily focused on development, so it adds more granular controls for things like uni-/bi-directionality, conflict resolution, ignores, symbolic link handling, etc. It also has a permission propagation model that's focused on things like cross-platform executability propagation and preservation (i.e. between Windows and POSIX), as well as operating in multi-service environments where many different process UIDs/GIDs might be in play. It also handles weird filesystem quirks (like macOS Unicode decomposition).

Low-latency: Mutagen's goal is to reduce the latency of sync cycles (i.e. time from local edit to change reflection on the remote) to an imperceptible level. It uses a lot of tricks to try to do this, but the goal is really to use the absolute best filesystem watching mechanism for each platform and to integrate that tightly with the sync loop. On Linux, for example, Mutagen is now starting to experiment with the recently revamped fanotify[0] API to get highly scalable but low-latency watching (as opposed to the emulated and janky recursive watching emulation with inotify that most tools use). It also uses tricks like rsync-diffing of the metadata snapshots that it transfers to get latency as low as possible. The eventual goal is to reach sub-100ms sync cycles for multi-GB codebases, and I think that's pretty close.

Git-like sync: Conceptually, Mutagen's sync algorithm is like a filesystem watcher + repetitive three-way Git merge (with the difference being that file transfers are deltified and the merging (potentially) affects both endpoints). This means it tracks content in a manner very similar to Git's CAS and branches, which is a little different than the way that Syncthing does it. This affords (in my opinion) more precise identification of conflicts.

Distrust: Mutagen takes a more aggressive approach to mutual distrust between endpoints, working hard to ensure that a malicious endpoint can't read outside the synchronization root on the other endpoint via symbolic links or maliciously crafted paths. It does this by using POSIX *at functions to traverse the filesystem and perform operations. This avoids issues like CVE-2017-1000420. You can harden this further by using unidirectional sync and other configuration options. This makes it well-suited to cases where multiple users might be syncing to file storage on a shared system (say on a SaaS platform) (though, at least in that case, you can protect yourself and users with the filesystem namespacing afforded by containers).

One-sided installation and flexible topology: Mutagen's primary M.O. is injecting small "agent" binaries to remote systems via a copy mechanism (such as `scp` or `docker cp`), so you don't have to manually install it on both endpoints. This is less important to "full stack" cases like Okteto, where your tooling can handle the setup of Syncthing on the remote, but it makes working directly over SSH or in ephemeral containers significantly more convenient. And Mutagen's architecture is also really flexible, allowing it to sync files and forward traffic between any combination of local and remote endpoints (including remote-to-remote, proxied via the local Mutagen daemon).

Command-based transports: Mutagen uses the standard I/O streams of commands like `ssh` and `docker exec` as its transport (similar to tools like Git or rsync), making it easier to target remote environments with your existing tooling and configuration. Again, this is less of an issue for a case like Okteto's, but is useful in the standalone case.

Network forwarding: This is outside the scope of sync, but Mutagen offers OpenSSH-style TCP/UDS forwarding (with the difference from OpenSSH being that Mutagen's forwarding is persistent and managed by a background daemon). This offers support for doing things like forwarding a local socket to a remote Docker daemon over SSH, and then forwarding web application traffic over that underlying forwarding by using Mutagen over `docker exec` (or reverse forwarding, or forwarding between two remotes and bridging them via your laptop, or loads of other crazy shenanigans).

I hope that clarifies things a bit. Ping me via email if you want an expanded comparison.

[0]: https://man7.org/linux/man-pages/man7/fanotify.7.html


Also, how hard would it be to run mutagen on web assembly?


It should be possible, especially since Mutagen is already built for almost all of Go's supported architectures. The biggest issue would just be implementing the race-free filesystem traversal that's used on Windows and POSIX (or potentially living without it given that WASM would probably provide sufficient sandboxing). In any case, most of the work would take place in Mutagen's `filesystem` package, where those syscall equivalents would have to be figured out. The rest of Mutagen should compile without modification. You'd also need to define a transport for Mutagen to use, which would depend on the exact mode of operation you're looking at, but that's the easier problem to solve. Mutagen v0.15 is going to be focused on extensibility (including custom transports), so something like this may become a reality soon.


Thank you for building this - it appears to solve a real problem that has been a pain in my butt for a couple of years.

What are the plans for funding the project? It looks like everything is OSS now, will there be any closed source elements in the future?


To be totally transparent: funding is a bit of an open question, although there's no risk of Mutagen disappearing in the short term.

Mutagen was actually part of YC's S19 batch, and it still has a significant amount of runway from that, but also some revenue from contracting work.

I really want to keep as much of Mutagen as FOSS as possible, ideally MIT licensed. I did just add a small portion of code under the SSPL, which I might experiment with dual licensing to SaaS embedders of Mutagen (since it's really only useful in cases where Mutagen is being embedded in other tooling), but even that I wanted to keep open source for other FOSS projects embedding Mutagen.

In the near term, I have some ideas about plugins and tooling that I want to build on top of Mutagen that will probably be closed source, but those will be separate entities from Mutagen itself and the aim will be to avoid compromising on any functionality that belongs in the core of Mutagen.


How robust is the handling? Can it tolerate unreliable network? Lost connections and hiccups?


It can tolerate disconnection very reliably, and it will automatically reconnect to synchronization and forwarding endpoints as soon as possible.

Its synchronization algorithm in particular (which is essentially a (repeated) three-way merge with rsync-style differential file transfers) is designed for safety and robust tracking of the changes that it has propagated. It's also capable of resuming file transfers once it reconnects.


I'd love to hear more about anyone who has managed to use mutagen as a docker-compose replacement. I have so far been a bit let down by mutagen in that it doesn't really seem to be living up the promises of really being easier than using docker directly. Its file synchronization concepts also don't always seem to keep up with file watchers and hot reload

I've been looking at something like Lando[0] as a result, and I realize they're not the same thing, it seems like it is more fitting with the role for the development environment, at least.

[0]: https://docs.lando.dev/getting-started/


Hey, sorry to hear that Mutagen didn't pan out for you. I'd be curious to hear if you've tried the Mutagen Compose[0] project or some other mechanism for Compose integration. I'm still iterating on the best approach for integrating Mutagen with various Docker workflows, so I'm always looking for feedback and experience reports.

[0]: https://mutagen.io/documentation/orchestration/compose


I'll take a second look at this to see if maybe we could optimize some bits in how we are using mutagen currently.

Is there any thought - given this is a development tool - to exposing container aliases as commands? Kinda like with lando, where you can `lando composer install` or `lando yarn turbo` or something like that. One of the other issues I have with these container setups in general is that for anything that has a heavy CLI component or interactivity there's no good abstraction around interacting with the CLI.


Feel free to reach out on Slack or GitHub issues or by email if there's anything I can help with.

I think framework-specific stuff like that is better handled by higher-level tooling, typically something that's just embedding Mutagen. DDEV, for example, adds that type of functionality for PHP-based projects, while embedding Mutagen underneath.

The goal with Mutagen itself is (ideally) to be invisible - only providing synchronization and forwarding. Even Mutagen Compose (being just tweaked version of Docker Compose) probably falls too low in the stack to know anything about the specific frameworks being used atop it.

That being said, I think this is something that Compose wants to address, and that Mutagen Compose could inherit once it's added to the Compose Spec and implemented into Docker Compose. Things like profile specifications in Compose files are designed to encapsulate these types of operations (among other things).


OT but, I switched away from Lando over to plain docker-compose files, It was much easier to configure things. For example, trying to get HMR working in Lando was just a pain.

Lando is great for getting up and running quickly,outside of the recipes, it's easier to just create your own docker-compose file.


I've been testing Mutagen's file sync through DDEV on a Drupal project and it's pretty nice!

The project is way more snappy than using NFS mounts on the Mac (which was already WAY faster than bind mounts), and DDEV has file ignores set up correctly so the project's media files aren't synced and disk usage isn't doubled.


Yeah, Randy's done a great job with the integration - it seems to work seamlessly for most users as far as I can tell. I'm really excited to get Mutagen v0.14 and beyond into DDEV, because support for fanotify[0] is being added and it's going to make Mutagen's Linux filesystem watching unbelievably faster for container-based development.

[0]: https://man7.org/linux/man-pages/man7/fanotify.7.html


I'm switching from using docker-compose + docker-sync[0] to using mutagen-compose and am recommending my coworkers across Windows, Intel Mac, and Apple Silicon Mac all do the same. For me mutagen-compose is solving the same performance file mount performance problems I was using docker-sync to solve, but mutagen is doing it much more reliably. I would have to restart the docker-sync (and sometimes the docker daemon) multiple times a day. So far mutagen "just works".

[0]: https://github.com/EugenMayer/docker-sync


Glad to hear that it works for you. Don't hesitate to reach out on the Mutagen issue tracker or Slack channel if you run into any issues or questions - I'm always happy to help.


Same! Mutagen + DDEV have been a godsend for me


Same here - works wonders when developing Magento projects


Looks like a really handy tool and essentially the logical endpoint of all the various hacky manual file watcher / scripted copy type sync things we end up doing to work across environments if they were developed into a full solution.

We work quite hard to maintain unified production/dev config for our docker-compose setups. Mutagen looks like it could simplify some of these since a whole layer of complexity comes from trying to optionally map ports and volumes in such a way that it works for development, while not being inappropriate in test/production. If instead, mutagen can virtually connect / synchronize these it'll be a lot simpler.

I do wonder a bit how debugging works .... if something is running in a container, it's great I can hot sync my code there, but what happens when I put a breakpoint in the code in my IDE?


Mutagen and mutagen-compose + an EC2 instance work great as a replacement for Docker for Mac. This setup you to develop and run Intel images without performance penalty. Both on M1 and on Intel I’m never going back to Docker for Mac again if I help it.


An important thing to note is that the _worst_ part about Docker for Mac is the lousy disk IO performance for shared drives. This can be almost fully mitigated by using NFS mounts instead[0].

In general Docker for Mac is far from ideal - particularly considering you must allocate a set block of memory to the VM. If only it was the year of the Linux Desktop :grin:

[0] https://gist.github.com/vschoener/b0b5ae08625744a1f4ad414027...


Looks like it handles two-way sync very well, but could it keep more than two participants in sync? I'm sort of "abusing" Syncthing to sync Docker volumes (mostly just configuration and helper scripts) across several hosts. It works well, and the only things I'm missing are file ownership preservation and low-latency "instant"-feeling sync. csync2 gets it right, but it's fragile and needs a lot of babysitting. I'm wondering if Mutagen might be a suitable alternative.


One way you can do this with Mutagen is to use a hub-and-spoke topology, where one copy of the files (say the one on the laptop where you're coding) is the "hub," and then you have multiple two-way (or one-way) synchronization sessions going out to the various "spoke" endpoints. In the two-way case, changes on a spoke endpoint will be propagated out to the other endpoints by first propagating to the hub. Obviously, you can have contention in this case if you have ultra-high-frequency updates to the same sets of of files, though the conflicting operations would have to occur on a smaller timescale than that of a single Mutagen synchronization cycle (which is typically somewhere around 100ms for an average codebase). Also, in the case of contention, Mutagen will simply display the conflicting files to you and you can delete the "losing" copy. You can also set auto-resolution behavior with Mutagen's `two-way-resolved` mode to let the hub win in conflicts, or use `one-way-replica` mode to make all the spoke endpoints replicas of the hub, or some combination of all these things, etc.


Huh! This sounds a lot like my hobby project to real-time sync files between my laptop & desktop:

https://github.com/stephenh/mirror

But dressed up to work with docker + "just" :-) being a professional/polished product, which is impressive. Neat!


Mutagen is great (I really wish Skaffold implemented it for it's sync) and I want to point out that Jacob, the author is the project, is a really awesome and intelligent person - few people in our YC batch deserve the front-page more!


I have been using mutagen for few years. It has been a life saver. I use it mostly to work on remote projects, mostly for development, rarely working on prod. So there I have my projects on some cloud linux instances and I work locally on windows, mostly phpstorm.

Having the webserver in the cloud and doing the heavy lifting is quite nice, especially if you work with phpstorm, which can be a resource monster for bigger projects, and working on remote projects with phpstorm built in sftp / file sync is a huge pain in the ass, really slow.

With mutagen it feels almost like I work locally. Thanks to the developer for this cool tool!


I've been using Mutagen for years to offload running docker containers during local development from my Mac to a Linux machine and it works great. I'm glad to see more people talking about it.


using mutagen as a workaround for the Windows WSL I/O slowness. great and rock solid tool.


its not that bad if you use the native filesystem, although then you have to actually get your files into the native filesystem


Using IDE in Windows and execution in wsl2. So IDE has files on Windows file system and use mutagen to keep them in sync with wsl2 file system via ssh. It's actually instant, only downside is duplication but thats a small price to pay.


Another workaround is to use WSL2.


I am using WSL2 - issue exists on both versions.


You can still get improvements in some scenarios with mutagen on WSL2


Great name! waves

(not associated)


Just curious, when you develop for the Cloud where to you debug your code? In the cloud or on your work-station?


For integration tests and other code meant to run like external facing code I use my local machine for debugging. If my local machine is too slow (like running the whole integ test suite) I use a cloud VM.

For internal code or slow to build code I use a remote VM hooked up to the internal network. That allows me to build quickly and test in an integration environment.

I use a file synchronization tool like mutagen at work (for ninjas) and I use mutagen on my personal laptop for the same workflow.

I’m thinking of trying remote development from jetbrains soon though: https://www.jetbrains.com/remote-development/

I generally prefer using a remote vm for development and only using my laptop for git.

Also you can use spot VMs for development so it can be cheap to get a high CPU vm for building and debugging or a high mem VM for running data science scripts and debugging them.


VSCode makes it very easy to selectively install plugins on a remote that doesn’t have internet access and allows fairly seamless editing, debugging and terminal access to a remote with very little effort. But I’ve been looking for alternatives to VSCode’s remote editing experience because I’ve generally got about 6-8 projects that I keep open all the time and it can eat up a lot of RAM and CPU especially when I have several of them running and debugging stuff in tandem. The trick for me is that beyond file editing synchronization I need to be able to debug the code. There’s some promising ideas here, but I haven’t been able to make the jump yet. The jet brains link you shared is the most similar offering I’ve seen, but I’m hopeful distant.nvim or mutagen (plus whatever else I’d need) can be used to get more tools on a similar level of seamlessness.


So specifically for debugging I have used the remote interpreter feature in most jetbrains IDEs: https://www.jetbrains.com/help/pycharm/remote-debugging-with...

For web dev: https://www.jetbrains.com/help/webstorm/debugging-javascript...

I use jetbrains IDEs cause you get this stuff out the box without much fiddling with configs and it’s all integrated together.

For lower ram when working on multiple projects they did build a new ide for lightweight remote dev but I havent tried it: https://www.jetbrains.com/fleet/

Could be worth a shot.


I'm asking because I'm contemplating on doing some work on AWS. If code runs on AWS doesn't it mean it makes lots of calls to the different AWS APIs. And then in which environment do you debug the code that makes those API-calls? Is it even possible to debug it locally?


This looks interesting, but I'm not sure I correctly understand the use case or how exactly this might fit into my development workflow. Can someone enlighten me and elaborate on what this solves and what its use cases are? Thanks.


Does the compiler run on my workstation or in the cloud? I read through the docs but didn't find anything about this.


There's no compiler here - it's a tool for bidirectionally synchronizing files (specifically code, assets, build products, etc.) and forwarding network traffic to and from remote systems.

The idea behind Mutagen is to use your local tools (e.g. editor, IDE, browser, etc.) to work on remote hardware (where "remote" can mean anything you like, e.g. your local system, a Docker container (local or remote), a Raspberry Pi, a cloud server, etc. - or some combination of all of those).

The architecture is designed to be maximally flexible, so you can work between your local system and a remote system, between two remote systems, or in any other topology you want.


It sounds like sshfs on steroids?


with mutagen you have a 1:1 copy on your local machine and remote machine. If the connection drops, you have still all your files on eg your laptop, with sshfs your files are "gone" locally. If your connection is back again, mutagen will start the sync and upload/download files again (only changed) so there is again a 1:1 copy.


In the kind of setup you'd use with mutagen your dev tools and compilers are typically inside the container, while your code and editor are outside the container and being synced inside the container on every change. If the container is running on another machine/cloud service then yeah it could be that your compiler is running outside your machine. But much more likely it's still running as a container on your machine. The typical use case for this is making your dev environment accessible to anyone, just send them a dockerfile and they can almost perfectly reproduce it and get going with building your software (vs. having to laboriously setup the exact version of each dev tool and compiler you use).


Would it be possible to integrate this with a browser editor (like codemirror) to create something like replit?


I’m sure this feels like an accomplishment but in practice it provides nothing over git/source control as usual, and any number of reliable open or free hosted file sharing options.

What specifically is missing in the current ecosystem of electron state moving technologies that this solves? Why this instead of git and a Wireguard based mesh of devices (to offer my current setup as an example; there are others).


The goal with Mutagen is real-time remote development, not source control or complex networking. It's designed to facilitate making changes to code locally and then having those immediately propagated to a remote system for compilation, execution, etc. Basically, it's designed for the work you do between Git commits, saving you the need to do a full commit, push, pull, evaluate cycle to test your code on a remote system.

Basically, it's like VS Code's remote development extensions, but editor-agnostic and more flexible.


Speed. This works so fast that it improves the efficiency of local development with containers.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: