Hacker News new | past | comments | ask | show | jobs | submit login
Containers and persistent data (lwn.net)
49 points by vezzy-fnord on June 7, 2015 | hide | past | favorite | 11 comments



It sounds like if you smash together Project Goverenor and Flocker you get Joyent's Manatee[1]. It uses ZooKeeper to automatically manage a PostgreSQL cluster. It also uses a separate ZFS dataset to hold the database, so backup and restore can be accomplished just using "zfs send" and "zfs recv". It's open source[2] Javascript if you want to check it out.

[1]: https://docs.joyent.com/sdc7/troubleshooting-sdc7/manatee [2]: https://github.com/joyent/manatee


A bit off-topic: I always wondered why Docker recommends volumes. Given that they are removed when no container uses them anymore they seem a poor choice for persistant data to me. Shared directories are much better choice imho. Would love to hear why this is not so though...

EDIT: looks like volumes are (now?) persistent: https://docs.docker.com/userguide/dockervolumes/ Still don't see any advantage in using them though... Am I missing something?


Docker volumes are persistent. Docker never removes anything unless you ask it to, specifically to allow you to decide what is persistent and what isn't. It certainly has never removed volumes when containers stop using them.

However Docker lacks a good volume management UI, so that fact is not always clear.

To answer your question, the reason volumes are useful is that they allow you to be explicit about which part of the container's filesystem should have a lifecycle of its own, across container upgrades.


Seems to me that the only sane way for production, is persistence-as-a-service.

Whether that means NFS, shared volumes with some kind of cluster fs -- or something else -- I don't think some hackish ux on top of folders located at a magic path on the docker host is a particularly robust or elegant solution.

That said, if you're going to use folders mapped on the host anyway, integrating them with docker (like volumes does) seems like a good idea.

Would much prefer some sane networked system (perhaps with some optimizations for loopback/local use), possibly blessing NFS/CIFS/p9fs for file-systems, and some kind of block storage ("s3"/drbd/iscsi) for, well, block storage.

Adding to that, are there any really working implementations for 9p client/server for Linux? It looks like it should be beautiful and smell of flowers, but the reality seems a lot more half-hearted...


I only use shared directories because leaving all your data in a container seems scary to me. It also makes keeping backups easier. I just sync username and groups between the host and the containers and everything works fine.


I'm yet to figure out which is better to use since we're only experimenting with Docker for dev, but it seems like having a volume based container makes versioning and getting people up to speed easier (we can commit a base data container image for people to work with).

I use shared directories though as well for cross compile environments and for dev work to redeploy apps without rebuilding the container.


I believe (although this might be worth checking) that they recommend using data containers in order to prevent uid/gid problems between the host and clients.

https://medium.com/@ramangupta/why-docker-data-containers-ar...


Check out https://containership.io, it supports moving persistent data between servers in a cluster via https://github.com/containership/codexd, and also allows you to backup/restore/migrate entire clustered databases between hosting providers. Currently there is support for Crate.io and Apache Cassandra, with more on the way.


All these problems went away with using lxc, lost so much time with docker, which is definitely the wrong way to do it.

However it is funny to see a whole industry emerging around artificially created problems.


At Pachyderm (github.com/pachyderm/pfs), we're also working on data-aware scheduling of Docker containers as it applies to analytics.


Watch out for some announcements around this topic at DockerCon in a couple of weeks, we are close to Flocker 1.0 :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: