Hacker News new | past | comments | ask | show | jobs | submit login
How I encrypt my data in the cloud (robertclarke.com)
151 points by robertjfclarke on July 6, 2019 | hide | past | favorite | 99 comments



I wouldn't trust a closed-source tool like Boxcryptor for encryption of sensitive data. Cryptomator looks interesting, though it's still a relatively new tool, and I'd be hesitant to rely on it.

For my personal backups I use a combination of tar, pixz, and GnuPG. There's no fancy deduplication, and it's definitely not efficient, but it's relatively simple and I can restore individual files with ease.

I run a variation of the following command occasionally:

  tar -C / \
    --exclude='dev/*' \
    --exclude='home/*/.cache' \
    --exclude='lost+found' \
    --exclude='mnt/*' \
    --exclude='proc/*' \
    --exclude='run/*' \
    --exclude='sys/*' \
    --exclude='tmp/*' \
    --exclude='var/cache/*' \
    --exclude='var/lib/docker*' \
    -cvf - . | pixz | gpg2 -e -r $PGPID \
    | ssh host 'cat > /backup/root.tpxz.gpg'
Then I generate an encrypted index file for quick lookups, create checksum and PAR2 repair files, and upload all of it to Wasabi, while keeping a local copy.

Wasabi may not be the cheapest storage solution, but they have no egress charges, which makes recovery a non-issue. Good speeds and S3 compatibility are also great. Don't want to run an ad for them, just a happy customer.


I never heard of pixz... cool! For anyone else like me, this comparison is worth a read: https://www.rootusers.com/gzip-vs-bzip2-vs-xz-performance-co...


Wasabi does have some not-entirely-true advertising on their web site though. If you don't download (which you typically don't if you're doing backups), all cloud providers are cheaper than Wasabi on per-TB-mo basis. Google Coldline or Amazon Glacier, for example, are $4 TB-mo (and Google is about to roll out $1.23 TB-mo "archive" option). Azure seems to have "archive" option for $1 TB-mo (LRS Archive option, advertises the same "11 nines", tho the price is so low, there's got to be a catch)

The Wasabi offering seems to be equivalent to the "hot" storage options, which, I agree, is crazy expensive in the cloud if you do a lot of egress.


Indeed, Wasabi is not cheap for long-term archiving of large amounts of data. It's also potentially more expensive for short-term storage because of their 90 day minimum retention period, which they explain well in their FAQ.

But they're a great fit for my personal use case of well below 1TB of rarely accessed data, while also providing peace of mind that recovery is not an issue, so I'm OK with paying more for that. Plus, it feels good betting on an underdog. :)


Isn't this basically what Duplicity does? It uses GPG to encrypt files before sending them to the remote server.

https://www.nongnu.org/duplicity/


It's been a few years since I used Duplicity, and while I liked it, I prefer the Unixy one-thing-well approach of composing several smaller tools to achieve what I need.

The big thing I'm missing from Duplicity are incremental backups, which is not a strong requirement for my use case as bandwidth is cheap and I can delete the oldest N backups to free up space.

But I gain a lot from using a combination of tools: I can easily replace each component, and easily improve my workflow by adding more components, such as deduplication or incremental backups if needed.


Pretty much, and it stores additional indexes so it can do incremental backups. It supports many different storage targets, including S3, which I use.


I used to do it this way too, but recently switched to `encfs` and simply sync the encrypted directories. So only modified files (with encrypted filenames and content) are sync'ed.


Interesting, though that wouldn't work for off-site / cloud backups, unless you could upload the encrypted EncFS volume somehow, or don't mind leaking some file information to your storage provider if you're uploading the underlying encrypted filesystem as-is.

I use EncFS for other purposes, but be aware of its security issues[1]. This report was influenced by the founder and CTO of Boxcryptor, so I'd take it with a grain of salt, but I'd still avoid using EncFS for any important data.

[1]: https://defuse.ca/audits/encfs.htm


My cursory audit of encfs (not written up) revealed that, if you use it in the natural way for backup (reverse mount an unencrypted directory, and rsync the virtual, encrypted file system that exposes), then it does not use per-file salt, so each file with the same contents is encrypted to the same ciphertext.

This was years ago. It might be fixed.


I use Cryptomator but it does "phone home" for version checks so I keep an old version of the installer just in case.


i guess a main point for Boxcryptor is easy access through mobile apps, does your solution provide that?


Accessing backups via Termux and SSH on Android is easy enough for me, but certainly not user friendly in the popular sense.

Though I'm OK with trading some usability features for security and peace of mind.


Arq [1] works very well for me, it is compatible with various cloud providers as well as personal servers.

1. https://www.arqbackup.com/


I love Arq. I use Arq to backup to a local server via sftp and remotely to B2 (which has very affordable storage). I have used Arq for many years, and regularly restore files through Arq.

On Linux, I use restic, which can also backup to B2 (and via sftp, obviously). restic has this nice feature where you can mount the backups at some destination as a FUSE filesystem. Makes it very easy to go through backups and recovering the bits you need.


I have been using Arq for years and it has saved me numerous times. I backup to Amazon Drive, which is about $60/year, but it supports most of the major object store providers.


Arq backup is fantastic and will work with any SFTP endpoint - not just name-brand (proprietary) cloud services.


I’ve been using Arq for years and also find it to be excellent.


I use Borg and rsync.net [1]. I recently switched to Restic which is pretty much the same as Borg but doesn't need a corresponding server, it can back up to dumb storage. It's been going well, I think I prefer it to Borg.

[1] https://www.stavros.io/posts/holy-grail-backups/


You can also use sshfs in combination with borg to eliminate the need for the server to support borg.


That doesn't work as well because borg needs fast local access to the files in order to do deduplication etc. If you use SSHFS, it's going to be much slower, IIRC.


You're right. It does have an impact on performance. I haven't done a comparison, but the performance is still acceptable in my experience. It depends on the use case, but the local cache that borg maintains helps a lot in simple cases, because any unchanged files simply get skipped.


Ah, that's great news. It means you can use things like B2 as well.


Restic is a great choice also.

You can store files/backups/whatever encrypted with support for many popular endpoints (local filesystem, S3, Backblaze B2).


I use Nextcloud on Synology with WebDAV. It is encrypted on the filesystem level.

Then I use a bunch of free cloud providers (including TransIP STACK who gave 1 TB for free at some point) together with Cryptomator [1] which is a cross-platform (Windows, Linux, macOS, Android, and Cyberduck/Mountainduck also support it) Java program. The advantage of it, is it abstracts the filesystem and WebDAV. So you see the decrypted data on a separate filesystem layer, allowing all your normal applications to work. It is also FOSS and gratis.

Is it the best option? I don't know. I like the mentioned advantages. I've never used Arq, for example, but it not working on Linux and Android is a dealbreaker for me.

As for cold wallets, quoting the article:

> Offline wallets are the best way to go for storing a larger amount of cryptocurrency. I use ColdTi wallets to store multi-sig private keys. ColdTi is essentially just a slab of titanium that comes up with a punch set that can be used as a fire-proof seed backup. Very handy :)

These are useless in a case of fire.

[1] Already mentioned multiple times in other posts at the time I wrote this. https://cryptomator.org


I just tried boxcryptor in my Ubuntu Workstation. It was burning through 40%+ CPU on all 4 cores while it is not being actively used! Don't know if it is a simple bug or just designed without an eye for resource usage.

On the other hand, I ended up learning about scrypt(written by Collin Percival, who works on FreeBSD a lot and runs Tarsnap), and restic which in layman's view appears to be a better borg.


This is really weird, what might it be doing?


Considered rclone instead of boxcryptor? If you are worried about data security, I'd be wary of using a closed source encryption service.


I do pretty much the same but moved away from Boxcryptor to Cryptomator as it's open source :)


I'd never heard of Boxcryptor. Does anyone else use this? I'm not sure I understand why I need to sign up for an account to use it if its entire purpose is to do client-side encryption.

Also, it's not quite the same functionality, but this also reminds me: For a long time I've used Knox (by AgileBits, the same company that makes 1Password) for encrypted disk images, but they no longer sell or maintain it. It works just fine, but I should probably find a replacement that's still maintained, at least for security updates. Anyone know a good alternative? VeraCrypt (mentioned in the article) seems like one possibility.


Veracrypt is a great piece of software, but it isn't as easy to integrate across various platforms. Boxcryptor is great because they have iOS/Android/etc. apps.


Try Cryptomator instead! It's free and open source and does essentially the same thing (no account required)


You must sign up for an account to use Boxcryptor because it is paid software. That is the only reason as far as I can tell. As far as I know their servers do nothing for you once you have installed the software on your devices.


Boxcryptor[1] started out as an EncFS[2] implementation. At the time, EncFS was the only real good solution for file-based encryption. Solutions like TrueCrypt are disk-based, which means for cloud syncing solutions like Dropbox, one file -- the entire disk volume -- gets synced, and every time a file changes, the entire disk gets synced again. EncFS encrypts individual files, which works great for file-based syncing services.

Boxcryptor offered a client for macOS, Windows, Android, and iOS that worked really well, and if you needed Linux support, one could install EncFS and use it transparently on that platform. Boxcryptor charged for a creating volumes with more advanced EncFS settings, but if you created the EncFS volume with those advanced settings using EncFS itself (e.g. on a Linux machine), the free version of Boxcryptor could read and write those volumes with those settings.

In 2013, the people who ran Boxcryptor wrote a second version that implemented a proprietary, unpublished encryption and/or file management scheme. They relegated the previous version to an unmaintained Boxcryptor Classic product and eventually removed it.[3] The proprietary version is what is offered today.

IF you want Boxcryptor-like functionality today, the EncFS4win project[4] is a good solution for Windows. EncFS can be installed via Homebrew[5] on macOS and its volumes mounted via a shell script or some FUSE GUI managers. You can install EncFS on Linux and use gencfsm[6] for a GUI manager. The Windows, macOS, and Linux implementations all use FUSE for exposing the encrypted files via a native filesystem interface. For Android, Encdroid provides an application browser for volumes. I am unaware of an iOS solution. I use the FUSE systems to keep certain sensitive cloud documents synced between my Windows, macOS and Linux machines while still being able to edit and use them like normal files on those systems.

EncFS does have a few attack vectors they have been slowly addressing. It also suffers from the same problem that all cloud-synced file-based encryption systems suffer; someone could restore your cloud files to a previous known version without your knowledge. The file-based encryption does not prevent what is in effect a replay attack. A research paper proposed a solution -- CryFS[7] -- with some solutions for this problem, but the implementation is immature.

(edited for formatting)

1. https://www.boxcryptor.com/en/

2. https://vgough.github.io/encfs/

3. https://www.boxcryptor.com/en/blog/post/6-years-of-boxcrypto...

4. https://encfs.win/

5. https://formulae.brew.sh/formula/encfs

6. https://moritzmolch.com/apps/gencfsm/

7. https://www.cryfs.org/


I found it extraordinary difficult to build your own encrypted cloud.

Options:

1. Truecrypt container. CON: Upload takes to long

2. ecryptfs CON Always had problems getting it to work. AFAIK it is not under current development anymore.

3. Run a FS in a mounted contaner (Filesystem in a FILE). Slow. Not very stable. Under no circumstance use ext4 or something like it, if you really want to try this, use ZFS to avoid data corruption

4. CryptFS. Great Idea but slow as f.. https://www.cryfs.org/comparison/

In the end I did not the cloud as a second backup for a large system (10TB) since I found no safe, fast and reliable way.


"I found it extraordinary difficult to build your own encrypted cloud."

I am happy to report that this has been (recently) solved:

https://www.stavros.io/posts/holy-grail-backups/

"(the) holy grail of backups"


That is why I like ZFS: Its send/recv function can do block level syncing, so while the first upload will take a while, subsequent syncs will be much smaller.

Of course, unless you stand up your own VM with a ZFS partition, there are few cloud options for ZFS.


"... there are few cloud options for ZFS."

There is exactly one. You can ZFS send/recv to and from an rsync.net account that is enabled to do that:

https://arstechnica.com/information-technology/2015/12/rsync...

https://www.rsync.net/products/zfsintro.html

OR you can get a plain old rsync.net account and do a "dumb" sync to it and just configure ZFS snapshots on any schedule you like.

Ask about the "HN Discount".


> 4. CryptFS. Great Idea but slow as

"CryFS solves all of these issues, but because of the increased security it is a bit slower. It is also a very new project and currently only available for Linux and Mac, but has experimental Windows support in the newest version. So if you don't need Windows support today, you can give it a try." https://www.cryfs.org/comparison/


I tried. I have a 100MBit line. With no encryption I have a very good speed. To get things working I came down to 1% or somethings and very sluggish.

I also remember that my internet provider blocks many ports and I had to use my VPN to get required ports to working. This again scaled things down a little. I found CryFS not usable on Linux


Did you consider rclone [1]?

[1] https://rclone.org/crypt/


I bought a synology, and used their solution.

Haven’t tried a full restore, but I occasionally browse old backups and download something as a spot check.

I would much prefer a turn key open source solution that’s commercially developed/supported by one firm (but not tied to that company’s hosting).

Maybe someone will release something under the BSL, and build a healthy company out of it.


Interesting but wonder if this type of encryption ruins Dropbox business model since it keeps them from de-duping anything. I couldn't care less about Dropbox's business model... just curious.


Of course any kind of encryption does make a dent in Dropbox's margins, since Dropbox's model is to dedupe data across all its customers but yet charging everyone as if the space used is strictly by their data alone. But the follow up question would be how much of personal (non-public and non-shared) data do people store vs. how much publicly available or shared data (not necessarily free) data they store in their Dropbox accounts for this to make enough of a dent.


I doubt Dropbox gains much from deduping between customers, but I'd love to see some data to the contrary. Last I knew they weren't sharing that, but most of my data is unique to me and anything I'd want to encrypt is unique to me.

I think they do gain a lot from selling 2TB to people using 30GB and selling additional users of the same <3TB of data to enterprises. (That's gotta be pretty sweet profit if they have takers - $12.50 more a month for zero additional storage and a little more data transfer.)


"$4 per TB/month" so 16 X 12 = $192+tax not a insignificant amount even in a first world country and probably a deal breaker for people living in poorer countries.


4TB is a lot of personal data - $192/y for that isn't cheap, but I wouldn't call it expensive either for a first world country. And I suppose syncing masses of personal data to an archive across the world is kind of a 1st world problem.


For comparison, Google One offers 2TB for $10 a month and the next tier is 10tb for $100 a month (the only downside is having to use Google Drive)


Note that AWS offers archival storage for $1/TB-month and Google has promised $1.23/TB-month later this year. These prices are competitive with raw storage, so the alternative is to go without backup.


Do you have a link for the AWS 1/TB-month and Google? I'm interested.


"S3 Glacier Deep Archive This new storage class for Amazon Simple Storage Service (S3) is designed for long-term data archival and is the lowest cost storage from any cloud provider. Priced from just $0.00099/GB-mo (less than one-tenth of one cent, or $1.01 per TB-mo), the cost is comparable to tape archival services. Data can be retrieved in 12 hours or less, and there will also be a bulk retrieval option that will allow you to inexpensively retrieve even petabytes of data within 48 hours." https://aws.amazon.com/about-aws/whats-new/2019/03/S3-glacie...


That's a fairly normal price. I pay $60/year for Amazon Drive, which has a 1TB limit (no charge for data transfer), that I don't even come close to approaching.


I think the best approach is never save unencrypted data on cloud. Always encrypted on client first. But by that way we lost dedup capability, so we have to do everything, such as encryption, dedup and compression on client side. I made an in-app file system dedicated for that purpose. https://github.com/zboxfs/zbox


> But by that way we lost dedup capability

This depends on how secret do you want your data to be. You could use block-based encryption/compression and backup. That way you can still dedup encrypted result.

If anyone can inject data into your system and monitor the backup, they could learn when they hit collisions, but for most personal backup cases that's irrelevant.


I don't think the encrypt-then-dedup is a safe way to protect data privacy. In this case, identical blocks need to produce same cipher text, this will actually leak your data pattern even though it is encrypted. A better way I think is using randomly-seeded derived keys to encrypt each block, thus the identical blocks' cipher text will always be different.


Yes, if that's more important to you than dedup savings, then you should definitely do that.


if you encrypt in a way that enables the service to do dedupe, you are either reusing IVs and encryption keys across items (bad) and leaking information that two items are the same item.


You must not reuse IV between different blocks, but that does not stop you from using the same IV for the same block. Yes, you leak information about matching blocks - it's up to your use case whether you care about it.


Yes, sorry I was imprecise. I meant reusing IVs for equal blocks only, to be able to see duplicate data.


Dedup and encryption are 90° orthogonal. Encrypted data should look like uniform noise from every conceivable direction. Just the fact that blocks persist between encryption runs is leaking sigint.

I think a better approach, if you want to have versionable files, but encrypted outside of the client, would be to do something with diffs, similar to Git, or perhaps staged dockerfile builds, depending on whether it is binary or text data.


Turning two blocks of cleartext with the same content into equal blocks of cyphertext is not great for encryption: https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation...


I think this is a really compelling approach!

What sorts of applications have started to adopt this?

I also thought a different layer to start at would be sqlite databases since I understand that many mobile application use that.

How do applications handle conflicts? It looks like there is a version on files but when is a new version created? On close?

Do you see GDPR or any other compelling event that will cause applications to consider this sort of cloud storage?


I think any applications need store confidential files on client or remote can adopt it. Web and mobile app might be the best to use it at this moment.

As there is transaction control, the conflict handling should be straightforward. That is, the thread got write lock can write the file exclusively and each write is a transaction and commit will form a new permanent version.

GDPR might be a good reason, but I think it can be more general. Any apps need store confidential data can use this, no matter the data is on local or cloud.


Or use Tarsnap.com - Online backups for the truly paranoid


It uses content-dependent splitting of blocks (for deduplication), and I'm too paranoid to accept that block-sizes will not reveal anything about my stored data.

https://www.tarsnap.com/download/EuroBSDCon13.pdf



Time to advertise my open source encryption filesystem: https://github.com/netheril96/securefs


None of the tools described in the article are open-source. Call me paranoid, but that doesn't pass my bar, both from a security pov and from a long-term recover-ability pov.


Veracrypt [0] is open source and has passed a security audit in the past (when it was forked from (Truecrypt). I use it for encrypted volumes and it has extra features for the truly “paranoid”. Combine this with Backblaze and you’ve got yourself a good secure backup for sensitive/personal data. [0] https://www.veracrypt.fr/code/


I like encrypted ZFS snapshots. There are tools to automate the process of creating and uploading them and they handle incremental backup and restore painlessly.


I tried Boxcryptor, but didn't like how it worked, so have built something else myself and been using it in one of my companies (distributed) for over a year now. The side benefit is that by being able to encrypt at the folder level I now can give different permissions to different teams but within the one Dropbox account. If anyone is interested in a beta when I release it drop me a message - contact info in my profile


I've been working on my own encrypted and de-duplicated backup solution using libsodium. It's early days and progress is slow with limited spare time, but it works well enough for my own use already. I wanted to avoid any closed source or even lesser-used open-source encryption.

https://github.com/willtim/Atavachron


I just do Backblaze with a client-side key. Cheap and effective. I do wish they had a Linux daemon client. I'd pay more for that.


FYI, to restore your files with Backblaze, you'll have to give them your key. They then decrypt your files and leave them in an unencrypted zip file on their servers for you to download.


And also: their app is closed source, so I'm kind of already trusting them with my encryption key.


I use Cryptomator tied to a WebDAV instance on my server for most of the same use cases. The one thing I feel it’s missing is a gallery type feature for photos so you can see thumbnails and swipe left to right through the files. I’m not sure how this would work in practice with the encryption but it would really make it a killer encryption app for me.


It really is the only missing feature. It gets frustrating trying to show a specific photo from an event to a friend when I have to guess which number in a series a photo was.


I used to put my data on dropbox but I'm increasingly not comfortable putting my personal data on devices out of my control. Now I just do a daily local rsync backup. I also use syncthing to keep the documents, music and photos in sync between my laptop and phone (with all that, my phone's 128GB storage is less than 60% utilized).


Boxcryptor solves that problem for the most part... That being said, rsync works really well. I use it to copy files to and from the Synology.


I sync to a homeserver and back that up with Borg using a time4vps storage server.


I like Boxcryptor too. But I've never liked Truecrypt/Veracrypt. I only use Linux, so my machines and external drives all use LUKS.

Also, I use VMs a lot, so LUKS encrypted VDI is my Truecrypt/Veracrypt equivalent. One advantage is the ability to use dynamically allocated DVIs. So VDIs can start small, and grow as you add more data.


I'm considering a Cloudberry Labs (freeware/personal) + BB B2 combination for encrypted cloud backups and would like to hear opinions specifically on the Cloudberry part. I love the command-line suggestions in this comment thread but need something that "just works" as a Windows service for my wife's pc.


I have been running Resilio Sync [1] (formerly BitTorrent Sync) for a while now, it also supports an 'encrypted peer' and has support for all the major platforms.

[1] https://www.resilio.com


I also use Resilio Sync for distributing backups with encrypted peers, but I use Borg to backup into the folder for compression, de-duplication, and really easy PIT recovery. I've also added rdiff-backup on one of the encrypted peers in case the write host gets hit with ransomware and the backup overwritten.



I recently did a small setup on my Synology consisting of a simple script that:

- Tars folders I want to backup

- Encrypts using GnuPG

- Uploads encrypted files to S3 (Glacier).

Simple and cheap cloud backup for me + nothing had to be installed on my NAS except for docker to run GnuPG and AWS CLI in containers.


I'm not sure if it runs on Synology, but duplicity handles most of this OOTB, and handles incrementals as well.


I’ve heard getting your data recovered from a Synology nas is difficult if not impossible


They have an app "Cloud Sync" which replicates the entire NAS onto a cloud provider. I do a backup to Azure, as mentioned in the post, which makes it easy to recover in the event of something going haywire.


That's true with any hardware level raid. Do raid in software or better yet, use btrfs.


Have you? Can you share some stories about this?



Last time checked it's better to avoid fixed encrypted partitions. I didn't look into it any further but uploading a 100mb partition file every time for a 5mb doc file it's not practical.


I used to use boxcryptor + dropbox at one time but now I use sync.com which is like boxcryptor + dropbox rolled into one.


Arq? I know it’s Mac specific. What about an Arq -> google-cold-storage solution?


7zip with aes-256 and random pw in password manager should be enough I think.


>I’m curious to hear your thoughts on my personal data security strategy

security tools = 10 opsec = 0


It would be more informative if you actually explained your position.


I hear you, but I don't rely on security through obscurity for /all/ of my data. ;)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: