Hacker News new | past | comments | ask | show | jobs | submit login

SpiderOak cofounder here.

It's not really about the encryption directly; it's about the way the database design has to change to support zero knowledge (the server can no longer do the database work.)  Things like garbage collection have to happen client side.  Described in detail here:  https://spideroak.com/blog/20091026143000-why-and-how-spider...

FYI, when SpiderOak first starts running, like any backup software, it has to scan your filesystem to see what may have changed while it was shutdown, and that can take a lot of IO and CPU if you have many folders and/or many small files (hundreds of thousands).  (SpiderOak works with any folders you like, including externa/network drives, not just one folder like Dropbox.)  

In a profile, basically none of this CPU time is spent in the crypto module, just mostly for unserializing the per-folder journal to compare state.  SpiderOak does some optimizations with directory hashing to avoid having to open the journal to scan for specific changes in unchanged folders, but there are some situations where that optimization is defeated (such as if you remount a file system with changed gids, inode IDs, timestamps, etc.)  

Anyway, once it's running, it uses the operating system facilities on Linux/Mac/Windows to notice changes automatically so rescanning the whole backup selection after startup is not usually necessary.

In any case, thanks for your interest in SpiderOak. :)




Hello Alan,

I have a quick question for you which might be of interest to others as well:

Why isn't SpiderOak open source yet?

I've read your FAQ answer[1] on this, however it doesn't really give a concrete explanation, besides "soon" and "licencing concerns" which is definitely disappointing.

I currently use my own, EncFS based sync solution, however I'd love to be able to use SpiderOak, or a third party open source application which supported syncing with SpiderOak (if you'd ever consider exposing an API/Protocol which allowed such.)

This is the only reason I'm not using your service currently, however I love the concept and hope you'll take it into consideration. It's likely I'm not the only one with similar concerns.

[1]: https://spideroak.com/faq/questions/35/why_isnt_spideroak_op...


My hard drives are reasonably fast, so even with a quarter million files it doesn't take long to scan the filesystem. I have other backup software that scan the exact same folders and finish in less than 30 seconds. So I suspect that it's your journaling/unserializing system that takes up the bulk of the startup time.

Fortunately, I only need to start up SpiderOak once in a while, because nowadays I put my computers into hibernation instead of shutting them down and starting them up again. I also noticed that the more often I use SpiderOak, the less time it takes to start up, presumably because there's less delta to process. Also, as you said, once SpiderOak is up and running, it's relatively fast. Rest assured that the performance issues, although annoying, have not dissuaded me from renewing my SpiderOak Plus (100GB) account once again last month.

I also like the fact that SpiderOak's "Queue" and "Log" screens tell me exactly what it's doing at any given moment. Waiting becomes a lot more tolerable when I know what I'm waiting for. I hate backup tools that assume the user is too dumb to understand what's going on behind the scenes. Even Wuala's Upload/Download queue never seems to work properly.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: