We received several reports from users who used a Dropbox feature called Selective Sync and couldn’t locate certain files they’d saved in Dropbox.
When we took a closer look, we discovered that older versions of the Dropbox client had introduced an issue affecting a small number of users whose Dropbox application shut down or restarted while users were applying Selective Sync settings.
In light of all of this, we've taken the following steps to ensure the Selective Sync bug won’t affect anyone else going forward:
1) we've patched our desktop client so this issue doesn't exist in Dropbox anymore;
2) we've made sure all our users are running an updated version of the Dropbox client; and
3) we've retired all affected versions of the Dropbox client so no one can use them.
We've also put additional testing in place to prevent this from happening in the future.
We’re very sorry about this issue and the trouble it might have caused. We’ll keep doing our best to ensure our users' data is always safe and available to them.
We received several reports from users who used a Dropbox feature called Selective Sync and couldn’t locate certain files they’d saved in Dropbox.
When we took a closer look, we discovered that older versions of the Dropbox client had introduced an issue affecting a small number of users whose Dropbox application shut down or restarted while users were applying Selective Sync settings.
In light of all of this, we've taken the following steps to ensure the Selective Sync bug won’t affect anyone else going forward:
1) we've patched our desktop client so this issue doesn't exist in Dropbox anymore;
2) we've made sure all our users are running an updated version of the Dropbox client; and
3) we've retired all affected versions of the Dropbox client so no one can use them.
We've also put additional testing in place to prevent this from happening in the future.
We’re very sorry about this issue and the trouble it might have caused. We’ll keep doing our best to ensure our users' data is always safe and available to them.
I was affected by this, but I realized it at the time.
I have an older laptop that I turned on. It was a work laptop a few years ago, linked to my dropbox account, etc. Since then I had added a bunch of things like a bunch of git repos to a folder included in dropbox.
I turned on that laptop and Dropbox started using 100% cpu after a few minutes. Then the fan kicked on and it was annoyingly loud so I looked at dropbox and saw it was chugging along in the repos directory. I went ahead and clicked on selective sync, unchecked repos, and left it alone for about 5 minutes.
It was still 100% cpu, so I killed the dropbox task and restarted it.
Minutes later, on another machine, I went to fetch from one of the repos and it had a gnarly error. So I went about investigating.
I found my way to the dropbox events tab (on the website - the desktop client doesn't have this feature) and saw an event where dropbox decided to delete 7,800 files.
I submitted a support request, but before they responded I had figured out it was (mostly) in the repos directory, which I fixed by simply deleting the repos and pulling from one of my servers.
Anyways. There's my real world run in with this bug.
This is exactly why sync is not a commodity. Dropbox is the very best at what they do, and even they have bugs. So when someone offers to sync your files for less, ask why.
The sync heavy lifting in Dropbox is handled by librsync, or at least was at one point. librsync is very mature open source software, and this bug pertains to a particular interaction between the Dropbox GUI and a feature (selective sync) which they have somewhat tacked on to the core library. Long story short, you don't necessarily have to be Dropbox or employ a couple hundred software engineers to get inotify + rsync working well.
> The sync heavy lifting in Dropbox is handled by librsync, or at least was at one point.
Nope, highly custom process that involves librsync very little. The sophistication of what has to be done to solve this problem well would probably surprise you.
He wasn't talking about rsync the binary but rather about librsync [0], the library that allows you to calculate a diff between two byte arrays without needing the cooperation of the other side -- ie you can calculate an efficient delta to send to remote without needing remote to be online at the moment you calculate, and without needing the complete old version, only a signature of it.
A good example use of it is the rdiff tool that allows you to do exactly what I said before. A better real-world example would be duplicity [1] or rdiff-backup [2] that use librsync to de-duplicate backups, without needing access to the whole previous value, only a small signature of it.
> This is exactly why sync is not a commodity. Dropbox is the very best at what they do,
The very best? I use OneDrive across all of my Windows machines and I don't even notice it exists; never had any problems. I just access all my files everywhere. If you buy a windows phone you even get a decent amount of space for free (15GB). (Though I subscribe to Office 365 so I have virtually unlimited space.)
It is a cliche, but the plural of anecdote is not data. I never had any problems with Dropbox, but had Office on OneDrive corrupt files. OneNote has also rendered some notes unreadable.
The bottom line is that errors happen. You should prepare for that and make backups.
Also, Dropbox are still among the very best when it comes to syncing. Many useful synchronization features are implemented by Dropbox, but not the competition. E.g., features that most competitors do not have:
- Modifying a large file on Dropbox will only resync modified chunks.
- DropBox avoids re-uploads, both when uploading identical files and moving files around:
- Dropbox does LAN sync. If a machine has to download a large file and another machine on the network has the same file, chunks are provided peer to peer. This makes using large files on multiple machines or in a team much faster.
- Dropbox does streaming sync. A machine can already download chunks when another machine is still uploading:
Sure, OneDrive and Google Drive do have many useful functions that Dropbox does not have, such as including complete office suites. But for the original task, file syncing, Dropbox is still pretty much unbeaten.
I think you're attributing too much of Dropbox's success to simple technical reliability. It really isn't that difficult a problem, and many services and projects do it right. I have an rsync script that has been syncing my files reliably to an offsite location for 6 years.
It's certain that Dropbox has a high quality syncing service, but there are other factors. Think, for example, how this case was handled: a fault in their core product, a breach of user trust in their service, and they understood that it needed more than a technical solution. None of this was part of their core sync reliability: it was part of a more broad quality, which is closer to their true reason for success.
I think you're attributing too much of Dropbox's success to simple technical reliability.
I did not say anything about their reasons for success. Only what the technical advantages are compared to some of the other file sync services.
It really isn't that difficult a problem,
Difficult enough that some of its useful features are not matched by other services yet.
I have an rsync script that has been syncing my files reliably to an offsite location for 6 years.
That's great. But that is one-way sync and not something my parents could use. Dropbox is successful because they made sync technology that is relatively flawless to the average user. Also, there is a network effect.
In the longer term, it will be interesting to see if they survive, since Microsoft and Google have been undercutting prices heavily, and as far as I know there is no online Office suite on the horizon (only Microsoft Office integration for business users).
> Dropbox does LAN sync. If a machine has to download a large file and another machine on the network has the same file, chunks are provided peer to peer. This makes using large files on multiple machines or in a team much faster.
Doesn't the file have to exist on Dropbox's servers before it can be synced to another computer on the same LAN? The last time I looked into it, this was the case.
We use OneDrive for business at work (I'm not sure it's exactly the same as OneDrive under the hood though) and I can't say I had the same experience.
- There is no Linux/OSX client unlike DropBox (the OSX client only works for OneDrive and not OneDrive for business) so it's unusable with servers or environments with a lot of OSX machines. (so most enterprise usage)
- There is a list of approved file types and if your file is not on the list, it just refuses to sync it. This is really annoying because I need to create zip files all the time to bypass this bug.
- OneDrive modifies certain file types (like word document but also others) to add metadata on it so you never know if the file you are getting is exactly the same as the one you synchronized.
- We experienced bugs in the permission system which destroyed couple of files (thankfully we had backups).
The only positive thing with OneDrive is that it's integrated with Office 365 (it's the equivalent to Google Drive for Gmail) so you can preview files directly within your web browser on Office 365 (when it works because sometimes you just can't).
I would have a choice, I would return to Dropbox without any hesitation.
It sounds like you're using OneDrive for Business (aka SkyDrive Pro), which is something like a hosted SharePoint solution. Regular OneDrive is comparable to Dropbox, will sync any files you want, doesn't add metadata to Office documents, etc. The naming is incredibly confusing since OneDrive for Business is just about nothing like OneDrive.
I thought there were many robust open source solutions out there and what made dropbox the winner was not that it is able to reliably sync, but that they made it easy to install and sync.
Dropbox is generally known for reliability in the face of competitors, however issues like this one and the highly embarrassing "anyone can login to anyone else's account" bug of 2011 definitely cast some doubt on using it as a sole storage option, or a solution to store any sensitive data.
Most medium-sized and large companies refuse to touch Dropbox due to these reasons, especially in the financial and medical space.
That said, I still employ it for personal use and like the product in general.
I aggressively use selective sync, and have since as long as I can remember yet I haven't got an email like this, so it may only affect specific users.
It appears the circumstance is more specific than simply using selective sync.
> This problem occurred when the Dropbox desktop application shut down or restarted while users were applying Selective Sync settings.
So, you must be in the midst of applying selective sync settings while the app shuts down or restarts. Although I'm not sure what they mean when they say, "while users were applying selective sync settings." I'm not sure if this means:
A) Changes made in the selection dialog box, but not committed (by clicking OK).
or
B) Changes committed, but still syncing.
The former is an edge case, the later, not so much.
Dropbox should have understood that people are using it as a backup service. I mean carousel and other use cases sort of ebcourage and imply this. With that in mind, it baffling they didn't have any proper backups for user data.
It is a shame that they don't offer the unlimited packrat option anymore. It's still not backup, but at the very least people would be able to recover files in such cases.
Also, if I understand correctly, Google Drive has a better policy here: removed files are just placed in the trash until you remove them from the trash. Of course, trash takes space up as well, but it protects better against such cases.
I guess Dropbox is trying to maximize its profits with its 'remove after 30 days' policy.
I have been using the "packrat" feature for more than a year, and Dropbox sent me a similar notification today to tell me they lost several thousand files, 816 could not been restored. They were "lost" around 8 months ago, so "packrat" didn't save me at all.
As it turns out, I have other backups of most of the files, and the rest of them weren't important. So I was lucky. Still, my confidence in the product is unlikely to recover.
I want to note that I had been aware of the "dropbox is not backup" chorus, but that argument usually is just "sync is not backup", which is sort of obvious. The packrat feature pretty much addressed this issue, so dropbox with packrat WAS a backup solution. So the lesson here is never to rely on any ONE backup provider.
I use packrat - so the files they suggest that I might have had deleted were recoverable (though none of them were an error).
If dropbox kept backups like you're suggesting, people would be complaining about how you can't delete damning files from them and that law enforcement was abusing this.
I disagree. A local tape NAS is also not a good backup, (god forbid) your house can burn down. A remote service like Backblaze is also not a good backup, they could go bankrupt or a software error could corrupt all your backups.
A good backup policy uses a mixture of onsite and offsite, and Dropbox can be a (convenient) part of that. E.g., I store (non-sensitive) files in Dropbox, which gives me a certain period of undelete possibilities. My Dropbox folder is backed up on a local time machine backup. Critical parts of my Dropbox are also backed up using tarsnap, etc.
A good backup policy diversifies, and Dropbox can be part of that.
Sorry, but this is just wrong. Any system that offers multi user sync is inherently more complex than it needs to be as a backup solution. A backup should generally be
- convenient enough that you do it without thinking about it.
- technically as simple as possible, so it's easy to understand and review.
- secure.
Dropbox fullfills the first point, but not the second, and the third is debatable. Spideroak as a counterexample is just as convenient, has a pure incremental backup mode and is client-side encrypted, the gold standard of security.
You really don't need any of those to be a solid backup system. What matters is that you make backups, the backups last long enough, and there's testing of backups.
Also from what I've seen spideroak is significantly more complex than dropbox.
It can be a backup client, as long it's not the only one. I use OneDrive + a local NAS as my backup. Both have advantages and disadvantages, but having at least one copy locally and one copy in the cloud is quite safe. Obviously these two must not be synced in realtime (because if the sync software screws something up, then you might lose everything) -- so I just keep everything on OneDrive, and every couple of weeks manually copy the files (this is still not 100% safe because of possible file corruption issues, but it's OK for me).
We provide a self hosted sync offering for businesses. It is currently used by close to 1000 businesses. It took us almost 18 months from our launch to get the sync right. There are simply too many edge cases and the development team needs to closely work with the customers to identify and fix it. Even then our complexity is much less than dropbox. The largest customer of ours have 10000 users.
Short story: if you plan to develop a sync product from scratch, be prepared to spend at least 2 years or hire core developers from Dropbox sync team. Eve now dropbox has issues with handling large number of small files. Try to stuff 200000 to 300000 files and see how it works.
I was notified of the potential data loss and checked my data on the 'personalized web page.' Of the 12,000 files that may have been affected, I found only a subfolder of a few dozen photos that may've been removed.
The problem is that when I clicked 'restore all' from within the subfolder, Dropbox restored all 12,000 files rather than just the files within the folder.
Note to DB's UX team: when you place a Restore All checkbox above the lefthand file selection column, it means 'select and restore all files on the page', not 'lift the roof off my house and dump in all the shit I spent months decluttering.'
Ha, dropbox deleted my files the other day presumably due to this bug. I ranted on Twitter and they came back with the dropbox client can't delete files. Hmmmm. Seems I was correct :-/
I have all my digital life on dropbox, a few hundred gig.
One of my greatest fears is that thousands of files might disappear without me noticing for years.
I use selective sync and twice I was looking for something that has disappeared and I have to restore it. I assumed maybe my wife accidentally deleted some files, but maybe it was dropbox?
Perhaps you can have a cron job run against your local dropbox folder(s) and do an "md5deep" reporting only differences (or sha1deep or whatever, perhaps test which uses least resources, maybe nice it heavily too). Then you could have the output report saved to a folder (not a dropbox one!). Perhaps add another job to email/alert you if the "count" of lines in the report is greater than a certain number? Crude, for sure.
Don't put all your eggs in one basket. Keep all the same data stored on another service, as well as on a physical backup that you keep yourself. If it's really important data you might keep a copy in a safe deposit box.
This manifested for me as a large number of "conflicted copies" everywhere inside my main visual studio solution. Thankfully, source control saved the day... But I was really annoyed at Dropbox for a little while.
As founder of cloudHQ, I have to jump into this.
Software products will have bugs. And people will make mistakes. We are all human.
So even if you store data in Dropbox - it is smart to have one extra copy in some other cloud storage. Like Google Drive. Or Box. Or Egnyte. So if data is deleted in Dropbox (accidentally, maliciously, or due to a bug) you can restore it from other cloud.
I stopped using Dropbox because of this. I booted my system up one day and a ton of my files were deleted (locally). Luckily, this didn't affect the sync on my other systems.
There may occasional bugs, but I don't know of any cases when Gmail actually lost emails. According to this talk [0], the bug referred to by that article caused some users' emails to be temporarily inaccessible, but all the emails were eventually recovered.