Hacker News new | past | comments | ask | show | jobs | submit login

Key points:

1. Dropbox moved from AWS to its own datacenters after 8 months of rigourous testing. They didn't exactly build a S3 clone, but something tailored to their needs, they named it Magic Pocket.

2. Dropbox still uses AWS for its European customers.

3. Dropbox hired a bunch of engineers from Facebook to build its own hardware heavily customised for data-storage and IOPS (naturally) viz. Diskotech. Some 8 Diskotech servers can store everything that humanity has ever written down.

4. Dropbox rewrote Magic Pocket in Golang, and then rewrote it again in Rust, to fit on their custom built machines.

5. No word on perf improvements, cost savings, stability, total number of servers, amount of data stored, or how the data was moved. (Edit: Dropbox has a blog post up: https://blogs.dropbox.com/tech/2016/03/magic-pocket-infrastr... )

6. Reminds people of Zynga... They did the same, and when the business plummeted, they went back to AWS.

7. Not a political move (in response to AWS' WorkDocs or CloudDrive), but purely an engineering one: Google and then Facebook succeeded by building their own data centers.




> Dropbox rewrote Magic Pocket in Golang, and then rewrote it again in Rust, to fit on their custom built machines.

Actually, full disclosure, we really just rewrote a couple of components in Rust. Most of Magic Pocket (the distributed storage system) is still written in golang.

> No word on perf improvements, cost savings, stability, total number of servers, amount of data stored, or how the data was moved.

Performance is 3-5x better at tail latencies. Cost savings is.. dramatic. I can't be more specific there. Stability? S3 is very reliable, Magic Pocket is very reliable. I don't know if we can claim to have exceeded anything there yet, just because the project is so young, and S3s track record is long. But so far so good. Size? Exabytes of raw storage. Migration? Moving the data online was very tricky! Maybe we'll write a tech blog post at some point in the future about the migration.


Yup, Dropbox Infra is mostly a Go shop. It's our primary development language and we don't plan to switch off any time soon.

We're always about the right tool for the job tho, and there are definitely use cases where Rust makes a lot of sense. We've been really happy with it so far.


Without getting into any sort of religious war, what are typical use cases that make Rust a better choice?


I think @jamwt did a pretty good job of explaining this in https://news.ycombinator.com/item?id=11283688.

At a high level it's really a memory thing. There are a lot of great things about Rust but we're also mostly happy with Go's performance. Rust gave us some really big reductions in memory consumption (-> cheaper hardware) and was a good fit for our storage nodes where we're right down in the stack talking to the hardware.

Most of the storage system is written in Go and the only two components currently implemented in Rust are the code that runs on the storage boxes (we call this the OSD - Object Storage Device) and the "volume manager" processes which are the daemons that handle erasure coding for us and bulk data transfers. These are big components tho.


The really big deal is memory management. There's no GC, and there's pretty precise memory control, and it's not the Shub-Niggurath that dealing with C++ is.

In fact, just think of Rust as C++ for mortals. :)


Curious about the specifics with how you interfaced Go with Rust.


Right now, they talk over RPC. In the future, we might embed Rust libraries as C libraries into go and other languages.


Was not earlier it was Python? I guess Dropbox even hired Guido van Rossum due to this.


What happened to Python?


We still have a ton of Python. James is referring to the infrastructure software layer--storage services, monitoring, etc. That particular part of Dropbox is mostly Go.

But our web controller code, for example, is millions of lines of Python. And we use Python in lots of other places as well.


I had to count carefully, but it appears you're claiming 12 9's of durability for magic pocket, which seems like a subtle jab at S3's 11 9's.

I think both durability numbers would be fine for customers, but I've also wondered about the math behind AWS S3 durability for a while, and how you would prove statistically that 11 9's was even possible.


I'd imagine that if Backblaze is realizing great cost savings building their own storage pods from commodity off the shelf components, Dropbox's costs we're even lower per GB. Well done.


Yev here -> One of the reasons we started opening up about our Backblaze Pods and Hard Drive stats was so others would do so. We approve this message :D


This is most accurate. A couple of comments:

> Dropbox still uses AWS for its European customers.

We haven't publicly launched EU storage yet but will be doing so later in the year.

> Dropbox hired a bunch of engineers from Facebook to build its own hardware heavily customised for data-storage and IOPS (naturally)

Facebook and Google and startup folks and people from random other places.

Our IOPS demands are reasonably modest on a per-disk basis which is why we skew heavily towards storage density. This is a different workload than you'd find at some of the other big players, hence a fairly custom solution.

> Not a political move

Definitely. AWS are great, we love working with them and will continue to do so.


How much of your cost savings was due to the lower IOPS requirement?

Also, was the S3 "infrequent access" tier a response to customers like you or was your special bulk pricing already taking into account your low IOPS demands?

Thanks!


Thanks a lot for summarizing it for those who could not read the article due to the Wired adblock-preventing nag screen!


At the end of the blog post, they mention that they're working on creating the ability for European companies to store data in Germany if requested. This is interesting as the implication is that European businesses don't see the US as a safe place to store their data anymore.


It's not about safety, it's about EU privacy laws and Safe Harbor. It is desirable to store the data in-country to avoid the hassle of requiring/finding a provider with an EU approved Data Protection Agreement.


It would seem to me that if you're primarily in the business of storage, you should go down the stack as far as you can. Outsourcing to AWS leaves you at parity to anyone else who can do the same.

If Box is in the "Business Services" business, then perhaps it's different.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: