Hacker News new | past | comments | ask | show | jobs | submit login
Ma.gnolia Data is Gone For Good (datacenterknowledge.com)
43 points by Anon84 on Feb 19, 2009 | hide | past | favorite | 34 comments



"It turns out that Ma.gnolia was pretty much a one-man operation, running on two Mac OS X servers and four Mac minis."

This should be a lesson: entrust your systems administration to people that have experience in this stuff. I'm not saying you have to find an expert and pay him or her expert prices, instead, you probably have a developer or admin friend who knows what it takes to implement a basic backup strategy that will at the least keep you from losing all of your data.

I watched a little bit of the podcast and he said he was just syncing the db files... well if it's innodb, that's not going to work. It says so in the MySQL docs. You know, in the section about backups.

The lesson here could be this: if you have a great idea, get it developed and out and the door -- awesome. Now talk to someone with experience in systems planning. Don't just throw a bunch of mac minis in a cabinet, bust out an rsync script and call it a day.


I don't want to throw rocks, because I used to work for Larry (in fact, I wrote the original codebase for Ma.gnolia.com) and I like him.

But I distinctly remember talking about backups and how a raw sync of the innodb files wasn't really what we needed (it could sort of work in the original deploy of ma.gnolia, but would not scale as the site scaled). I'm kind of surprised that the community has been so tolerant of this failure. Personally, I'd be furious if I found out that a site I trusted lost all my data because they hadn't even tested their backup system.


> I'd be furious if I found out that a site I trusted lost all my data because they hadn't even tested their backup system

Exactly - first rule of backups - test them. For a database, it can be as simple as using your nightly backup to create your dev or test database or whatever. I guess it isn't feasible with a half terrabyte database every day, but to not have at least a several week old backup set is somewhat unforgivable.

On the other hand, I bet he will never ever make the same mistake again, and it will make a lot of people on here think long and hard about their own backup strategies ...


I've seen that kinda setup done nicely..

Daily automated DB Backup. Daily automated DB restore to a seperate server. (Which also can kinda double as a hot spare as required.)


You may not want to get into this, but why was the database half a terabyte anyway? That seems like a shitload of data for a site like ma.gnolia.

My best guess is the link screenshots were being stored in the DB.


Or could it be that half a gig was just a figure he came up with to demonstrate the (ostensible) difficulty in doing a proper backup ?


Well, half a gig is trivial to backup. Half a terabyte is not. Assuming you meant to say 500 gigs, then yes, there is difficulty, but it's not uncharted territory. He had two xserves -- perfect for replication.


It's not even close to uncharted territory. You can walk into any Apple store and buy a 1-terabyte backup device off the shelf.

OK so it won't handle your databases properly but that's not the point; 500G is not a troublesome amount of data these days.


No the point is backing up a 500G database is difficult, but not unpossible. Sorry if that wasn't clear.


Is this a good case for cloud computing services?


I think it doesn't matter what your infrastructure looks like, the principles are the same. You should not rely on a single service to store your data and/or do backups properly for you.

For example, I am building an application on top of EC2 (IaaS) and utilizing the nice EBS system with snapshots to S3.

This is still a do-it-yourself style in many ways, you still need to understand the database system, read those docs and you still need to test restores etc. It's not all that different from the Ma.gnolia situation. The main difference is that you get some nice properties from EBS and S3 redundancy.

But even with those nice properties, like hell am I going to hang the crux of my business on S3 snapshots only. Besides technical failures (there have been S3 data inconsistencies reported), what if they cancel our account? Etc. etc.

Another thing people label "cloud computing" is something like Google's appengine (PaaS). Here, you trust Google entirely to store the data safely in the "datastore" but its only integrity that they try to supply you, not backups. I have never heard of a way to get Google to take snapshots (which are important when you or an attacker delete something that shouldn't have been, etc.).

You have to manually back up off of AppEngine, as in: http://code.google.com/appengine/articles/gae_backup_and_res...

But you should do something like that anyhow is my real point.

Cloud computing is still a single point of failure because it's a "single company of failure". It may address your SPOF problems with buying and running hardware but I'm seeing people let that throw caution to the wind. Instead of "hard disks will fail, know that will need contingency plans" the situation if you choose to go the cloud computing route would perhaps be "trust the redundancy system they try and supply but be sure to have a contingency plan because there is always some possibility you will be hosed."

If you're building with something higher level than Google AppEngine, or Engine Yard, etc., the principle still applies. Even up to the consumer level, look at those people "building their bookmarks" at the higher level, they lost a lot of their hard work because they trusted a single company/system.

As a business, I don't ever want to put someone in that position where they realize they should not have trusted one system. That deteriorates any trust they had in our brand and is just shitty all around. And the only way to begin on that path is to not do that yourself.


Only in the sense that "Someone robbed an unlocked house" is a good case for sniper teams. I mean, sure, it will work -- but as an interim measure, locking the door is a bit easier, almost as effective, and far less messy.


Here is a tip - don't hire consultants to talk to you about blog marketing and microformats etc. until you have the basics like backup and support in order.

Magnolia hired such consultants was engaged with them for a while. I know because I am an advisor to a company who hired the same consultants - and I made the same argument to them, ie. there are more basic things that need to be completed before good money is spent on people who love to talk about Blogs, FOAF, OAuth, OpenID, Syndication and online marketing etc.

With the money Magnolia would have saved from these consultants, they could have really had their site in order. I also don't know how you can have so many employees, advisors and 2-3 consultants on tap and nobody mentions backup.

So other startup founders, forget about all that fancy 2.0 stuff - get the essentials in order first.


I have a wonderful data backup story.

I was backing up one nearly-full 1TB Seagate Freeagent to another one, and then after 180GB of data transferred, I knocked one of them over by mistake, it fell 15 cm to the floor and died!

I sent the broken Freeagent to Seagate data recovery, but unsure if it would be recoverable, I had this amazing brainwave.

Before I backed up one Freeagent to another, I quick-formatted (note: quick, not full). When one of the drives broke, I had a brainwave: Why not undelete the remaining data?

So I found a utility on the web to unformat and undelete, and the only data left from the copying was off a spare 3.5" drive, and I got back all but 2 weeks of data.

I was lucky. Now I have 3 Freeagents to serve my backup needs.

However, all my core data is now on some 2.5" drives, which are more reliable than a 3.5% Freeagent.


The lesson as always: Keep your Freeagent drive on the floor.

I've knocked mine over more than a few times, but since I've got it sitting on the floor, there's been no issues.


datacenterknowledge.com seems to be having some issues right now (the HN effect? ;) Here is the gist of the post:

     The social bookmarking service Ma.gnolia reports that 
     all of its user data was irretrievably lost in the 
     Jan. 30 database crash that knocked the service 
     offline. That means that users who were unable to 
     recover their bookmarks through publicly available 
     tools (including other social media sites and the 
     Google cache) have lost all their data.



     Ma.gnolia founder Larry Halff said last week that the 
     service’s MySQL database included nearly half a 
     terabyte of data. Yesterday Halff informed users that 
     a specialist had been unable to recover any data from 
     the corrupted hard drive. ”Unfortunately, database 
     file recovery has been unsuccessful and I won’t be 
     able to recover members’ bookmarks from the Ma.gnolia 
     database,” he wrote.


Data Center Knowledge had another story (Exploding Servers) on Slashdot at the same time it was getting the HN traffic, so a double-whammy.


Meaning no disrespect to HN, but given that we seem to send about 4k visitors a day in my recent experience, getting linked by Slashdot and HN on the same day is a single-whammy.


The website is down (at least for me). But there's also some information on the official ma.gnolia site: http://ma.gnolia.com/


I seriously didn't expect this from magnolia.

How hard/expensive is it to get an automated daily backup of the DB and sync it up to something like Amazon S3 ? I run a similar pet project ( pardon the shameless plug :) http://tagz.in ), albeit with only a dozen people actively using it. I maintain a 30 day rolling backup of daily db dumps, paying next to nothing for the backups. I've had a number of people asking me how can they trust me if I decide to stop the whole thing, and a part of the plan was to allow people to download their bookmarks in Netscape bookmarks format for atleast 30 days from the day I announce just in case I decide to take it down / run out of money. Not like this is ever gonna happen (To be honest i probably can run it for the next 6 months, even if I lose my job this very day.), But I feel contingencies like these need to be thought about, well in advance when we're dealing with user's data (bookmarks in this case).

PS: I use Postgresql and have a script on my laptop which syncs my local db with the latest db snapshot every week, which isn't really a good idea, but works for me, given the tiny scale that I'm working on.


Hardish. Expensive when you're talking about terabytes.

Regardless of that, though, ma.gnolia had backups; it's just they were untested and were dutifully backing up the corruptions that were being introduced on the software level.

It's that bit I'd like to know more about. Were they faithfully following the Rails Way and expecting the model to handle validations, instead of setting up the database to do double validation? How did these errors creep in, and how did they grow to be so catastrophic?


This corruption doesn't have anything to do with the "Rails Way". The corruption was at a much lower level then things like nil foreign keys, long fields, etc. that Rails validates. It was the filesystem that was corrupted, and whether he did database validation in ActiveRecord, using database validations, or both would not have made a difference, as far as I can tell.

Think about it for a second - if the corruption that caused this was things that could be validated using Ruby or in SQL declarations, it would to more than feasible to import that data into some format form which it could be recovered.


Didn't Reddit lose user account data at least once after start-up (as I seem to recall from personal experience)?

I know that MSN lost a lot of user account data soon after its start-up, way back in the 1990s. Preserving data for an online community is not easy, even for richly funded market entrants.


They had a laptop stolen that had a backup of their database. After it was stolen they let users know about the theft and that passwords were stored as plain text. I don't believe any data was ever lost.


But in every other major case I've heard of, it has been properly backed up.


I was a charter subscriber to MSN, and I never got my email address back. So I stopped using MSN.


Oh! I hadn't known.


I just feel so sorry for the Magnolia guy; what a waste of hard work!

He must be feeling like shit right now; I know I would.


Was Magnolia profitable? Did it run ads?


No, and only occasionally. For the approximately 2 years that I used ma.gnolia it ran ads for less than half of the time.


this makes me wonder about the backup systems of sites like tinyURL.

imagine all those shortened URLs everywhere becoming useless!


A terrible tragedy of epic proportions, to be sure.


As if millions of tweets were suddenly silenced.


Was this a fried harddrive problem? If so, maybe Spinrite would help?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: