MongoDB 1.7.5 Released with single server durability

xal · on Jan 27, 2011

I think making this an opt in feature is poor strategy. This should be opt out for the people who know what they are doing.

ehwizard · on Jan 27, 2011

Durability is a new feature, and should be treated as such. While it is currently not the default, that may change in a later release.

jacques_chester · on Jan 27, 2011

Durability is the new web scale.

rst · on Jan 27, 2011

I agree that it's poor technical strategy. It may not be poor marketing strategy --- there are a lot of folks who are attracted by gaudy performance numbers, and may not care (at least not until they get burned) that they're dependent on unsafe defaults.

kchodorow · on Jan 27, 2011

No one is ruling it out as being the default in the future. However 1) it was a pretty major code change and 2) this is the first release to make it publicly available.

Do you really want that enabled by default right off the bat?

snissn · on Jan 27, 2011

ya i kinda want 'durability' to be enabled by default, actually

true_religion · on Jan 27, 2011

I thought the whole idea behind MongoDB is to deploy at scale. In other words, deploy over 100s of boxes and don't worry about RAID, or single sever durability because at that scale any single server is liable to fail so you'll need to ensure platform-wide durability via replication.

MongoDBs killer feature was/is sharding. If your deployment isn't going to require sharding from the get go, then I'm not sure why you'd be attracted to it instead of any of the other more mature alternatives.

Adding single server durability just gives MongoDB more possible use cases (e.g. small site, single server, low overhead enviroment, etc.).

stoneg · on Jan 28, 2011

Currently (mongodb 1.6.x) sharding is a killer feature that might kill yourself. It's just a marketing propaganda.

I once tried switch to sharding to avoid high write lock ratio(MongoDB use DB level lock currently, only one write operation allowed at the same time). After sharding, MongoDB even don't know how to count my colleciton. db.mycollection.count() return values at random. mongorestore also failed in sharding setup, there's no error message when I was restoring millions of documents, but after it reported successfully restored, I checked the DB, no documents there.

We switched back to more reliable master/slave setup finally. So before trying sharding, you should watch this video http://nosql.mypopescu.com/post/1016320617/mongodb-is-web-sc... . It's so true to some extend.

true_religion · on Jan 28, 2011

I've seen the video but I didn't give it much credence.

But if what you say is generally true, then MongoDB is giving a terrible showing for software that's past beta.

stoneg · on Jan 28, 2011

It's true, we also experienced data loss (several millions of log data caused by dirty shutdown, it's data file is very fragile) issue like this thread http://groups.google.com/group/mongodb-user/browse_thread/th...

I found the following two posts very insightful：

1, http://www.mikealrogers.com/2010/07/mongodb-performance-dura...

"This (mongodb's write plolicy) is kind of like using UDP for data that you care about getting somewhere, it’s theoretically faster but the importance of making sure your data gets somewhere and is accessible is almost always more important."

2, http://www.paperplanes.de/2011/1/10/mongodb_and_data_durabil...

"It's okay to accept trade-offs with whatever database you choose to your own liking. However, in my opinion, the potential of losing all your data when you use kill -9 to stop it should not be one of them, nor should accepting that you always need a slave to achieve any level of durability. The problem is less with the fact that it's MongoDB's current way of doing persistence, it's with people implying that it's a seemingly good choice. I don't accept it as such."

dm_mongodb · on Jan 27, 2011

development agility is a goal of the project too, in addition to scale-out. i've seen a lot of happy users with a single server (or two) and that's all they need.

sibsibsib · on Jan 28, 2011

It's great for this. I'd love not having to worry about replication for small side projects.

thibaut_barrere · on Jan 27, 2011

Good point - I asked on the list to see what their opinion is on that point.

cheald · on Jan 27, 2011

It's worth noting that this is not a "stable" release. MongoDB releases with an odd minor version number (1.3, 1.5, 1.7) are unstable development releases, and should be treated as such. 1.8 will be the first official "stable" release with single-server durability.

e1ven · on Jan 27, 2011

This is great news, thank you!

Until 1.7.5, the advice seems to have been that ANY single server is vulnerable, always use replication sets to prevent losing data.

While I appreciate that point, and we do use ReplSets for every DB, in the real world problems happen.

    A circuit might explode in a DC, causing all the machines to go down. (Happened to me at ThePlanet)
    Our Secondary machine might go down, and while fixing it, the primary might fail. (Happened two weeks ago on dev machines)
    Our devs might run a test database on their Macbooks; While this isn't mission critical to stay up, potentially losing records means they need to restart all tests after an event, rather than resuming.

There's a million other places that this will be helpful. Yes, we should always spread things out as much as possible.. But I still use redundant power, RAID arrays, a journaled filesystem and in ideal times a ACID DB.

Here's hoping 1.8.0 will be out soon! ;)

lazyjeff · on Jan 27, 2011

I've always thought the MongoDB database was really good, but has poor drivers for node.js. I've been using node-mongodb-native, and it has basically no documentation. If you want to use a relatively new feature, you have to figure out how to convert from the MongoDB command line syntax in the MongoDB docs, to some syntax that the driver understands (which may sometimes be impossible). If you have poor drivers, even an awesome database won't make up for it.

mnutt · on Jan 27, 2011

It also requires way more nesting than it should, in my opinion. It would be nice to have a synchronous connect method so that I don't have to wrap my entire app in a connect callback.

And I don't see why 'collection' should take a callback either, unless you're in strict mode and querying mongo to see if it exists.

Some people have written higher level APIs on top of node-mongodb-native, two that I have seen are mongoose and mongous. I'm still evaluating them to see what their performance characteristics look like.

johngalt · on Jan 27, 2011

Forgive my ignorance, but I had no idea what single server durability was. So I had to look it up. I couldn't find any direct definitions, but from what I read it sounds like it does this:

Single server durability is a disk buffered list of pending writes so if the server reboots while in operation it can just resume where it left off. In larger deployments this risk is handled by having multiple servers running concurrently.

Does this sound right?

rit · on Jan 27, 2011

Take a look at the documentation for Journaling: http://www.mongodb.org/display/DOCS/Journaling

This covers what MongoDB is doing for durability.

johngalt · on Jan 27, 2011

Thanks that is exactly what I was looking for.

mike_esspe · on Jan 27, 2011

Anyone did write benchmarks with durability enabled?

rst · on Jan 27, 2011

This is a situation where the physical disk configuration may make a real difference. If the journal and the data files are on the same disk, a write-heavy load will be continually seeking between them. If they're on different disks, there's a lot less seeking, and that may reduce the performance penalty quite a bit (at the cost of extra hardware).

And of course, if you're on SSDs, this is all a non-issue, but that also still comes at a premium.

thibaut_barrere · on Jan 27, 2011

I didn't but I would be interested.

FWIW, here's the FAQ on performance with durability enabled:

How's performance?

Read performance should be the same. Write performance should be very good but there is some overhead over the non-durable version as the journal files must be written. If you find a case where there is a large difference in performance between running with and without --dur, please let us know so we can tune it. Additionally, some performancing tuning enhancements in this area are already queued for v1.8.1 and beyond.