I really wish they had been more specific about the problems they faced. It sounds like part of the headache was that they had to write their own MongoDB to ElasticSearch connector. However, 10gen has now written one which they could use to avoid writing and maintaining their own:
http://blog.mongodb.org/post/29127828146/introducing-mongo-c...
Hi Ben, the ES connector was part of it. At a certain point we had our custom oplog follower working well. After so many years working with MongoDB, we just lost trust in it, to be honest. We ran into so many bugs, replication issues (two secondaries and no primary) and the only way to get what we needed out of it was to use all SSD storage (Even flashcache did not work).
I have to admit that this was for our use case, and I know that MongoDB works extremely well for some people. We still use it in Beanstalk for our deployment logs and have no complaints. It's a much smaller dataset though.
Really nice to see that 10gen made an ES connector. ES is still young, but it's amazing. One thing that I always respected about 10gen is how they maintain all of the drivers and code.
Interesting to find out what was going on behind the scenes at postmark. As a (albeit small) customer, we were noticing something was going on, but not exactly sure what. It's great to see the postmark guys sharing the info and being transparent. Overall I really like the service despite the recent issues.
As far as cloudant, I also never heard of it. Trying to go through the website, is this effectively a managed CouchDB in the cloud? Also, just wondering what's the migration path out of it (or who provides alternative service) just in case something doesn't work out.
Cloudant sits under the Apache CouchDB API, so the easiest path for migration in/out of Cloudant is to use replication to/from another CouchDB-speaking service like Cloudant, Apache CouchDB, TouchDB (mobile), PouchDB (browser). For most databases, you can also get a JSON dump of the document bodies by doing: curl 'https://username.cloudant.com/dbname/_changes | gzip > dbname.json.gz.
The tricker part of course is understanding how to best model data as you move between different database systems. There are some nice tools emerging in the no/new-sql ecosystem to help with that. Here's one that some of our mysql/postgres converted customers seem to like: http://www.foxweave.com/moving-on-premise-data-to-the-cloud-.... My understanding is that it works in both directions.
Chris from Postmark here. We definitely had some trouble over the last few weeks. I am hoping we are beyond that in terms of architecture. So far everything looks great.
Regarding lock-in, this was pretty important to us when we decided to migrate. We prefer to control everything in our data center, so if it came down to it, we wanted that option. Since the Cloudant API is compatible with CouchDB, the migration path is not that hard. Although, considering our impression so far and their expertise, I don't expect it will come to that point.
Thanks for the info Chris. I certainly hope you won't have to make another big move, but vendor lock-in is always a consideration. Also, I'm curious about network/other latency and bandwidth limits when it comes to accessing cloud-based database services. Did you do any measurements on that before or after switching?
...and regards to Natalie, Dana and the rest of the postmark team!
EDIT: noticed info about latency from cloudant on another comment.
We were running our own Bigcouch cluster and having all sorts of issues scaling to large numbers of databases (50k+), but after moving to Cloudant it handled it just fine.
It does cost more, but we don't worry about our database now and can focus on what we're actually good at.
Their search feature seems to be pretty good too. It is based on Lucene and is very flexible!
Cloudant seems interesting, never heard of them before. The pricing is intriguing. (https://cloudant.com/pricing/) From what I understand, it's free for most small projects but could get very pricey very fast, but then, handling millions of customers is far from being trivial to handle so I'm sure it's worth the price. Thoughts about the Cloudant pricing versus in-house solution (or vps)?
I might also note, thats its worth contact Cloudant directly if you are working on a larger scale production app. We have the metered pricing listed on the website and then dedicated which is an "all you can eat" style service and you pay a flat monthly rate based on how many nodes are in your cluster. That is the system that Postmark is using along with most of our other bigger customers.
NoSQL databases are becoming very common place now and there are a lot more companies and hosted providers than there used to be. So naturally you are going to see more stories of people switching away from traditional SQL databases.
I work for a large enterprise who previously was an Oracle shop and even they are starting to using MongoDB and CouchDB internally.
No. But all new development isn't being done on it.
My point is that stories like this are becoming the new norm especially with companies like 10gen and Datastax wining and dining CIO/GMs pretty hard. MongoDB in particular is getting a (misguided IMHO) reputation as a drop-in replacement for MySQL. And the last MongoDB training session had developers from banks, insurance companies, finance houses etc.
I am pretty sure that many of the next 10+ years of in-house built enterprise apps will be using NoSQL databases. Which will in turn then result in university courses changing thus affecting the next generation of programmers.
We looked at MongoDB for our back end at http://moj.io and preferred CouchDB. It's currently a toss up between Iriscouch and Cloudant. Both seem like great offerings. We are particularly interested in the ability to sync to CouchDB on iOS/Android/Windows Phone in the future.
It isn't a problem if you're speaking specifically about Cloudant. They co-locate clusters next to your app as best they can (as Mike has mentioned elsewhere). That makes the network you're running on critical, however.
When we built out our message queue offering at SoftLayer (which uses Cloudant/BigCouch as a data store), latency out to the DB cluster simply never ended up being an issue. Properly leveraging DB features, I/O priority tuning, the caching tier behaving properly - all more often these end up being the culprit in performance issues.
What you need to better understand when considering a cloud-based data store is who's running your cluster. When it comes to guys like Adam, Mike and Robert, you're going to be very hard-pressed to find better (and available to you) talent in the CouchDB world. We certainly appreciate having them around.
(Mike Miller, Cloudant). Exactly, we run cloudant clusters in nearly 20 data centers globally on many different providers (ec2, softlayer, rackspace, joyent, azure, hostway, ...) so we make sure that the data and application tiers are co-located.
Dealing with direct connections from mobile is more fun.
Seriously do. I spent a few minutes looking for the list of data centres. And when I couldn't find it I left with the assumption that it was going to be a slow, pricey mess.
Nothing is more important than transparency when you are providing core infrastructure.
Sorry, I need to correct that. I meant that shards are basically still primary / secondary with an arbiter for failover. We prefer horizontal nodes instead, like Cloudant, Elastic Search or Cassandra offers.
Space/resources was certainly a concern in this case. A disk-based solution with predictable latencies enables storage of far more data than something that requires holding the working set in RAM.
Grr.. I can't figure if you can deploy cloudant in an on ec2 avail zone that you have your app in. Can you you even deploy it on ec2?
I don't see anything stating answers to these questions clearly on their site. And they want my email to let me see tech docs! Are they run by former IBM staff?
Finally, if they are 'hosted and managed all day everyday' some place far from me. I dunno if I want to use it.