Hacker News new | past | comments | ask | show | jobs | submit login
DynamoDB One Year Later: Bigger, Better, and 85% Cheaper… (allthingsdistributed.com)
71 points by werner on March 8, 2013 | hide | past | favorite | 19 comments



A) I don't get it - Is it a proprietary database maintained and provided only by Amazon? If that's the case, then isn't that a terrible trade-off for a business to get locked into a proprietary database on a proprietary platform?

I agree it could save you a LOT of cash, but, assume you ran Mongo or PostgreSQL, then if you aren't satisfied with provider A, you can dump them and migrate your data to provider B, let alone all the modifications you can make on top of these Open source databases...Sure, you may face downtime, but that's not as bad as being locked into a proprietary database that you have no control over.

Someone please correct me if I'm wrong..

B) How does this compare to LinkedIn's Voldemort (which seems to be inspired from Amazon's Dynamo[1])? Voldemort has a lot of positive feedback from real world applications running at Scale, I guess.

[1]http://blog.linkedin.com/2009/03/20/project-voldemort-scalin...


It's a lot more similar to Cassandra than Voldemort, actually. Voldemort is more of a straight-up clone of the original Dynamo, while Cassandra and DynamoDB are second-generation, what-can-we-learn-from-Dynamo's-weak-points evolutions from there.

http://www.datastax.com/dev/blog/amazon-dynamodb


Thank you :)


It's a key-value store, so you don't really get locked in. Also, Cassandra is supposed to be pretty close if you ever want to run your own database cloud.


Further, Amazon provides a ton of sample code showing how their various services can be used to easily export data from DynamoDB, so there's very little friction there apart from paying for an EMR cluster of whatever size you prefer and waiting for the job to finish


The reality is that, if you have enough data to warrant using DynamoDB, you're going to be locked in more by the "weight" of that data. Getting your data out of DynamoDB is going to be so expensive that it is the driving factor, not refactoring your code to use another DB.


In general, I agree with you. However, AWS services are different in the sense that Amazon has become like a large utility company and betting your business on the long term availability of these services seems like a safe bet.

Just renting EC2s like expensive VPSs does not make much sense, but using DynamoDB, SimpleDB, Elastic MapReduce, etc. as appropriate and in effect composing systems from your own and Amazon's services seems like the modern way to do things.


DynamoDB is very constrained when it comes to features which means 100% compatible alternative can be easily developed on top of other database. So the lock-in is not really a problem.


The 75% reduction in indexed storage cost is a big win, from my perspective.

At $0.25 / GB / month it's now roughly 2.5x the cost of EBS disk-based storage, and you're only charged for data as you use it, with no need to reserve it ahead of time.

The alternative of running postgres or mongodb on an ebs-backed ec2 instance just got a lot less attractive to me.

Way to go, Amazon!


Well, you get the 85% price reduction for throughput only if you reserve read/write capacity for three years.


It's really a trivial cost compared to the DynamoDB price before. It used to cost about $3600/month to get 5000 writes/sec. Now in a bit more than 2 months you'll have broken even on the reserved price compared to the old DynamoDB price, and in a bit more than 3 months compared to the new DynamoDB price, for the 1 year investment. Any company using Dynamo for anything serious will probably rush to get the reserved capacity, since they've probably been spending a multiple of the reserved cost per month anyway.


But 70% for 1 year is an amazing price break when you think about it.

Run your project for a month, analyse the usage, and reserve that same capacity for a whole year, for the price of only 3 more months usage!

If your project has fairly consistent (or growing) usage, and is going to run for more than 3 months, it really doesn't need much more thought.


I'm really excited about the price reduction and the reserved throughput possibilities. Over the past six months I slowly but surely migrated most of the core data storage for the startup I work for to DynamoDB and I have been very satisfied with the performance. That said I agree with the post that DynamoDB isn't suited for every use case, just like MySQL and other relational databases aren't suited for every use case. We are still using MySQL for log tables and other tables in which we need to do complex queries over time ranges with tests on multiple properties.

The sweet spot is combining DynamoDB and a relational database into a seamless system that lets you use the power and scalability of DynamoDB when all that is needed is a simple lookup, and the complex queries of SQL when you need a detailed time range based report.


I worked with dynamoDb maybe 6 months ago and found some of the restrictions very limiting. In MongoDb the current max size for a document is 16mb, and there are workarounds for bigger data. In dynamoDb a single item cannot exceed 64kb, ever.

They used to only support unsigned integers, so happy hacking when storing unix time before 1970. However, this seems to have been fixed.

Personally, I will hold off another year before using dynamoDb again. It takes time for a project to mature.


I don't find these restrictions limiting at all. I barely ever need more than 1 kB per object. If I need to store more than a few kB, I will put it in S3 and store in DynamoDB just pointer to it.

Why would I store unstructured blobs in DynamoDB anyway if I have much cheaper S3 at disposal.


When Amazon makes Dynamo installable and usable outside of the walled city of AWS I'll give it a try. Dynamo just gives me the heeby jeebies of vendor lockin.


TFA talks about the fact that SQL simply doesn't scale to volume as big as the ones that companies like Amazon need to deal with.

It's not the first time I've read this: Google also has similar problems and fixed it with a DB (or DBs ?) of their own. In Google's case, AFAICT, it's also a gigantic key/value store on which monstrous map/reduce are done.

My question is simple: up to which amount of data and for which use cases do SQL still work? In other words, at which point do you need something else than SQL simply because SQL ain't cutting it anymore?


You can run SQL on TB's of data if you throw fast enough nodes at it, it's not purely a data size issue, but more of a scale-out issue.

Things like joins and foreign key constraints can be very slow on widely distributed relational databases, so you end up dropping more and more functionality from your SQL to keep the performance to a maximum.

Eventually you pull out so much of the functionality from your relational database, that you end up running individual very narrow queries, and very short transactions, that you more closely mirror a key/value store than a traditional RDBMS like Oracle.

Of course, when you're someone like Amazon, then TB's of data is considered fairly small.


Every three words in this article is DynamoDB. I'm sure this is an interesting post but I stopped reading due to DynamoDB stuffing. I really don't need your brand hammered on my brain in order to realize how great your product is.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: