Amazon DocumentDB, with MongoDB compatibility

ahachete · on Jan 10, 2019

My bet is that it is built on top of Aurora PostgreSQL. By looking at the "Limits" section (https://docs.aws.amazon.com/documentdb/latest/developerguide...), identifiers are limited to 63 characters and the same characters that PostgreSQL limits identifiers to; and a collection size limit of 32TB, coincidentally maximum PostgreSQL table size.

Edit: I can confirm: does not allow the UTF-8 null character in strings: https://docs.aws.amazon.com/documentdb/latest/developerguide... ... It is written on top of PostgreSQL.

talawahtech · on Jan 10, 2019

It sounds like it is built on top of the Aurora storage subsystem that is used by both Aurora MySQl and Aurora Postgres[1].

I kinda expected them to build it on top of DynamoDb's backend and provide the same kind of "Serverless" on demand experience, but I guess the architecture didn't fit, or maybe this was just faster.

1. https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide...

mrep · on Jan 10, 2019

Definitely because it was faster. Amazon's strategy is to launch new features ASAP and then rely on everyone having to be on-call to fix shit when it inevitably breaks in prod because they rushed to launch. I will admit that while their "operational excellence" is shit, the security engineers do have quite a bit of power to block launches so their security isn't as bad as the reliability.

However, the fact that writes aren't horizontally scalable makes it a laughable nosql database but it probably satisfies the checkmark for enough of their enterprise customers that it will be a mild success and they'll keep it on life support like simpledb forever until they implement a proper solution assuming there is enough demand for it.

senderista · on Jan 10, 2019

I was there for the launch of a major AWS service where they had an entire separate team working on the next iteration since well before launch (because the initial design wasn’t even intended to be sustainable). They are happy to incur technical risk (and in this case, to eat major losses in hardware costs) in order to be first to market.

Corrado · on Jan 10, 2019

We used MongoDB in my last job and I just want to say that I would have given up management of that beast in a heartbeat. We didn't stress MongoDB nearly enough to warrant all the effort required to construct it, monitor it, back it up, etc. Even if the performance was crappy, I would have lobbied hard to change to DocumentDB ASAP.

halbritt · on Jan 10, 2019

While I don't love MongoDB, I don't find it to be especially difficult to run. I'm running ~200 instances of MongoDB with a small team and it consumes very little of my attention.

ElasticSearch on the other hand...

bloomburger · on Jan 15, 2019

Why is Elastic so difficult to run for you?

adaptiveValleys · on Jan 10, 2019

I'm sure they are, so they a pass those lovely negative externalities onto the customer because they know it's in demand and only they provide that service.

If only they had a competitor that could launch the same products a few months later but offered higher reliability off the bat, that could eventually force Amazon to improve their reliability or risk losing customers long term.

Being first to market doesn't ensure eventual market dominance. Sure, it could give you important feedback. But if your product is subpar, the feedback will have a ton of noise and possibly be useless. Plus it's not worth creating negative externalities and earning the reputation.

talawahtech · on Jan 10, 2019

You think AWS has a reliability problem for their database products? That's news to me. AWS often launches products with limited features, but security, durability and reliability tend to be the standard.

Reliability is the trickiest of the three because it requires the customer to architect their solution with multi-AZ support in mind, but AWS always provides the foundation for that architecture.

Could they, and should they provide more features and a better developer experience around building fault tolerant solutions? Absolutely! But I certainly don't think they have a bad reputation for reliability.

halbritt · on Jan 10, 2019

From my perspective, performance and scaling issues are most likely to occur.

nailer · on Jan 10, 2019

> If only they had a competitor that could launch the same products a few months later but offered higher reliability off the bat

Doesn't Azure Cosmos DB do this? From https://docs.microsoft.com/en-us/azure/cosmos-db/introductio...

> You can elastically scale throughput and storage, and take advantage of fast, single-digit-millisecond data access using your favorite API among SQL, MongoDB, Cassandra, Tables, or Gremlin.

Haven't used it though, so would welcome some real world experience.

HeavyStorm · on Jan 11, 2019

> If only they had a competitor that could launch the same products a few months later but offered higher reliability off the bat, that could eventually force Amazon to improve their reliability or risk losing customers long term.

They have, it's Azure. I'm even a little bit scared because no one here is mentioning CosmosDB... It seems to me that most of the community only knows aws products.

sokoloff · on Jan 10, 2019

For how many customers do all of AWS flaws combined represent more than 2% of their production outages? I think it’s a very small number.

HeavyStorm · on Jan 11, 2019

Well, they are second to market this time around, Cosmos has had mongo api compatibility for a long time.

talawahtech · on Jan 10, 2019

It definitely sounds like it sucks from the perspective of an internal AWS developer or SRE, but if the AWS systems are architected such that these internal failures aren't seen by end users then AWS's reliability reputation remains fully intact.

Customers are paying AWS so that their SREs don't get called, they don't care if the AWS SREs do as long as the system keeps running.

Based on the supporting quotes at launch from Capital One, Dow Jones and WaPo it sound like enough customers are ok with vertical write scalability and (pretty awesome) horizontal read scalability for now because it fits their use case and is better than what they had before.

Also consider that since the cluster management overhead has been removed from the customer, they can essentially "shard" by using a separate cluster for each sufficiently large service/org/dept, which might actually work out better for them in some respects.

Perfect is the enemy of good enough, the architecture might be laughable to you, but it is probably miles ahead of what the customer was using before.

int_19h · on Jan 10, 2019

I suspect that most MongoDB users never get to the point where they need to horizontally scale (i.e. it gets chosen for fad reasons, not because they actually have something big enough to scale).

And the nice thing about this hypothesis, you can test it by looking how successful DocumentDB will turn out to be. ~

tanilama · on Jan 10, 2019

AWS prioritizes launch above EVERYTHING. It is their strategy, to have market tells them what to build.

I think it works, and AWS has yet been brought down by this horizontal complexity. Quite an achievement, but might not be a satisfying experience for the engineers work there.

talawahtech · on Jan 10, 2019

It makes sense in terms of feeling out the market as well. If this version of the service takes off it validates the decision to proceed with a more complex/scalable version and it gives them more customer feedback. Standard MVP best practices.

The downside is that a lot of their products lack polish which sucks. On the flip side even when they are launched with minimal features, they do tend to be reliable, durable and secure, which is important when it comes to data related services.

danpalmer · on Jan 10, 2019

This is one of the main reasons why I don't like AWS services, everything just seems so half-finished. There's not a lot in AWS that I would trust enough to use in production.

I wonder how widespread this view is. I suspect it's more widespread than Amazon realise. They may have optimised into a local maximum where they get a lot of value from being first to market, but could potentially get more by being first to "viable to trust a business on".

talawahtech · on Jan 10, 2019

I certainly agree that they seem half finished in terms of features and developer experience, but from the point of view of security and data durability they have an excellent reputation. They typically have a pretty good reliability story as well, but it relies on the customer architecture their solution to take advantage of multiple AZs/Regions, which is often not trivial.

As far as being "viable to trust a business on" the numbers don't lie, AWS is number one because customers are running their businesses on AWS. The fact that DocumentDB launched with supporting quotes from Capital One, Dow Jones and WaPo shows that customers were clamoring to use it even before GA.

Remember a lot of these customers are coming to AWS because they tried doing themselves and stuggled. When it comes to data, customers trust AWS more than they trust themselves, and rightly so.

ctvo · on Jan 10, 2019

What's your definition of production ready? AWS services when launched "half-finished" still do not have outages, data lost or security issues. They also come with metrics and enough monitoring to support them in production. Those are the major checkboxes for production ready.

AWS also has not had a reputation for deprecating services it launches. I find very little risk in taking a dependency on something AWS releases.

jjeaff · on Jan 11, 2019

You mean if they used a different strategy, they might have more than the entire one third market share of the entire cloud hosting industry?

manigandham · on Jan 11, 2019

>> "viable to trust a business on"

They already are viable and trusted by multiple billion-dollar companies and governments.

manigandham · on Jan 10, 2019

Apparently SimpleDB is still used quite a lot internally. As for their market tactics, there's no denying it works as their pace is accelerating and leaving everyone else in the dust. Most customers just want to pay some money and have a solution ready to go, they don't need infinite scaling from day 1, if ever.

This focus on actually meeting needs today is what keeps AWS on top while the others take 2 years to launch minor service upgrades.

drb91 · on Jan 10, 2019

MongoDB is not horizontally scalable either is it?

mrep · on Jan 10, 2019

Yes it is: https://docs.mongodb.com/manual/sharding/

evil-olive · on Jan 10, 2019

DynamoDB is written on top of MySQL (more specifically, MySQL's storage engine, not the query engine) so using Aurora which has a newer design would make sense.

talawahtech · on Jan 10, 2019

Saying DynamoDB is built on top of InnoDB is a pretty big oversimplification of a much more complex distributed system[1] and for all we know they could have switched out the low level the storage engine on the backend to something like RocksDB or WiredTiger.

The Aurora storage subsystem is much more limited in terms of horizontal scalability and performance, they probably chose it because it was a better/quicker fit.

1. https://youtu.be/yvBR71D0nAQ

evil-olive · on Jan 10, 2019

Yeah, I used to work on DynamoDB, I know it's more complicated (much more complicated than that video makes out - their code quality was atrocious, like 2000-5000 line Java classes in 3 or 4 deep inheritance hierarchies; no unit tests, only "smoke tests" that took 2 hours to run and were so prone to race conditions that common advice was to close everything else on your machine, run them, then leave them alone while you went to meetings)

There was work underway at the time I left to replace InnoDB with WiredTiger. It seemed to be very slow going, and I suspect WiredTiger being acquired by 10gen had a part in it. They also had only 1-2 engineers on the project of ripping out MySQL and replacing it, in a long-lived branch that constantly dealt with merge conflicts from more active feature development happening on mainline.

Aurora, simply by virtue of being newer and learning from DDB's mistakes (in the same way DDB learned from SimpleDB and the original Dynamo) probably has better extension points for supporting (MySQL, Postgres, Mongo) in a sane way.

talawahtech · on Jan 10, 2019

Interesting, how long ago was that? I would be curious to know if the WiredTiger switch ever happened, and what that support relationship looks like not given the contentious relationship between MongoDB and AWS. The old Wired Tiger Inc website[1] still lists AWS as a customer.

Then again, the relationship between AWS and Oracle is even more contentious and Aurora MySQL is one of AWS's most popular products so I don't think they are terribly worried about building on competitor's technologies.

1. http://www.wiredtiger.com/

evil-olive · on Jan 10, 2019

3+ years ago, so it's entirely possible that things have changed since I left. I don't have any more recent information on the state of the system.

At least when I was there, the strong focus was always on adding new features (global & local secondary indexes, change streams, cross-region replication, and so on) to keep up with the Joneses (MongoDB et al).

Meanwhile, a bunch of internal Amazon teams were taking a dependency on it instead of being their own DBAs, and those teams didn't care that much about the whiz-bang features, they just wanted a reliable scale-out datastore that someone else would get paged about when some component failed.

Adding features at a breakneck pace while keeping up umpteen-nines reliability and handful-of-milliseconds performance meant tech debt and non-user-facing improvements, including WiredTiger, all got sidelined. Around the time I left, our page load was around 200 per week. That's one page every 50 minutes, 24/7, if you're keeping score at home.

manigandham · on Jan 10, 2019

According to this post [1] the WiredTiger project seems to have been cancelled after the acquisition.

https://news.ycombinator.com/item?id=13170746#13173927

talawahtech · on Jan 10, 2019

Given the scale and popularity of DynamoDB and the distributed nature you would think that they could hire multiple teams just to work on improving it, but I guess it isn't as simple as that.

I would love to get a behind the scenes look at the process of gradually improving the components of DynamoDB with better technologies, while still maintaining reliability and performance.

Kiro · on Jan 10, 2019

People downvoting one of the guys who worked on DynamoDB at Amazon, somehow thinking they know better. HN in a nutshell.

qorrect · on Jan 10, 2019

You have been downvoted.

erikig · on Jan 10, 2019

It would be nice if Amazon provided an API to access the data via SQL alongside the MongoDB API; I've seen quite a number of organizations migrate from mongo to Postgres once they get out of the rapid development phase. This would make that transition butter smooth.

ahachete · on Jan 10, 2019

That would make the internal representation used an "API" and thus won't be able to change it in the future.

Apparently, they are using a 1:1 mapping between a collection and a table. Either by flattening the document or by using jsonb or equivalent. I'm not a big believer this is good for performance reasons, at least compared to a more normalized approach like the one we did for https://www.torodb.com But they may change it in the future --if they don't expose the SQL API to their internal representation.

scarface74 · on Jan 10, 2019

Anecdote:

I led a C# project where we could seamlessly switch back and forth between Mongo and SQL Server without changing the underlying LINQ expressions.

We sent the expressions to the Mongo driver and they got translated to MongoQuery we sent the expressions to Entity Framework and they got translated to Sql Server.

manigandham · on Jan 10, 2019

C# is ahead of the game with LINQ, expression syntax, and the entire Rosyln platform. Passing an IQueryable<> around that can be interpreted and transformed for multiple backends is a incredibly productive. I wish more people knew about this, and .NET in general.

scarface74 · on Jan 10, 2019

And I’ve seen a few Java and Javascript libraries that purport to “implement LINQ” and they don’t get the power of LINQ is not the syntax, it’s that LINQ translates your code to expressions that can be parsed and translated to any back end query - it’s not just an ORM.

I’ve seen a LINQ to REST API provider.

talawahtech · on Jan 10, 2019

I doubt that they actually built this on top of Postgres. They probably just integrated the WiredTiger[1] storage engine used by Mongo with their Aurora storage subsystem.

I am however really hoping Amazon provides a MySQL 8.0 compatible version of Aurora with full support for its new hybrid SQL and Document Store interfaces[2] courtesy of the X DevAPI[3] and lightweight "serverless" friendly connections courtesy of the new X Protocol.

That way your don't have to choose just one approach, and you can have your data in one place with high reliability and durability.

My ultimate pipe dream would be that they also provided a redis compatible key/value interface that allows you to fetch simple values directly from the underlying innodb storage engine without going thru the SQL layer, similar to how the memcached plugin currently works[4]

1. https://github.com/wiredtiger/wiredtiger

2. https://mysqlserverteam.com/mysql-8-0-announcing-ga-of-the-m...

3. https://dev.mysql.com/doc/x-devapi-userguide/en/devapi-users...

4. https://dev.mysql.com/doc/refman/8.0/en/innodb-memcached.htm...

UlfWendel · on Jan 23, 2019

What's the motivation for a faster access path to InnoDB: performance?

X DevAPI and X Protocol/X Plugin could team up and map K/V style access to the server internal InnoDB API instead of using a SQL service as it is currently done. They could try to do it "transparently" or let you set hints. Whatever is desired from an application standpoint.

mrep · on Jan 10, 2019

> I doubt that they actually built this on top of Postgres.

Maybe not (but OP makes a lot of good points for why it is), but it is still based on the aurora limits, 64TB of size, 15 low latency read replicas in minutes, and presumably 1 write capacity which makes it a laughable nosql system since it cannot scale past 1 servers write capacity.

sorenbs · on Jan 10, 2019

Are you aware that they are working on multi-master for Aurora? https://aws.amazon.com/about-aws/whats-new/2017/11/sign-up-f...

StreamBright · on Jan 10, 2019

And there are organizations who can do rapid development in Postgres.

preetamjinka · on Jan 10, 2019

I think they're built on a common storage system, just like the MySQL compatible version too.

Scarbutt · on Jan 10, 2019

Aurora Postgres isn't really Postgres(only compatible), or is it?

idunno246 · on Jan 10, 2019

The storage engine is different, but the frontend is actual Postgres

garyclarke27 · on Jan 10, 2019

Interesting- Reminds me, I wish Postgres would increase this default identifier limit to 255 - or make it easily user configurable. It can be done by a sophisticated user, but only via special compilation and only when first installed, which is a right pain. I find Long identifier names useful for constraint names and foreign key names auto generated by code.

ahachete · on Jan 10, 2019

Corollary: PostgreSQL is also web-scale! ;P

moltar · on Jan 10, 2019

Wasn’t latest Mongo built on Postgres backend too?

sbr464 · on Jan 10, 2019

I think your thinking of the BI Connector for analytics/SQL compatibility.

From the docs:

Changed in version 2.0: Version 2.0 of the MongoDB Connector for BI introduces a new architecture that replaces the previous PostgreSQL foreign data wrapper with the new mongosqld.

manigandham · on Jan 10, 2019

No, MongoDB has its own storage engines, it's not built on top of anything else.

abrookewood · on Jan 10, 2019

I was reading a post [0] by Brian Cantrill that predicted this would be the result of licences like the SSPL. I instinctively disagreed with him, but it turns out he was right: "The cloud services providers are currently reproprietarizing all of computing — they are making their own CPUs for crying out loud! — reimplementing the bits of your software that they need in the name of the service that their customers want (and will pay for!) won’t even move the needle in terms of their effort."

[0] http://dtrace.org/blogs/bmc/2018/12/14/open-source-confronts...

andyidsinga · on Jan 10, 2019

I liked that post - this especially near the end, referring to Adam Jacob and some of his posts.

> Adam has endured the challenges of the open core model, and is refreshingly frank about its economic and psychic tradeoffs. And if he doesn’t make it explicit, Adam’s fundamental optimism serves to remind us, too, that any perceived “danger” to open source is overblown: open source is going to endure, as no company is going to be able to repeal the economics of software. That said, as we collectively internalize that open source is not a business model on its own, we will likely see fewer VC-funded open source companies (though I’m honestly not sure that that’s a bad thing).

api · on Jan 10, 2019

Years ago I realized that a hidden driver for the growth of cloud is this. The cloud is DRM, and almost uncrackable DRM at that since you have neither the code nor the hardware.

ZainRiz · on Jan 10, 2019

Basically. The one who controls the servers is King.

The code needed to run those servers is the secret sauce and a huge competitive advantage, but with open source software you're giving away the secret sauce and the business victory goes to the one with the most business friendly servers

(There are many dimensions to "business friendly", a big one of which is "it's easy for us to start using this additional service since we're already paying this company for other services")

aortenzi · on Jan 10, 2019

Software and the control plane is the razor, compute resources are the blades. Amazon's software is its loss leader.

halbritt · on Jan 10, 2019

Not really. There's roughly a 50% premium vs raw EC2 instances for any RDS related service. The crux is keeping the operating cost below that delta.

kemitchell · on Jan 10, 2019

If you haven’t already, do a search on FSF.org for “service as a software substitute”.

dman · on Jan 10, 2019

What is the way out? Would love to hear from people.

Lazare · on Jan 10, 2019

Ultimately, Cantrill put it well:

> ...for those open source companies that still harbor magical beliefs, let me put this to you as directly as possible: cloud services providers are emphatically not going to license your proprietary software. I mean, you knew that, right?

MongoDB Inc cannot make Amazon pay commercial license fees. That is not a thing that will happen. They have a lever in front of them with two positions, one of which is "large cloud companies might use your software for free", and the other is "large cloud companies will not use your software at all". They didn't like the first option, so they gave the lever a yank, but they're not going to like the second option, and there is no third option.

The way out is not to try and build a business on the assumption that people who have no interest, requirement or reason to give you large amounts of money will inexplicably do so anyhow. :)

This thread already has people eyeing up DocumentDB's pricing and comparing it favourably to MongoDB's competing Atlas service, and it's almost unthinkable to suggest that Atlas can compete on price with Amazon. The way to win this game is not to play; the rules are not in your favour.

ProblemFactory · on Jan 10, 2019

> They have a lever in front of them with two positions, one of which is "large cloud companies might use your software for free", and the other is "large cloud companies will not use your software at all".

Was that even the goal? My impression of the licensing change was not that they expected to Amazon to pay fees for offering a hosted MongoDB service. It was instead to lock Amazon out, and keep MongoDB Inc. as the only "cloud provider" of a hosted MongoDB service (perhaps still on top of AWS but with separate management interface).

Lazare · on Jan 10, 2019

> My impression of the licensing change was not that they expected to Amazon to pay fees for offering a hosted MongoDB service. It was instead to lock Amazon out, and keep MongoDB Inc. as the only "cloud provider" of a hosted MongoDB service.

Oh absolutely. I don't think they really thought they could force Amazon to license MongoDB, but I do think they believed they could force Amazon to not offer something that competed directly with Atlas.

That hasn't worked out for them very well.

(Not that I think leaving the license alone would have worked out any better. To the best of my knowledge, the MySQL, Postgres, Redis, and Memcache projects have not particularly benefited from Amazon building RDS and Elasticache on top of them, and I see no reason to think Amazon would have contributed a bunch of great patches upstream for MongoDB either.)

ProblemFactory · on Jan 10, 2019

I think PostgreSQL does benefit from Amazon RDS and Google Cloud SQL indirectly.

Unlike MongoDB, it is a real volunteer-led open-source projects, and the goal is to provide an excellent database to users rather than make money. Having easy-to-use cloud hosted versions available helps with attracting users, mindshare, and perhaps in the long run developers to the project itself. Having cloud hosted versions from big vendors means that it's easy to justify "we'll use PostgreSQL for this project" to management or clients.

voidfunc · on Jan 10, 2019

Could Mongo or other companies use the Oracle v. Google precedent regarding API copyright to extract money from competitive vultures like Amazon?

Lazare · on Jan 10, 2019

I hope like hell that horrible precedent doesn't stand; if you find yourself on Oracle's side you may want to rethink some of your priors. Regardless of the rights and wrongs of this specific issue, some solutions are worse than the problems they solve.

jahewson · on Jan 10, 2019

No, because the API was made open source as it’s just part of the MongoDB source code. Future changes to the API made under Mongo’s new license would in theory be eligible for such protection - but what that means in practice is anyone’s guess. For starters they would need to be “substantial”. I can’t imagine Mongo going down that road.

naner · on Jan 10, 2019

Oracle are the good guys in this scenario?

spullara · on Jan 10, 2019

They always were the OK guys in that argument. Google invented a whole new VM and bastardized the language just to get out of a $1/device licensing fee for mobile uses. The Java ecosystem has been irreparably harmed by Dalvik and its lack of support for more modern versions of Java.

On another note, anyone that doesn't think API design is a creative endeavor and worthy of protection probably has never made a great API before. It may be OK to accept that and also let other people use the API for free but I think ruling that it isn't is BS.

elygre · on Jan 10, 2019

I also always found it amusing that people thought API design was not creative and protectable.

Like, “how many ways can you do a date api”, and then turn around to look at the original java Date api, the Calendar api, JodaTime and JSR310.

int_19h · on Jan 10, 2019

An API is just a collection of facts of the form, "if the system gets input X, the system produces output Y". And facts shouldn't be copyrightable.

deanCommie · on Jan 10, 2019

You could describe inventions as "facts" too, are you saying that inventions shouldn't be patentable as well?

Maybe the fundamental properties of the universe aren't copyrightable/trademarkable/patentable, but what you CHOOSE to do with those - what API you design or what widget you build out of it certainly is.

int_19h · on Jan 11, 2019

Patents and copyrights are two very different things, though. I don't know if APIs are patentable, but that's a very different question. Has anybody ever successfully patented an API?

zbentley · on Jan 10, 2019

> The Java ecosystem has been irreparably harmed by Dalvik and its lack of support for more modern versions of Java.

So if ReactOS gets popular but doesn't support Windows 10 APIs, will it be harming the windows ecosystem? If popular implementations of a tool exist that don't chase other (official or not) implementations' features but still get lots of users, that probably means that the popular implementations provide other benefits.

> API design is a creative endeavor

I agree with that.

> and worthy of [legal] protection

But not that.

spullara · on Jan 11, 2019

With current copyright law you basically can't agree with both of those statements as they are mutually exclusive.

sfifs · on Jan 10, 2019

Not likely because the Apache 2 licence version they are compatible with includes an explicit copyright and patent licence grant.

Cpoll · on Jan 10, 2019

I don't think so, isn't the API version they're using still covered under an Apache license?

drb91 · on Jan 10, 2019

Even if true this would not be a win for open software.

ifcologne · on Jan 10, 2019

Pricing for smaller workloads is better on MongoDB Atlas right now. The DocumentDB performance pays of for super large collections and really high read/write workloads.

zimbatm · on Jan 10, 2019

As a DevOps consultant; if Amazon is already setup as a vendor, I would just use DocumentDB. Setting up a vendor can be a major hassle and is not worth the saving of a few $$ per month. It's also much cheaper than spinning up and managing a EC2 instance with MongoDB installed on it since most of the operational knowledge can be deferred to AWS.

SpicyLemonZest · on Jan 10, 2019

There's no secret formula to stop people from competing with you. If MongoDB Inc is successful, it should be because they run a good document-database-as-a-service people want to use, not because they earn indefinite seigniorage from launching a popular open source project.

sipos · on Jan 10, 2019

> If MongoDB Inc is successful, it should be because they run a good document-database-as-a-service people want to use

Unfortunately, something that is good, and something people want to use, are not the same thing. People will use AWS's offering even if it is worse and harder to use, because it is bundled as part of AWS. That is a safe option (it can't be that bad if AWS has released it) and an easy one (no need to think about what to use, you are using AWS already.

Being a big provider of virtual machines puts them in a very strong position to sell loads of other stuff.

KaoruAoiShiho · on Jan 10, 2019

But isn't it wrong to place all economic value in the hosting layer rather than the software layer?

systoll · on Jan 10, 2019

Maybe, but MongoDB did that by themselves by making the software layer free.

gaius · on Jan 10, 2019

because they run a good document-database-as-a-service

Spoiler: they do not

tebs1200 · on Jan 10, 2019

They don't?

I've been using Atlas for over a year now and I don't have any complaints. It was super quick to set up and I've never had a single issue in terms of performance or availability.

What have your issues with Atlas been?

winrid · on Jan 10, 2019

In comparison to what?

manigandham · on Jan 10, 2019

The problem is VC backed companies expecting ridiculous multiples.

There are thousands of very successful and profitable software companies that make proprietary products and offer managed services, training, support, etc. It's a great business, but it's not going to offer 100x wild startup growth.

These companies would all do fine if they bootstrapped or took a small seed/loan instead of taking on 100s of millions.

DannyBee · on Jan 10, 2019

Stop assuming the value in the development ecosystem belongs to you (and should be extractable as money). It doesn't.

Realistically, the next step you will see, unless something changes, is that they will start going after people for API duplication. They have precedent (currently) on their side in the US.

None of the reasonable players will touch this, but you can be sure some VC backed "open source" player will be willing to touch this 3rd rail in exchange for a Series A.

amyjess · on Jan 10, 2019

> Realistically, the next step you will see, unless something changes, is that they will start going after people for API duplication. They have precedent (currently) on their side in the US.

IANAL, but since they already released the API as open-source under the Apache 2.0 license, this avenue is closed off to them.

metheus · on Jan 10, 2019

The API is implemented on the server side; the licensing of the MongoDB drivers is irrelevant.

kemitchell · on Jan 10, 2019

Arguendo: Cloud providers seem to be assuming that the development ecosystem belongs to them. They are extracting tons of money. Why don’t they just accept that hosting storage and compute will become commodity services, driving margins toward zero, and give up?

Plenty of reasonable players will touch Oracle v. Google going forward. I’m as eager to debate the opinion as other counsel. But procedural history demonstrates directly, not theoretically, that it’s effective against tech giants.

In the matter of API Owner v. Google, if API Owner touches that “third rail”, Google gets the shock.

dman · on Jan 10, 2019

By "they" do you mean Mongodb?

cortesoft · on Jan 10, 2019

The linked blog sorts of hints at it, but the way out is to not try to build business models around people paying directly for some sort of license.

Successful open source does not require someone making money off developing it. It is successful when it is something that helps a profitable company but is not core to their business; then, they benefit from making it open source and having everyone contribute to its development and maintenance.

Or, you make money off support and consulting.

The key take away is, you aren't going to make money off selling licenses for open source. Which is good, I think.

kemitchell · on Jan 10, 2019

SSPL, the new license for Mongo, isn’t written to force developers already using Mongo to build apps to pay license fees. It’s designed to stop cloud companies from offering managed Mongo with closed service rigging.

I suppose Mongo could sell exceptions to cloud companies, the way other companies dual license libraries or frameworks. But even Mongo’s bread and butter paid deals aren’t primarily about alternative license rights for open code. They’re about closed add-on code and services, as you describe.

Dual licensing, on its own, is an old and plenty good model for funding development of open source code. I’ve heard wind of dual licensing deals done decades and decades ago, maybe even before GPLv2.

cortesoft · on Jan 10, 2019

Right, but the point of the article the GP linked was that expecting a cloud company to pay for a license for add-on code... instead, they are just going to write their own versions to work with the open source parts.

kemitchell · on Jan 10, 2019

You’re right about the article. But the SSPL approach is different from what we’ve seen from Redis Labs, Elastic, Cofluent, and Cockroach. SSPL applies to Mongo’s “open core” itself. The other companies have applied new terms to previously “closed shell” add-ons.

The question is whether giants will pay the cost of reimplementing entire stacks, core and shell. I don’t have the time myself, so I’ll have to wait on a report about how compatible AWS DocumentDB really is.

Given AWS history, I’d expect they’ll get most of the popular functionality, most of the way, but gotchas will abound, and they’ll never hit 100%. Switching cost of code won’t bottom out unless DocumentDB takes lead mindshare, which closed clones rarely manage.

bilbo0s · on Jan 10, 2019

Not sure this would fall under SSPL in any case. It's clear that what Amazon is doing is using Postgres under the hood, not really mongo. So I'm not sure how that would work if you make an interface shim to make postgres look like mongo, are you then subject to the mongo license? the postgres license? the apache 2.0 mongo api license? all of them? what if clauses of them are mutually exclusive? etc etc etc.

Just at a cursory glance it certainly seems like only the apache 2.0 mongo api license would apply. But I guess mongo could try to force the sspl on amazon?

ISV_Damocles · on Jan 10, 2019

Now I kinda hope Oracle decides to buy out MongoDB and integrate it into their own cloud. Then Oracle can decide to pull the same bullshit that they did with Google over the Java APIs with the MongoDB APIs but now against their current enemy Amazon (and Microsoft, too).

Then a combined Google + Amazon + Microsoft may finally be able to reverse the API Copyright insanity that is hovering ominously over the tech industry, and Oracle can continue to be a shining city upon a hill of shitty technologies you should never allow your business to adopt.

wmf · on Jan 10, 2019

AWS is preemptively defensive about API licensing claims: "Amazon DocumentDB implements the Apache 2.0 open source MongoDB 3.6 API".

christkv · on Jan 10, 2019

I think they are referencing the drivers which are licensed under Apache 2.0.

giorgioz · on Jan 10, 2019

I've always seen Google+Android as the good guys that gave Java new life while I saw Oracle has the bad guys that bought Sun and killed Java.

buremba · on Jan 10, 2019

I don't understand why people are reacting to it so aggressively. That's basically how AWS works, they did the same to Apach Kafka with Kinesis, Prestodb with Athena, PostgreSQL and MySQL with Aurora, Redis with ElastiCache and many others over the last 4 years so it's not new.

It took too long for the open-source community to figure out that the cloud providers are killing them, now it's too late. Well played, AWS.

driverdan · on Jan 10, 2019

> It took too long for the open-source community to figure out that the cloud providers are killing them

How are service providers killing FOSS? That doesn't make sense. Permissive FOSS licensing allows anyone to use their software, regardless of how it's used, and that's how it should be.

keepper · on Jan 10, 2019

Do you get to see AWS's source code for these services?

No...

That's how it's killing "FOSS". Extend and Extinguish. This is not a new playbook.

SilasX · on Jan 10, 2019

I don't think you get to see MongoDB's source code for the enterprise edition either (though I couldn't quickly verify on Google).

keepper · on Jan 10, 2019

Of course they provide source! Source rpms and tgz are downloadable.

Enterprise isn’t gpl, but source is provided.

(This could have been easily answered with a google search, as you pointed out)

SilasX · on Jan 10, 2019

No, I meant that I did search it on Google, and couldn't easily see from the results which case it is. Google "mongodb enterprise source code" -- which one answers the question?

If it were so easy, you could have provided the citation yourself in that comment.

konschubert · on Jan 10, 2019

Well, people are angry about that, that’s why they react aggressively.

buremba · on Jan 10, 2019

Open-source gave AWS the ability to monetize their software so the software companies should be careful enough to prevent any other big company to steal their software and use their name to make money.

I think that it's too late considering AWS already did that to most of the industries but here is Hazelcast's take: https://www.linkedin.com/pulse/open-source-needs-protect-its...

dstaley · on Jan 9, 2019

If I'm reading the pricing page correctly, DocumentDB would run a _minimum_ of $200/month. That's for the smallest instance and no storage or I/O. Kind of steep if you ask me.

Trisell · on Jan 10, 2019

We were paying $5k a month for Atlas. So while it's not 'cheap' for a hosted solution it's cheaper. And the autoscale and RR is better DR is super configurable. And then there is this line.

'Together with optimizations like advanced query processing, connection pooling, and optimized recovery and rebuild, Amazon DocumentDB achieves twice the throughput of currently available MongoDB managed services.`

rafaelturk · on Jan 10, 2019

Can you please elaborate? This was launched today, you had access to the new feature in advance?

geofft · on Jan 10, 2019

MongoDB Atlas is the name of the cloud service run by MongoDB themselves, with which Amazon DocumentDB competes. https://www.mongodb.com/cloud/atlas

owenmarshall · on Jan 10, 2019

Not OP, but that wouldn’t necessarily be a surprise. As customers make product requests to AWS they can be tapped to test upcoming launches - anything from pre-release testing to very early alphas.

jon-wood · on Jan 10, 2019

I can confirm that this is very much a thing that they do. We have an account manager able to bump feature requests over to the appropriate product managers, and have been involved in pre-release testing of features that we expressed interest in.

Trisell · on Jan 10, 2019

As others have said we had MongoDB Atlas. And it is basically mongodb ran in aws with a pretty interface to do basic things like whitelist ips and another such functions.

a13n · on Jan 9, 2019

Yeah, that's pricy. They're definitely not going after early-stage startups then.

But if you have a medium-sized data set (eg. 50+ GB), this is definitely competitively priced. More RAM, storage, compute than Mongo Atlas and Compose for less money.

Here's hoping they introduce cheaper options!

orf · on Jan 9, 2019

50GB is a really small data set.

a13n · on Jan 10, 2019

Eh, by what measure? Realistically it's probably bigger than 90% of all Mongo datasets.

It's tiny if you're a massive company and it's massive if you're a tiny startup.

q3k · on Jan 10, 2019

50GB easily fits in RAM. It's a small dataset.

orf · on Jan 10, 2019

If you can run the dataset comfortably on a Macbook then it's very, very small.

Heck, you can even just use grep over 50GB reading straight from disk. It's tiny.

wpietri · on Jan 10, 2019

Is an argument based on the premise that relative terms have absolute meanings a good use of people's time here?

kstrauser · on Jan 10, 2019

A recent work Slack chat had a dev asking what a particular table contained. They were going through our data inventory and found a randomly-named table 18TB in size. When I ran "select count()" against it, I got back 5,325,451,020,708 rows (that's a copy-and-paste).

50GB isn't trivial, but it's utterly manageable.

Aeolun · on Jan 10, 2019

It seems a bit wrong if you have a 18TB table but no idea what it contains...

kstrauser · on Jan 10, 2019

It was a temp table that we hadn't garbage collected yet. We don't make a habit of leaving that much junk data around, but it bumped our monthly storage bill several percent, not like tripled it.

apta · on Jan 11, 2019

Was this a relational or NoSQL DB?

kstrauser · on Jan 14, 2019

It's primarily in things like Spark and Snowflake that act like relational DBs as long as you squint the right way.

djohnston · on Jan 10, 2019

in my experience it qualifies as "medium"

orf · on Jan 10, 2019

If it can be stuck in a sqlite database and run on a developer laptop, then no, it is not medium by any standard.

Please elaborate why you think 50Gb is anything other than a small dataset that can fit in memory on any half-decent server though.

djohnston · on Jan 10, 2019

[edit] in the spirit of not being a condescending tool to you, i'll replace my original reply with this: https://en.wikipedia.org/wiki/Long_tail

privateSFacct · on Jan 10, 2019

I'm assuming this is a joke. You can run databases that size without any of the fancy scalability stuff - no sharding no anything. I'd actually recommend that, it's makes admin super easy!

rafaelturk · on Jan 10, 2019

Besides that AWS will charge per transaction (at 0.2 per million) outrageous given that you already pay per instance.

Correct pricing strategy needs to be per request or per instance, AWS is charging for both

znep · on Jan 10, 2019

I would guess the pricing model is actually closely related to the main dimensions of their costs and is quite valid.

The key point is illustrated by this quote from their main landing page: "storage and compute are decoupled, allowing each to scale independently".

This suggests it is built on top of the Aurora storage layer, or something similar, as other comments have suggested. This means there is a real cost per I/O operation because you aren't limited by the physical hardware of the compute instances, you get "free" storage nodes underneath that do much more than traditional storage and thus have to be built into the pricing structure.

It is definitely not going to be the cheapest possible solution for all use cases, but do the math before you reject it. If it does follow the Aurora pattern, then the number of I/O operations you are billed for will be a lot less than you may think because, to use another quote from their product page, "Amazon DocumentDB reduces database I/O by writing only database changes to the storage layer, avoiding slow, inefficient, and expensive data replication across network links". I think that quote is harder to understand without background as it sounds like market speak, but lines up very well with some of their in depth Aurora whitepapers, such as https://www.allthingsdistributed.com/files/p1041-verbitski.p... Again, I haven't seen evidence this is based on Aurora but the details they talk about line up really well.

mabbo · on Jan 10, 2019

The correct pricing strategy of any product is "whatever the customer is willing to pay for it". If you feel the price is too steep for your use-case, then don't buy it.

LamaOfRuin · on Jan 10, 2019

That's rarely actually true for anyone that wants to operate for more than a short time period. There are significant costs to gouging your customers. Anything from it being illegal, to it encouraging competition and your customers being motivated to actively flee you and shit on your reputation. The correct pricing strategy for people that don't have a long term enforceable monopoly is "whatever most customers are willing to reasonably happily pay"

solatic · on Jan 10, 2019

The minute that you have customers paying any amount at all, you set yourself up for possible competition undercutting you on price. The truth is, whether you have a great or poor relationship with your customers, unless you have legal protections you have very little control over whether competitors will eventually enter your market or not. So you need to always operate as if there is competition breathing down your neck.

Pricing strategy has little to do with customer happiness in aggregate. Every price will make some customers happy, and other customers feel gouged, because different customers extract different amounts of value from your product. The key to protect yourself from competition isn't to spend time worrying about how pricing affects your aggregate customer volume, but about whether your customers are happy. Maybe some customers are unhappy because they feel gouged. Maybe you could make them happier by reducing prices. But maybe, you're better off letting them go, if they represent a small minority of your users, and instead focus on what a majority of your users might appreciate more - better service, relevant features, etc. which make them happier.

kllvql · on Jan 10, 2019

I think you've hit on what always bothers me about this sentiment. It is obvious that at any point in time you can charge the maximum customers are willing to pay, but that allows for disruption through the channels like competition. The opposite where you charge the minimum to continue providing the goods or services seems optimal, though, leads to a company with zero profits that is unattractive to investment. Is there any literature on how to identify the optimal point of "whatever customers are willing to reasonably happily pay"? Businesses successfully exist on many points in the spectrum of zero profits to most profits the market will bear, but I'd be interested in anything discussing optimality.

[Edit] Amazon employee working in Physical Consumer (not AWS). Asking out of personal curiosity.

LamaOfRuin · on Jan 10, 2019

I'm not an economist, and can't point to to anything in particular, but I would be skeptical of anything that claimed a general approach to that. "Optimal" depends entirely on what you're optimizing for, which is basically an infinite possibility space. I could need a significant amount of revenue immediately to accomplish a desired business development, or I could have plenty of cash and want to build a large and loyal long term customer base at the cost of immediate profit. As you say, successful businesses exist doing pretty much everything. The only limiting factor is being a viable ongoing concern (and that can just mean having a rich backer). I'm sure there are things discussing optimizing based on small slices of the possibility space though (but all the normal caveats about economists making dumb assumptions that rarely apply to humans apply even to those).

mabbo · on Jan 10, 2019

> Amazon employee working in Physical Consumer (not AWS)

You too? I'm in AFT. I posted the original "whatever the customer is willing to pay" comment. Mostly just offhand and yeah there's a lot of nuance to it.

I don't mean that anyone should want to individually gouge each customer, but when running a business one should pick a price whereby the total long term profit is maximized.

Your pricing determines the number of customers. Your pricing also determines the profit on each customer. But choosing your pricing strategy correctly, you should have some people who won't buy your product.

manigandham · on Jan 11, 2019

>> "whatever most customers are willing to reasonably happily pay"

Do you have a better idea of what this is then they do?

Considering they already have launch customers actively using this product and there are several comments on this page saying pricing is better than MongoDB?

danpalmer · on Jan 10, 2019

AWS very often charges across multiple axes. I tried to model out our Cloudfront charges and they charge there for 3-4 different factors, each of which varies in pricing by region.

I think the idea is that by charging precisely where they incur costs, they can be much more reactive to different usage patterns, and therefore be more competitively priced overall.

Although it certainly does create lock-in due to not being able to figure out your billing and accurately model alternatives.

ma2rten · on Jan 10, 2019

Seems to be really aimed at businesses which want to get off of MongoDB desperately.

threeseed · on Jan 10, 2019

Not really at all.

It's targeted at enterprises like mine who currently use MongoDB on premise and are looking for a managed solution. The advantage of AWS over Atlas is you can use the same security and governance approaches e.g. IAM policies, ADFS/SAML integration, Cloudwatch/Cloudtrail etc.

spydum · on Jan 10, 2019

Exactly- atlas kills me without any type of SSO options for the control plane.

Also I feel that they HAD to offer this to counter Azure CosmoDB

Aeolun · on Jan 10, 2019

So they’ll have a huge target market.

jshen · on Jan 10, 2019

I’m not a fan of mongo, but I’ve run a fairly sizable enterprise platform on it for nearly a decade and we haven’t had any major issues that would make replacing it an urgent desire.

RcouF1uZ4gsC · on Jan 10, 2019

Once there is AWS version, it seems like a matter of time before it becomes the safe choice. Nobody got fired for using AWS.

vaer-k · on Jan 9, 2019

Finally an AWS service where the name makes sense and describes what it is. I hope this is the start of a trend.

kerng · on Jan 10, 2019

Yeah, they copied Azure basically. DocumentDB was the name of an Azure service in past, interestingly it offered MongoDB, Gremlin and other API gateway options. Its called Azure CosmosDB now.

rjbwork · on Jan 9, 2019

I love Azure for this. The names are almost all extremely straightforward. There are a handful that have made the jump from confusing to straight forward, and a handful that have made the jump from straightforward to confusing (CosmosDB, formerly DocumentDB, chiefly comes to mind).

Someone1234 · on Jan 9, 2019

Agreed. Too bad the Azure portal is the polar opposite. AWS, for all its faults, is mostly just a boring HTML portal but it works.

Azure tried to get fancy, with side sliding panels all over the place, and it is barely useable. The nicest thing I can say is it is "quirky." It isn't really productive however, particularly not on my 1080p monitor at Windows 10's default 125% DPI.

I literally quit Azure's Application Insights and went back to Google Analytics simply because I hated the Azure UI with a burning passion of a thousand suns.

The concept of writing queries is good, but if that's the only way you can get at your data you better make it damn easy, and they didn't. I'm sure for full time data pros it is a dream however.

dharmab · on Jan 10, 2019

Azure Portal feels like if someone tried to make the Xbox 360 blade interface into an admin tool, without first asking the admins what they needed.

rjbwork · on Jan 10, 2019

That's interesting. I actually quite like it. I can build monitoring dashboards for our various services an see how something I don't need to monitor is doing just by going to the panel for it. To each his own I suppose.

aynsof · on Jan 10, 2019

Except for the fact that Azure names seem to change once per year.

Our Azure SA was giving us a presentation and actually got confused himself. "So that's TFS... I mean VSTS... Actually wait, it's Azure DevOps now?"

innocentoldguy · on Jan 10, 2019

I worked at Microsoft for a while and I swear most of their "upgrades" are nothing more than renaming things and juggling menu items around so people can't find them.

coredog64 · on Jan 10, 2019

Just wait until next year when it’s called TFSHub.

titanix2 · on Jan 10, 2019

Yes the names are good but they still suffer for the MS illness: they change every few years. I also agree with other comments about the portal UI. Heck, it's supposed to be a professional tool...

napsterbr · on Jan 9, 2019

Surprised they didn't go with a 3-letter acronym. AWS DDB.

Recently I made a typo on a formal document. Wrote "AMI" when I meant to to write "IAM". Oops.

tj-teej · on Jan 9, 2019

DDB is typically used to denote DynamoDB

stevehawk · on Jan 10, 2019

i honestly thought that was his joke

jkchu · on Jan 10, 2019

I would have preferred AWS D2B.

Corrado · on Jan 10, 2019

My brain keeps trying to parse this into either DB2 or some Star Wars reference (R2DB, RD2B, etc.)

bnjmn · on Jan 10, 2019

Looking through the supported APIs (https://docs.aws.amazon.com/documentdb/latest/developerguide...), it appears DocumentDB has no support for Mongo's oplog (https://docs.mongodb.com/manual/core/replica-set-oplog/), or change streams (https://docs.mongodb.com/manual/changeStreams), which I guess is no surprise because change streams were introduced in Mongo 4, whereas DocumentDB copied the 3.6 API. So DocumentDB seems much less useful as a reactive data store than MongoDB.

In other words, DocumentDB is only a drop-in replacement for MongoDB if you weren't using any of the features Amazon decided not to support.

Happy to be corrected if I'm misreading the documentation!

InspiredIdiot · on Jan 10, 2019

Also the aggregation pipeline is seriously hobbled with way more No-s than Yes-es over here https://docs.aws.amazon.com/documentdb/latest/developerguide...

ahachete · on Jan 10, 2019

I agree with you.

Having said that, when we were working on https://www.torodb.com we discussed how we'd implement the oplog. And actually, based on PostgreSQL's logical decoding (LD), it wouldn't have been a great deal (there are some gotchas, but LD brings much of what you need. So I won't be surprised if this would be implemented sooner than later.

indogooner · on Jan 10, 2019

Interesting. I think the sole purpose of this product is to wean existing Mongo customers (3.6-). And only those customers who are happy with Mongo API but not MongoDB itself. Is that such a huge market? Would be curious to see how this solution is adopted.

0xFFFF0000 · on Jan 10, 2019

Weird question: Could Microsoft sue Amazon here for infringing on the DocumentDB name? I mean Microsoft's DocumentDB was among the first to even have such a MongoDB layer also) and that was like 3 years ago.

Given that current Amazon leaders actually came from Microsoft's data platform group this leaves a bit of a bad taste behind.

I'm not working for either company.

jon-wood · on Jan 10, 2019

My assumption is that DocumentDB falls into a category of being so generic you can't trademark it, or otherwise claim exclusivity to it. Its literally just describing the fact this is a database for documents.

SmellyGeekBoy · on Jan 10, 2019

Bear in mind we're talking about the company that trademarked "Word" and "Excel" here.

amyjess · on Jan 10, 2019

For all any of us know, Amazon's lawyers already talked to Microsoft's lawyers about it and got permission beforehand.

See: Apple licensing the iOS name from Cisco before announcing the name change.

preetamjinka · on Jan 9, 2019

Sounds like this runs on the same storage service as Aurora.

preetamjinka · on Jan 9, 2019

Not sure why I'm getting downvoted. The characteristics sound exactly like Aurora.

- "replicates six copies of your data across three AWS Availability Zones (AZs)" [0]

- "Amazon DocumentDB uses a distributed, fault-tolerant, self-healing storage system that auto-scales up to 64 TB per database cluster." [0]

- "When writing to storage, Amazon DocumentDB only persists a write-ahead logs, and does not need to write full buffer page syncs." [1]

[0] https://aws.amazon.com/documentdb/

[1] https://aws.amazon.com/documentdb/faqs/

jkchu · on Jan 10, 2019

I'm guessing if you would have included this reasoning in your original comment, then it wouldn't have been downvoted.

MrTonyD · on Jan 10, 2019

Yeah, downvoting is totally broken on Hacker News. Seems like anything that isn't immediately agreed-with gets downvoted. I don't know where all the small-minded people come from, but they seem to have found their home here.

ma2rten · on Jan 10, 2019

There is just some random noise. It might just have gotten one downvote. Now it's the top-comment.

ec109685 · on Jan 10, 2019

Good article on Aurora's design: https://www.allthingsdistributed.com/files/p1041-verbitski.p...

XorNot · on Jan 10, 2019

I'm pretty sure this is going to kill Mongo as a company dead. With this in existence there's literally no reason to use Atlas.

If they wanted to twist the knife they should get to work implementing a pass through migration option.

_wmd · on Jan 10, 2019

MongoDB still have a strong hand

- control of the client and particularly its exposed featureset

- due to that, also control of the protocol and the ability to, for example, insert legally protected strings in the style of the Apple SMC signature into the handshake.

- ability to gate new features on the presence of an object like a copyrighted text, trademark, or even a crypto signature

- ownership of the name. AWS are pissing in Mongo's pool marketing themselves as compatible, and there are a variety of ways it could be made to backfire, if it were in Mongo's interests to encourage that outcome

- AWS focuses on breadth and very rarely nails any particular service. Their hosted Postgres for example still does not expose core features years later

- Following from that, AWS services on the whole are rarely best-in-class in terms of raw performance. I imagine Mongo could continue to easily compete on benchmark results running on AWS own infrastructure

I think this is a really interesting case, far more interesting than the technical minutia of Just Yet Another AWS service. It does not sit well with me whatsoever that they're basically ripping off a much smaller company's core tech while simultaneously borrowing their trademark (in a legally acceptable manner) as part of the marketing, but I also find it hard not to see a ton of potential upside from this for Mongo

ec109685 · on Jan 10, 2019

The client change would never work. The client is licensed as lgpl, so if they tried to pull any funny business like that, it would be instantly forked and if’d out.

_wmd · on Jan 10, 2019

As the person suggesting it, it's difficult to imagine how it could never work considering I haven't managed to figure out all the possible combinations in which such a strategy could be applied. Finally, it is quite exasperating to call this kind of strategy "funny business" in a thread about their core tech being ripped off by a megacorp

AmericanChopper · on Jan 10, 2019

Mongo and Amazon are both large companies, but mongo are the only ones here trying to stiff their customers. Selling a product that uses open source software is not ripping anybody off, and the only party in this situation who are upholding open source values are Amazon. The only thing I can take away from this is that if I get too successful using Mongo technology, that they’re happy to change their license to try extort money from me. I find this to be especially greasy since open source product like this become successful because of their open source nature. They exist because the community that exists around them, and for them to turn around and decide to spit in our faces by dictating how we can consume the product just makes me hope their product is forked and that they go under.

ec109685 · on Jan 10, 2019

Their API is core tech? That’s like saying there shouldn’t be separate implementations of Java, right?

alittletooraph · on Jan 10, 2019

Have you heard of Amazon Elasticsearch Service, launched in 2015? Elastic is doing fine.

juliansimioni · on Jan 10, 2019

This is hopefully a good counterexample. Amazon's Elasticsearch Service is pretty bad (poor general performance, very slow to make cluster changes/launch new clusters, etc).

But I can't help but think Amazon can and would easily fix those things if they mattered. Amazon's hosted Elasticsearch is a lot cheaper than Elastic's, and I'll bet that's enough to get people to use it.

ignoramous · on Jan 10, 2019

> poor general performance

By poor performance, I assume you mean IO? AWS Elasticsearch has supported i3 instance type (nvme on-instance storage) for well over a year now [0]. Additionally, you could enable slow-logs to catch perf issues yourself [1]

> very slow to make cluster changes

Scale-out and access-policy changes happen in-place now and so happen much faster than they used to be.

> launch new clusters

In my experience, it depends on the cluster size, but usually, I see cluster being up in 20m. That's nice given that it sets up pretty much everything (spin up instances, apply access policies, run health checks, enable cloudwatch monitoring, snapshots, create route53 records, integrate with cognito, enc-at-rest via KMS, spin up load balancers, setup vpc resources etc) on my behalf.

[0] https://docs.aws.amazon.com/elasticsearch-service/latest/dev...

[1] https://aws.amazon.com/blogs/database/analyzing-amazon-elast...

juliansimioni · on Jan 11, 2019

Actually not IO performance, but mostly CPU. Last time I tested (which admittedly was about a year ago), an AWS ES cluster was about 20% slower than a self-made cluster with the same instance types. Given that AWS ES clusters still cannot use C5 instances, which offer FAR better cost/$, the performance disparity today might be even larger.

I can also launch an Elasticsearch cluster myself in about 2 minutes via terraform, so 20 minutes is not super impressive.

That said I recognize Elasticsearch is actually quite a finicky beast to set up, and my setup only has to deal with the needs I have, and probably would be set up horribly for certain other people. I can see how a hosted system that has to deal with all the weird edge-cases of a few thousand customers would take longer to set things up.

scarface74 · on Jan 10, 2019

Elastic Co isn’t profitable by definition it isn’t “doing fine”.

From their SEC filing:

https://www.sec.gov/Archives/edgar/data/1707753/000119312518...

We have a history of losses and may not be able to achieve profitability or positive cash flows on a consistent basis. If we cannot achieve profitability or positive cash flows, our business, financial condition, and results of operations may suffer.

alittletooraph · on Jan 10, 2019

You're right they're not profitable — and neither is MongoDB — but the point is that AWS launched an Elasticsearch service 3 years prior to Elastic having a very successful IPO supported by stellar metrics (also found in the SEC filing you linked). So the statements made at the beginning of this thread are probably a bit premature.

scarface74 · on Jan 10, 2019

Only in tech do people think that a money losing company is “successful” because they were able to convince investors to buy stock instead of defining success as having a business model where income is greater than expenses.

In reality long term profitability is the only metric that matters for a corporation

privateSFacct · on Jan 10, 2019

And in contrast to these startups with their "success" AWS is printing cash for amazon which releases surprisingly few "metrics" beyond $ in and $ out.

scarface74 · on Jan 10, 2019

And at the end of the day. What else matters when measuring whether a profit seeking corporation is successful?

zwily · on Jan 10, 2019

Literally every S1 filing will have some sort of language like that. They are required to list the risks that may harm them.

scarface74 · on Jan 10, 2019

They haven’t shown a profit yet. So you don’t have any proof that they have a sustainable business model.

codepopacy · on Jan 11, 2019

Dj from MongoDB here. We have, obviously, been keeping up with this and other threads, but we've also been busy testing out Amazon DocumentDB's correctness and performance. While we're getting that together to bring you an official response in a few days, complete with test results and methodology, I'd like to pick up on a couple of points and some inaccuracies that have been repeated in various threads:

This move shows MongoDB’s approach to document databases is compelling. We’ve thought so for a long time.

A cloud-hosted, truly global and managed MongoDB, MongoDB Atlas, has existed for the last two and a half years and has been serving more and more satisfied users every day with some massive workloads.

MongoDB Atlas runs the full implementation of MongoDB in the cloud.

Many features of MongoDB are documented as not being implemented by DocumentDB: these include change streams, many aggregation operators including $lookup and $graphlookup. But beyond that, well let’s just say we’ve been staggered by how many tests DocumentDB has failed (no spoilers!).

The MongoDB API is not under an Apache license.

MongoDB drivers are still under the Apache license. The MongoDB server used to be licensed under AGPL and is now licensed under SSPL. The source code is open to all, as it has always been, at https://github.com/mongodb/mongo

DocumentDB is not cheaper than MongoDB Atlas. Preliminary estimates show this to only be the case with very large collections and very, very high read/write workloads.

There’ll be more next week over on the MongoDB blogs.

Dj

therealdrag0 · on Jan 11, 2019

Any idea when Atlas will expand support for Sharding configurations and taggable zones? My impression is Atlas ONLY supports a 2 field shard, and the first shard MUST be location. Also it's impossible for clients to set write-concern to tags, because you don't support custom tags as MongoDB itself does.

snissn · on Jan 11, 2019

SSPL feels a lot like a bait and switch to me

keepper · on Jan 10, 2019

The current biggest threat to Free and Open Source software is cloud computing. Plain and simple.[0]

I know this is a blunt and harsh statement to make, but when you sell a service, you have zero native incentives to Open Source the way your system works. It just opens up Competition. This is not unique to AWS/Amazon. But their success gives them the power to have wide OSS damage.

This is, to me, the biggest reason why cloud portability should be something that every customer of a cloud service should have in their plans. Amazon as a company has shown no timidness in both "embracing, extending, extinguishing" their competition.

OSS literally built the internet and opened up the wold wild communication age, let's not be so short sighted that we don't see proliferation of cloud services ( specifically one having so much dominance), for what it really is.

[0] http://dtrace.org/blogs/bmc/2018/12/14/open-source-confronts...

cthalupa · on Jan 11, 2019

This comment is pretty bizarre when taken into account with the link you referenced. Unless I'm completely misreading what Cantrill is saying in that blog, I don't think he agrees with you.

>And while they’re at it, it would be great if they could please stop making outlandish threats about the demise of open source

>Adam’s fundamental optimism serves to remind us, too, that any perceived “danger” to open source is overblown: open source is going to endure

>and in the end, open source will survive its midlife questioning just as people in midlife get through theirs: by returning to its core values and by finding rejuvenation in its communities