A proof of concept MongoDB clone built on Postgres

begriffs · on April 20, 2015

You can turn a Postgres db into an API with PostgREST, and it supports jsonb to retrieve json properties.

leoh · on April 20, 2015

This is a really cool project. Thanks for pointing it out.

nl · on April 20, 2015

How are JSON updates handled?

For those who aren't aware, Postgres currently lacks support for doing updates to JSON fields via SQL[1]. For many this isn't a problem, but I'd imagine that people expecting a MongoDB clone would need it.

[1] But even though you can individually address the various fields within the JSON document, you can’t update a single field. Well, actually you can, but by extracting the entire JSON document out, appending the new values and writing it back, letting the JSON parser sort out the duplicates. https://blog.compose.io/is-postgresql-your-next-json-databas...

rpedela · on April 20, 2015

I don't know the answer to your question but JSON update is being worked on by the PG community. I think it is unlikely to make 9.5, but we will see.

http://www.postgresql.org/message-id/flat/54E00D1F.9060101@d...

benlambt · on April 20, 2015

Write a function in PLV8?

bonesmoses · on April 20, 2015

Or Python, or any language that has a PL in PostgreSQL. Unfortunately a lot of people consider that a hack, and they're not wrong.

If we can address fields with operators (->, ->>, etc.) then why can't those operators modify, too? It works for everything else, after all. It's fairly counter-intuitive from a dev perspective.

That, and PLV8 still doesn't support JSONB. With JSONB being so much more efficient, faster, and so on, not being able to interact with it using the most logical procedural language is an EPIC oversight.

nunwuo · on April 20, 2015

> If we can address fields with operators (->, ->>, etc.) then why can't those operators modify, too? It works for everything else, after all. It's fairly counter-intuitive from a dev perspective.

It certainly doesn't work for "everything else". You can update individual "parts" of arrays and record types, that's it. And those exception are hard-coded, so someone would either have to hard-code similar exceptions for JSON values (ugly, inflexible, generally a bad idea) or generalize support for custom operators on the left-hand side of assignments in UPDATE. All this while trying to push out a release which was already late by months.

You can call everything an oversight, but with very limited resources available and the need to cut a release at some point, this is far from the truth in this case.

chc · on April 20, 2015

Why is special-casting json or jsonb worse than special-casing arrays?

jerrysievert · on April 20, 2015

There have been calls to move toward jsonb, as well as a pull request for newer v8, but development seems to have stalled.

laurencerowe · on April 20, 2015

You can also write an update function in plain SQL. See my example in this comment on a recent plant postgresql post: http://bonesmoses.org/2015/04/10/pg-phriday-functions-and-ad...

jerrysievert · on April 20, 2015

currently, that is exactly how they are handled. the JSON structure is brought into memory, parsed, modified, and then written again to the database.

this is non-optimal, but currently the only method available.

nunwuo · on April 20, 2015

That wouldn't change even if there was a specialized syntax, because of MVCC. So the only problem is that syntactic sugar is lacking, not that the performance isn't as good as it could be.

wcummings · on April 20, 2015

Use a transaction maybe?

jasondc · on April 20, 2015

DB2 also implements the MongoDB query language: http://www.ibm.com/developerworks/data/library/techarticle/d...

So does CouchDB: https://cloudant.com/blog/couchdb-and-mongodb-let-our-query-...

ericsink · on April 20, 2015

FWIW, I've recently been implementing the MongoDB query language as well:

https://github.com/zumero/Elmo

In F#. Not even remotely close to usable or production-ready.

The approach here is somewhat different, as this implementation is built on SQLite, which it treats as a simple key-value storage layer.

rasur · on April 20, 2015

Thanks for posting that, it's interesting and instructive to see examples of f# code 'out in the wild'. Much appreciated!

ericsink · on April 20, 2015

Note also that the Meteor project has "minimongo", which is yet another implementation of the MongoDB query language.

dheera · on April 20, 2015

I'd use it if it supported $near ...

aikah · on April 20, 2015

That's an excellent news for CouchDB, frankly writting map/reduce functions was a bit hard. CouchDB has excellent features (multi master syncs,...), I look forward using CouchDB again where it makes sense.

ahachete · on April 20, 2015

Álvaro here from ToroDB.

Check out ToroDB (github.com/torodb/torodb). It is a Mongo implementation based on PostgreSQL that transforms JSON documents into relational tables. This has many advantages like significant storage reduction, less I/O required and it allows for updates (a concern raised on some comments below). Please check it out! :)

tinco · on April 20, 2015

I was going to post that this would have been cool a year ago, but now that MongoDB has interchangeable backends there's not much to gain from a high level project like this.

But then I saw that this project started two years ago, and the last commit was a year ago.

justinsb · on April 20, 2015

From my experience of MySQL's pluggable engines, I learned that this actually complicates things: the user interface ends up being awkward, optimization is harder, adding features is made much harder.

Investigating standalone alternatives therefore seems like a good thing, even for projects starting today.

worldsayshi · on April 20, 2015

| MongoDB has interchangeable backends

Do you have a source on this? I can't find anything at the moment but it sounds familiar.

bryanlarsen · on April 20, 2015

The feature is called "pluggable storage engine", if you want something to Google.

worldsayshi · on April 20, 2015

Yup, found it: http://docs.mongodb.org/manual/release-notes/3.0/#pluggable-...

s_kilk · on April 20, 2015

I've been working on something similar for the last while, but using jsonb and plpgsql instead of plv8.

So far, the basic crud stuff works and there's a python driver with decent test coverage. Progress is slow, but it's fun!

http://bedquiltdb.github.io

benlambt · on April 20, 2015

Something else that provides a simple API for working with JSON in Postgres: https://github.com/robconery/massive-js (node.js)

bryanlarsen · on April 20, 2015

That looks pretty awesome, although fairly orthogonal to this discussion. I submitted it as a story on its own. https://news.ycombinator.com/item?id=9407782

bryanlarsen · on April 20, 2015

At the bottom of the README it says "Follow along at http://legitimatesounding.com/blog/", which 404's. I want to follow along! Please fix. Thanks.

jerrysievert · on April 20, 2015

Fixed.

ubergesundheit · on April 20, 2015

Interesting experiment! How does it compare in performance, cpu and memory footprint?

lgas · on April 20, 2015

There's also Mongres which was inspired by this project.

https://github.com/umitanuki/mongres

nitramafve · on April 20, 2015

I thought MongoDb:s main selling point was the simple scale-out model? At least that's what stuck with me. I'm well aware about the non-robustness properties of Mongo, but to me it seems like calling this a clone without the scale-out capabilities would be missing the point.

Sanddancer · on April 20, 2015

It's a proof of concept. The author's essentially saying, "here, it can be done, this is one way of doing it." It would need more bits and pieces and more tuning to be useful in a production case, but this shows that it at least can work.

nitramafve · on April 27, 2015

I think you missed my point. The USP of MongoDb in my view is the simplicity and its scalability model. At least these are the things which are interesting to me and what is hard to implement. They have cloned a subset of the functionality - the subset which is easy - and left the hard parts. I don't agree that this proves anything.

luke-stanley · on April 20, 2015

Python has a nice non-SQL style interface to SQL engines too, called Dataset: http://dataset.readthedocs.org/en/latest/

cies · on April 20, 2015

Benchmarks please! Response times, mem usage, disk usage.

Does it use HSTORE?

Nice project!

lc1 · on April 20, 2015

ELI5 does this essentially mean I can use the Mongo API to work with Postgres?

innguest · on April 20, 2015

Makes sense they're porting MongoDB to Postgres, given that:

"As of version 9.4, PostgreSQL benchmarks faster than MongoDB for both inserting and querying JSON data." :P

rbryan71 · on April 20, 2015

jerrysievert · on April 20, 2015

Developer here (woke up surprised it was on hn again), why not?

The blog posts go into more details: http://legitimatesounding.com/blog/building_a_mongodb_clone_... and http://legitimatesounding.com/blog/building_a_mongodb_clone_...

bryanlarsen · on April 20, 2015

I suspect there are lots of people who built applications on top of MongoDB who would like to transition away from Mongo onto a different database, or would at least like the option to be able to to. This may provide a good mechanism for a transition.

Which leads to a question for the OP: does this work with mongoose.js?

jerrysievert · on April 20, 2015

i started a mongo client implementation in node.js, but it fell off my radar - the goal was to get it to work with mongoose, etc: https://github.com/JerrySievert/mongolike-client (it is not yet complete).

in the end, i just wrote my own database system, with a query language that could act as an intermediary from multiple sources (SQL, mongo, etc).

but, this stuff is mostly just for fun, my day job keeps me plenty busy.

prottmann · on April 20, 2015

Could be interessting for meteor.com, the mongodb only usage prevent me to use it.

aikah · on April 20, 2015

problem : meteor relies on a specific mongodb feature,ie polling the db for any change. I heard there is experimental support for PG but it seems to rely on a complex hack using triggers ... So the issue isn't really about mongo queries, but whether it is possible to track db edits from third parties or not.

anarazel · on April 20, 2015

There's infrastructure for that in postgres, since 9.4. See

http://www.postgresql.org/docs/devel/static/logicaldecoding.... for the (somewhat low level) description of the feature. You'd have to write an output plugin that formats the output as json, but that should be pretty easy.

Disclaimer: I'm the author of the feature ;)

ahachete · on April 20, 2015

That specific MongoDB feature is "tailing the oplog", where the oplog is the capped collection with the idempotent commands that represent the changes in the database. This is what Meteor uses to receive (be pushed) changes (not polling).

Certainly, logical decoding provides the necessary infrastructure to implement it (and also normal tailable cursors): thank you Andres, really nice work! :) But some work is also needed to transform the representation you are using in PostgreSQL into MongoDB's oplog entries.

This is definitely what we are using in ToroDB to emulate it (currently, under development).

aikah · on April 20, 2015

> I suspect there are lots of people who built applications on top of MongoDB who would like to transition away from Mongo onto a different database

What happened to separating the data access layer from the business logic layer? Oh yeah, "good architectural practices are so JEE ..." /s

DrJokepu · on April 20, 2015

Taking full advantage of the capabilities of the data layer has effects on the business logic layer. Sometimes it's worth it, sometimes it isn't.