Hacker News new | past | comments | ask | show | jobs | submit login

Nice! (Non snarky question) does that scale?



That's a good question. Relaxo is a database designed around immutable, transactional structures where convenience is more important than scale. Think of things like comments on a blog, items for sale in a small shop - https://github.com/ioquatix/financier is an example of an actual project which is in production.

Some things which I personally find useful about Relaxo:

- Easy to move data around, merge and fork data (it's just a git repository).

- Easy to roll back or inspect changes. If you make a mistake, just reset HEAD.

- Easy to backup (guaranteed consistency on disk).

- Better grouping of changes by transactions, which have a description, date, and information about who committed it (can even tie to currently logged in user for a web app, for example).

In theory Relaxo could scale up. Using libgit2 as the backend, it wouldn't be hard to use redis as an object store for git. The git data structure on disk is really just a key-value store with some specific data structures.

The main issue with Relaxo is query performance and indexes. Simple queries like fetching a document is fast. Complex queries including subsets, aggregations, and joins require supporting indexes to work efficiently, and this is something that is hard to build into a pure document storage system. The naive solution is to load all the documents and filter them, which is actually fine until you get a large number of documents (e.g. 1,000+).

However, git does provide one useful guarantee - it will sort directory entries. With this in mind, it's possible to make radix-sorted indexes (e.g. /invoices/by_date/2017/07/). You can use this to do basic indexes, but it's still not as good as a traditional SQL database in this regard.


I have seen a growth of such "vcs-like" databases, but I think the preponderance remains SQL stores like MySQL/MSSQL/Postegres or NoSQL like Mongo/Cassandra/Redis/Couch/etc. For those - or anything that has its own model of storage or processing and, in the end, is backed by filesystem-type storage, dotmesh provides a really nice solution.

I haven't used Relaxo itself, but personally, I like the fact that independent groups are thinking of version control semantics for data. Tells me it is heading in a positive direction.


Relaxo actually grew out of Couch DB.

Relaxo used to be a couch query server (https://github.com/ioquatix/relaxo-query-server - not so useful any more) and ruby front end (https://github.com/ioquatix/relaxo-model - still useful). But I got frustrated with the direction of couchdb 2.x so I rewrote it to do everything in-process and use git as the document store. It organically grew from that.

Unless you are operating at scale, doing things in-process is vastly more convenient. Sending ruby code to the query server to perform map-reduce was a cumbersome process at best. It's easier just to write model code and have it work as expected.

Systems like Postgres a great when you have a single database and multiple front-end consumers though. You'd need to put a front-end on top of relaxo in order to gain the same benefits, but it would be pretty trivial to do so - just that its never been something that I've needed to implement. The API you'd actually want is one that interfaces directly with your Ruby model instances, rather than database tables and rows. I think there is room for improvement here - probably implementing a websocket API that exposes the raw git object model and then allowing consumers to work on top of that.


Pretty cool. Is there a write up on architecture and usage models? I’d like to see it.

I was a happy couch 1.x user, but moved away with 2.0. Nothing specific about it, just needs and timing.


Thanks for being so interested.

The architecture is super simple, I'd suggest that the first place to look is the source code.

There are really only two ways of accessing the underlying data store - a read-only dataset and a read/write changeset which can be committed.

It's purely a key-value storage at the core - a key being a path and a value being whatever you want.

On top of that you can build more complex things, e.g. https://github.com/ioquatix/relaxo-model which provides relational object storage and basic indexes (e.g. has one, has many, etc)


From the readme: "Relaxo is designed to scale to the hundreds of thousands of documents. It's designed around the git persistent data store, and therefore has some performance and concurrency limitations due to the underlying implementation... Relaxo can do anywhere from 1000-10,000 inserts per second depending on how you structure the workload."




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: