Unofficial Guide to Datomic Internals (2014)

JimmyRuska · on July 24, 2020

We used datomic in production at Time Inc around 2016. The idea of an immutable database where you can track changes over time, or query the state of the universe at any given point, sounded amazing for marketing or compliance use cases. Unfortunately from a dev standpoint it did not feel like mature system, and the performance was not where we needed it to be.

Probably the most advanced database for triple stores these days is RDFox ( https://www.youtube.com/watch?v=-DnmuHtywFs ). While datomic uses datalog for querying, RDFox uses datalog for database reasoning, and sparql, a w3 standard for querying. As you add data to the database you can infer new facts. If you want immutability, simply add data in append mode only with a timestamp. But this idea you can add the business rules/logic to the database, and have it incrementally apply that logic as you add data is a recent advance by oxford AI research.

kendallgclark · on July 25, 2020

Stardog is full of features, performance, and enterprise hardening Rdfox hasn’t even started thinking about yet.

jmiskovic · on July 24, 2020

For uninitiated, Nikita has created in-memory database with mostly compatible API (datalog) for ClojureScript or Javascript. My impression is that his variant actually has more wide-spread use than original Datomic, based on number of open source projects that use each.

I used DataScript for a while to get familiar with graph database querying. I was fascinated how easy it is to construct queries that mine obscure relations between distantly related entities. I hope I get to use similar tech again.

twic · on July 24, 2020

> This simplicity enables Datomic to do more than any relational DB or KV storage can ever afford

> Datomic does not manage persistence itself, instead, it outsources storage problems to databases implemented by other people. Data can be kept, at your expense, in DynamoDB, Riak, Infinispan, Couchbase or SQL database.

These things can't both be true.

adamkl · on July 24, 2020

Why not?

It really depends on the definition of “do more”.

From what I understand, Datomic’s model is far more flexible than many other databases, and has built in time travelling capability due to its accretion of immutable data.

It’s architecture does, in fact, allow you to choose the storage provider, and it’s considered an external concern.

Those are very compelling reasons to use it, but part of the trade off is write-scalability, and potentially, raw performance.

So maybe it’s “do more” within certain limitations (which is up to you to decide if those limitations are a deal breaker).

Here’s a great talk about Datomic’s architecture for more details: https://youtu.be/9TYfcyvSpEQ

reilly3000 · on July 24, 2020

I think in this context "do more" refers to querying capabilities than the costs of running the system.

tnisonoff · on July 24, 2020

If the argument is Datomic can't be simpler than a relational DB because it can utilize it for persistence, then you'd have to argue that a relational DB can't be simpler than directly using a hard drive for your storage solution.

jayd16 · on July 24, 2020

"This simplicity enables a relation DB to do more than any hard drive can ever afford."

That also seems debatable for the exact same reasons. I don't think this would convince anyone that needs convincing.

jayd16 · on July 24, 2020

Yes, use of the word ever is a bit hyperbolic. Change it to "easily afford" and no one would bat an eye.

paulgb · on July 24, 2020

One thing I've never understood is why all the indexes have transaction last. One of the selling points of Datomic is that it supports as-of queries, but using the EAVT or AEVT indexes requires it to scan all historic values of that attribute, right?

In most situations this is probably fine, but if you have data that changes frequently it seems like this could slow queries down compared to an EATV or AETV index.

It's also likely that the people who made Datomic are both smarter about this stuff than me and put more thought into it than I have, so I'd love to know what the reasoning behind the choice of index is.

(PS @dang it would be nice to have (2014) in the title)

refset · on July 24, 2020

I'm not sure EATV/AETV could be used fully instead of EAVT/AEVT as you would then lose the ability to have efficient range seeks across values. I do agree though that scanning all historical values in EAVT/AEVT is unsatisfactory for many use-cases as it makes the performance of ad-hoc as-of queries unpredictable.

By contrast, Crux [0] uses two dedicated temporal indexes: EVtTtC and EZC (Z-curve index) to make bitemporal as-of queries as fast as possible. These are distinct from the various triple indexes, which don't concern themselves with time at all. (Vt = valid time, Tt= transaction time, and C = the document hash for the version of an entity at a given coordinate)

[0] https://opencrux.com (I work on Crux :)

lukashrb · on July 24, 2020

Not necessarily. Take a look at this implementation https://aosabook.org/en/500L/an-archaeology-inspired-databas...

There you can only retrieve the top layer and don't have to scan all historic data, it's only in-memory though.

jasonwatkinspdx · on July 24, 2020

In the article it mentions that while indexes are conceptually monolithic, in practice they're partitioned into 3 spaces: historical, current, and in memory.

New data gets written to the log for durability and updates the in memory portion for queries. Periodically indexes are rebuilt, creating new segments for current, and shifting historical data out of current. This limits how much of the log must be replayed on recovery, and allows garbage collection of data that falls out of the retention window.

It's not that dissimilar to solutions used by traditional mvcc databases.

nlitened · on July 24, 2020

The page mentions the Log index which is sorted by transaction id. It should be enough to support as-of if I understand correctly.

paulgb · on July 24, 2020

The log index supports as-of if you know the actual transaction ID, but if you want to look up by entity/attribute efficiently it's not much help because you don't know when the data point you're interested in was last modified.

nlitened · on July 25, 2020

I think in this case you’d find all datoms via normal EAVT index and then sort the results by transaction id, dropping everything after your desired transaction.

dgb23 · on July 24, 2020

Additional details about Datoms:

To determine whether a Datom is being rectracted or added there is a fifth element in the tuple [0].

There are many similarities to modelling temporal data in SQL [1]. But Datoms are simpler and more open as you can freely build relations between them (composable), similar to a graph-db.

[0] https://docs.datomic.com/cloud/whatis/data-model.html

[1] https://en.wikipedia.org/wiki/Temporal_database

vlmutolo · on July 24, 2020

this theme is hilarious and infuriating

EDIT: You can comment out the yellow background image in the style editor and it becomes something reasonable

adamkl · on July 24, 2020

I quite like the black-on-yellow theme. It’s different, and distinctive (I’ve read a few things on tonsky’s site, so I know what to expect).

Even more hilarious is switching to “dark mode”.

kgwxd · on July 24, 2020

Too late. Read the whole thing before coming back here to see your suggestion, now everything here, and everywhere else, is yellowish, no matter what I set in the style editor.

ddlutz · on July 24, 2020

It gave me a migraine reading half of the article, couldn't continue :(

archarios · on July 24, 2020

I went to a Clojure meetup one time and they all went on about how using Datomic in production is a nightmare and it's generally an over-engineered product that isn't worth the trouble in the end. Do most people who have dealt with Datomic in production feel this way?

Scarbutt · on July 24, 2020

Yes, and that's exactly why Nubank acquired Cognitect. They are too deep into the tech to migrate to something else, cheaper to just buy the authors.

So you have deep technical debt with serious scaling issues and bugs everywhere(Datomic/Nubank) and a burnout company(Datomic/Cognitect) get together, makes sense.

Burnout because their "Datomic Cloud" product didn't worked out, it was just a horrible complex AWS cloudformation template that force you to click through tens of aws webpages. It was more complex to manage and to dev for than on-premise but you still had all the same issues and bugs.

Nubank got into Datomic not because of Clojure, but the other way around, they got into Clojure because of Datomic. If you watch their videos, the reason they picked Datomic was because they think it had "time travel", which is quite different from having "history" of transactions, use mostly for auditing and troubleshooting, not for real time travel queries.

In the end, I guess things did work out for Cognitect, and Hickey is now laughing all the way to the bank.

I have being following Datomic for a year because of a system I inherited.

tomconnors · on July 24, 2020

This seems like an excessively uncharitable read of the situation. I've never used Nubank's software, but I have used (on-prem) Datomic and I certainly wouldn't say it has bugs everywhere. In fact, in my (admittedly low-volume and simple) usage of the system I haven't come across any bugs I can remember. Calling Cognitect a "burnout" company is inaccurate and rude.

I agree with you that the Datomic cloud stuff comes across as being frighteningly complex. I think they probably just need to work on the documentation, like making it more obvious what the differences and tradeoffs are between the deployment scenarios.

Did you inherit a Datomic system that was previously developed by a small team or a small company? Because inheriting a system that's hard to understand and change transcends languages and databases. It is the tie that binds us all as software developers.

Scarbutt · on July 24, 2020

I hit this bug: https://docs.datomic.com/on-prem/changes.html#0.9.6021

Not being able to perform writes to your database is not scary enough? it's funny how they phrased that bug.

Also hit 4-5 more that are there in the change log, let serious but still pretty bad and frustrating.

This was a internal application, the DB was not being stressed, a 4KLoC readable Clojure codebase.

Don't get me wrong I really like Datomic and its features but the implementation still has a long way to go.

tomconnors · on July 24, 2020

Fair enough, hitting that bug would have pissed me off too.

On your last point, I agree that it still has a way to go. It's good for some (many?) production use cases now, as Nubank's success demonstrates, and hopefully with Nubank's resources it'll start to live up more to its promise.

dwohnitmok · on July 24, 2020

Anecdotally I know of one company which is also in the same boat and generally regrets their usage of Datomic and is trying to move away from it last I talked with them. However, there's also people on HN like dustingetz who have had a great time with Datomic and use it as a core component of their product.

I just wish Cognitect would allow people to run public benchmarks of Datomic to make it easier to evaluate its tradeoffs.

lukashrb · on July 24, 2020

Do you have more detailed info on this? I guess this could really help making a decision and understanding the tradeoff's of using datomic.

dwohnitmok · on July 24, 2020

What the company ran into? Unfortunately not :/. It was a quick chat in an informal setting with their VP of engineering (I think?) that really was just a "huh, interesting moment" for me (although I've coded in Clojure for a full-time job before I have essentially no personal experience with Datomic).

As for the positive side, I think dustingetz monitors Clojure and Datomic threads pretty closely so maybe they can chime in here.

zeroDivisible · on July 24, 2020

What is the policy of Cognitect re: public benchmarks? I did not know that.

dwohnitmok · on July 24, 2020

> The Licensee hereby agrees, without the prior written consent of Cognitect, which may be withheld or conditioned at Cognitect’s sole discretion, it will not... publicly display or communicate the results of internal performance testing or other benchmarking or performance evaluation of the Software

From the Datomic EULA here: https://www.datomic.com/on-prem-eula.html

mercer · on July 24, 2020

That's just vile. is there any /good/ defense of this kind of agreement other than a 'think of the children' argument that people might make a mistake in their performance reviews?

fiddlerwoaroof · on July 24, 2020

It's annoying, but it's pretty standard in commercial databases: if your competitors refuse to allow public benchmarks, all it can do is hurt you.

dwohnitmok · on July 25, 2020

How standard is it? As far as I know among databases MS SQL and Oracle do this but do other commercial databases do this as well?

fiddlerwoaroof · on July 25, 2020

https://danluu.com/anon-benchmark/

It’s common enough to have a name: “DeWitt clause”. It sounds like IBM is the only major commercial rdbms vendor to allow benchmarks?

dwohnitmok · on July 25, 2020

That article only lists MS and Oracle though. Apart from IBM, I don't think CockroachDB Enterprise has such a prohibition, nor does Google Spanner (I think?), nor does Amazon Aurora (again I think?). And of course all the open source competitors don't have this clause.

Basically my impression is that DeWitt clauses are common enough to be well-known, but still in the distinct minority. That's just an impression though.

auganov · on July 24, 2020

Never had any strict trouble with it. Maybe it's just that I've used it for a long time but I enjoy the simplicity of using it.

My biggest complaint is performance for certain use-cases. Say if you're trying to pull a lot of attributes on hundreds of thousands of datoms it's going to be rather slow (even though it's supposed to be in-memory already). But again for these kinds of use-cases I'd probably go with a completely different kind of a database either way.

The story around deletions/excisions isn't that great either. Honestly the whole log/history aspect of Datomic sounds nice but never really used it other than for reverting stupid mistakes.

The #1 thing I love is the freedom of querying you get with Datomic. You insert your data in a way that makes sense for your data, and querying is pretty much a completely separate concern. For the most part you don't need to structure your schema around the querying capabilities of your database which I love. Say back in the day I liked Mongo because you could just insert whatever you wanted [0] but eventually you'd hit problems where you couldn't easily query your data (maybe it has changed over the years, no idea).

And the syntax is just a pleasure to work with. I'd love a version of Datomic that kept the same interface but dropped some of the more esoteric features in favor of performance.

Also I noticed some of the people reporting issues used the cloud version. Never used that so can't speak to that. On-prem is free and has all the features. As long as you don't redistribute it there's no problem.

[0] Yes in datomic you do have to have a schema. But it's pretty much a simple global list of possible attributes. If you need to add something later or make a change it's pretty straightforward.

huahaiy · on July 25, 2020

Datalevin may have what you like https://github.com/juji-io/datalevin, has no history, no "database as a value" etc, and focus on performance instead.

auganov · on July 25, 2020

Thanks will definitely check it out later. Though I did play around with Datascript before and found it to suffer from similar performance issues.

huahaiy · on July 26, 2020

FYI, Datalevin has faster queries than Datascript, for Datalevin has given up "database as a value" doctrine that both Datomic and Datascript share, so Datalevin can cache aggressively to achieve better performance.

invisiblerobot · on July 26, 2020

Also look into open crux

juskrey · on July 24, 2020

Datomic learning curve is relatively steep, like many higher level and more abstract things in Clojure ecosystem in general, and you should know how to cook it for sure. After figuring out all why's and how's it works like a charm.

However, I indeed find Datomic Cloud version unnecessarily complex for most applications. Probably it is still a good corporate sales product for Cognitect.. Datomic On-premise version is much more friendly for small-medium-somewhat-larger use cases. Cloud version is also an AWS thing, so locks you in there, which is also not good.

MrBuddyCasino · on July 24, 2020

I have heard multiple times that its rather slow, but haven’t seen any benchmarks. Would make sense, as a dynamically typed, garbage collected language Clojure is not the greatest fit to implement a database in. The question is, are the things you gain worth it?

dwohnitmok · on July 24, 2020

I would be surprised if Datomic's core code was written in Clojure rather than Java (and these days Java's performance can get you pretty far in implementing a database, see e.g. Cassandra).

Most highly performance-sensitive code in the Clojure ecosystem is a Clojure wrapper around a Java core.

But yes as I said elsewhere, it would be great if Cognitect allowed people to post benchmark results.

MrBuddyCasino · on July 24, 2020

I‘ve used Cassandra, its not that impressive. Much slower than the C++ rewrite (ScyllaDB?), latency issues due to GC, can’t hold a candle to Clickhouse. And they’ve been optimizing it for a long time now.

peferron · on July 24, 2020

Cassandra and ClickHouse are designed to do different things. To flip things around, have you compared the latency of a single-row update or delete in Cassandra vs ClickHouse?

hodgesrm · on July 24, 2020

Or the fact that Cassandra uses consistent hashing to distribute data automatically across hosts.

My company supports ClickHouse, but there are many use cases where it's simply not the right solution.

MrBuddyCasino · on July 25, 2020

If you care about the latency of a single row update or delete, Clickhouse is definitely the wrong tool for the job. First, it doesn’t really have deletes(afaik). Second, you need to batch updates aggressively to get good throughput.

But you’re right C* and CH are designed to do different things. I just found the difference in general performance across everything (startup, schema changes, throughput, query performance, optimization opportunities) to be quite pronounced. One feels like a race car, the other not so much.

Scarbutt · on July 24, 2020

Idiomatic Clojure is slower than JS, but you can make Clojure somewhat close to Java by writing Java with parenthesis(lots of interop from Clojure). One of the devs of Datomic brag about how it was only 200KLOC of Clojure, but if you extract the datomic tar, the lib dir has probably more that 2Million LOC of open source Java libs.

dragonne · on July 24, 2020

I wouldn't call it over-engineered, but it certainly is an operational disaster. It's slow, memory hungry, and full of catastrophic bugs.

We are currently replacing it with PostgreSQL to improve performance and scalability.

nojito · on July 24, 2020

It's ridiculously expensive and many large scale deployments have consulting arrangements because of that very reason you shared.