Datomic is Free

augustl · on April 27, 2023

Datomic's is perfect for probably 90% of small-ish backoffice systems that never has to be web scale (i.e. most of what I do at work).

Writing in a single thread removes a whole host of problems in understanding (and implementing) how data changes over time. (And a busy MVCC sql db spends 75% of its time doing coordination, not actual writes, so a single thread applying a queue of transactions in sequence can be faster than your gut feeling might tell you.)

Transactions as first-class entities of the system means you can easily add meta-data for every change in the system that explains who and why the change happened, so you'll never again have to wonder "hmm, why does that column have that value, and how did it happen". Once you get used to this, doing UPDATE in SQL feels pretty weird, as the default mode of operation of your _business data_ is to delete data, with no trace of who and why!

Having the value of the entire database at a point in time available to your business logic as a (lazy) immutable value you can run queries on opens up completely new ways of writing code, and lets your database follow "functional core, imperative shell". Someone needs to have the working set of your database in memory, why shouldn't it be your app server and business logic?

Looking forward to see what this does for the adoption of Datomic!

electroly · on April 27, 2023

> Someone needs to have the working set of your database in memory, why shouldn't it be your app server and business logic?

This one confused me. The obvious reason why you don't want the whole working set of your database in the app server's memory is because you have lots of app servers, whereas you only have one database[1]. This suggests that you put the working set of the database in the database, so that you still only need the one copy, not in the app servers where you'd need N copies of it.

The rest of your post makes sense to me but the thing about keeping the database's working set in your app server's memory does not. That's something we specifically work to avoid.

[1] Still talking about "non-webscale" office usage here, that's the world I live in as well. One big central database server, lots of apps and app servers strewn about.

dimitar · on April 27, 2023

Consider this use case - in addition to your web app, you have a reporting service that makes heavy duty reports; if you run one at a bad time, bad things might happen like users not being able to log in or do any other important work, because the database is busy with the reports.

So in a traditional DB you might have a DBA set up a reporting database so the operational one is not affected. Using Datomic the reporting service gets a datomic peer that has a copy of the DB in database without any extra DBA work and without affecting any web services. This also works nicely with batch jobs or in any situation where you don't want to have different services affect each others performance.

Its true that a lot more memory gets used, but it is relatively cheap - usually the biggest cost when hosting in the cloud being the vCPUs. But usually in Clojure/Datomic web application you don't need to put various cache services like Redis in front of your DB.

Thea assumption here is that the usual bottleneck for most information systems and business applications is reading and querying data.

electroly · on April 27, 2023

I appreciated this insight into other people's use cases, thank you for that! This architecture brings RethinkDB to mind, which also had some ability to run your client as a cluster node that you alone get to query. (Although there it was more about receiving the live feed than about caching a local working set.)

svieira · on April 27, 2023

> which also had some ability to run your client as a cluster node that you alone get to query

FoundationDB does this as well.

bsaul · on April 28, 2023

Do you have a pointer to some doc explaining how to do that in foundationdb ?

svieira · on April 28, 2023

https://blog.the-pans.com/notes-on-the-foundationdb-paper/ is one:

> Client (notice not Proxy) caches uncommitted writes to support read-uncommitted-writes in the same transaction. This type of read-repair is only feasible for a simple k/v data model. Anything slightly more complicated, e.g. a graph data model, would introduce a significant amount of complexity. Caching is done on the client, so read queries can bypass the entire transaction system. Reads can be served either locally from client cache or from storage nodes.

thewataccount · on April 27, 2023

Is RethinkDB still around?

They actually have recent commits, and a release last year.

electroly · on April 27, 2023

The company is gone, but the open source project lives on. We still use it in production.

thewataccount · on April 27, 2023

How's it been for production?

Would you recommend using it, or would it be better to go with a safer option?

jwr · on April 28, 2023

RethinkDB user here. I've been running it in production for the last 8 years or so. It works. It doesn't lose data. It doesn't corrupt data (like most distributed databases do, read the Jepsen reports).

I am worried about it being unmaintained. I do have some issues that are more smells than anything else — like things becoming slower after an uptime of around three weeks (I now reboot my systems every 14 days). I could also do with improved performance.

I'm disappointed that the Winds of Internet Fashion haven't been kind to RethinkDB. It was always a much better proposition that, say, MongoDB, but got less publicity and sort of became marginalized. I guess correct and working are not that important.

I'm slowly rebuilding my application to use FoundationDB. This lets me implement changefeeds, is a correct distributed database with fantastic guarantees (you get strict serializability in a fully distributed db!), and lets me avoid the unneeded serialization to/from JSON as a transport format.

electroly · on April 27, 2023

We've never had any issue with it on a typical three-node install in Kubernetes. It requires essentially no ongoing management. That said, it can't be ignored that the company went under and now it's in community maintenance mode. If you don't have a specific good use for Rethink's changefeed functionality, which sets it apart from the alternatives, I'm not sure I could recommend it for new development. We've chosen not to rip it out, but we're not developing new things on it.

thewataccount · on April 27, 2023

Interesting thank you!

I remember back when it came out it was a big deal that it could easily scale master-master nodes and the object format was a big thing because of Mongo back then.

That was before k8's wasn't a thing back then, and most of the failover for other databases wasn't a thing just yet. I'm too scared to use it because they have a community but they're obviously nowhere as active as the Postgres and other communities.

tyre · on April 27, 2023

With AWS you can create a replica with a few clicks. Latency is measured in milliseconds for the three-nines observed usage.

I don’t think this is a huge challenge (anymore) for Postgres or whatever traditional database.

dimitar · on April 27, 2023

That is a read replica, datomic peers can do writes as well, which further expands the possible use cases.

augustl · on April 27, 2023

A few milliseconds per query adds up if you're doing one query per item in a list with 100s or 1000s of elements :)

tyre · on April 28, 2023

ah okay. in that case add in a couple hours to refactor those N+1 queries and you’re all good

robertlagrant · on April 28, 2023

> A few milliseconds per query adds up if you're doing one query per item in a list with 100s or 1000s of elements :)

If you're doing this, then you need to stop :)

bt1a · on April 28, 2023

when do you ever have to do one query per array of items? genuinely curious

augustl · on April 28, 2023

I suppose I think of this the other way around. When the query engine is inside your app, the query engine doesn't need to do loop-like things. So you can have a much simpler querying language and mix it with plain code, kind of similar to how you don't need a templating language in React (and in Clojure when you use hiccup to represent HTML)

Additionally, this laziness means your business logic can dynamically choose which data to pull in based on results of queries, and you end up with running fewer queries as you'll never need to pull in extra data in one huge query in case business logic needs it.

augustl · on April 27, 2023

It's definitely a trade-off! If you have 10s or 100s of app servers that has the exact same working set in memory, it's probably not worth it.

But if you have a handful of app servers, it's much more reasonable. The relatively low scale back-office systems I tend to work with typically has 2, max 3. Also, spinning up an extra instance that does some data crunching does not affect the performance of the app servers, as they don't have to coordinate.

There's also the performance and practicality benefits you get from not having to do round-trips in order to query. You can now actually do 100 queries in a loop, instead of having to formulate everything as a single query.

And if you have many different apps that operates on the same DB, it becomes a benefit as well. The app server will only have the _actual_ working set it queries on in memory, not the sum of the working set across all of the app servers.

If this becomes a problem, you can always architecture your way around it as well, by having two beefy API app servers that your 10s or 100s of other app servers talks to.

electroly · on April 27, 2023

SQLite provides a similar benefit with tremendous results using its in-process database engine, although the benefit there is more muted by default because of the very small default cache size. We do have one app where we do this. There's no database server, the app server uses SQLite to talk directly to S3 and the app server itself caches its working set in memory. I can definitely see the benefit of some situations, but for us it was a pretty unusual situation that we might not ever encounter again.

All that said... can't Datomic also do traditional query execution on the server? I thought it had support for scale-out compute for that. AIUI, you have the option to run as either a full peer or just an RPC client on a case-by-case basis? I thought you wouldn't need to resort to writing your own API intermediary, you could just connect to Datomic via RPC, right?

carry_bit · on April 27, 2023

AIUI, the full peer is Datomic; the RPC server is just a full peer that exposes the API over http and is mainly intended to be used with clients that don't run on the JVM (and so can't run Datomic itself in-process).

xmlblog · on April 27, 2023

> If you have 10s or 100s of app servers that has the exact same working set in memory, it's probably not worth it.

The introduction of intelligent application-level partitioning [1] and routing schemes can help one balance cost and performance.

[1] https://blog.datomic.com/2023/04/implicit-partitions.html

NovemberWhiskey · on April 27, 2023

I think the point is that treating your database as an arms-length, RPC component that's independent from your "application" isn't necessarily the only pattern.

foobiekr · on April 27, 2023

Strong agree. there are vast, massive cost savings and performance advantages to be had if the model is that a shard of the dataset is in memory and the data persistence problem is the part that's made external. The only reason we are where we are today is that doing that well is hard.

robertlagrant · on April 28, 2023

Is this not the case already? Database drivers (or just your application code) are allowed to cache results if they like. The problem is cache invalidation.

foobiekr · on April 28, 2023

Caches aren't the same. In the shard-in-memory case, the shard is the thing serving the queries, meaning it's not a cache it _is_ the live data.

robertlagrant · on April 28, 2023

Understood. For a single-node or read-only system it sounds fine, but then there are a variety of ways to solve that (e.g. a preloaded in-memory sqlite).

bombolo · on April 28, 2023

If it's in memory you must live with the fact that it might be gone at any moment though.

foobiekr · on April 28, 2023

Hence "the data persistence part."

robertlagrant · on April 28, 2023

This sounds like a write-back / write-through cache with extra terminology. What's the difference?

foobiekr · on April 28, 2023

The difference is the 10,000x slower performance if you move the actual DB transaction leader for the shard to over-the-wire access.

Anything can be anywhere if we ignore latency and throughput.

robertlagrant · on May 2, 2023

I'm not saying that. I'm asking if this is just new terminology for a previous technology idea, or whether it's a new concept.

xmlblog · on April 27, 2023

Having the working set present on app servers means they don't put load on a precious centralized resource which becomes a bottleneck for reads. The peer model allows app servers to service reads directly, avoiding the cost of contention and an additional network hop, allowing for massive read scale.

bruiseralmighty · on April 27, 2023

This is true, but the tradeoff is that now your central DB is a bottleneck that is difficult to scale.

Having the applications keep a cached version of the db means that when one of them runs a complex or resource intensive query, it's not affecting everyone else.

epolanski · on April 27, 2023

> Datomic's is perfect for probably 90% of small-ish backoffice systems that never has to be web scale (i.e. most of what I do at work).

So is any cloud-managed db offering and at that scale we talking very small costs anyway.

Why datomic instead?

augustl · on April 27, 2023

Because of the reasons I list :) Anything in particular that wasn't clear/relevant?

xpe · on April 27, 2023

> Datomic's is perfect for probably 90% of small-ish backoffice systems that never has to be web scale (i.e. most of what I do at work).

I don’t think I agree with this as stated. It is too squishy and subjective to say “perfect”.

More broadly, the above is not and should not be a cognitive “anchor point” for reasonable use cases for Datomic. Making that kind of claim requires a lot more analysis and persuasion.

augustl · on April 27, 2023

I agree, I mostly phrased it that way for effect. My "analysis" is 100% subjective, opinionated and anecdotal :)

fulafel · on April 27, 2023

> Someone needs to have the working set of your database in memory, why shouldn't it be your app server and business logic?

This is Ions in the Cloud version, or for on-prem version the in-process peer library.

Lutger · on May 1, 2023

Datomic always seemed like a really cool thing to use. However, I'm not familiar with Clojure or any other JVM based language, nor do I have the time to learn it. And I can't find any supported way to use it with other languages (I'm not even talking about popular frameworks), or am I missing something?

It doesn't feel like the people behind Datomic actually want to have users outside of the Clojure world, which will be rather limiting to adoption.

brundolf · on April 27, 2023

Something I've been curious about: how well (or badly) would it scale to do something similar on a normal relational DB (say, Postgres)?

You could have one or more append-only tables that store events/transactions/whatever you want to call them, and then materialized-views (or whatever) which gather that history into a "current state" of "entities", as needed

If eventual-consistency is acceptable, it seems like you could aggressively cache and/or distribute reads. Maybe you could even do clever stuff like recomputing state only from the last event you had, instead of from scratch every time

How bad of an idea is this?

augustl · on April 27, 2023

Datomic already sort of does this :) You configure a storage backend (Datomic does not write to disk directly) which can be dynamodb, riak, or any JDBC database including postgres. You won't get readable data in PG though, as Datomic stores opaque compressed chunks in a key/value structure. The chunks are adressable via the small handful of built-in indexes that Datomic provides for querying, and the indexes are covering, i.e. data is duplicated for each index.

brundolf · on April 27, 2023

Interesting! I assumed Datomic was entirely custom

Now I'm even more curious if you could skip Datomic and just do something like this directly with a relational DB in production

JBiserkov · on April 27, 2023

But why?! The whole point of Datomic is that it implements this entire immutable framework for you, on top of mutable storage.

So YOU can focus on building your own specific business logic, instead of re-implementing the immutable DB wheel.

xmcqdpt2 · on April 28, 2023

Because, for example, your application is not tied to the JVM? You are uncomfortable using closed source software for such a critical piece of infra? As far as I can tell they don't even have a searchable bug report database! I'd hate to be the one debugging an issue involving datomic.

kamma4434 · on April 27, 2023

Yes, but you end up rewriting Datomic!

brundolf · on April 27, 2023

Well, except it sounds like Datomic is closed-source :)

Moru · on April 28, 2023

Nah, our (not very good) implementation existed 10 years before Datomic :-)

koreth1 · on April 27, 2023

That's a pretty common pattern in event-sourcing architectures. It is a completely viable way to do things as long as "eventual-consistency is acceptable" is actually true.

Scarbutt · on April 27, 2023

Datomic's is perfect for probably 90% of small-ish backoffice systems that never has to be web scale

How do they scale it for Nubank? (millions of users)

augustl · on April 27, 2023

Good question! I don't have any personal experience in that regard. I would probably have paid up for enterprise support (or bought the entire company ;))

ithrow · on April 27, 2023

I don't how they do it, but the obvious answer is probably sharding. Their cloud costs must be no joke. Peers require tons of memory and I can only guess they must have thousands of transactors to support that workload and who knows how many peers. Add to this that they probably need something like Kafka for integrating/pipelining all this data.

outworlder · on April 27, 2023

> Peers require tons of memory

As do most distributed databases. Even when you don't store your entire database (or working set) in memory, you'll likely still have to add quite a bit of memory to be used as I/O cache.

bhurlow · on April 29, 2023

sharding, microservices, they run many instances of Datomic to handle different functionality

spariev · on April 27, 2023

One thing which is quite hard to do in Datomic is simple pagination on a large sorted dataset, as one can easily do with LIMIT/OFFSET in MySQL for example. There are solutions for some of the cases, but general case is not solved, as far as I remember (it’s been a while I used it extensively)

augustl · on April 27, 2023

It depends! If you want to lazily walk data, you can read directly from the index (keep in mind, the index = the data = lives in your app), or use the index-pull API which is a bit more convenient.

However, if you want to paginate data that you need to sort first, and the data isn't sorted the way you want in the index, you have to read all of the data first, and then sort it. But this is also what a database server would need to do :)

spariev · on April 28, 2023

Yep, I am well aware of these specifics and workarounds, but in general case where is no general solution to the question asked here, for example [0]. And for big datasets with complex sorting it will take some effort to implement a seemingly simple feature.

Guess it is just one of the tradeoffs, as while some features Datomic has out of the box are hard to replicate in RDBMS-es, things like pagination which are often took for granted is a bit of work to do in Datomic. So it is something to keep in mind when considering the switch

[0] https://forum.datomic.com/t/idiomatic-pagination-using-lates...

augustl · on April 28, 2023

Interesting link, thanks for posting!

Datomic's covering indexes are heavily based on their built-in ordering, and doesn't really have much flexibility in different ways to sort and walk data.

Personally, I'm a fan of hybrid database approaches. In the world of serverless, I really enjoy the combo of DynamoDB and Elasticsearch, for example, where Dynamo handles everything that's performance critical, and Elasticsearch handles everything where dynamic queries (and ordering, and aggregation, and ....) is required. I've never done this with Datomic, but I'd imagine mirroring the "current" value of entities without historical data is relatively easy to set up.

convolvatron · on April 27, 2023

there must be a relational sort? ah yes, there is, and the ability to feed a relational output into a Clojure sort

pachico · on April 27, 2023

You seem to describe the Event Source paradigm rather than a database :)

augustl · on April 27, 2023

The main difference between event sourcing and datomic are the indexes and the "schema", which provides full SQL-like relational query powers out of the box, as well as point-in-time snapshots for every "event" (transactions of facts).

So, "events" in Datomic are structured and Datomic uses them to give you query powers, they're not opaque blobs of data.

JimmyRuska · on April 27, 2023

> doing UPDATE in SQL feels pretty weird, as the default mode of operation of your _business data_ is to delete data, with no trace of who and why!

It's a good idea to version your schema changes using something like liquibase into git, that gets rid of at least some of those pains. Liquibase works on a wide variety of databases, even graphs like Neo4j

I got the same feeling in Erlang many times, once write operations start getting parallel you worry about atomic operations, and making an Erlang process centralize writes through its message queue always feels natural and easy to reason about.

mst · on April 27, 2023

I guess NuBank (Cognitect's owners) have concluded that the paid licensing business wasn't worth the hassle compared to having the developer time involved spent on other things.

Releasing only binaries, while I understand people being grumpy about it, seems like an interesting way of keeping their options open going forwards. Since it was always closed source, it now being 'closed source but free' is still a net win.

The Datomic/Cognitect/NuBank relationship is an interesting symbiotic dynamic and while I'm sure we can all think of ways it might go horribly wrong in future I rather hope it doesn't.

motoboi · on April 27, 2023

Very probably they understood that having a property database makes hiring and onboarding more difficult than necessary.

Open sourcing the database helps on that.

elteto · on April 27, 2023

They didn’t open source the DB, just the binaries.

remram · on April 27, 2023

How can you "open source" something that doesn't include the sources?

Thorrez · on April 27, 2023

Yeah, they didn't open source anything. They just made it free.

elteto · on April 27, 2023

Ah! Yes, but not quite! It’s not freeware. The binaries are technically open sourced, you can do with them as you please within the confines of the Apache license.

sitkack · on April 27, 2023

I take that means reverse engineer, decompilation and extracting and reusing any bytecode you want.

dunham · on April 28, 2023

It seems like it, which is a bit of a change in direction for them. I've poked at the data from the clojure side in the past (inspecting objects), just to learn how it works, but the license was strongly worded against reverse engineering.

I've also taken a look at generated clojure bytecode. It looked like the codegen is pretty straightforward with minimal optimization. It looked like it wouldn't be too hard to reverse with maybe a little bit of backtracking (essentially a parsing problem, I believe). You'd then need a separate step to redo the macros.

It sounded like it might be a fun little project (just to see if it can be done and try my hand at decompiling), but I would have wanted to decompile datomic to make it interesting and the license precluded that.

elteto · on April 28, 2023

That’s my take as well. It is definitely an odd license for a binary, I’ll grant you that.

remram · on April 27, 2023

The Apache license says I can do things with the source of it.

> You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and **in Source or Object form**, provided that (...)

jolux · on April 28, 2023

…only if you have the source

croes · on April 27, 2023

That's parent's point, they didn't. They only made the binaries available under the Apache 2.0 license

elteto · on April 27, 2023

Well, _technically_ you are now free to modify the binary and redistribute the result as per the Apache 2.0 license. That’s different than giving something as freeware, which would not allow/cover modification/redistribution.

xmcqdpt2 · on April 28, 2023

Which actually does make a difference for JARs because the bytecode is portable and modifiable, not trivially so but definitely way more easily than native code.

bananapub · on April 27, 2023

they haven't open sourced the database though?

samuell · on April 27, 2023

Question to people having used Datomic:

Based on experience with Prolog, I always thought using Datalog in a database like Datomic would mean being able to model your data model using stored queries as a very expressive way of creating "classes". And that by modeling your data model using nested such queries, you alleviate the need for an ORM, and all the boilerplate and duplication of defining classes both in SQL and as objects in OO code ... since you already modelled your data model in the database.

Does Datomic live up to that vision?

mtnygard · on April 27, 2023

You can definitely use Datomic in the way you describe, in at least a few different ways. I haven't often seen queries stored in the database itself. It's more common to have the queries as data in the application. Since queries are ordinary Clojure data structures, it's even more common to see functions that build queries from inputs or functions that transform queries (e.g., adding a tenant-id filter clause).

Datomic also support rules, including recursive rules. I wrote a library to do OWL-style inference about classes of entities using rules. You can see an example here (https://github.com/cognitect-labs/onto/blob/master/src/onto/...). This is a rule that infers all the classes that apply to an entity from an attribute that assigns it one class.

I would also say that building an "entity type definition" system as entities in Datomic is almost the first thing every dev tries after the tutorial. It works... but you almost never _need_ it later.

samuell · on April 28, 2023

Great, many thanks for this elaboration!

gleenn · on April 27, 2023

Clojure in general is all about passing around little maps of data and in particular not using OO to model. So Datomic naturally continues that by returning maps of nested structures to represent your query results and does side-step the ORM completely.

PreachSoup · on April 27, 2023

Question: What's the difference between nested oop objects (composition over inheiritance) vs maps of nested structures?

gleenn · on April 27, 2023

Rich Hickey has a great talk about how Objects are data structures with unique interfaces that are unnecessary complexity. He showed and example of a web server with a web request object with request headers etc etc. Doing simple things like collecting information out of that nested object structure is bespoke and harder than it should be for no real gain. If everything is a map or list or set, it becomes completely trivial to extract and manipulate data from a structure. It's a subtle difference at first but when everything is like that it makes your system far simpler. Unfortunately I can't remember exactly where the part of this talk is.

greg7mdp · on April 27, 2023

This? https://www.youtube.com/watch?v=aSEQfqNYNAc

gleenn · on April 27, 2023

Yes, this hit home so hard. All objects are bespoke mini languages that add little to no value and simultaneously make it hard to remember and hard to use. Just use maps!

oblio · on April 28, 2023

An object is a schema, though.

A map is... Whatever you put in it.

simongray · on April 28, 2023

This is the benefit. Languages like Clojure have excellent support for manipulating maps and I'd wager that something like 40-60% of Clojure code is similar-looking map manipulation stuff for this very reason.

If you need that validation, you just validate the map? You can use established methods like Malli or Clojure Spec for this. If you need to use a record with a fixed schema, just use a record instead of map. In Clojure, you can use most of the map functions for records too.

xpe · on April 28, 2023

This comment is talking specifically in the context of Clojure.

In the Clojure culture (so to speak) maps may also have a schema, as used by various schema checking tools, which are richer than runtime type checks. (Not the same as database schema)

Nit: I would not say that a JVM object is a schema, because there’s more to it. Rich is well known for saying that the idea of an object complects two things: (1) a record-like (struct) data structure with the (2) code to manipulate it.

Sometimes it’s even more complected because in some languages classes can make assumptions about state across all objects and threading.

javajosh · on April 28, 2023

Constraints always have value. Without constraints programmers are confronted with the naked complexity of mapping between the set of all possible states of N bits which is (2^N)!.

gsinclair · on April 29, 2023

More constraints don’t necessarily have more value, though. Each constraint needs to be thought through, not just have its value assumed.

bavell · on April 27, 2023

> Unfortunately I can't remember exactly where the part of this talk is.

If you or anyone else remembers, I'd love to watch

sharms · on April 27, 2023

Here is this specific part cut out from the larger talk "Clojure made simple": https://www.youtube.com/watch?v=aSEQfqNYNAc

janosos · on April 27, 2023

https://youtu.be/aSEQfqNYNAc

The full video: [Time 0:49:06] https://www.youtube.com/watch?v=VSdnJDO-xdg

Transcript here: https://github.com/matthiasn/talk-transcripts/blob/master/Hi...

PreachSoup · on April 27, 2023

Most of the time, the class name is used just as a nominal type identifier. That's what we do at least

tyre · on April 27, 2023

“No real gain”

I don’t agree with this. iirc Rack ultimately uses and array to represent HTTP responses. It has three members: the status code, the headers, and the response body.

If you’re shipping a new change, is it easier to mistake response[0] or response.headers?

This is a trivial example, but the general class (ha) of trade-off is amplified with more complex objects.

I love clojure and lisp but the blindness behind a statement like “no real gain” has always kneecapped adoption.

outworlder · on April 27, 2023

> If you’re shipping a new change, is it easier to mistake response[0] or response.headers

False dichotomy. There are many options other than arrays. Clojure in particular is fond of hashmaps. You can have your response.headers without OOP.

augustl · on April 27, 2023

In Clojure, response.headers is still data :) You just use the built-in ways of reading named keys, such as (:headers response) or (get headers :response).

augustl · on April 28, 2023

Errata: (get response :headers)

kaba0 · on April 27, 2023

I think there is a need for objects. An active connection, talking to the GPU, etc are not data -- identity is essential for their operation.

OOP is probably the best way to model such,.. well objects, allowing them a private, encapsulated state, and making it only modifiable, or even viewable through a well-defined public interface that enforces all its invariants.

waffletower · on April 27, 2023

I think OOP (object oriented programming) is abused and is not the optimal paradigm for most software services. It succeeds best at providing inter-operability structure for API design. "Objects", as mentioned here, are an abstraction. Stateful data can be organized and manipulated elegantly without use of the OOP paradigm. Many small systems that employ OOP hamper their maintenance and extendability by unnecessary dependence upon encapsulation and data ownership. Mutability is a villain here as well -- when data structures are immutable, there is little fear of panoptic architectures designed without ownership constraints. Here software is no longer corralled into walled gardens of "objects"; large complex types and their brittle method associations are avoided, greatly simplifying software architectures as a result.

gleenn · on April 27, 2023

Clojure uses objects for connections and things, but POJOs are harmful IMHO because it makes manipulation and collecting data a bespoke task every time. Every time you change the data, you have to change a class to represent the JSON and the ORM class and .... this is all just data

kaba0 · on April 27, 2023

Well, that’s what Java records are for. As for having to change the type description, that’s more of a static typing discussion, though it can be generated — having a single source of truth and generating others is imo the best approach.

ledauphin · on April 27, 2023

Java records weren't even a gleam in the eye of James Gosling when Clojure was solving these problems at its inception.

kaba0 · on April 28, 2023

In his eyes? It definitely was a thing, records are just nominal product types, these are probably the single most used building block of programming languages.

I really like Clojure, but I really don’t know what some of its fans think (also true of other lisps), like there is a healthy pollination of ideas between languages, lisps are not God’s language.

lmm · on April 28, 2023

Standard ML had records since the '70s. Both Clojure and Java would benefit from taking more from what came before, though Java at least had the excuse of being designed for low-powered set-top boxes.

filoeleven · on April 28, 2023

According to the link below, ML records are mostly handled by hash-maps in Clojure, except that there’s no canonical key/val order or strict typing by default.

ML record:

  {first = "John", last = "Doe", age = 150, balance = 0.12}

Clojure hash-map:

  {:first "John", :last "Doe", :age 150, :balance 0.12}

Destructuring a record in an ML function:

  fun full_name{first:string,last:string,age:int,balance:real}:string =
    first ^ " " ^ last

(It’s unclear from the example whether or not all of the destructured values are required in the function signature. I hope they are not, but I left them in since I don’t know. The caret positioning raises further questions.)

Destructuring a map in a Clojure function:

  (defn full-name [{:keys [first last]}]
    (str first “ “ last))

I don’t know if I’m missing something that ML offers with its records aside from more strict typing, which you can also have in Clojure when it’s useful. In both cases, it looks like it’s applied at the function declaration and not the record declaration.

https://www.cs.cornell.edu/courses/cs312/2008sp/recitations/...

simongray · on April 28, 2023

Clojure has records too.

filoeleven · on April 27, 2023

Connection pools exist precisely because the code outside of the connection management piece shouldn’t have to care much whether or not there is an active connection. It’s all boilerplate, except for handling the “unable to connect” case.

When you call a connection or connection pool object, you’re querying its current state. This is absolutely data.

blatant303 · on April 27, 2023

"It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures."

Alan Perlis.

xmcqdpt2 · on April 28, 2023

To paraphrase a well known saying, objects are the poor person's nested maps and vis versa.

samuell · on April 27, 2023

Thanks for the comment!

I was more thinking of the means to define your data "classes" (or whatever it is called on this context) though, rather than how it is passed around.

abc_lisper · on April 28, 2023

It emphatically does. It has pull-syntax just for the nested entities obviating the need for ORM. I am surprised I had to scroll so far down to see this. This is its biggest selling point IMO.

bvanderveen · on April 27, 2023

From experience:

Datomic Cloud is slow, expensive, resource intensive, designed in the baroque style of massively over-complicated CloudFormation astronautics. Hard to diagnose performance issues. Impossible to backup. Ran into one scenario where apparently we weren't quick enough to migrate to the latest version, AWS had dropped support for $runtime in Lambda, and it became impossible to upgrade the CloudFormation template. Had to write application code to export/reimport prod data from cluster to another—there was no other migration path (and yes, we were talking to their enterprise support).

We migrated to Postgres and are now using a 10th of the compute resources. Our p99 response times went from 1.3-1.5s to under 300ms once all the read traffic was cut over.

Mother Postgres can do no wrong.

Still, Datomic seems like a cool idea.

lgrapenthin · on April 27, 2023

As someone who is using Datomic Pro in production for many years now I must agree with you. One time I began a project with Datomic Cloud and it was a disaster similar to what you described. I learned a lot about AWS, but after about half a year we switched to Datomic Pro.

There were some cool ideas in Datomic Cloud, like IONs and its integrated deployment CLI. But the dev workflow with Datomic Pro in the REPL, potentially connected to your live or staging database is much more interactive and fun than waiting for CodeDeploy. I guess there is a reason Datomic Pro is the featured product on datomic.com again. It appears that Cognitect took a big bet with Datomic Cloud and it didn't take off. Soon after the NuBank acquisition happened. That being said, Datomic Cloud was not a bad idea, it just turned out that Datomic Pro/onPrem is much easier to use. Also of all their APIs, the "Peer API" of Pro is just the best IME, especially with `d/entity` vs. "pull" etc.

JulianWasTaken · on April 27, 2023

I don't doubt your story of course, and I love Postgres, but comparing apples to oranges no?

Datomic's killer feature is time travel.

Did you simply not use that feature once you moved off Datomic (and if so why'd you pick Datomic in the first place)? Or are you using Postgres using some extension to add in?

bvanderveen · on April 27, 2023

We implemented it in Postgres with 'created_at' and 'deleted_at' columns on everything and filtering to make sure that the object 'exists' at the time the query is concerned with. Changes in relationships between objects are modeled as join tables with a boolean indicating whether the relationship is made or broken and at what time.

Our data model is not large and we had a very complete test suite already, so it was easy to produce another implementation backed by postgres, RAM, etc.

wwweston · on April 27, 2023

Yeah, it seems you could be able to substitute thoughtful schema design avoiding updates/deletes for time-travel as a feature.

I wonder if anyone has made a collection of reference examples implemented this way (and in general think that a substantial compendium good examples of DB schema and thinking behind them could be worthwhile).

pjot · on April 27, 2023

It’s called a slowly changing dimension. In this example, it’s a type-2.

https://en.m.wikipedia.org/wiki/Slowly_changing_dimension

twic · on April 27, 2023

I'm moderately confident you could mechanically transform a time-oblivious schema into a history-preserving one, and then write a view on top of it which gave a slice at a particular time. Moderately.

ysleepy · on April 27, 2023

That is essentially what MVCC does.

JohnBooty · on April 28, 2023

Yes, although AFAIK those hidden MVCC columns (xmin, xmax?) aren't very usable from an application standpoint -- the obsoleted rows only hang around until the next VACUUM, right?

I realize you're not claiming those columns are useful from an application perspective. Just curious to know if I'm wrong and they are useful.

Because as I understand it, the selling point of Datomic is their audit trail functionality and that is admittedly a bit onerous to implement in a RBDMS. Even though I feel like every project needs/requires that eventually.

ysleepy · on April 29, 2023

I meant MVCC is the proof that you can automate the transform of a schema into a versioned schema. How and if the DBMS exposes that is another concern.

The garbage collection / VACUUM part of an MVCC system is the harder part, saving all versions and querying a point in time is the easy one.

twic · on April 28, 2023

Oracle lets you use the MVCC data to query past states of the database, called "flashback":

https://docs.oracle.com/en/database/oracle/oracle-database/2...

You can configure how long the old data is kept:

https://docs.oracle.com/en/database/oracle/oracle-database/2...

Worked examples:

http://www.dba-oracle.com/t_rman_149_flasbback_query.htm

JohnBooty · on April 28, 2023

Wow, super informative. Thank you so much.

twic · on April 27, 2023

That is an extremely good point.

mjul · on April 28, 2023

Snodgrass wrote a whole book on that topic: Developing Time-Oriented Database Applications in SQL (1999).

It is available as PDF on his publications page:

https://www2.cs.arizona.edu/~rts/publications.html

rehevkor5 · on April 27, 2023

Maybe search around on bitemporal database table modeling.

jskulski · on April 29, 2023

Ive built a couple systems that would have been datomic’s bread and butter.

Each time the company was more comfortable with mainstream dbs, so we ended going with something like you’re talking about, built on top of a db. A couple of the projects were because a mainstream dbs wouldn’t scale.

The systems definitely worked, but it was also a lot of implementation complexity on an other wise simple business prop: “store this data as facts”

btown · on April 27, 2023

https://www.postgresql.org/docs/11/contrib-spi.html#id-1.11.... discusses a model for implementing time travel in Postgres <12 using SPI. https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit... discusses why it was removed in Postgres 12 - it seems logical that it's more maintainable to implement in plpgsql, though as far as I can tell there aren't off-the-shelf implementations of this.

We use https://django-simple-history.readthedocs.io/en/latest/ (with some custom tooling for diff generation) for audit logs and resettability, and while you can't move an entire set of tables back in time simultaneously, it's usually sufficient for understanding data history.

ithrow · on April 27, 2023

Datomic's 'time travel' is an audit feature, not something for your application/business logic to depend on. Performance reasons make it impractical, unless you only have like 10 users and very little data.

JulianWasTaken · on April 27, 2023

That's certainly not how it sells and markets itself.

The first feature on benefits (and the only reason I've ever heard Datomic brought up and/or considered it myself for production workflows) is using that stuff in application workflows: https://docs.datomic.com/pro/time/filters.html#history

Could be you're saying it in fact doesn't work well performance-wise, that'd (surprise me but) certainly explain why it's not more popular -- but I think it's clear it wants you to use this as an application feature.

newlisp · on April 27, 2023

Welcome to sales tactics ;)

Datomic is great but as another commenter said, is good for "small-ish backoffice systems that never has to be web scale". You almost probably can rely on querying history for internal applications. I think their primary market was for companies to use it internally but they never made this clear.

KingMob · on May 9, 2023

Ironically, Hickey fired the one marketer they hired for Datomic.

He lucked out when a unicorn went all in on it. Word around Cognitect was, Datomic was barely breaking even.

xmlblog · on April 27, 2023

> "small-ish backoffice systems that never has to be web scale". Doesn't production use of Datomic by Nubank and Netflix (to mention just two examples) belie this assertion?

KingMob · on April 28, 2023

Alternatively, Datomic wasn't performing up to snuff, and they found it cheaper to buy Cognitect than do a DB migration :D

bilkow · on April 28, 2023

Do those companies specify what they use it for? They probably have their own internal "small-ish backoffice systems".

oblio · on April 28, 2023

Nubank is one thing, but for Netflix, just like for any big company 10000 DB technologies are probably in use at the same time.

And 9996 of them are used for stuff like the internal HR DB or other minor projects.

outworlder · on April 27, 2023

Are they _forcing_ you to use CloudFormation? Or is it just the officially supported mechanism?

> Mother Postgres can do no wrong.

I'll say that Postgres is usually the answer for the vast majority of use-cases. Even when you think you need something else to do something different, it's probably still a good enough solution. I've seen teams pitching other system just because they wanted to push a bunch of JSON. Guess what, PG can handle that fine and even run SQL queries against that. PG can access other database systems with its foreign data wrappers(https://wiki.postgresql.org/wiki/Foreign_data_wrappers).

The main difficulty is that horizontally scaling it is not trivial(although not impossible, and that can be improved with third party companies).

JohnBooty · on April 28, 2023

Yes. Postgres such a reliable and known quantity that IMO it should be the default choice for just about anything.

Don't misunderstand me. There are plenty of times when something else is the right choice. I'm just saying, when I have a say in the matter, folks need to clear that bar -- "tell me why tool xyz is going to be so much better than postgres for this use case that it justifies the overhead of adding another piece of software infrastructure."

Like, you want to add a document database? Obviously Mongo, Elasticsearch, etc are "best of breed." But Postgres is pretty capable and this team is already good at it. Are we ever going to have so many documents that e.g. Elasticsearch's mostly-effortless horizontal scaling even comes into play? If you don't ever see yourself scaling past 1,000 documents then adding a new piece of infra is a total joke. I see that kind of thing all the time. I can't tell if developers truly do not understand scale, or if they simply do not give a f--- and simply want to play with shiny new toys and enrich their resumes.

I mean, I've literally had devops guys telling me we need a Redis cluster even though we were only storing a few kilobytes of data, that was read dozens of times daily with zero plans to scale. That could have been a f'in Postgres table. Devops guy defended that choice hard even when pressed by mgmt to reduce AWS spend. WTF?

jwr · on April 28, 2023

> Postgres such a reliable and known quantity that IMO it should be the default choice for just about anything.

This is being repeated so often. And yet — the above is true, IF (and that's a big if for some of us) you are OK with having your database on a single machine.

If you want a distributed database with strict serializability, where some nodes can go down and you still get correct answers, Postgres is not it.

JohnBooty · on April 28, 2023

Totally agree. That's really my thinking as well. Default to Postgres unless you have a reason not to choose it, and a need for distributed serializability is one of those cases where Postgres is an easy "nope, not suitable."

But I've also been burned by people reflexively reaching for $SHINY_NEW_TOY by default, when really there is no need. Architects and senior-level devs are the worst offenders. They throw a bunch of needlessly buzzword-compliant infra at a problem and then move on. They have the time and freedom to learn $SHINY_NEW_TOY well enough to MVP a product, but then the project is passed on to people who don't have that luxury.

I feel like there's a progression that often happens:

1. Early engineers: stick to Postgres or another RDBMS because it's all they know

2. Mid-stage engineers with "senior" in their title for the first time: reach for $SHINY_NEW_TOY

3. Late-stage engineers: stick to Postgres because it's something the whole team already knows and they recognize the true long-term cost of throwing multiple new bits of software infra into the mix

quickthrower2 · on April 28, 2023

Where does sqlite fit into the heuristic? I thought for toy apps or small sites it might be an even easier option and cheaper.

ggleason · on April 27, 2023

> Datomic Cloud is slow, expensive, resource intensive, designed in the baroque style of massively over-complicated CloudFormation astronautics. Hard to diagnose performance issues. Impossible to backup.

You should give TerminusDB a go (https://terminusdb.com/), it's really OSS, the cloud version is cheap, fast, there are not tons of baroque settings, and it's easy to backup using clone.

TermiusDB is a graph database with a git-like model with push/pull/clone semantics as well as a datalog.

ithrow · on April 27, 2023

I guess this is why datomic.com front page now defaults to datomic pro and not cloud.

panick21_ · on April 30, 2023

They should have just focused on having a great Kubernetes setup experience, their focus on this cloud stuff always seemed strange to me.

avodonosov · on April 27, 2023

Why backups were impossible? Couldn't you backed up the storage resources?

bvanderveen · on April 28, 2023

As far as I could tell, there was no straightforward way to point a new instance of the compute resources back at the old storage resources since they're all provisioned in one CF template.

froggertoaster · on April 28, 2023

> Mother Postgres can do no wrong.

Simple, eloquent, damn true.

martypitt · on April 27, 2023

Only the binaries are made available, not the source, which is interesting.

I guess they don't claim to be open source, they're claiming to be free, which is - in itself - awesome.

Last time I checked, you couldn't push binaries to maven central, without also releasing the source. That may have changed.

miroljub · on April 27, 2023

They say it's under the Apache 2 licence, so it is open source.

EDIT: I was wrong. They actually released binaries under the Apache licence, not the source code. Which is, mildly said, deceptive. I don't even have an idea what that actually means.

martypitt · on April 27, 2023

They say the binaries are being made under Apache 2.

They don't say anything about the source code being published. That's why (to me) this is so interesting. I've never seen binaries released without source code before.

pjmlp · on April 27, 2023

We used to call it Public Domain and Shareware (with variations like Coffeeware, Beerware, Postware,...)

stefan_ · on April 27, 2023

What is even the point of releasing binaries under Apache 2? When I patch the binaries, do I need to release a hexdiff too to fulfill my Apache obligations? Very weird.

politician · on April 27, 2023

I suppose you could run it through a Java decompiler and clean up the results with an LLM. How long would that take? 5 years to make it useful?

vore · on April 28, 2023

I wonder if they took a page out of Fabrice Bellard's book: https://bellard.org/ts_server/

  The CPU version is released as binary code under the MIT license.

a2800276 · on April 27, 2023

They licensed the binary under Apache. It's a publicity stunt.

lolinder · on April 27, 2023

Making your product available for free isn't a publicity stunt, it's a huge step for a business. And, in practice, it's not that much different for the average user if only the binaries are Apache licensed. When was the last time you needed to open up the Postgres source code and modify something?

a2800276 · on April 27, 2023

If it wasn't a publicity stunt, it certainly had the effects of one: I've never heard of Datomic before and here they are at the top of hackernews!

> And, in practice, it's not that much different for the average user if only the binaries are Apache licensed. When was the last time you needed to open up the Postgres source code and modify something?

Sure, if you're playing a game it probably doesn't make a difference. If I'm building my IT infrastructure on a product, tt makes a huge difference if I get a an open-source-licensed "binary" or access the to source:

- the package they distribute contains no less than 960 different jars. Most of those are the standard apache-project-everything-and-the-kitchen-sink-style dependencies. Say I'd like to update log4j because it contains a catastropic vulnerability that datomic decide not to fix. (not that that sort of thing ever happens)

- or say Datomic decides to abandon the product altogether or goes out of business

- or say I'm not happy with their quality of service contract around their DB they support and would like to work with a different company

nightski · on April 27, 2023

Rich Hickey started Datatomic (along with Stuart Halloway & Justin Gehtland). He also created the Clojure programming language and has been on Hacker News numerous times with many popular talks. In fact they all have made famous contributions.

Many businesses use Microsoft SQL Server or Oracle and don't need access to the source. I'm not saying open source isn't nice, but it is absolutely not a requirement for IT infrastructure.

I'd imagine people rely on many cloud services that are in fact, not open source.

KingMob · on April 28, 2023

Gehtland was the CEO of Cognitect, and Relevance before that. I doubt he's written a line of code in years.

lolinder · on April 27, 2023

Again, with your hypotheticals—when was the last time you needed to do any of that with Postgres or another FOSS DBMS?

For the vast majority of use cases, a FOSS DBMS and a free-as-in-beer DBMS are indistinguishable. If you're in a category where they're not, then don't use Datomic, but this is still far more than a publicity stunt.

a2800276 · on April 27, 2023

We must be working in a different world. In all my career I've not once worked with a serious business that did not have a support contract for their database system open source or not.

Most of those had escrow agreements for central closed source components with vendors in case the vendor went out of business. (obviously only for things perceived as critical and from companies with some perceived risk of failure).

And god knows how many times have I experienced companies biting themselves because they bought into a product that turned out not to deliver what was promised after the contracts were signed.

xmlblog · on April 27, 2023

Free beer binaries are not mutually exclusive of Enterprise support agreements featuring all those things you mentioned above _for people that need that_.

a2800276 · on April 27, 2023

Completely agree. I'm fine with a free beer license. The context of the post is that the binary is licensed using an Open Source license which leads to confusion.

amluto · on April 27, 2023

Never. On the other hand, I have considerable confidence that I could do so, and that if something goes wrong with upstream development, someone is likely to do so.

If I use a free-binary-but-no-source product, I’m much more likely to get stuck.

(Of course, as a regretful MySQL user, I am pretty stuck, but largely because MySQL is, in many respects, a terrible product. It does, quite reliably, get security updates at the kind of general maintenance that keeps it working no worse than it ever did.)

ingenieroariel · on April 27, 2023

Today I looked up pgvector's NixOS availability. For the past 15 years I have relied on postgis source being available and improved by the community for my day to day business.

My point is that the option to modify the source results in software bein available and community maintained in a way that binary only isn't. Even if I change the source myself just twice a decade.

eternalban · on April 27, 2023

Someone (forget who but he worked there) was giving a presentation of Datomics in some downtown (NYC) bank circa 2014 iirc. Per the presenter -- iirc someone asked a specific technical question -- even people working for the company don't get to see the full source. Only a small team has access to the full source, and he said he wasn't one of them.

casion · on April 27, 2023

Maven actually has a tutorial on publishing binaries without source. So I assume it's ok when they tell you how to do it.

martypitt · on April 27, 2023

Sure, Maven makes this possible.

But Maven Central has strict rules around what can be published there. I just double checked and it's a requirement to publish the source as well as the binaries:

https://central.sonatype.org/publish/requirements/#supply-ja...

BaculumMeumEst · on April 27, 2023

it seems you're right, but it also says the following, so i'm confused on whether it's a hard requirement?

"If, for some reason (for example, license issue or it's a Scala project), you can not provide -sources.jar or -javadoc.jar , please make fake -sources.jar or -javadoc.jar with simple README inside to pass the checking. We do not want to disable the rules because some people tend to skip it if they have an option and we want to keep the quality of the user experience as high as possible."

blatant303 · on April 27, 2023

Datomic is an event-sourced db, and it makes it hard to introduce retroactive corrections to the data when your program's semantic already rely on using datomic's time travelling abilities: at one point you'll need to to distinguish between event time and recording time as explained in this excellent blog post:

https://vvvvalvalval.github.io/posts/2018-11-12-datomic-even...

This is why I' rather use XTDB [1], a database similar to datomic in spirit, but with bitemporality baked in.

[1] https://www.xtdb.com

redbar0n · on May 8, 2023

Event time vs. recording time - I think this was the link you were looking to provide: https://vvvvalvalval.github.io/posts/2017-07-08-Datomic-this...

adamfeldman · on April 27, 2023

What is Datomic, you ask? It's a database written in Clojure. https://hn.algolia.com/?q=datomic

  Datomic is an operational database management system - designed for transactional, domain-specific data. It is not designed to be a data warehouse, nor a high-churn high-throughput system (such as a time-series database or log store).
  It is a good fit for systems that store valuable information of record, require developer and operational flexibility, need history and audit capabilities, and require read scalability.

(via https://docs.datomic.com/pro/getting-started/brief-overview....)

mbesto · on April 27, 2023

For 90% of the web devs that just spin up Postgres/MySQL, why would you use Datomic over that?

alecco · on April 27, 2023

Just the temporal properties alone make it very useful for anything where it matters like billing, finance, inventory. Else you are in views/schema/indexing hell to do it on top of SQL.

There is some SQL temporal support but it's not great and varies a lot. Also since it's not native to the storage it has a lot of complexity issues under the rug making it not great.

Many financial systems use Event Sourcing (OOP + ORM). I had to suffer this at a previous employer.

See https://vvvvalvalval.github.io/posts/2018-11-12-datomic-even...

slaymaker1907 · on April 27, 2023

The temporal support seems handy, but time is still going to be really tricky for financial systems. Datomic only covers what the physical state of the database was at a particular time, but there's also the effective legal time (maybe a payment was dated a day before the system actually processed it) as well as requirements to remove data after a period of time (including point in time stuff).

nlitened · on April 27, 2023

There's a bitemporal Datomic-like database from JUXT that does exactly that, I believe https://www.xtdb.com

augustl · on April 27, 2023

Indeed, it depends a lot on the domain. Datomic only has "technical" database time, and doesn't have any built-in way of modelling domain time. You can set the transaction timestamp manually when you write, but you can't set it to be earlier than the latest transaction that was committed. So, if you want your domain modelling to piggyback on Datomic time travelling, you can only do things like delaying writes for, say, an hour, and hope you have all the data by the time you commit to db.

riku_iki · on April 27, 2023

> Just the temporal properties alone make it very useful for anything where it matters like billing, finance, inventory.

you can easily create datamodel to have this in SQL dbs: create table transaction_history(..., execution_time timestamp);

tasuki · on April 27, 2023

This is not an answer, it's the beginning of a question. Yes sure, we know `create table` and we know it's a good idea to record the execution timestamp. What exactly do you put in the place of the three dots?

augustl · on April 27, 2023

Fyi, Datomic lets you look at the entire database at any point in time, as an immutable value. Also, you can annotate transactions with metadata, and query for "which tx wrote this specific value for this row/column" and look at custom metadata you added to the tx to reason about your system. Doing all of that in SQL is not trivial, to say the least.

roguas · on April 27, 2023

You have to add a lot of scaffolding to postgres to make it semi-immutable. datomic just is, wanna know previous user email, just go back and see. Out of the box, without thinking about it.

evantbyrne · on April 27, 2023

Immutability is certainly tempting for certain kinds of data. Does it handle use-cases where data needs to be deleted though? i.e., privacy compliance.

mcbits · on April 27, 2023

It looks like they call that excision, which leaves behind a breadcrumb saying something was deleted and which can't itself be deleted.

dgb23 · on April 27, 2023

Temporality in general becomes super handy if you have something like reports that need to be consistent across time. Or if you want to ask questions about the past. Or questions about the future without affecting the present.

mbesto · on April 28, 2023

Super interesting, thanks.

dgb23 · on April 27, 2023

That's not easy to answer. It's a question of:

- mutable data vs immutable data

- tables, row based vs tripple store, attribute based (EAV/RDF)

- table schemas vs attribute schemas

- relational connections vs graph connections

- SQL vs datalog

- nested queries vs flat queries and rules

- remote access vs client side index

etc.

panick21_ · on April 27, 2023

Because its a different model of integrating your database and your app.

It allows you to write queries in a pull style, it can be trigger based, datalog or raw index access. Its by default immutable and allows historical query. It allows meta data on the transaction themselves.

A lot of the time the user builds much of that himself or relays on frameworks to do it.

fulafel · on April 27, 2023

A tangent but it would be interesting to see survey data of how many devs default reach for SQL first these days. A lot of people use various other kinds of DB models which are preceived to have smoother learning curves.

derefr · on April 27, 2023

> Is it Open Source?

> Datomic binaries are provided under the Apache 2 license which grants all the same rights to a work delivered in object form.

So... no?

(I say that, but "Datomic binaries" presumably refers to compiled JVM class files; and JVM bytecode is notoriously easy to decompile back to legible source code, with almost all identifiers intact. Would Apache-licensing a binary, imply that you have the right to decompile it, publish an Apache-licensed source-code repo of said decompilation, and then run your own FOSS project off of that?)

Hixon10 · on April 28, 2023

It had to be simple yes/no question, but they haven't managed with this :D

hombre_fatal · on April 27, 2023

Aside, I remember HN in 2009 or so where Clojure was a daily homepage staple and Rich Hickey was putting out his talks about Clojure and code design.

I watched a lot of that and used Clojure fulltime for five years. Wonder what he's up to these days.

xmlblog · on April 27, 2023

Attending https://2023.clojure-conj.org/schedule/

kgwxd · on April 27, 2023

Oh, that's today and tomorrow. I had been waiting for that, I could have sworn earlier this year they said there was going to be a live stream available. Maybe it didn't pan out.

Edit: Oh, there are streaming tickets for $20.

lib-dev · on April 27, 2023

Those talks inspired far more than Clojure itself. It sort of started this movement toward simplicity as a value.

frou_dh · on April 27, 2023

If that's the case then all the people running around praising the ""simplicity"" of golang got the wrong end of the stick completely from Rich's classic presentations.

moomin · on April 27, 2023

The thing is, Hickey was entirely right to reject this idea of “Simplicity” being the same thing as “Easy”. But he then decided to conflate in “comprehensible” which it turns out is very much a matter of aesthetics.

Turns out if you really focus on composability above other concerns you get Haskell.

frou_dh · on April 27, 2023

> But he then decided to conflate in “comprehensible” which it turns out is very much a matter of aesthetics.

Did he? I seem to remember a quip in one of the presentations about German not being comprehensible to him being his own problem, because he never learned German.