Hacker News new | past | comments | ask | show | jobs | submit login

Please think twice before choosing an embeddable database for hostable software. I see so many open-source projects that get hampered own the line because they chose sqlite and there's now a SPoF that would require a significant rewrite to eliminate - especially as sqlite depending on configuration either corrupts or becomes extremely slow when stored on distributed/network file systems. I assume this goes for duckdb as well.

Embeddable databases absolutely have legit use-cases, but I see them being used for projects where it'd have saved so much trouble to use a proper ORM (perhaps with sqlite support!) from the beginning.




Not sure where an ORM would help in this case. If anything, it might get in the way sometimes (in terms of performance), as some queries might not be well optimised.

But yeah, people shouldn't just "follow the hype". They should always ask why would use this and not that.


I think ORM might have been brought up because in many embedded DBs there is some level of language native bindings that eliminate the need for an ORM. A couple of examples that come to mind would be mnesia (KV store rather than SQL but still an example) or some of the SQLite providers for c# which give you their own mini orm (that you will have to refactor if you change out dbs instead of using sqlite with an orm from the get go)

My preference for ORMs is still on the 'give me a SQL DSL and map my objects' side. Let me use SQL as SQL, give me convenience in interfacing with the DB. Complex features like object tracking tend to be nice up front but require extra diligence down the line to avoid pitfalls, and tend to be risky from a standpoint of composability.

Sorry. Almost went on an ORM rant there.


I guess the thought behind using an ORM here was as an abstraction over the database/storage layer, so theoretically you could change switch out the database easily.

In practice though, real projects often have to sidestep the ORM and handwrite SQL, sometimes using database-specific features.


What I meant more specifically was a database-independent RBDMS ORM (e.g EF Core, sqlalchemy) - for cases when there aren't resources or need for separate db driver implementations.


> especially as sqlite depending on configuration either corrupts or becomes extremely slow when stored on distributed/network file systems

I had Plex running via Docker, and the Plex configuration/metadata stored via NFS on my NAS. For quite a while seeking would be intermittently slow, or plain crash the Plex server requiring a restart of the container. Other weird issues as well.

Finally I had enough and did some debugging and searching, and found that Plex uses sqlite in a way that is not compatible with networked storage. So I moved it to an iSCSI volume, mounted as a regular drive on the Docker host and it's been rock solid since.

But yeah, didn't expect that.


Yup. I was surprised and did the same sleuthing for PhotoStructure when libraries are stored on remote filesystems. I had to add a "local db replica" mode to work around this issue.

When running PhotoStructure under docker, it turned out to be simpler to just assume the DB is stored on a remote filesystem, and enable this mode by default. When running outside a container I can detect if the host volume is remote automatically and set the mode appropriately.

The reason why, other than missing file locking semantics, seems to be from WAL mode's use of an SHM file. Unsurprisingly, memory-mapped files don't work well from remote shares.


You just saved me a lot of time. In the middle of migrating homelab from day from docker/swarm to kube and had not thought of this. Ty.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: