I Migrated from a Postgres Cluster to Distributed SQLite with LiteFS

klabb3 · on Jan 6, 2023

I get that relational dbs like both Postgres and SQLite are extremely battletested and reliable, incredible pieces of tech, with tons of features. But what I’m reading here is a minefield of footguns, for fairly trivial logic, so I’m wondering if these tools were really designed for these use cases.

Now, I don’t know if every server-side app NEEDS to run in 10 regions, auto-refresh session cookies, notify when someone starts typing a comment, report metrics every time the user moves the mouse pointer (or whatever else the kids do for fun these days). But let’s assume for a moment that's the common case.

My understanding is that these relational dbs require a lot of config, awareness of the difficult distributed issues, and other layers such as caching, batching and so on, all of which is incredibly error-prone.

In fact, the most important parts of the ground laws of data manipulation isn’t even part of a schema! I could worry less about enums and nulls, in a world with orders of magnitude more dangerous creatures: losing writes because I wrote to a replica, reading or writing inconsistent state, duplicate writes etc etc. It feels like we have a lot of structure around the easy stuff, but no structure around the hard parts.

As a mere a dabbler with databases, can someone help calm my nerves?

benbjohnson · on Jan 6, 2023

LiteFS author here. I agree with you that most applications don't need to have all the distributed bells and whistles. Most projects that I run start out with just SQLite on a single node with Litestream and/or a cron backup. However, the number of applications that need to reduce latency in multiple regions is growing and the options tend to be pretty complicated or expensive. LiteFS aims to be a low-cost, self-hosted alternative to those options.

> I could worry less about enums and nulls, in a world with orders of magnitude more dangerous creatures: losing writes because I wrote to a replica, reading or writing inconsistent state, duplicate writes etc etc

LiteFS uses a primary/replica setup (as do many distributed databases) where the primary can perform writes and replicas are read-only. You won't be able to lose writes written to a replica because LiteFS doesn't allow that. As for inconsistent state, there is a transaction ID that lets you track your replication position[1]. You can implement strict serializability across your cluster using that. Then for duplicate writes, all writes go to the primary so it has the same serializable guarantees as regular SQLite.

I hope that explanation helps. Let me know if you have any other questions.

[1]: https://fly.io/docs/litefs/position/

klabb3 · on Jan 6, 2023

Very cool, and thanks for taking the time, it is indeed calming to a db novice like myself. In fact, I’m using Fly right now to try out the platform in general.

While I have you, do you have any high level thoughts on reactivity within a db like sqlite? (My app needed some real-time features so I have mostly been looking at NATS.io plus websockets). Are there any real-time like features (reactivity, subscriptions, CDC, materialized views) that work well and ergonomically within that context, either today or in the future? If so it could be really, really useful for my app.

Thanks again!

benbjohnson · on Jan 6, 2023

If you're running a single-node SQLite application then you can handle reactivity within your application code itself and use either websockets or SSE to propagate it to the client.

It definitely gets more complicated when you have a distributed application. Using something like NATS works but you'd need to synchronize between the NATS message and your database state on your replicas (e.g. NATS could arrive before data is propagated to replicas).

We do have plans to add event data to the LiteFS transaction files[1]. That would let you write out a message like "user_updated id=123" during your transaction and it would get bundled with the transaction payload on commit. Then your application on your replica can listen for events and they are already synced up with your database state.

[1]: https://github.com/superfly/litefs/issues/20

yoz · on Jan 6, 2023

Kent's recommended NodeJS module, `better-sqlite3`, has some very nifty features including the creation of JavaScript user-defined functions[0] that (if I understand this right) can be called from SQLite. Combined with TRIGGERs, I wonder if it might fire a function within the app when an UPDATE/INSERT happens from a different process? (This is me wondering out loud, I don't actually know.)

I also recommend checking out Replicache[1] and alternatives, which may be a better way to handle the networking and database replication so that it doesn't rely on the underlying DB.

[0] https://github.com/WiseLibs/better-sqlite3/blob/HEAD/docs/ap... [1] https://replicache.dev/

justsomeuser · on Jan 6, 2023

This will not work as SQLite triggers only work for writes within the same process; a trigger cannot execute code in another process.

But you could use triggers to write to a table (as any process that writes will run that trigger), and then poll that table from another process.

jacob019 · on Jan 6, 2023

I have never considered reacting to the database like that. If I need something to happen on db writes, then I cobble together some kind of event system, or delegate those writes to a function that takes care of the side effects. Things like websockets are app layer. There are many db features that I have never tried, have you used a reactive database in production?

mharig · on Jan 6, 2023

Real-time is a vague term.

Webassembly is a compile target for SQLite, AFAIR, so you might use that in the client to get the fastest possible response without using too much other libraries/services.

But without more information, it is hard to help you.

klabb3 · on Jan 7, 2023

> Real-time is a vague term.

Yeah I know. Sorry. Think chat application latency. I think I should have just used event-driven or reactive instead.

paulryanrogers · on Jan 6, 2023

Core issue is usually multiple writable nodes, therefore conflicts and acting on outdated data. Then there are isolation level issues. More distributed solutions usually opt for eventual consistency.

Enforcing a single writable nodes / primary can go a long way. Of course you've got a lot more options if your data naturally partitions, with few or no relationships among them.

roncesvalles · on Jan 6, 2023

>As a mere a dabbler with databases, can someone help calm my nerves?

Just use one of the cloud or SaaS database options which are battle tested, have well-defined guarantees and 24/7 support. Running your own database seems to be a fool's errand if data loss or corruption worries you at all.

LAC-Tech · on Jan 5, 2023

I have a couple of questions...

- So LiteFS turns sqlite into a client server database. But SQLite being embedded is what gives it its low latency in the first place. So, I kind of have to ask - what's point? Just prefering the way sqlite does SQL over postgres?

- What language is that database model under "Evaluating the existing data"?

chillfox · on Jan 6, 2023

I haven't tried LiteFS yet, but as far as I understand, you still get the embedded performance for reads, and on the primary node you get that for writes as well.

SQLite can do things PostgreSQL can't. Like the "insert or replace" is superior to PostgreSQL upsert for maintaining seed data. And the flexible typing is great for a settings table where you can just have key/value columns and store numbers/text/bool/dates/etc... in the value column. Also, if you want to do a graph style query then SQLite is better then PostgreSQL.

The way I see it, both PostgreSQL and SQLite have their strengths and LiteFS expands the areas where SQLite can be used.

paulryanrogers · on Jan 6, 2023

Postgres has MERGE and INSERT ON CONFLICT. Pg also has JSONB type which can store a variety of types within it.

How is SQLite's replace and typing is superior to these?

chillfox · on Jan 6, 2023

It all depends upon what you are trying to do.

If you want to store lot's of json data then yes, JSONB is the better option.

As for the MERGE and INSERT ON CONFLICT vs INSERT OR REPLACE, they all work differently, they are not the same. The SQLITE option is simpler and better if you just want to ensure a full row is there in a specific state. If you want to preserve certain columns then the PostgreSQL options are better.

Edit: I do wish that PostgreSQL had the SQLite options and that SQLite had the PostgreSQL options, because they are good for different things and it would be nice to have access to all of it regardless of which DB I am working with (I use both in different apps).

paulryanrogers · on Jan 6, 2023

If you make a column JSONB you can store strings, numbers, arrays, objects, booleans. They don't have to be nested in an object. Though I suppose if you don't want to decode strings that's an extra step.

I find duck typing in a DB more of a problem than a solution, unless the dataset is so small and variable I don't even need a RDBMS at all.

chillfox · on Jan 6, 2023

I prefer SQLite's flexible types for a small settings table (other tables I will setup as strict). Anyway, it's nice having multiple options of how to do things.

LAC-Tech · on Jan 6, 2023

sqlite has json1 enabled by default these days

https://www.sqlite.org/json1.html

ergonaught · on Jan 5, 2023

The Litestream and LiteFS sections near the beginning of the article cover your first question.

And I guess the second one is covered by the "generator" entry, but:

https://www.prisma.io/docs/concepts/components/prisma-client

LAC-Tech · on Jan 5, 2023

The Litestream and LiteFS sections near the beginning of the article cover your first question.

I don't feel it did.

sureglymop · on Jan 5, 2023

Do you like Prisma?

I think it's cool but it sucks that their prisma schema doesn't support constraints and more complex things. It is very basic and for any real world use migrations need to be used.

And I've had issues using it with SQLite. It doesn't support case insensitive filtering when using SQLite (I know, neither does SQLite). But as a result, i have to build a custom query which is pretty much what i wanted to avoid by using Prisma.

davnicwil · on Jan 6, 2023

I really like Prisma, and talk about it to anyone who'll listen, but it is still very new and as such I think doesn't have a lot of features figured out yet and some really surprising things do come up.

One I discovered just the other day [0] was that if you pass an undefined value into a field in a where clause of findFirst, it's treated the same as if you didn't pass anything at all.

If you do this with an id, for example, this can result in just the first record in the table being returned, because it's as though you have no where clause at all. Naturally this can lead to some surprising bugs if undefined values crop up at runtime, even potentially some really bad security bugs (imagine if this happens on the account table for instance).

I don't mean this as a criticism, I think it is just a normal part of being new and exciting software, maturing over time. But it's always a tradeoff to be aware of.

[0] https://github.com/prisma/prisma/issues/5149#issuecomment-13...

BFLpL0QNek · on Jan 6, 2023

I think the usecase is the new generation of platform as a service (PaaS) popping up. Cloudflare workers/pages, Netifly, Vercel ect.

They all deploy web apps at the edge, so are distributed by default. A lot of the web apps deployed on these services have small volumes of data, read heavy and low writes. They are also low system resources so an embedded db makes sense?

Running SQLLite locally to the application will give extremely low read latency with most of the apps being read heavy/biased. The sales pitch for deploying at the edge is you can have hundreds of apps running across the globe near your users transparently.

I’m fairly interested in LiteFS as a distributed cache. I’ve seen systems use a message queue listening for updates then updating local caches which is a bunch of custom code/plumbing that can be replaced with LiteFS. Using LiteFS the read replicas are essentially materialised views.

benbjohnson · on Jan 6, 2023

> I’m fairly interested in LiteFS as a distributed cache.

We've been finding a lot of use cases for this at Fly.io. We'll use the static leasing in LiteFS for this as it's simpler for that use case.

fmajid · on Jan 7, 2023

The real tragedy is that Microsoft and Mozilla conspired to kill WebSQL in the window of opportunity before they both became completely irrelevant in the browser space, and we got the misbegotten monstrosity that is IndexedDB instead of a full-featured SQLite in the browser.

benbjohnson · on Jan 6, 2023

> So LiteFS turns sqlite into a client server database. But SQLite being embedded is what gives it its low latency in the first place. So, I kind of have to ask - what's point? Just prefering the way sqlite does SQL over postgres?

LiteFS author here. Some other folks responded but I'll try to clarify here. LiteFS keeps SQLite as an embedded database so it maintains low latency. It injects itself as a passthrough file system so it can track changes with each transaction. On the replica side, it mostly acts like a separate SQLite process would in that it obtains the same locks as SQLite and applies page change sets to the database.

tl;dr from the application perspective, it just looks like a SQLite database. I mean, it is. We just capture change sets and propagate them behind the scenes.

jacob019 · on Jan 6, 2023

Seems there is a bit of vendor lock-in here with flyio, the way he's throwing custom http error codes for their infra. Can I just run LiteFS on a few virtual machines and forward writes back to master?

With LiteFS, what happens if master goes down? Can a new one be elected?

benbjohnson · on Jan 6, 2023

> Seems there is a bit of vendor lock-in here with flyio, the way he's throwing custom http error codes for their infra. Can I just run LiteFS on a few virtual machines and forward writes back to master?

LiteFS author here. You can run LiteFS anywhere -- even on a laptop with a few instances. There's no special Fly.io lock-in. The "fly-replay" is some syntactic sugar for our proxy that makes it easy to redirect requests but you could do the same with regular HTTP proxying.

> With LiteFS, what happens if master goes down? Can a new one be elected?

Yes, if you use Consul-based leasing then a new primary will come up automatically. Consul sessions have a minimum TTL of 10 seconds so you could lose write availability for up to that amount of time in the event of unexpectedly losing the primary. If you cleanly shutdown the primary then it will release its session and another node will pick up the lease right away.

The other option is static leasing where a single node will always be primary. This is a simpler option but you do lose write availability if your primary node goes down.

mildbyte · on Jan 6, 2023

You can indeed run LiteFS by yourself, without Consul, as a sidecar / wrapper around your application. We do it in our project and have a Docker Compose example at [0]. In this case, you specify a specific known leader node. We haven't tried getting it running independently with Consul to do leader election / failover.

[0] https://github.com/splitgraph/seafowl/blob/main/examples/lit...

pcthrowaway · on Jan 5, 2023

Good writeup on using LiteStream at this early stage, though I wouldn't say it's production-ready based on a skim, though it might be a good fit for read-heavy apps which don't have a lot of data.

One thing that's not clear to me is whether Litestream+SQLite works in-memory. If Litestream wrote to disk on the primary node, but kept the full DB in-memory for facilitating reads, I can definitely see how it would be a lot faster than Postgres (as the author claims).

Though I'd also like to know if any RDBMSs have been designed for that use case, as I'd imagine traditional DB optimizations have been made for high incidence of disk access that would no longer hold true if the entire DB was in-memory

mhio · on Jan 6, 2023

LiteFS replicates the underlying file system writes by storing the file on a Fuse file system that understands sqlite writes/transactions, it doesn't look like it changes how SQLite runs on top of that file system.

The underlying OS page cache is still available, as long as the fuse file system doesn't mess with it [1] so I think this is regular sqlite performance with a fuse overhead to reads/writes.

- [0] https://github.com/superfly/litefs/blob/main/docs/ARCHITECTU...

- [1] https://github.com/superfly/litefs/issues/149

nickcox · on Jan 6, 2023

Even reading from disk ought to be way faster than the typical Postgres use case of going over the network, shouldn't it?

pcthrowaway · on Jan 6, 2023

I would expect Postgres to be faster for complex operations over large datasets (of course with a helping of "It depends!"), but very interested in knowing if this is generally untrue.

mrkurt · on Jan 6, 2023

This is generally untrue, depending on what you mean by "faster".

The way most web apps use DBs, sqlite is incredibly fast. Because most web apps make multiple queries, often sequentially, and they're very sensitive to DB latency. The network overhead adds up: https://twitter.com/benbjohnson/status/1514969870529560580

chillfox · on Jan 6, 2023

It's faster even if the PostgreSQL instance is on the same box. If PostgreSQL is on a different box, then it's a lot faster.

fmajid · on Jan 7, 2023

I’d wait until the VFS version (non-FUSE) is ready before giving it a whirl.

danpalmer · on Jan 5, 2023

Interesting article covering the nitty gritty details of production deployments. I can’t help but think something is wrong with anonymous users though?

Loading the site was noticeably slow in MEL despite a SYD deployment. Took ~8s to load the page on a fast home fibre connection, abnormally slow for local sites, typical for US-only deployments.

I wonder if anonymous users are unnecessarily being given a session for tracking purposes, and that is causing the slowness. Cycling session IDs for authenticated users makes sense for the UX benefit of not having to log in again, but that speed issue is terrible for low intent anonymous users landing on the site for the first time.

Vt71fcAqt7 · on Jan 5, 2023

Off topic I guess, but I've been thinking about a collaborative spreadsheet that uses something like Sqlite over LiteFS. Where each spreadsheet is just a sqlite file. Not sure what the architecture would look like though. Ideally the main server would just keep the files in sync, perhaps with some way to update each user without having to send the whole file. This has the advantage of programabilty and a familiar UI. All at just the cost of hosting the file and a webserver.

chillfox · on Jan 6, 2023

LiteFS is meant to get write forwarding in an upcoming release, so you could use maybe use that to sync, assuming the writes are less than about 10/s.

endisneigh · on Jan 5, 2023

you will find that maintaining schema consistency during migrations will be more trouble than its worth

Vt71fcAqt7 · on Jan 6, 2023

Hey thanks for your reply. Can you expound on what the issue would be? What do you mean by migrations?

endisneigh · on Jan 6, 2023

Let’s say you have a hundred databases and you want to do a schema migration. You could naively do all of them at once, but what if they take a long time, or if one fails? What then?

So you implement some thing to do rollbacks and retries. Now you’ve created a complicated monstrosity because you will have users dealing with potentially different databases with different schemas.

You were better off with a single database with a sharding key.

Vt71fcAqt7 · on Jan 6, 2023

Hmmm thanks for the reply. Would that be true even if the database is just a single table each? In my mind I shouldn't need to do any schema migrations because I'm really just a file server and a UI. But I guess with adding more tables (akin to sheets in excel world) I would need to keep that in mind.

I may try this when I get some free time, thanks for the feedback!

DAlperin · on Jan 6, 2023

The problem appears less when creating new tables, creating tables is relatively safe. But let's say in your one table per db you keep track of a few things (columns). For example you have a column for the name of the sheet and another column for the time it was created and some other miscellaneous columns. Later on once you have a hundred users or so you get a persistent feature request: keep track of the time it was last updated. You write up a simple migration, it looks something like

  ALTER TABLE your_table ADD updated_at TEXT;

This is also relatively harmless, but now you have to figure out how to run this migration across your 100 databases, and deal with the case where maybe some fail, what do you do then? Rollback everywhere? But what if some users have already have rows that use the new column? Do you delete their data? Etc. Now consider a more complicated migration, maybe it renames a column, drops a column, adds a default value, drops an index, etc. You can see where problems might come from.

That said I do really like your idea, I've gotten into sqlite a lot lately (I work on the database team at Fly.io), I'm just responding to you questions specifically about migrations.

Vt71fcAqt7 · on Jan 6, 2023

Thanks for the detailed answer. I was thinking of a more hands-of approach in terms of what I do vs what the user does. So in terms of this feature request, it would actually be implemented client side. That is, I would update the client to ask the user if they want this new feature, then use IF EXISTS to see if the table already exists, and if it does, tell the user to fix it themselves. Or could we keep all the metadata in a seperate table and JOIN it on the ID? In any event, I'm literally just a reliable file server and a UI in this equation (plus backups).

For context, this would really be aimed at developers and admins to manage data for their clients or co-workers. Stuff that currently is an excel sheet, but needs to leave excel world to do some mundane task, such as filling a pdf and uplading it somewhere. You could do this is word mailmerge + excel, but now you (the admin) are in charge of performing this mundane task every week. And belive me this kind of thing is very common. In fact, the one special thing I may do is somehow write the data into an excel file and somehow allow them to sync to each other (not the logic part from either side though).

Not even sure if I would try to sell or just make it for free. It wouldn't be something streamlined like airtable, although if it got interest maybe I would take your approach.

If you like my idea btw, feel free to use it! I have nothing but respect for fly.io, I used it breifly for a side project and it was realy simple to set up. You saying you like my idea actually gives me confidence in trying it out, thank you!

DAlperin · on Jan 6, 2023

Hah I've got my own backlog of side projects, don't tempt me I'm like a year behind on it :). If you ever get around to building it or something like it I'd love to hear about it. You should check out tools like https://github.com/nocodb/nocodb which expose a spreadsheet like interface on top of a database.

rish1_2 · on Jan 5, 2023

Does this mean an entire copy of the DB is stored in-app? Isn't that a bad thing for large sites?

chillfox · on Jan 6, 2023

Yep, one full copy of the DB on each app server. So yes, it's bad if your data is 100s of GB, but if you only have in the low 10s GB of data or less then it's fine. It all depends upon how many server you need to run your app, If you only have one or two server then even very large databases of 100s of GB could be just fine.

mrkurt · on Jan 6, 2023

It's often better than you'd expect, because the alternative is usually in memory caching. Disk is cheap, RAM is less cheap.

kgeist · on Jan 6, 2023

Interestingly, our SRE people specifically have a rule "never share the DB and the app on the same server", because apparently a high load on the DB can decrease the overall performance of the server. It also allows them to freely add new DB servers (sharding) without touching the app server

chillfox · on Jan 7, 2023

That makes sense with client/server databases and if you are trying to separately manage database resources from app resources. When you use SQLite you have basically decided that the database and the app is one integrated unit that you will manage as a whole.

sakopov · on Jan 5, 2023

Great writeup. Thanks for covering DB writes. I was curious how that worked with SQLite and LiteFS.

js-celebrity · on Jan 5, 2023

Site takes several seconds to load. Author completely overhauls an application just over one year old which was marketed at the time as a prescription for modern web development. Migration plagued by errors and implemented with obvious inefficiencies in data handling. Feature parity with old application not yet reached. Handling analytics with a relational db.

chillfox · on Jan 6, 2023

What page are you hitting that's slow? Everything I have tried so far is instant.

Edit: Maybe the author needed to scale up the site infrastructure for Hacker News?

_a_a_a_ · on Jan 5, 2023

Site loads instantly for me with JS disabled but after that I've never seen anything like it. It's all narrow columns overlapping.

mr90210 · on Jan 5, 2023

Kent is a great Developer specialized in Frontend, but his whole Full-Stack experiment is rather overengineered.

bicijay · on Jan 5, 2023

Whats the problem with handling analytics data with a relational db?

mr90210 · on Jan 5, 2023

Go read about OLAP vs OLTP, those keywords will lead to the answer you are seeking.

Tostino · on Jan 6, 2023

It's more about transactional vs analytical systems. You want those decoupled. You can absolutely still use a relational DB for your analytics database, it can even be column oriented and vectorized if you want.