Why you should probably be using SQLite

fourfour3 · on Oct 27, 2023

For an app running on one VPS? Sure, this might make a decent amount of sense.

But... for an app running multiple instances, the article suggests a lot of extra complexity. Managing that complexity makes less sense to me than just firing up a managed database server (RDS, or whatever).

Even if you're self hosting, I think running a MySQL/Postgres cluster is a lot less complex than the options this article calls out.

edit: and more to the point - MySQL / Postgres replication is boring. It's done by a ton of deployments, and isn't going to surprise you, at least at smaller scale.

filleokus · on Oct 27, 2023

Yeah. I've gotten into discussions on HN before about this. sqlite is awesome when you run it in one instance on a VM or something, but solving migration between hosts or backup via the filesystem seems like a mess. Sure LiteFS helps, but is it better than a managed networked database?

> One huge benefit to SQLite is the fact that it runs as an embedded part of your application

But all production workloads I've been involved with the last ≈5 years have either been containerised apps offloading state to databases/caches/blob storage etc, or have strived towards that.

What I do think would be awesome would be an embeddable Postgres library/binary that could use a single state file on your local filesystem for development and use a networked database in production. Getting the benefits I like most about sqlite locally, and not having to deal with files in production.

Aurornis · on Oct 27, 2023

> sqlite is awesome when you run it in one instance on a VM or something,

This article really needs to be taken in the context of the author’s target audience: He sells courses that primarily target web dev juniors doing learning projects and small projects. That’s why he includes the “most of you reading this” disclaimer in the article:

> So, can you use SQLite? For the vast majority of you reading this, the answer is “yes.” Should you use SQLite? I’d say that still for the majority of you reading this, the answer is also “yes.”

Pushing strong contrarian opinions is part of his social media marketing style. He’s also pushing a “why I won’t use next.js” article and opinion on his social media, while conveniently omitting the obvious conflict of interest that he sells courses that don’t happen to cover Next.js.

flanbiscuit · on Oct 27, 2023

Even with his bias you still saw a lot of agreement to his criticisms on the NextJS subreddit (which you would assume was biased towards NextJS) and the React subreddit.

https://www.reddit.com/r/nextjs/comments/17giozq/why_i_wont_...

https://www.reddit.com/r/reactjs/comments/17gl76d/why_i_wont...

password4321 · on Oct 27, 2023

Discussion seemed most opposed to Vercel-ification and churn.

https://news.ycombinator.com/item?id=38018217

leerob · on Oct 27, 2023

I'll be posting a full response to this, which I think you'll find very interesting (from the opposite point of view). Stay tuned.

flanbiscuit · on Oct 27, 2023

looking forward it!

Last time I truly used NextJS was in 2020 (it was v9), and that was only to make a statically generated brochure type site. I had started the site with Gatsby and didn't love Gatsby, so I switched to NextJS and loved it.

Recently I did start up a Next v13 project using the new App Router. But I haven't gotten that far past the initial scaffold and creating a couple components. Seeing all of these complaints about v13, especially the App Router, are making me want to scrap that and use something else.

leerob · on Oct 27, 2023

https://leerob.io/blog/using-nextjs

thaumasiotes · on Oct 27, 2023

By the time you're labeling stating your opinions in public and then acting consistently with them as a "conflict of interest", you've abandoned any ability to get anything useful out of the term.

szundi · on Oct 27, 2023

My thoughts exactly

yawaramin · on Oct 27, 2023

So if I like apples more than oranges, I write a blog post saying I like apples and won't eat oranges, and I also have a business selling apples...it's a conflict of interest?

Aurornis · on Oct 27, 2023

If a company sells apples and they start pushing social media campaigns about why oranges are bad and they think you should be buying apples instead, then that’s a conflict of interest.

I think some people are missing the conflict of interest because the author blurs the lines between his personal opinion and his business. He turned his personal brand into his business and uses his name as his brand.

EDIT: I think people are missing the meaning of "conflict of interest". Having a conflict of interest doesn't mean something is wrong. You can be right and have a conflict of interest.

The issue is that conflicts of interest exist whenever someone's paycheck depends on an opinion being true. If someone on Twitter was alternating between trying to sell you vitamin supplements and then posting articles about why those vitamin supplements are good for you, HN would have no problem with pointing out the conflict of interest and taking it into consideration in the context of evaluating the claims.

If someone's entire job and personal brand are built around selling courses for particular stacks, that's important context to bring up when they start writing about why other technologies are bad. Agree or disagree with the conclusion, but you have to acknowledge that the writings should be read with the conflict of interest taken into account.

brvsft · on Oct 27, 2023

We're missing the "conflict of interest" because you are abusing the meaning of the term.

What you describe is simply a bias.

Conflict of interest would be if I take money from parties A and B. Party A pays me to give a professional opinion about which nutritional supplement to take, which frontend javascript framework to use, etc. Party B pays me to endorse their special Snake Oil Pills and Ointment or React or something else. I take Party B's money and make those endorsements to Party A without their knowledge and without any actual consideration of what is truly best for Party A. There is a conflict of my interests in Parties A and B.

Your definition is excessively broad and could be used to suggest, e.g., that Nike has a conflict of interest because it is their opinion that you should purchase their shoes.

albedoa · on Oct 27, 2023

> e.g., that Nike has a conflict of interest because it is their opinion that you should purchase their shoes.

It's even worse: that Nike has a conflict of interest because they think that wearing shoes while running is good, even though it would be extremely weird if a running shoe company thought that running with shoes is bad.

Remove the business from the equation and the opinion is fine — it's the existence of a business that aligns with the opinion that creates the conflict! Galaxy brain definition.

rrrrrrrrrrrryan · on Oct 27, 2023

> EDIT: I think people are missing the meaning of "conflict of interest". Having a conflict of interest doesn't mean something is wrong. You can be right and have a conflict of interest.

He has motivations that make him not impartial, but that's not usually termed a "conflict of interest."

You're just using the term strangely. Usually the phrase "conflict of interest" implies the person has some sort of official commitment / obligation / duty to another party that is put in jeopardy because of another conflicting interest.

But this is just some guy selling stuff. He hasn't made any official commitments; he doesn't have any official duty or obligation to remain impartial.

By your definition, anyone selling anything has a conflict of interest because they want money from their customers, which may be against the customers' best interests.

Sohcahtoa82 · on Oct 27, 2023

> If a company sells apples and they start pushing social media campaigns about why oranges are bad and they think you should be buying apples instead, then that’s a conflict of interest.

Pretty sure that's just marketing.

A lot of product pages will spell out features they have and list competitors that don't have those features.

thaumasiotes · on Oct 27, 2023

> I think some people are missing the conflict of interest because the author blurs the lines between his personal opinion and his business.

No, what's going on is that you don't understand what "conflict of interest" refers to.

Different people having different interests, as when the apple company says you should buy apples even though you prefer oranges, is just a regular conflict. A conflict of interest is when one person has two different interests. The apple company isn't experiencing a conflict in your example.

jimbokun · on Oct 27, 2023

If you don't disclose the business, yes.

willsmith72 · on Oct 27, 2023

It's extremely disclosed

rmbyrro · on Oct 27, 2023

Since he doesn't like next.js, you expected him to offer a course on it?

I really don't get the reasoning here...

Aurornis · on Oct 27, 2023

He made his courses before Next.js rose to popularity.

I wouldn’t be surprised if he offered Next.js courses if it becomes popular enough in the future. For now, the anti-Nextjs push seems like a clear defensive play to steer people back to the course material he already had for sale.

willsmith72 · on Oct 27, 2023

That's just utterly untrue.

The whole course he's been advertising is based on remix, which came up YEARS after next.js rose to popularity.

Where did you get this info that you were so confident on?

no_wizard · on Oct 27, 2023

He’s also employed as part of the Remix team, that’s part of it too

adamesque · on Oct 27, 2023

He used to be but I think he left to do this EpicWeb thing.

sa46 · on Oct 27, 2023

> What I do think would be awesome would be an embeddable Postgres library/binary that could use a single state file on your local filesystem for development

I wired Postgres up for our local dev. I don't believe in mocking the database, so all our tests and local dev run against a real Postgres instance.

The main tricks:

- Store the Postgres installation in a known location.

- Set the dynamic loader path to the installation lib dir, e.g., LD_PRELOAD.

- Don't run CI as root (or patch out Postgres' check for root)

- Create and cache the data directory based on the source code migrations.

- Use clone file to duplicate the cached data directory to give to individual tests.

One thing I'd like to pursue is to store the Postgres data dir in SQLite [1]. Then, I can reset the "file system" using SQL after each test instead of copying the entire datadir.

[1]: https://github.com/guardianproject/libsqlfs

thom · on Oct 27, 2023

Little use if you’re not on the JVM but I’ve had great success with Embedded Postgres:

https://github.com/zonkyio/embedded-postgres

Each test just copies a template database so it’s ultra fast and avoids the need for complicated reset logic.

sa46 · on Oct 28, 2023

Yep, I used that trick as well. The evolution we went through:

1. Each test runs all migrations on a fresh database. Each test spends 2.1 seconds on db setup.

2. Each test suite runs all migrations. Each test copies the template database from the test suite. Each suite takes 2.1 seconds to run migrations, but cloning a template database takes 300 ms.

3. Bazel caches the data dir and rebuilds it once for all test suites. Reduces the initial test suite setup from 2.1 seconds to 160 ms (to copy the datadir).

4. Each test in a suite uses clonefile to copy the datadir. Reduces db setup overhead from 300 ms (to copy a template database) to 80 ms (to clonefile a datadir).

Currently, most of our testing overhead is clonefile and cleaning up the datadir. I'm interested in a single file sqlite FS because I could be clever and use LD_PRELOAD to replace Postgres's file system operations with sqlite, avoiding most syscalls altogether.

thom · on Oct 28, 2023

Nice, any chance you’ll package all this up one day? I appear to be getting better numbers than these right now but I suspect it’s because the DB is simpler and fairly empty to begin with, so presumably we could be going even faster.

loloquwowndueo · on Oct 27, 2023

Point Postgres data directory to a memory-backed tmpfs, makes the thing blazing fast.

sa46 · on Oct 28, 2023

I've tried--macOS doesn't make it easy. It's not a clear performance win since tmpfs with Postgres fsync=off is mostly memory-backed storage via the page cache already.

Another problem is that parallel tests can exhaust memory.

The slowest part of our tests is syscall overhead. A mostly empty Postgres data dir for a medium-sized database with a few hundred tables consists of 3k files. On my M1 macOS, it takes 120 ms to delete the entire data dir. Copying is cheaper at 80 ms using clonefile.

paulryanrogers · on Oct 28, 2023

I've done this for MySQL too. Saved our CI times on GitHub Actions after upgrading to MySQL 8 and the single threaded performance dropped.

never_inline · on Oct 27, 2023

You can get very close to this with docker compose.

You can delete and recreate the volume when you want with `docker compose down --volumes` flag, which is great for testing.

Shameless plug: https://mahesh-hegde.github.io/posts/postgres_local_using_do...

appplication · on Oct 27, 2023

> What I do think would be awesome would be an embeddable Postgres library/binary that could use a single state file on your local filesystem for development and use a networked database in production.

I mean this is more of less what using a local container for Postgres is. It’s not exactly one file but it’s hidden in a volume somewhere where you don’t need to go file digging, so for all intents and purposes sure, it’s one file.

Most ORMs also will let you swap out your RDBMS so you can run sqlite locally and Postgres in prod via the same code. I think this is a bit of an anti pattern though. While the two are similar in most standard use cases, it’s not always apparent when you veer off that standard use case and into something that’s not supported by sqlite. And then all of the sudden you either have a divergence between what works locally vs in production or you start to intentionally limit your app’s capability to only what works with sqlite because you can’t locally validate it.

captn3m0 · on Oct 27, 2023

Managing Postgres, even via containers is not so simple. Upgrades between major versions will still need a backup+restore for eg.

I run a single container Postgres for my homeserver, and have to deal with occasional issues that wouldn’t apply to SQLite (which I run as well).

time0ut · on Oct 27, 2023

My general pattern these days is a compose file with postgres and another container to run my migrations (along with anything else I need). Then I have an integration test suite that hits all the code paths that touch the database. The whole workflow gets run again in CI but with the new image via a profile in my compose file. Its so fast and easy to do at this point.

geraldwhen · on Oct 27, 2023

You will run into compatibility issues almost immediately. Running a local Postgres container is really the only option.

eduction · on Oct 27, 2023

> What I do think would be awesome would be an embeddable Postgres library/binary that could use a single state file on your local filesystem for development and use a networked database in production.

Absolutely! And also, short of that but incrementally building toward it, some sort of simplification of initial db setup and configuration, including really solid, thoughtful defaults (many of them are but I am guessing they could be substantially improved).

I feel like half of people’s attraction to SQLite is the easy setup and first deploy story. Postgres will never match that but could get closer.

jimbokun · on Oct 27, 2023

> I feel like half of people’s attraction to SQLite is the easy setup and first deploy story.

That is an incredibly important part of the story!

Spending a lot of time on configuring and monitoring and operationalizing your database can kill a business iterating on its minimum viable project.

icedchai · on Oct 27, 2023

In production, use a managed service (RDS, Cloud SQL, tons of others...) In dev, set up a Docker container. You won't spend a lot of time on either of these things.

thaumasiotes · on Oct 27, 2023

> but solving migration between hosts or backup via the filesystem seems like a mess.

I don't understand this. How would you solve backups without using the filesystem?

filleokus · on Oct 29, 2023

I mean the filesystem is going to be used somewhere of course, and everything depends on your setup / the scale of things.

My point was more that with a networked database, you can easily centralise your backing up of databases. If you are in the cloud with managed databases, the backing up is kinda solved out of the box. If you are doing it yourself or on-prem, you probably have a database cluster of some kind where backing up can be managed (which of course involves a filesystem at some point).

But if you are using sqlite, every sqlite file across your organisation needs to be backed up. You'll probably have some scripts creating the actual snapshot and then some other scripts shipping that away somewhere. Those scripts probably need to be quite bespoke for different hosting solutions. Are you going to ssh into the vm's running the apps and do the backup that way, or will you be running the backing up scripts on each vm?

Of course if you have a single VM hosting a few apps with one sqlite database each, none of this is an issue, and it might be easier in the short term than a networked database. Or even if you are operating at "scale" and treating your VM's more like pets than cattle, it's probably not going to be an issue either.

casselc · on Oct 27, 2023

https://docs.oracle.com/en-us/iaas/dbcs/doc/back-database-ob... and https://learn.microsoft.com/en-us/sql/relational-databases/t... are a couple examples.

thaumasiotes · on Oct 27, 2023

Why is this considered "not using the filesystem"?

What's the conceptual difference between backing up your Oracle database to an Oracle Cloud Infrastructure bucket and backing up your sqlite database to an S3 bucket?

binwiederhier · on Oct 27, 2023

Your comment implies that running a database cluster is easy, or easier than what OP describes. And while I do not endorse his suggestion of using SQLite in a multi node setup, I can tell you from personal experience that running a multi-node RDMS is hard as hell.

In $oldjob I designed and managed a system that was eventually running many hundreds of MySQL 3-node groups. MySQL in particular has a lot of nobs that you need to understand to set up a cluster that doesn't lose data during a fail over (semi sync, sync bin log, trx commit flush, ....). And then of course: neither MySQL nor PostgreSQL have a built-in mechanism to handle fail overs at all. It's another software stack around that, or even something custom built.

Again, not saying I agree with the author. I'm just saying you're portraying it easier than it actually is.

Aurornis · on Oct 27, 2023

> Your comment implies that running a database cluster is easy

You obviously don’t need a cluster if you’re considering SQLite as an alternative.

Comparing a full database cluster to SQLite doesn’t make sense.

binwiederhier · on Oct 27, 2023

Tangentially related: my personal open source project (ntfy.sh) is starting to outgrow a one node SQLite-backed setup, and I am now faced with the decision we are talking about here (something I would not have imagined, I might add): do I use some SQLite-replication thing, or do I rewrite it all to work with a RDMS and deal with managing that? It's quite the annoying choice, so I really wish there was a great SQLite backed solution, hehe.

aseipp · on Oct 27, 2023

If by "outgrow" you mean you want to improve concurrent read/write performance, availability, and throughput via replication, then I would -- personally, for peace of mind and ease -- look at just buying a managed RDBMS from someone with predictable pricing and using that. Buy PlanetScale, Aurora, Neon, CockroachDB Cloud, etc and let them handle it. PlanetScale and Aurora in particular have plans that let you fix the I/O costs and only pay for compute and storage, which is pretty attractive. They also use standard MySQL/Postgres, so you can migrate in/out to other providers later. It does mean you must spend time on the migration though, and spend time understanding the implications of your new storage choice (latency, failure modes, etc.)

If you just want peace of mind and better disaster handling, and your current uptime, concurrent workload split, and performance is fine -- maybe your hosting provider eats it for an hour or two and you don't want to be hamstrung -- I would suggest just using a tool like Litestream, replicating the DB to S3, and setting up a hot standby server in some alternative region that you can fail over to and that actively synchronizes the working set. Always keep them up to date with deployment automation. Set up some alerting, and if a failure happens, terminate the main instance with prejudice, let the replica catch up by replaying any latest changes, and reorient your load balancer to point to it (Cloudflare tunnels are a good low-tech solution you could use to do that.) This might imply a small downtime window for the hot failover. For bonus points, you can automate this whole task, and actively perform it regularly, swapping between servers on regular cadence -- thus turning the design from having a primary/standby to simply having two interchangeable systems that swap roles. And then you have good confidence in hot-failover disaster recovery. e.g. just do this whole dance every week on Monday at 1am UTC, and you can have confidence it works and will stay working.

I do not know what other architectural constraints you might have. These are just suggestions. Good luck!

robertlagrant · on Oct 27, 2023

I know it's heresy, but could you defer the decision by sharding? E.g. customers with an email that hashes to a positive number goes to instance A; others to instance B. That might buy you some time.

simonw · on Oct 27, 2023

Can you throw more RAM and CPU at the problem instead?

binwiederhier · on Oct 27, 2023

I have and it works really well. This is less about SQLite not being able to handle it -- because it does -- and more about reliability and availability of the overall system. I don't want one server being down to affect the entire system.

JohnnyGault · on Oct 27, 2023

https://github.com/maxpert/marmot is SQLite replication thats eventually consistent and doesnt require a master server.

binwiederhier · on Oct 27, 2023

Fascinating, thanks. I'll check it out.

yawaramin · on Oct 27, 2023

Congrats, that's a great position to be in.

DANmode · on Oct 27, 2023

> some SQLite-replication thing

This feels like an uncharitable interpretation of the maturity of projects like LiteFS/rqlite/dqlite.

binwiederhier · on Oct 27, 2023

The vagueness of this comment and its perceived derogatory nature is merely a result of my ignorance around these technologies. It was not meant the way you perceived it.

That said, I looked at 2/3 a while ago:

- rqlite looks great and mature, though it has a HTTP interface, so I'll need to rewrite my database layer anyway to use it. At that point might as well to Postgres

- dqlite seemed like a failed Canonical experiment to me last time I checked, but looking again I think I may have had the wrong impression.

mharig · on Oct 28, 2023

There is also Litestream.

And: the SQLite homepage runs on - guess it. Maybe they have some doc/blog describing how they manage their traffic.

wongarsu · on Oct 27, 2023

There is one edge case where comparison makes sense: read only workloads. If you have a large, mostly static database (e.g. a car parts website) it can make sense to just sync a Sqlite file across all your servers.

On the other hand, read replicas are also the easy case for database clusters. And just making your one database server bigger also scales a lot, there is nothing stopping you from adding a terabyte of RAM.

JohnBooty · on Oct 27, 2023

    And just making your one database server 
    bigger also scales a lot, there is nothing 
    stopping you from adding a terabyte of RAM.

Underrated development strategy. If your max envisioned workload won't exceed what a single server can reasonably handle, this is orders of magnitude easier&cheaper&performant than anything else.

Additionally, I think a large percentage of the HN crowd underestimates what a single large-ish single server can handle these days. Probably because we are all used to only seeing tiny virtualized slices these days.

192+ threads, 2TB of RAM, terabytes of crazy fast PCIe storage. This was multiple racks of servers not too long ago, now in ~2U and it costs less than a car.

qaq · on Oct 27, 2023

You can get Dual AMD EPYC™ 9634 Processor 84-core 2.25GHz 384MB Cache with 3TB RAM and 200TB of SSD storage for less than a car. So 336 threads in 2u box it's really crazy.

tcgv · on Oct 27, 2023

> I can tell you from personal experience that running a multi-node RDMS is hard as hell

There exist varying levels of complexity. Here at my company, we have a very small cluster comprised of a main PostgreSQL instance and four replicas. It just works and requires very low maintenance effort.

But I imagine that things can get really messy if you're dealing with dozens or hundreds of instances, for sure, especially if you need sharding.

binwiederhier · on Oct 27, 2023

> we have a very small cluster comprised of a main PostgreSQL instance and four replicas

As you said, it depends on what you want:

- If you are okay with losing data in the event of a failover, then a simple setup is fine. - If you want lossless failovers (i.e. not a single transaction is lost), then that's hard. - If you want entirely automatic failovers, that's also very hard. - ...

My experience stems from MySQL, and not PostgreSQL, so this may or may not apply to pg. I wrote about my experience in this blog post: https://blog.heckel.io/2021/10/19/lossless-mysql-semi-sync-r... --- TLDR: Not losing data with MySQL is really really hard.

fodkodrasz · on Oct 28, 2023

PG fully consistent automatic failovers are also pretty hard, at least they were 3-4 years ago when we did such a setup (initially it seemed easy, but tests showed its limits, and getting it reliable was pretty hard). Now I heard that there are Kubernetes based solution which are said to automate it pretty well, but have not tried those yet.

beoberha · on Oct 27, 2023

Imagine how hard it would be to manage SQLite clusters ;)

fourfour3 · on Oct 27, 2023

I did handwave it a bit, yes :)

What concerns me more than anything with a process like LiteFS is the 'unknown unknowns' perspective of what can go wrong - whereas I think MySQL and Postgres have generally more understood failure modes.

I also think that a lot of people in middle sized projects where they're not big enough to have a dedicated database team, etc will end up using some kind of managed database solution - something like Amazon Aurora, or RDS, or even someone like DigitalOcean's managed database product? Those generally take care of sensible configuration, backup, and failover at least.

edit: FWIW, I've done both ways - self-managed MySQL & Postgres clusters and managed services. You're right - getting MySQL in particular set up to be crash-safe etc requires careful documentation reading, and it's very frustrating that default installs of MySQL don't come already configured like this. In my eyes, safe-but-slow is a much better default than fast-but-risky :/

MagicMoonlight · on Oct 28, 2023

Why do you need multiple instances? You can get like 400 cores and 8TB of RAM on one server now.

People like messing around with meme features but I don’t think there’s many companies in the world using more than that.

starcraft2wol · on Oct 27, 2023

I think the excitement about SQLite comes from LAMP stack fatigue Every Wordpress site could have been built on sqlite just fine, with better security too. But beyond that, you probably want a real RDMS with auth and alter table.

simonw · on Oct 27, 2023

SQLite has a pretty interesting auth story using the auth callback hook.

Alter table is solvable too. I built tooling for that here: https://sqlite-utils.datasette.io/en/stable/cli.html#transfo...

zdragnar · on Oct 27, 2023

Back in the day, WordPress always ran on Apache with mod_php, correct? Wouldn't that be unsafe to use sqlite, since every request ran in a separate thread?

graemep · on Oct 27, 2023

Separate prcoesses because php was not thread safe back then.

SQLite can handle multiple processes: https://www.sqlite.org/faq.html#q5

I am not sure it could back then

zdragnar · on Oct 27, 2023

Writes and changes to the database itself are locked to a single process at a time, per your link.

If the load is very read heavy the write locks aren't an issue, but if you're tracking page views, comments or any other write-heavy feature, it'll be harder to get good performance.

simonw · on Oct 27, 2023

Benchmark it with WAL mode turned on. You may be pleasantly surprised.

MoSattler · on Oct 27, 2023

It's actually quite easy with things like LiteFS Cloud. https://fly.io/docs/litefs/

hobofan · on Oct 27, 2023

LiteFS Cloud does nothing to help with the single-writer limitation of LiteFS.

eternityforest · on Oct 28, 2023

What about fully independent instances, each serving N clients, with shared stuff done as a DHT? Is there a framework for that?

anacrolix · on Oct 28, 2023

Why use a DHT if you control all the nodes?

eternityforest · on Oct 28, 2023

If you had something that needed very easy setup but also scalability, you could do it all as a monolith and not have anything else to set up besides just the one app and your certificates, and they'd just all automatically find each other.

I'm not sure how else you could do that besides a DHT without a central database or manually assigning certain pieces of data to specific servers.

rebolek · on Oct 27, 2023

Honestly, how big your application needs to be to outgrow SQLite? If you can serve X customers from one server with SQLite, good for you. If you need to serve more people, you should be in a position to afford migration to something more complicated and good for you too. I don’t see a problem here. Let’s start small with the right tools and don’t fall in the premature optimization trap. You don’t need DB cluster, Kubernetes and other fancy stuff when you start your project.

8n4vidtmkvmk · on Oct 28, 2023

Migrations can kill your business. They are no small matter. Do you really want to stop everything just as your business is gaining momentum?

codedokode · on Oct 27, 2023

SQLite is not easy to use with migrations, because it doesn't support many ALTER TABLE options [1]: you need to create a new table instead of modifying a column, for example. Also, foreign keys are ignored by default and you need to explicitly enable them after connecting.

Also, column types are not checked and you can easily insert a string into numeric column.

Also it doesn't allow you to use multiple application servers.

So it can be used only with small, simple sites.

[1] https://www.sqlite.org/lang_altertable.html

Timon3 · on Oct 27, 2023

> Also, column types are not checked and you can easily insert a string into numeric column.

This is only true if you don't use strict tables (implemented end of 2021): https://www.sqlite.org/stricttables.html

If you define a table as strict, the data is coerced, and if not successful an error is thrown.

mickeyp · on Oct 27, 2023

> If you define a table as strict, the data is coerced, and if not successful an error is thrown.

Coercing the data is hardly better. Now you have data integrity issues and you have no errors!

Sammi · on Oct 31, 2023

SQLite attempts to coerce the data into the appropriate type using the usual affinity rules, as PostgreSQL, MySQL, SQL Server, and Oracle all do. If the value cannot be losslessly converted in the specified datatype, then an SQLITE_CONSTRAINT_DATATYPE error is raised.

https://www.sqlite.org/stricttables.html

Mawr · on Oct 28, 2023

To clarify the bit about coercion: "SQLite attempts to coerce the data into the appropriate type using the usual affinity rules, as PostgreSQL, MySQL, SQL Server, and Oracle all do."

IshKebab · on Oct 27, 2023

> the data is coerced

Seriously? And there I was thinking they actually fixed this mistake... they just made another schoolboy error.

simonw · on Oct 27, 2023

I have a pretty robust, well-tested solution for advanced SQLite ALTER TABLE here: https://sqlite-utils.datasette.io/en/stable/cli.html#transfo...

Svip · on Oct 27, 2023

> numeric column

Worth mentioning that numeric in Sqlite is still just a float.[0] A table of two rows where column a is 0.1 and 0.2, respectively, sum(a) will not yield 0.3.

[0] https://www.sqlite.org/datatype3.html

michaelcampbell · on Oct 28, 2023

You've checked this? Perhaps you're correct in a general sense, but in this specific case, neither "number" nor "numeric" datatypes show this issue.

    > sqlite3 --interactive ./floats.db
    SQLite version 3.37.2 2022-01-06 13:25:41
    Enter ".help" for usage hints.
    sqlite> create table floats(colA number);
    sqlite> insert into floats(colA) values (0.1);
    sqlite> insert into floats(colA) values (0.2);
    sqlite> select sum(colA) from floats;
    0.3
    sqlite> create table floats2(colA numeric);
    sqlite> insert into floats2(colA) values (0.1);
    sqlite> insert into floats2(colA) values (0.2);
    sqlite> select sum(colA) from floats2;
    0.3
    sqlite>

LispSporks22 · on Oct 27, 2023

Huh what do folks use for money in Sqlite?

Yiin · on Oct 27, 2023

same thing you should use in any data store for money, integers

kruador · on Oct 27, 2023

In Microsoft SQL Server you should use MONEY or DECIMAL. In Oracle you should use NUMBER. In Postgres you should use MONEY or NUMERIC. In MySQL you should use NUMERIC or DECIMAL. In IBM DB2 you should use DECIMAL.

These are all backed by integer data storage and arithmetic, but the database handles scaling the values for you, to whatever number of decimal places you have configured. SQL Server and Postgres MONEY type will additionally format values with a currency symbol, when converted to a character string.

In SQLite you're out of luck - if you want accounting values you'll have to store them as integers and scale the values yourself.

Source: I work on a mobile app with offline storage of pricing and weighed quantities, using SQLite in the app, and SQL Server on the back end.

mercxry · on Oct 28, 2023

You should still not use MONEY in Postgres, even their own website says so [1].

I still wouldn't use floats/numerics/decimals to store currency either in any db generally, as said by others you're going to end up with inaccurate numbers [2].

Therefore using integers is in fact very good for this use-case, especially if you are in accounting or book-keeping!

Source: I work for a Fintech company that processes millions of payments.

[1] https://wiki.postgresql.org/wiki/Don%27t_Do_This#Don.27t_use... [2] https://www.youtube.com/watch?v=PZRI1IfStY0

nly · on Oct 27, 2023

Using a fixed point representation doesn't solve your SUM() problem.

If you add 0.1 to 1000000 with an integer representation scaled by 10^6 then you are still boned.

Binary floats may actually produce a better result here.

What you really want is a decimal floating point type with a suitable amount of significant digit precision.

cdelsolar · on Oct 27, 2023

int64?

sorenbs · on Oct 27, 2023

At Prisma, the "twelve easy steps" for performing a table change in SQLite documented on https://sqlite.org/lang_altertable.html is a bit of a running joke.

SQLite is super versatile, but very different from other more traditional RDMBS.

rmbyrro · on Oct 27, 2023

A bit of unfair comment. These steps apply only to ALTER statements that are NOT supported natively by SQLite.

The natively supported statements are as easy to perform as in MySQL or Postgres, for example.

> "The only schema altering commands directly supported by SQLite are the 'rename table', 'rename column', 'add column', 'drop column' commands shown above. However, applications can make other arbitrary changes to the format of a table using a simple sequence of operations.

yencabulator · on Oct 27, 2023

The whole point is how few ALTER TABLE statements are actually supported by SQLite.

rmbyrro · on Oct 28, 2023

The original comments talks nothing about that, actually. If that was the purpose, why not say that instead?

llimllib · on Oct 27, 2023

simonw's sqlite-utils can be helpful for executing table changes:

https://sqlite-utils.datasette.io/en/stable/python-api.html#...

https://sqlite-utils.datasette.io/en/stable/cli-reference.ht...

> Also it doesn't allow you to use multiple application servers.

Not in the same way you use postgres etc, but you can do it with sharding or with LiteFS, but you do have to consider carefully how you scale your app.

I'm not _really_ disagreeing with you, but I think you're painting with a bit too broad of a brush.

modo_ · on Oct 27, 2023

Thanks! Came to this thread hoping for something like these. They look useful!

Cthulhu_ · on Oct 27, 2023

There's a workaround for the alter table flaw, but it's a bit involved and takes longer - create a new table with the updated fields, move all data, remove old table, rename the new table. I'm sure there's a historical or valid reason for it, it's just annoying.

euroderf · on Oct 27, 2023

> foreign keys are ignored by default

This can be fixed in the connection string.

> column types are not checked

Now available, the STRICT keyword.

graemep · on Oct 27, 2023

> This can be fixed in the connection string.

Not a solid guarantee. More prone to bugs, errors, etc. What is someone changes something using the comnandline client and forgets to issue the pragma command?

I would feel a lot happier using SQLite if this was a per DB setting rather than a per connection one.

angra_mainyu · on Oct 27, 2023

> Not a solid guarantee

How so? Are you saying that sqlite would ignore the connection string?

progmetaldev · on Oct 27, 2023

https://sqlite.org/foreignkeys.html#fk_enable

Judging by the documentation, if you issue a PRAGMA foreign_keys; and no row is returned containing a 0 or 1, then you are using an unsupported version of SQLite, or the library was compiled with foreign key support disabled. I am struggling to find any documentation that states if anything will occur if enabling foreign keys in the connection string, when the version does not support foreign keys.

progmetaldev · on Oct 27, 2023

An older version that doesn't support the feature will probably just ignore the addition to the connection string.

graemep · on Oct 29, 2023

I mean it is too easy to get wrong through human error or buggy application code.

rrdharan · on Oct 27, 2023

Coercion isn’t necessarily what you want. See discussion here (when STRICT was introduced):

https://news.ycombinator.com/item?id=28259104

It’s also only two years old which is forever in the web world and brand new by Databases ops/maturity standards. There are likely still warts waiting to be discovered (there always are, but the discovery rate tapers over decades).

yawaramin · on Oct 27, 2023

> you need to create a new table instead of modifying a column

If modifying the column's type, yes. But you don't if you're just renaming, adding, or dropping a column.

> column types are not checked and you can easily insert a string into numeric column.

Can be fixed with CHECK constraints in the table definitions.

> Also it doesn't allow you to use multiple application servers.

This is mentioned in OP.

> So it can be used only with small, simple sites.

It can be used in a large variety of applications, but it's more appropriate for some than for others.

Ensorceled · on Oct 27, 2023

One thing I don't get ... are people really having problems getting PostgreSQL, or even MySQL, up and running? Is this REALLY a dev/devops/ops hurdle for people?

Just seems strange that is actually a problem.

Don't get me wrong, I use SQLite ... it's just not the database for my company's management web portal.

gorjusborg · on Oct 27, 2023

Just let the HN hype wave wash over you my friend :)

I see the sqlite hype as a (legitimate) pendulum swing away from defaulting to 'web scale' everything to a realization that most of the systems we design don't need it.

On the other hand, when you start mixing in new tech like litefs to paper over shortcomings in the fundamental nature of an alternative, I start to question the sanity of the choice.

We have screwdrivers, hammers, wrenches, let's not select a wrench and wrap it in a 'screwdriver adapter as a service'.

LeonenTheDK · on Oct 27, 2023

> I see the sqlite hype as a (legitimate) pendulum swing away from defaulting to 'web scale' everything to a realization that most of the systems we design don't need it.

That was exactly my thought, the majority of work that gets put out there isn't going to need to scale to many different nodes to serve thousands+ of concurrent users. If you're actually serving tonnes of traffic and need clustering or high end redundancy, have at 'er. But for most of the professional work I've done, SQLite is more than enough (provided you back it up properly!)

I think this is also influenced by my philosophy of keep things simple and only scale when you need to. I haven't had need of k8s or anything like that and suspect many companies serving small-medium traffic loads don't either.

logical_proof · on Oct 27, 2023

> We have screwdrivers, hammers, wrenches, let's not select a wrench and wrap it in a 'screwdriver adapter as a service'.

My hats off to you on this comment, even more so if you are the originator. Personal experience shows that I have added more complexity by spinning up a postgres server 'in case I need it' more often than not.

merpnderp · on Oct 27, 2023

As an aside, literally every single one of my tools in the garage doubles as a hammer...

And I also use SQLite in places I definitely shouldn't be.

calvinmorrison · on Oct 27, 2023

Why do we need to hype what amounts to a core software for the planet. Sqlite is good in some places. Mysql is good in others. Were all smart enough here to make that decision when to reach for what

sophacles · on Oct 27, 2023

I question your metaphor - SQLite has been popular here since I started reading this site. Wave suggests a rise and fall of popularity (possibly in a long lasting cycle)... sqlite has always just been popular and well regarded.

You may be noticing it more out of a frequency bias (baader-meinhoff effect) situation.

nh2 · on Oct 27, 2023

"Just starting a postgres" is possible but it's a lot harder than it should be.

Generally, you can just run the `postgres` command, but it creates many unnecessary hurdles:

* Just want to invoke `postgres`? Not possible, you need to invoke `initdb` first. Why can't `postges` do that for automatically you if the DB dir doesn't exist, like other servers do (e.g. `consul`)?

* Depending on your invocation, you're likely to run into errors mentioning "initdb: invalid locale settings; check LANG and LC_* environment variables". Why isn't there a simple UTF-8 default?

* Postgres refuses to run as `root`; that's unoverridably hardcoded in its code. So you can't "just start it" when you're `root` inside a container or VM in which only postgres runs. It can't run well under `unshare` when you want things to shut down reliably on cancellation in CI. This constraint should just be removed; no other software refuses to run as root. This mindset is indeed from 27 years ago.

* You can't start an in-memory postgres for quick testing or CI. This feature does not exist.

All of this is fixable, but it currently creats the "oh you can 'just run' postgres, but not in random conditions A, B, C and D".

returningfory2 · on Oct 27, 2023

I think just running Postgres via the standard Docker image solves most of these issues.

Also an in memory Postgres would be nice, but Docker again gives an easy way of setting up “throwaway Postgresses”.

This all assumes Linux ofc.

DANmode · on Oct 27, 2023

That doesn't sound super friendly for local development env/prototyping when swathes of webdevs literally don't even know what Docker is, today.

5Qn8mNbc2FNCiVV · on Oct 27, 2023

I always put a docker-compose.yaml into the repo and in the readme some instructions to install docker and docker-compose and that they should run `docker-compose up`

Works fine for almost any team. Obviously it would be great if they would try to understand it, but it's really not necessary.

Not like most web devs know how most of the tools they use work (Source: All the teams I worked at)

Too · on Oct 28, 2023

“docker run -ti -p 5412 postgres”? Easy enough to type on my phone. That downloads and starts everything you need and wipes the data on exit. Ideal for prototyping. If you want to keep the data, add some -v flag or use a compose file as suggested below, even easier.

Devs not knowing docker? Just install docker on their machine and ask them run the command above. One day has to be the first.

Aurornis · on Oct 27, 2023

I agree. SQLite is great for some simple purposes, but this article acts like launching PostgreSQL is impossibly difficult and can only be done on a server 100ms away from your web server.

For simple apps I just run Postgres on the same server and size things appropriately in settings. It’s really not hard and it’s well-trodden territory.

If someone is making a single-tenant app that runs on an embedded device or something then SQLite is awesome.

The way this developer concludes that “most of you reading this” should use SQLite is very strange. Is that indicative of his audience building small or toy websites rather than actual production websites?

BenjiWiebe · on Oct 27, 2023

How many actual production websites are actually too much for SQLite? 1%?

Most production websites are not multi million dollar businesses.

d1sxeyes · on Oct 28, 2023

Exactly the point here. There’s a bunch of brilliant reasons to use SQLite. One good reason is that by the time you outgrow it, you’re probably shifting enough data for it to be a worthwhile use of an engineering team’s time to plan a migration.

If you use an ORM from the beginning, you might not even have to change very much of your application code.

Ensorceled · on Oct 27, 2023

Most production websites are running Wordpress or something like Squarespace ...

Cthulhu_ · on Oct 27, 2023

I had one use case at my previous job. It was a bit of management software installed on premises / on physical or virtual servers (it was a bit old fashioned). The total installation involved setting up PHP, Apache, MySQL, and then installing the application itself, which also involved running some migration scripts written in shell scripts.

What I replaced it with was a Go binary with SQLite, where installing it would be a matter of installing the package or just the files and starting it up, it would take care of the rest.

SQLite is great for systems you don't control or have to set up, but maybe not for webservers. Common use cases are apps' internal storage (does not need to be shared with other applications or distributed), you don't want to have to install, configure and run MySQL or Postgres in those cases.

Ensorceled · on Oct 27, 2023

Dedicated apps are a FANTASTIC use case for SQLite, but most people already have SQLite as their goto for these applications.

zaphar · on Oct 27, 2023

I think you vastly underestimate how many people use SQLite for this. The number of times I've seen someone boot up an entire Mysql Server fora dedicated app is far larger than the number of times I've seen them just reach for sqlite.

Ensorceled · on Oct 27, 2023

Got it ... maybe this article is for those people :-)

xnorswap · on Oct 27, 2023

Indeed, and given the constraints that allow for SQLite also allow for running a local postgresQL instance, so no service discovery, complicated auth flows, etc, required, it should be even easier.

SQLite is fantastic, it absolutely should be used where appropriate.

I don't however understand those who argue that replacing the likes of PostgresQL / SQL Server with it is generally appropriate.

corbezzoli · on Oct 27, 2023

Your argument sounds right but also why use the more difficult tool if the simple one fits your needs? Use right tool for the job.

squirtlebonflow · on Oct 27, 2023

The vast majority of my development time for any project is setting up a server, database, web framework, etc. It's a massive pain in the ass and gets in the way of what I'm actually trying to do.

Ensorceled · on Oct 27, 2023

I've literally put a new django app live on AWS using PostgreSQL in a morning.

Now ... it didn't do anything but it was live and you could login, create users etc.

yawaramin · on Oct 27, 2023

Administering an RDBMS server is only part of the problem. It's also mentioned quite clearly there that by having data locality (just disk I/O) you get massive efficiency gains in queries and effectively don't have to worry about the N+1 query problem. Queries run in the order of microseconds, not milliseconds because of network latency.

winrid · on Oct 27, 2023

A DB server won't have to go to disk for every query, sqlite almost always will, or at least reach out from user space to virtual memory, which is still slower than reading from a buffer table.

So ya, in process is faster for trivial stuff. But it's not faster when doing heavier workloads that benefit from things being in memory or having a better query planner.

simonw · on Oct 27, 2023

I don't think that's true? SQLite uses in-memory caching just like MySQL and PostgreSQL do.

winrid · on Oct 27, 2023

yeah my bad it does have a page cache: https://www.sqlite.org/arch.html

The more you know :)

The query planner is still not nearly as good as PG's, but it's okay for simple applications.

I have actually used sqlite "at scale"... I'm still not sure people know that system (30k+qps dumping data into kafka) is running on sqlite on an EBS volume lol. But in this case it's just streaming stuff from disk, so pretty much any DB would have worked.

tootie · on Oct 27, 2023

There's so, so much missing context from almost every "you should use this tech" post. Software engineering is such an incredibly broad profession. People working on firmware for embedded medical devices and people making line of business software for an insurance company have wildly different choke points on productivity and utility. Budgets vary too. Maybe scaling SQLite can be done for low hosting costs if you just invest 200 hours of labor, but for most orgs, the 200 hours of labor are way more expensive than the hosting costs, so just pay for Postgres on AWS and don't worry about it.

IshKebab · on Oct 27, 2023

It's not about having problems, it's about the work that it takes to do. It's an extra hurdle that simply doesn't exist for SQLite. For a lot of projects that's worth it.

Your comment is basically the famous HN Dropbox comment but for databases: "are people really having problems getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem?"

dpeck · on Oct 27, 2023

It likely has more to do with the “data” team being different than the “application” team at a lot of big companies. It leads to many handoffs and inefficiencies and so people do a lot of things to route around it.

Not a knock against SQLite at all, I use it in production for my own projects and some professionally as well.

jampekka · on Oct 27, 2023

It is a hurdle. I'd say vast majority of all development time goes into navigating through an endless stream of such trivial hurdles.

xnorswap · on Oct 27, 2023

Saying that something has "Zero latency" because it's "on disk" is off-putting from a technical honesty perspective.

acomjean · on Oct 27, 2023

I see the idea of sql lite, but a mysql / Postgres store a lot in memory so no disk access required (if not the data the indexes stay memory resident)

We have an old web based tool that uses Java and luncene indexed file system based database. It’s a pain to work with frankly.

https://lucene.apache.org/

RedShift1 · on Oct 27, 2023

Note that your operating system caches file system reads so SQLite doesn't need to put anything extra in RAM. Most reads will be from RAM, so no disk access required.

dham · on Oct 27, 2023

True but the new hotness is edge and your database can be really far from where you actually deploy. This is what this article is trying to push back against. Mainly because Kent is all about Remix and Remix is more about just deploying to traditional servers.

willsmith72 · on Oct 27, 2023

remix is actually great for the edge. Lots of people run it on cloudflare workers in particular

https://remix.run/docs/en/main/guides/performance#this-websi...

_joel · on Oct 27, 2023

> We have an old web based tool that uses Java and luncene indexed file system based database. It’s a pain to work with frankly.

Elasticsearch waves

hahn-kev · on Oct 27, 2023

Compared to something over lan or internet, it is an order of magnitude less latency, so from an internet perspective it is 0

j-pb · on Oct 27, 2023

Ram on a different machine via ethernet has lower latency than spinning disks in the same machine.

gpderetta · on Oct 27, 2023

Yes, the net can be very fast. Of course when you start adding IP, TCP, SSL, HTTP, json serialization, chatty protocols, less than stellar implementation languages... suddenly [1] the network is the bottleneck.

[1] for very small values of suddenly.

Cthulhu_ · on Oct 27, 2023

You're reaching. Modern filesystems cache disk access in memory, and modern hardware doesn't use spinny disks for applications.

yawaramin · on Oct 27, 2023

Also, SQLite has a configurable local page cache which it maintains directly in your application's process. There's a very likelihood that you're just hitting that in-memory cache directly, let me emphasize, in your own process.

In-process memory cache hit is faster than separate-process memory cache hit over ethernet.

bravetraveler · on Oct 27, 2023

We're actually to a point where storage bandwidth can exceed memory!

NVMe arrays are weird, can be better to just rip it from the drives if you need bandwidth over latency

Even then, with how easy the memory cache is to invalidate... I'm not sure even that's worthwhile.

The caching isn't some Paragon of performance - there's only so much it can do, when looking at it funny invalidates it.

merpnderp · on Oct 27, 2023

We've spent ridiculous amounts of money on faster disk and network and we're still at around 30ms for our average server ping to the DB.

icedchai · on Oct 27, 2023

What do you consider a DB "server ping" ? Network ping? Or connect and "select 1"? Something else?

MagicMoonlight · on Oct 28, 2023

Why would you use memory on a different machine when you could just run them both on the same machine?

mminer237 · on Oct 27, 2023

There's no reason you couldn't run Postgres on the same machine either though.

isoprophlex · on Oct 27, 2023

Especially if you tout it as the solution for the n+1 problem...

refset · on Oct 27, 2023

Exactly, the n+1 problem is really about query optimization, latency just makes it more pronounced.

Ensorceled · on Oct 27, 2023

When really what you are really doing is putting off your n+1 problem until 6 months from now when you have actual active users with real data ...

fodkodrasz · on Oct 27, 2023

Isn't that a rational business decision?

isoprophlex · on Oct 27, 2023

I'm just as much a corner-cutter as the next "just yeet it in prod and we'll deal with it later" lazy grug brained dev, AND I fucking love SQLite...

But saying it SQLite good because no latency is weird, and saying that THAT is good because you can continue writing avoidable "I am not just lazy but I don't care about thinking about my database for 30 minutes" code... now that's a leaning tower of pisa of arguments that I can't follow.

fodkodrasz · on Oct 27, 2023

That is right, I was reacting to the GP comment in a narrower context.

SkyPuncher · on Oct 27, 2023

Yes. We have a bunch of n+1’s in our product, but they are in places where it literally does not matter. A single record getting pulled down, on a page deep in a UX flow.

It’s not worth our time unless they work into pages with large data sets or core flows.

Ensorceled · on Oct 27, 2023

I have "ignore this n+1 warning" in a few places in our unit tests for exactly this reason (details page, so n will never be more than 1). But these don't cause a problem on any database. So, again, not sure how this is a win for SQLite.

Ensorceled · on Oct 27, 2023

Finding out your entire app can't scale with real data is not really something you want to find out in production.

I'm not saying "don't use SQLite", I'm saying "reduced n+1" problems ISN'T a good reason.

fodkodrasz · on Oct 27, 2023

AFAIK SQLite can scale quite a bit with WAL2, and then probably it is easy to upgrade it to Postgres. If your data model was bad from the start up, it doesn't matter if you used Postgres initially.

If you find out your data model was bad, and is a constraint to scaling, you can find it out without spending a lot on insanely scaled RDB instances.

I'm for the build one to throw away approach, and SQLite is often a good tool in the toolbelt for the initial attempt of this.

Ensorceled · on Oct 27, 2023

Now we're getting silly. If you have n+1 issues you already have a bad data model which is why what you are really saying is "SQLite lets you have a bad data model which gives you velocity!!!!!" which is obviously (I hope it's obvious) not a good thing.

fodkodrasz · on Oct 27, 2023

From a business perspective: You might not have to solve that problem at all in 6 months, because you might not get so many users for the product, that scalability is a problem. You often need velocity more than clean code, because you are not sure if the idea will get traction at all. In that case, from a business perspective, to be able to experiment you need velocity more than a good data model.

I did this kind of exploratory development both with SQLite and Postgres, both had different strengths and weaknesses. Had the purist clean data model with Postgres cost more than it should have when the product had to pivot the product. Also have seen the scalability limits of SQLite. Just shutting down a POC project backed by SQLite, which failed to get traction mostly because of business execution problems, and SQLite saved a lot of effort for the implementation, and using a "cleaner" and "more scalable" solution would have only reduced velocity, but would not have got us more users, the system worked fine this way (even has N+1 queries in some places). Should we got to the point that SQLite starts to become a bottleneck we could have afforded to clean the data layer (we had to rework countless times already as requirements were in a flux) up a bit and move to more scalable solution. Others might stick Firebase or MongoDB there, and it also does work for a while, and that is also a valid decision IMHO from a business perspective.

I think this is a kind of topic where the premature optimization is the root of all evil meme can apply, depending on the product you are making and the available resource pool. Not for the hello world examples of course, and not for the well defined well funded project, but there are lot of exploratory attempts out of the bigco/vc funded unicorn world.

yawaramin · on Oct 27, 2023

But the thing is that in SQLite you effectively don't have N+1 issues.

JohnBooty · on Oct 27, 2023

It's frustratingly close to correct. Since it's on local disk, it winds up being cached in in-process RAM, which is..... still not zero latency... but has so much less latency compared to talking the to database over a network stack... that I'm alllllllllmost sorta kinda okay with it.

Almost. lol

rmbyrro · on Oct 27, 2023

He's comparing to a networked database.

In that context, a local disk has virtually zero latency for most use cases.

Also, SQLite can run in-memory. If your process is long running and keep the DB in RAM, the latency is even closer to zero.

thaumasiotes · on Oct 27, 2023

Wait until you see my all-storage-in-the-registers database. ;D

password4321 · on Oct 27, 2023

Kent is HN's Steve Gibson (https://news.ycombinator.com/item?id=37999197#38001051) of full stack development.

https://news.ycombinator.com/item?id=35903878#35915789

> Kent is the only blogger that I've felt compelled to tell all my less experienced colleagues to completely ignore

yawaramin · on Oct 27, 2023

Yeah, it's interesting to see what happens when people with confirmation bias hit an unexpected opinion and can't explain it away. So all they can do is ignore it and ask everyone else to pretend it doesn't exist.

segfaltnh · on Oct 27, 2023

Saying it's not uncommon for a database to be tens or even hundreds of milliseconds away is ignorant at best. If you care about performance, you're not jumping across the US for every database query.

Honestly this is the kind of stuff written to fluff out a social presence and offers nothing to a technical audience.

masfoobar · on Oct 27, 2023

There could/can be cases to use SQLite over SQL Server, MySQL, Postgres, etc. It really comes down to reasoning with the problem.

For me personally, I have only used SQLite for 2 scenarios.

1) When developing. SQLite is a simple method to getting something together, before building a proper MySQL/SQLServer/Postgres one.

2) Client software. Some client software needs a temporary data store and find SQLite great for this. These are not complicated databases. They barely have about 5 tables. Nothing more than a Queue system than anything, which goes over the wire via Rest API to a Messsaging Queue. Once sent over with a reply of "success" - it is removed from SQLite. It is a nice, simple system.

There can be a few other reasons.

Other than the above, I don't think I would be confident of SQLite file for a moderately sized database, likely to be used for some web portal or CMS system we created where more than one request could be talking to the database at the same time. This is why you should stick to those central, relational databases.

MySQL and Postgres, especially, are not that difficult to install, whether on the same system or its own dedicated server.

I hope my comment is not seen as a negative for SQLite. If anything, my attitude is the opposite. I really like the work and effort that has gone into SQLite. A virtual pat-on-the-back by those involved! It is a great approach to having data in 1 file, rather than writing your own.. or having lots of "records" in their own files (in a directory) which I have done.

rmbyrro · on Oct 27, 2023

I think it can be used for most things that have a low concurrency level and don't need real-time replication.

There are replication "solutions" for SQLite, but then it stops being simple and you're better off using Postgres.

masfoobar · on Oct 30, 2023

Agreed.

Admittedly, I cannot comment on replication options for Sqlite. I have not bothered to even look into this. To me, it already goes beyond the purpose of what makes Sqlite great --- to be a powerful sql in one file. Great for programs that need a storage solution without the reliability of being connected to another machine.

In my client programs using Sqlite, it has been designed in such a way that if the Sqlite file is corrupted then it is not the end of the world. We know what didn't get sent to the server so we recreate the file and resend, etc.

(However, our Sqlite files have been very reliable with the exception of software updates, but we ensure the sqlite db tables are empty before you can update the client software)

Thats why I specified "client software" in my original comment, but I don't see why server-side daemons cannot use it as well. I guess If more than 1 program needs it, though, a combination of CRUD operations, I would recommend something else.

beoberha · on Oct 27, 2023

SQLite is an awesome tool, but I wish we’d stop trying to force it into situations it wasn’t made for. It doesn’t belong anywhere near a web app deployment that currently or will span two or more machines. Once you have a “distributed system” and the complexities of a DR scenario, reach for a tool that was built for such. You can’t go wrong with PG/MySQL. You can go very wrong with a VC-hyped replicated file system from a company known for outages.

corobo · on Oct 27, 2023

Yeah it kinda feels like we're just going back to how things were before everyone decided 12-factor was the new hotness.

SQLite has its uses for sure - just not on a system that might need horizontal scaling, HA, or separate worker processes.

Sure, sure, you can duct tape a few SQLites together but then you're dealing with this

> LiteFS only allows a single node to be the primary at any given time. The primary node is the only one that can write data to the database. The other nodes are called replicas and they provide a read-only copy.

Might as well have just set up a separate DB server, less faff. At the point where you need to scale MySQL/postgres beyond vertical scaling you can probably afford to pay someone to do it for you

pookha · on Oct 27, 2023

I've seen people use a few hundred lines of javascript to create full blown distributed systems that are offline and end to end encrypted and backed by sqlite. They only use the server as a glorified caching system to sync to\from. It's esoteric (logical clocks, conflict free replicated data types), but that's a hell of a lot more distributed than a client-server architecture and it's backed entirely with sqlite.

beoberha · on Oct 27, 2023

I mean this is kinda what I’m trying to get at. You need to build all this bespoke infra in order to fit SQLite into the round role where a database with native replication is a far more proven and battle tested solution.

pookha · on Oct 27, 2023

When I say a few hundred lines of javascript (and I'm no fan of javascript) I literally mean a few hundred lines of javascript. No frameworks. No complicated server logic. Just CRDTs and a hybrid logical clock (written in fifty or so lines of JS) and a server running a little message bus...I once heard JLong talk about this and demo'd the capability to a military contractor in an office in old town alexandria and they thought it was hilarious. In just a couple crappy javascript files I was syncing data offline between twenty five different clients regardless of internal system clock speeds (I used a laptop that was running hot and had a fast system clock) or other bullshit. It was magical. And all of that was ran through SQLite. I even used some SQlite PostGIS-like extension during the demo. So yes, sqlite can be distributed in the purest sense of the word.

chasil · on Oct 27, 2023

You can certainly span two or more machines with provisos that you must not enable WAL mode and there can be only one writer.

I have never really examined how this performs, and it would be best if all access was explicitly via SQLITE3_OPEN_READONLY.

jampekka · on Oct 27, 2023

What share of web apps actually need to span two or more machines? Something like 0.1%?

0xbadcafebee · on Oct 27, 2023

Running instances, not machines. There's like 15 years of architectural advancement in running web apps that says that more than one copy of stateless software is the best practice. SQLite was not designed to be used for web apps.

HN's bizarre groupthink occasionally decides to subvert architectural norms for fun, and any blog post that's trending on the front page suddenly counts as "good advice". Even if it's literally insane, if it's on the front page it's considered sage wisdom, and naysayers are downvoted for being negative about the insanity.

simonw · on Oct 27, 2023

Sometimes it's OK to reconsider.

Those 15 years of architectural norms started at a time where a multi-core 2GB server was an expensive luxury!

We know a LOT more today about building web applications than we did 15 years ago. That's why some of us are ready to say that for a lot of applications, a single server running SQLite (with streaming backups to S3 via Litestream) is actually a pretty great option.

0xbadcafebee · on Oct 27, 2023

I don't mean use 15-year old patterns, I mean we've learned over 15 years what's good practice and what isn't.

We used to use .csv and .ini files that way. After that we used BerkeleyDB, since it was built into so many tools. Or serialized objects. If you got fancy you'd read/write them to a temporary filesystem, and occasionally sync the files to NFS; if the app restarts, sync the files from NFS back to temporary storage, or if the temporary file isn't there, read it from NFS and write it to temporary storage.

Today, if you have to write a web app that has dynamic content and writes and reads user data, you can do it a bunch of ways:

- "Mostly-JS": Write your app in JavaScript. Some hosted provider out there allows you to submit data to some backend app, and read it from the backend app. (This, by the way, is how CGI-BIN web apps worked 20 years ago)

- "Mostly-CGI": Use some managed hosting provider to host a web app with some framework they support, like Wordpress, or Django, or something else. Typically they bundle a networked relational database with it, but I suppose you could trade that out for an sqlite file. Probably wouldn't support Litestream but you could probably have a cron job copy the file to s3.

- "Mostly-standalone": Buy some virtual machine or physical server, set it up, maintain it, write your web app, run a web server that runs your web app, write the file to local disk, set up Litestream. Never touch it, patch it, upgrade it, etc, so it always has security vulns, but it will just keep running, so it will be much easier to get going.

- "Mostly-managed": Pay for some hosted provider to give you the ability to run your app in containers. You build it, you push it, and then set up the thing to run it. Like the previous option, you can opt to never patch or upgrade anything so there's nothing to maintain. But since you can use a bunch of SaaS providers to build, test, and deploy automatically, it's way less to maintain and much more automated. Again, typically you'd use a networked relational database, since they're largely designed to run one app at a time, but you can customize them to have some background task try to upload your sqlite file out of the container or a sidecar.

- "Completely managed": You upload your code. They completely manage the versions of your app platform, doing patches, upgrades, running different instances of your app, databases, web servers, whatever you need. You basically don't have to think about anything but code. You could use sqlite but there's probably no way to back it up.

If you don't need to read and write user data, you don't need a web app at all. A static site works just fine. Hell, you don't even need a static site generator anymore, just link to the right css or js files in your html to DRY up your static content.

jampekka · on Oct 27, 2023

Isn't "best practice" quite literally groupthink?

Vast vast majority of web apps run fine on a relatively modest VPS. And almost all will run fine on a beefy dedicated server. No amount of best practice changes that.

0xbadcafebee · on Oct 29, 2023

Best practice is the practice that works the best. That isn't groupthink. If it works, it works, regardless of whether a single person or a group is doing it. Groupthink would be whatever the group thinks is best, which isn't necessarily what actually works well at all.

The reason why running an app on a VPS is not best practice isn't that the VPS can't handle it; it's not a question of scale. The issue is that there are many different problems with running an app on a VPS, and it is much better to avoid those problems by not using a VPS at all, rather than to spend a lot of time trying to mitigate those problems on the VPS. If there is an alternative that is lower cost than the cost of mitigating the issues on the VPS, then that's the better solution.

It turns out that there are solutions like that, among them Heroku/DigitalOcean App Platform, DigitalOcean K8s/AWS EKS/GCloud GKE, Fly.io/AWS Fargate, Google Cloud Run/AWS App Runner, AWS Lambda/Google Cloud Functions, and many more. Almost all of these are better than running your own VPS, because they are designed to remove the problems you will inevitably run into with your own VPS.

Some people buy cheap things and replace them frequently. Some people invest in better quality that lasts longer. If you'd rather patch the holes in things rather than simply last longer, a VPS may be the solution you want. But it's certainly not the best solution.

beoberha · on Oct 27, 2023

Anything where you don’t want downtime. VM goes bad for some reason - what do you do?

simonw · on Oct 27, 2023

Start a new node restored from my Litestream backup.

Ilasky · on Oct 27, 2023

I’ve been using Pocketbase[0] on my projects and I can’t recommend it enough.

It is built on a SQLite db and has real-time pub/sub capabilities. Its JS SDK is incredibly easy to use and setup for CRUD as well. For side projects and some medium tasks, I’d say SQLite/Pocketbase has been super easy to work with.

[0] https://pocketbase.io

wg0 · on Oct 27, 2023

One possible strategy is to have one directory/file per customer which is one SQLite file. But then as the user logs in, you have to look up first what database they should be connected to.

OR somehow derive it from the user ID/username. Keeping all the customer databases in a single directory/disk and then constantly "lite streaming" to S3.

Because each user is isolated, they'll be writing to their own database. But migrations would be a pain. They will have to be rolled out to each database separately.

One upside is, you can give users the ability to take their data with them, any time. It is just a single file.

[0]. https://litestream.io/

MagicMoonlight · on Oct 28, 2023

Or… just make their data a row in the database. SQLite tables can store huge amounts in their columns

wave84 · on Oct 27, 2023

As a guy running a few Debian web servers for some small client projects and a few personal projects (some of which are medium sized, e.g. a website that gets 10 million pageviews / month) I actually tried to move from MariaDB to SQLite. The idea seemed great, at least in theory. Mainly to get away from mysqld, which actually crashed on me and corrupted some data once or twice in a timespan of 15 years.. think it was due to package updates every time.

Anyway, query and app logic modifications aside, I quickly ran into two unsolvable issues for me.

1. The lack of a robust web based admin tool comparable to phpmyadmin (no, phpliteadmin doesn't even come close) 2. No alter table. This hit pretty hard.

Plus lots of other idiosyncrasies - no enums, no type for dates, etc. In the end I decided it's not worth the extra work, trouble and risks and stayed on Maria.

simonw · on Oct 27, 2023

My main takeaway from this thread is that I need to do a much better job promoting my solution to the SQLite ALTER TABLE problem.

https://simonwillison.net/2020/Sep/23/sqlite-advanced-alter-...

My Datasette tool doesn't quite serve the same purpose of phpMyAdmin unless you're OK with read-only access, in which case it works great: https://datasette.io

Sammi · on Oct 31, 2023

I've been having a good time with this vs code plugin for viewing and editing sqlite db files: https://marketplace.visualstudio.com/items?itemName=yy0931.v...

I have also found the CHECK constraint on an INTEGER column to serve well as an ENUM: https://stackoverflow.com/questions/5299267/how-to-create-en...

For dates I happily use unix epoch integers or iso8601 date strings.

jimbokun · on Oct 27, 2023

I'm finding Sqlite DBs in S3 to be a great solution for ETL tasks.

It's a project where multiple other applications want to process data produced by my application. Instead of forcing them to make API calls to retrieve all of the data and all the network overhead and latency that entails, periodically write Sqlite DBs to S3 and post an event that the database is available. Can create one database per customer, or whatever your natural partition key is. Then it's just pull the database to local disk, pull all the data you need with SQL queries, and delete from the bucket when every client has had a chance to process it.

I believe it offers much better throughput than going through an API.

bob1029 · on Oct 27, 2023

The following is our current strategy. We used to be all-in on SQLite but we have discovered some nuance as we grow. We do B2B SaaS/PaaS for banking sector.

If we have a customer that is OK with RPO measured in minutes-to-hours and they have fewer than 5k active users, we are perfectly happy running with SQLite on a customer-managed VM and instructing them to use snapshots for backup & restore. Everything on one simple cheap VM in Azure/AWS/on-prem. One storage volume to snapshot. No weird tricks.

If the customer demands stronger consistency between their underlying record system and our system (specifically at time of catastrophe) and/or has more than 5k active users, then we are starting to push towards a separate cloud-native stack. Azure SQL Server Hyperscale with geographic replication to another region within ~100ms of the primary region. I looked long & hard at some of the SQLite replication options, but there is no way in hell I could get them to pass due diligence with the kinds of CTOs I have to argue with regularly.

SQLite is incredible for keeping it light & simple, and we still advocate for this where it fits the overall technical strategy. In terms of performance, before you reach a certain breakpoint, it is very hard to beat SQLite. You have to do it wrong on purpose to make it slower than a single node hosted solution on equivalent hardware.