I’m really interested in this model where each application server has a copy of ...

supriyo-biswas · 2024-09-25T13:29:34 1727270974

> I’m really interested in this model where each application server has a copy of a SQLite database file which is then replaced on a scheduled basis.

BTW, this is also the model used by all CDNs, where the global configuration file containing the certificates, HTTP routing rules etc. for all customers will be updated into into a single-file b-tree structure*, and that "bundle" is distributed among all edge locations frequently.

* I'm yet to see someone use sqlite for this purpose, it's usually DBM style databases like LMDB or Kyoto Cabinet.

twic · 2024-09-25T17:31:37 1727285497

> Kyoto Cabinet

Now, that's a name I've not heard in a long time.

Are people still using Kyoto Cabinet in new projects? Are people still using DBM-style storage generally? I thought that whole branch of the evolutionary tree had sort of died out.

supriyo-biswas · 2024-09-25T18:48:02 1727290082

> Are people still using Kyoto Cabinet in new projects?

Cloudflare used to use Kyoto Cabinet[1] and moved to LMDB[1] in 2020; other implementations that I'm familiar with (but don't have a link to share) also use LMDB.

> Are people still using DBM-style storage generally?

It's fairly common in these scenarios, as well as the underlying key-value store for popular software like Consul[3].

[1] https://blog.cloudflare.com/kyoto-tycoon-secure-replication/

[2] https://blog.cloudflare.com/introducing-quicksilver-configur...

[3] https://github.com/hashicorp/consul/issues/8442

vlovich123 · 2024-09-26T06:22:37 1727331757

Cloudflare moved from lmdb to RocksDB for production.

https://blog.cloudflare.com/moving-quicksilver-into-producti...

jbverschoor · 2024-09-25T23:30:56 1727307056

Yeah that plus zeromq. Very curious.. I always thought T/K cabinet and similar would become more popular with the adaption of ORMs

redserk · 2024-09-25T16:55:49 1727283349

I’ve worked on a project a long time ago where we did this with BerkeleyDB files. BDB was used to store configuration data that was frequently looked up.

Periodically we would run a full sync to replace the database. Between the periodic full syncs, we had a background process keep changes applied on a rolling basis.

All-in-all, it worked pretty well at the time! The full database file sync ensured a bad database was timeboxed and we got a bootstrapping mechanism for free.

akira2501 · 2024-09-25T23:06:34 1727305594

The query engine in SQLite can be weak. In particular where JOINs across multiple columns are concerned. You really do need to be aware of this when designing indexes into SQLite files.

In any case, SQLite would serve this solution, but almost certainly with a measurable level of inefficiency built in.

metadat · 2024-09-25T23:21:16 1727306476

This is the first I've heard of SQLite JOIN performance being "weak". I just spent 10 minutes scouring the annals of the web and didn't turn up anything relevant. Are there any additional links or other information you can share about this claim?

Edit: @akira2501: SQLite comes with different tradeoffs, sometimes superior and other times less so, depending on the use case. Blanket statements without evidence are unproductive in progressing the conversation..

Groxx · 2024-09-26T03:17:01 1727320621

Weeellll... I think it's safe to say SQLite's planner is simple, and trends towards "you get what you wrote" rather than having more sophisticated runtime monitoring and adjusting.

But as with all things SQLite, the answer is generally "it's easy to predict so just improve your query" and/or "there's a plugin for that" so it tends to win in the end with a bit of care.

thesz · 2024-09-27T23:14:03 1727478843

https://www.sqlite.org/queryplanner-ng.html

SQLite's planner is anything but simple. It has a beam search over different plans, basically, it approximates full NP-hard solution search process using bounded space and time.

akira2501 · 2024-09-25T23:25:26 1727306726

Seriously?

https://www.sqlite.org/optoverview.html

Chapter 2. Chapter 7. Chapter 8.

It should be _zero_ surprise to you that SQLite is not as powerful as other full database query engines.

simonw · 2024-09-26T02:54:06 1727319246

I don’t understand how reading that documentation page makes it clear that other relational engines have better performance for joins.

anacrolix · 2024-09-26T05:03:47 1727327027

I'm with you. I've encountered numerous situations where the query planner caused issues. I don't know if the alternatives are better, but it's definitely an issue. I've written multiple databases wrapping SQLite, and multiple bindings to SQLite. There are dragons.

vrighter · 2024-09-26T11:17:03 1727349423

I had to use sqlite once for something that would do a ton of read queries on a particular table. Sometimes just swapping the order of operands to an and operation generated a completely different query plan. Had to scour their optimizer document and painstakingly check the query plans of, and then manually tune, each sql statement.

But then it was quite fast.

Capricorn2481 · 2024-09-25T17:06:57 1727284017

> will be updated into into a single-file b-tree structure

I'm not knowledgeable on this, but my understanding was a b-tree is a way of sorting values that could be ordered in a certain way. Like this would be a b-tree of IDs

```

            [8]

           /   \

      [3, 5]   [10, 12]

     / | \     / | \  

  [1] [4] [6,7] [9] [11, 13]

```

You traverse by comparing your needle to the root node and going left or right depending on the results.

How is that done with configuration options? That seems like it would just be a regular hashmap which is already efficient to read. What would a b-tree of key/values even look like that wouldn't be less efficient than a hashmap?

remram · 2024-09-25T17:59:04 1727287144

Each number in your btree would actually be a key-value pair. So you can find the key fast, and then you have the value.

Databases including SQLite usually use b+tree for tables (a variant where only the leaves have data, the interior nodes only have keys) and regular btrees for indexes.

lucianbr · 2024-09-25T17:34:20 1727285660

A hash table makes sense in memory. If it's loaded just right for fast access, it has holes - empty entries. That makes little sense if you are building a file that will be transferred to many places over the internet. Bandwith waste would be significant.

So it might seem that simply enumerating the data (sorted or not) would be a better option for a file. (After all, the receiver will read everything anyway.) I guess frequent updates make this inefficient, so a tree helps.

d0mine · 2024-09-25T18:54:35 1727290475

you can use sqlite as a dbm db https://docs.python.org/3.13/library/dbm.html#module-dbm.sql...

jitl · 2024-09-25T15:20:30 1727277630

We used this model to distribute translations, feature flags, configuration, search indexes, etc at Airbnb. But instead of SQLite we used Sparkey, a KV file format developed by Spotify. In early years there was a Cron job on every box that pulled that service’s thingies; then once we switched to Kubernetes we used a deamomset & host tagging (taints?) to pull a variety of thingies to each host and then ensure the services that use the thingies only ran on the hosts that had the thingies.

In Ruby we called this “hammerspace” https://github.com/airbnb/hammerspace

otterley · 2024-09-26T20:54:39 1727384079

In Kubernetes, pod affinities (not taints) are the typical and preferred mechanism used to ensure pods are scheduled on the same node as their dependencies.

michaelbuckbee · 2024-09-25T19:31:19 1727292679

1. Very cool

2. If you were making it today would you just use SQLite?

skevy · 2024-09-25T19:38:53 1727293133

Yes, we would. And likely will switch to SQLite at some point in the future. (source, I work adjacent to these systems at Airbnb)

quesera · 2024-09-25T14:15:46 1727273746

> Feature flags can be checked dozens of times per request

My strategy for resolving this is to fetch the flag value once, but to store it in the request object, so that a) you never have to take the expensive lookup hit more than once per request, and b) there's no risk of an inconsistent value if the flag is updated mid-request.

jitl · 2024-09-25T15:31:44 1727278304

Where is the “session object” stored?

quesera · 2024-09-25T16:01:35 1727280095

Apologies, I meant "request object". Corrected above.

CraigJPerry · 2024-09-25T15:09:06 1727276946

What’s the use case for re-checking the same feature flag in a single session?

I can see why you need to check multiple different flags in a session and I understand the parent point about looking in SQLite for them (effectively a function call into a library in process address space rather than a call over the network for each flag).

quesera · 2024-09-25T16:04:56 1727280296

Sorry, s/session/request/g; corrected above.

One example is a multistep transaction processing request. The feature flag could gate several branch points.

A memory-mapped SQLite file is great too, but the strategy I describe above is less code to write, adds no new dependencies, is quicker to implement, avoids the SQLite file distribution/availability issues, and should get you a very similar performance improvement.

zellyn · 2024-09-26T14:40:32 1727361632

Having now spent enough years thinking about feature flags during an extended migration (from our own internal system to LaunchDarkly), I've become convinced that a really useful primitive would be:

* SQLite, but every change you make increments a global version counter.

* Some way to retrieve and/or apply just the changes between version counter numbers.

Then, you could layer all manner of distribution mechanisms on top of that. You could imagine gossip networks, an explicit tree-shaped cache/distribution network, etc. Anyone who has a later version than you would be able to give you the update.

What would this get you?

* You can bundle the current version of a DB into your app artifact, but efficiently update it once your app runs. This would let you fall back to much better defaults than having no data at all. (eg. For feature flags or live config, this would let you fall back to a recent state rather than code-level defaults.)

* Any kind of client can send the global counter as an ETAG and get just the delta

* Reconnections if the network blips are really simple and cheap

* If the system also let you keep a couple of minutes of history, you could annotate the counter onto a microservice call (using W3C Baggage headers, for example), and evaluate your flags/config/data/whatever at a single version number across multiple microservices. Even without evaluate-at-time functionality, logging the generation number would help a lot when debugging what happened after the fact

nbbaier · 2024-09-26T19:59:12 1727380752

Really neat ideas, anywhere something like this has been implemented?

michaelbuckbee · 2024-09-25T12:51:50 1727268710

SQLite for distribution is neat. FWIW - this is at least partially inspired by your datasette project which we may still try and do something with later on the reporting and data exploration side of things.

jasonwatkinspdx · 2024-09-26T05:18:58 1727327938

Years ago I had a conversation with a friend of a friend that worked at one of the big chip fabs in their lab dept. He mentioned they made very heavy use of sqlite as a file format for distribution. This was back in the "xml all the things" era and it struck me as such a smart breath of fresh air.

I'm honestly surprised it isn't more pervasive.

nnf · 2024-09-25T15:12:13 1727277133

I've wanted to implement this on a distributed web server environment I manage. Right now there's a centralized MySQL database that the web servers read from when rendering a web page, but there can be lots of queries for a single render (page, sections, snippets, attributes, assets, etc.), and sending that all over the wire, while fast, is slower than reading from a database running on the same host. It'd be great to be able to copy the "master" database onto each web server instance, maybe once per minute, or just on-demand when a change to the data is made. I imagine this would make reads much faster.

pkhuong · 2024-09-25T18:05:28 1727287528

That's how https://github.com/backtrace-labs/verneuil 's read replication is meant to be used. There's a command-line tool to recreate a sqlite DB file from a snapshot's manifest, with an optional local cache to avoid fetching unchanged pages, or you can directly use a replica in memory, with pragmas for (async) refreshes.

The write tracking needs to intercept all writes with a custom VFS, but once registered and configured, it's regular in-memory SQLite (no additional daemon).

otoolep · 2024-09-25T16:26:52 1727281612

rqlite[1] could basically do this, if you use read-only nodes[2]. But it's not quite a drop-in replacement for SQLite at the write-side. But from point of view of a clients at the edge, they see a SQLite database being updated which they can directly read[3].

That said, it may not be practical to have hundreds of read-only nodes, but for moderate-size needs, should work fine.

Disclaimer: I'm the creator of rqlite.

[1] https://rqlite.io/

[2] https://rqlite.io/docs/clustering/read-only-nodes/

[3] https://rqlite.io/docs/guides/direct-access/

michaelbuckbee · 2024-09-25T19:48:10 1727293690

(author here) I had looked at both Rqlite and the different commercial versions of this, but I didn't pursue them as they all seemed to require running an additional service on the host machines.

otoolep · 2024-09-25T20:24:10 1727295850

Yes, that is right, it would require a new service running on the host machines.

That said, I do think it depends on what you consider important, and what your experience has been in the past. I used to value simplicity above all, so reducing the number of moving pieces was important to my designs. For the purpose of this discussion let's count a service as a single moving piece.

But over time I've decided that I also value reliability. Operators don't necessarily want simplicity. What they want is reliability and ease-of-use. Simplicity sometimes helps you get there, but not always.

So, yes, rqlite means another service. But I put a lot of emphasis on reliability when it comes to rqlite, and ease-of-operation. Because often when folks want something "simple" what they really want is "something that just works, works really well, and which I don't have to think about". SQLite certainly meets that requirement, that is true.

bungle · 2024-09-26T11:04:11 1727348651

rqlite is a great project! Sometimes I have been wondering, how hard would it be to embed it into web server process, like nginx or apache, as a module.

antman · 2024-09-26T18:59:35 1727377175

What would be the challenge with hundreds of read nodes?

adroitboss · 2024-09-25T22:48:12 1727304492

You may want to check out LiteFS and LiteStream by benbjohnson. There was a time in 2022 where he was trending every week for his projects. The following snippet is taken from the LiteFS webpage. "LiteFS is a distributed file system that transparently replicates SQLite databases. You can run your application like it’s running against a local on-disk SQLite database but behind the scenes the database is replicated to all the nodes in your cluster."

faangguyindia · 2024-09-26T17:32:54 1727371974

Why sqlite but not something like Rocksdb which is optimised for NVMe.

While SQLite can be used on SSDs, it is not as heavily optimized for SSDs as RocksDB. SQLite is a general-purpose embedded database and was primarily designed for use on devices like mobile phones or lightweight embedded systems, where storage is often slower and smaller (e.g., flash storage or basic SSDs).

SQLite’s traditional B-tree structure involves more random I/O compared to RocksDB's LSM tree, which is less ideal for maximizing the performance of high-speed SSDs like NVMe.

SQLite’s limited concurrency (single-writer, multiple-readers) also means it cannot fully utilize the parallelism and high throughput potential of NVMe SSDs.

otterley · 2024-09-26T20:57:31 1727384251

This is a high read, low write application. And SSDs offer very fast random I/O performance. SQLite is more than sufficient and the entire database may well fit into the page cache anyway.

Salgat · 2024-09-25T21:07:34 1727298454

We use something similar for our event based databases. We project millions of events into an in-memory object (usually a dictionary), and periodically persist that state as json in S3. It's guaranteed to be consistent across all service instances because the persisted state contains the event stream positions to catchup from. The only drawback of course is that it can use up to several GB of memory depending on how big the state is, but it's extremely fast and doing in-memory operations is trivial compared to using SQL or a library with api calls.

atomicnumber3 · 2024-09-26T00:32:42 1727310762

In a past life, I used this pattern in hadoop mapreduce clusters. A job would do "small-data" sql queries locally to pull configs, maps of facts related to the business domain, then pack them into sqlite db files and ship them up with the job. Hadoop already has a process called "job localization" where it can download files sent up with the job down to the PWD of the mapper/reducer. And then the mapper/reducer can use it read-only while doing big-data things.

bob1029 · 2024-09-25T14:33:11 1727274791

> a SQLite database file which is then replaced on a scheduled basis.

You could look into WAL replication if you wanted an efficient way to update the copies. Something like Litestream.

er0k · 2024-09-25T20:20:40 1727295640

litestream is great. Dead simple and rock solid IME.

TmpstsTrrctta · 2024-09-25T21:13:12 1727298792

I’ve utilized this in Lambda@Edge for use case half feature flag-y, half HTTP routing/proxy serving rules as mentioned in a sibling comment. Lambdas pick up ~50-200MB of data on first boot, and keep their copy through their lifecycle.

As requests come in, gather their features, convert to effectively an int vector, filter for row where match and serve request

closeparen · 2024-09-25T16:33:06 1727281986

This is the type of architecture we use for feature flagging, but it's just a JSON file.

chipdart · 2024-09-25T14:51:35 1727275895

> Feature flags can be checked dozens of times per request and often need the kind of queries (user is a member of group A and has an IP located in country B) which could be well served by a local SQLite - and feature flags have a tolerance for updates taking a few seconds (or longer) to roll out.

This doesn't sound right. A feature flag only requires checking if a request comes from a user that is in a specific feature group. This is a single key:value check.

The business logic lies in assigning a user to a specific feature group, which the simplest way means pre assigning the user and in the most complex cases takes place at app start/first request to dynamically control dialups.

Either way, it's a single key: value check where the key is user ID+feature ID, or session ID + feature ID.

I mean, I guess you can send a boat load of data to perform the same complex query over and over again. I suppose. But you need to not have invested any thought onto the issue and insisted in making things very hard for you, QAs, and users too. I mean, read your own description: why are you making the exact same complex query over and over and over again, multiple times in the same request? At most, do it once, cache the result, and from therein just do a key:value check. You can use sqlite for that if you'd like.

simonw · 2024-09-25T18:07:34 1727287654

I've worked at places where the feature flag system was much more dynamic than that, considering way more than just membership in a group.

This meant you could roll features out to:

- Specific user IDs

- Every user ID in a specific group

- Every object owned by a specific user ID (feature flags might apply to nested objects in the system)

- Requests from IP addresses in certain countries

- Requests served by specific website TLDs

- Users who are paid members of a specific plan

- etc etc etc

It was an enormously complicated system, that had evolved over 5-10 years.

Not saying that level of complexity is recommended, but that's what we had.

Looks like I gave a talk about this back in 2014: https://speakerdeck.com/simon/feature-flags

chipdart · 2024-09-26T15:59:02 1727366342

> This meant you could roll features out to:

That's how feature flag services work.

The whole point is that this is not a querying problem. That's a group assignment problem.

Users/sessions are assigned to a feature group according to business rules. These can be as complex as anyone wants them to be.

Once a user/session is assigned to a feature group, the problem of getting features is a key-value query.

nnf · 2024-09-25T15:07:07 1727276827

GP's comment is talking about checking multiple feature flags, not checking a single feature flag multiple times.

chipdart · 2024-09-25T15:18:43 1727277523

GP referred specifically to queries checking if "user is a member of group A and has an IP located in country B".

The number of feature flags is irrelevant. In fact, the feature flag and A/B testing services I used always returned all default treatment overrides in a single request.