Interesting! We tunnel SQLite over JSON in the Notion iOS/Android/Mac/Windows apps, and the interface presented here looks pretty similar to our second iteration. Here’s the things I think are missing:
- Precondition. One or more statements to run to check if the transaction body can proceed. We use this to check pragma user_version which we use to mark schema version level. Prevents old processes from writing to the DB once another process has started a forward migration. You can also use it to assert about transaction state or disk space remaining.
- Userspace transaction batching. We found some use-cases (like as part of migrations) where we want to control transactions explicitly. Forcing each batch into a transaction is a good default, but ideally the JSON api doesn’t restrict any DB semantics.
- Websockets! I thought “ws” could mean websocket but web server is fine too. Websockets work better for our low latency local-only connection use case, and have a more traditional socket-pool vibe. They also avoid retransmitting headers etc, and can allow the server to better understand client “sessions”.
- Extensions. I guess portability is a fine goal, but there’s so much great stuff in the extensions. Full text search, JSON support (although the latest version mainlined it), bulk data load. The main one for us is JSON.
- Precondition: this is a goal, once I find a good balance between allowing to do things and not to reinvent a scripting language. But yes, it's on the roadmap.
- Userspace transaction batching: interesting. I don't feel comfortable about letting a transaction "escape" the request, I don't want to introduce timeouts, sessions or such. Could you state an use case? This could be a goal.
- Uh, websockets are promising. Thanks!
- Extension: basically, the point here is that Go doesn't allow for a "clean" cross-distribution compilation if I enable them, and I cannot chase all the possible glibc configurations. But it's an open point for sure.
Here's our precondition, it expects a single row with precondition_result to be 0/1. You could add an error reporting string column to the return type, but that's about all we needed. Very basic:
SELECT
CASE user_version
WHEN ${schema.pragmas.user_version} THEN 1
ELSE 0 END AS precondition_result
FROM pragma_user_version() LIMIT 1
On extensions: Yes, JSON is included in trunk. Is the latest trunk included in your build? Or are you linking against the platform's sqlite3? At least for the built-in extensions, you can pre-build an amalgamation .c file that contains all the first-party extensions with `make sqlite3.c` and a few env vars. But maybe you're using a pre-built dependency ¯\_(ツ)_/¯.
One extension we like is LIMIT for update/delete. I believe (at least in the version of SQLite we build with), that we need to enable that one when producing the amalgamation.
On userspace transaction batching: our usecase is internal inside an app. When we built the second version of our SQLite bridge, we were replacing a safe but very rigid API, so we decided to build something with as few limitations as possible. The main use-case today is batching multiple transactions into a single IPC message/JSON request object. But, for our platforms, there's exactly 1 client of the JSON bridge at a time, and we know out of band if that client dies or resets -- so we can terminate hanging transactions appropriately. You could also add an assertion that a batch never leaves a transaction open.
For an HTTP use-case, perhaps not a great idea because of multiple clients, sessions, etc. Maybe something to consider for the web socket API, which would give you a protocol session concept to reason with.
thanks for your comments, really interesting. I'll come back to this post for ideas ;-). I like the precondition, maybe I'd like to structure it so that it passes when 1+ rows are returned and fails if 0 rows... should be equivalent and doesn't make any assumption on rows _and_ columns being there... but I'll think about it. Thank you again!
You can also query sqlite databases directly from an http server that supports range requests (like s3). There are a bunch of implementations of this in different languages including Go[0] and Javascript[1].
Author here. This is a pet project of mine, and what a surprise to see on HN! :-) Your discussions on PostgREST and on using SQLite for different use cases are what motivated me for this. So thanks!
This is an amazing project!
I always wanted to leverage sqlite behind a REST API, this will make my life easier.
Massive Kudos to the well-written info/documentation full of useful links to dive deeper into security and best practices about databases.
In my testing of a similar project I found that WAL and multiple connections could still get errors with concurrent writes, what I found was that a single writer with multiple readers performed extremely well. Have you seen anything similar?
I didn't find any concurrency issue; it isn't based on sqlite alone, the underlying layer (go-sqlite3) has a connection pool that I set to 1. I have a test case on this, and it works fine (as far as I can see). If you care to take a look at it, I'd like to have your opinion.
About performances: I was blown out, yes. SQLite is an amazing piece of software, I just had to do my stuff while treating it with respect :-).
You can do bulk/batch inserts, and because of the single request = single query behavior you can much more easily cache responses.
It's a tradeoff, but ws4sqlite just sees HTTP as a tunnel for SQL-in-json while postgrest tries to represent SQL in REST. Which you prefer is a matter of taste.
(author) Yes, it's a tradeoff. ws4sqlite was meant to give an unstructured way to access sqlite, that cannot be accessed on the net; while I see PostgREST (a wonderful product that I also use) as a way to implement persistence for a model you already have in your application. They're not "direct competitors", as I see it, and PostgREST is much more ambitious (and complex, arguably).
I would just like to warn people that you shouldn't use this in any production or client-facing app, it's trivial for a malicious actor to change the SQL and drop your database or whatever.
If you need to access a sqlite database remotely, try https://github.com/rqlite/rqlite, it offers clustering on top of a basic SQLite database as well as a HTTP(s) API.
Author here. Thank you for this comment! Yes, it's probably a bad idea at this stage of maturity; there are security provisions put in place, please see https://germ.gitbook.io/ws4sqlite/security and of course never do anything of the sort without a reverse proxy. If you care to read it, I'd be interested in discussing things further.
If I'm not mistaking rsqlite is a different beast, it's a RDBMS which happens to use SQLite for storage. You can't just use it to share over the Internet a live database while it's being used by another regular sqlite application.
Not much different than a remote invocation of any other database (ok, this is more "tool-friendly" so it may open up to more attacks). But your point stands.
this reminded me of that p2p sqlite project...i can't recall the name but seems like something like this that runs on mesh networks would be a game changer
you could create databases that can't ever be taken down
Something very interesting is happening with SQLite. Has always been widely adopted (in the background?), but lately there's a lot of usecase development. Litestream for no-hassle replication (and live-read replicas coming), querying static DB with VFS, torrent fetching, and now no-setup DB to API.
Ironically one thing I reach for with this is how to achieve immutability with sqlite that is decentralized? Can't anybody and anyone read and write? or is it possible with another column that references the previous column with a cryptographic keys from multiple peers....my god we have blockchain on sqlite
Hi! It doesn't add much to what SQLite allows. It's allowed if database is set as read only and you don't opt-out of default WAL mode; if not, the transactions are serialized. Even so, you rarely notice it, because SQLite is quite fast.
Concurrency at the HTTP layer is permitted, though. Basically what is done is to set 1 max connection on the connection pool.
- Precondition. One or more statements to run to check if the transaction body can proceed. We use this to check pragma user_version which we use to mark schema version level. Prevents old processes from writing to the DB once another process has started a forward migration. You can also use it to assert about transaction state or disk space remaining.
- Userspace transaction batching. We found some use-cases (like as part of migrations) where we want to control transactions explicitly. Forcing each batch into a transaction is a good default, but ideally the JSON api doesn’t restrict any DB semantics.
- Websockets! I thought “ws” could mean websocket but web server is fine too. Websockets work better for our low latency local-only connection use case, and have a more traditional socket-pool vibe. They also avoid retransmitting headers etc, and can allow the server to better understand client “sessions”.
- Extensions. I guess portability is a fine goal, but there’s so much great stuff in the extensions. Full text search, JSON support (although the latest version mainlined it), bulk data load. The main one for us is JSON.