Hacker News new | past | comments | ask | show | jobs | submit login
Ws4sqlite: Query SQLite via HTTP (github.com/proofrock)
132 points by thunderbong on March 11, 2022 | hide | past | favorite | 41 comments



Interesting! We tunnel SQLite over JSON in the Notion iOS/Android/Mac/Windows apps, and the interface presented here looks pretty similar to our second iteration. Here’s the things I think are missing:

- Precondition. One or more statements to run to check if the transaction body can proceed. We use this to check pragma user_version which we use to mark schema version level. Prevents old processes from writing to the DB once another process has started a forward migration. You can also use it to assert about transaction state or disk space remaining.

- Userspace transaction batching. We found some use-cases (like as part of migrations) where we want to control transactions explicitly. Forcing each batch into a transaction is a good default, but ideally the JSON api doesn’t restrict any DB semantics.

- Websockets! I thought “ws” could mean websocket but web server is fine too. Websockets work better for our low latency local-only connection use case, and have a more traditional socket-pool vibe. They also avoid retransmitting headers etc, and can allow the server to better understand client “sessions”.

- Extensions. I guess portability is a fine goal, but there’s so much great stuff in the extensions. Full text search, JSON support (although the latest version mainlined it), bulk data load. The main one for us is JSON.


Thanks for the comment. Some great points here.

- Precondition: this is a goal, once I find a good balance between allowing to do things and not to reinvent a scripting language. But yes, it's on the roadmap.

- Userspace transaction batching: interesting. I don't feel comfortable about letting a transaction "escape" the request, I don't want to introduce timeouts, sessions or such. Could you state an use case? This could be a goal.

- Uh, websockets are promising. Thanks!

- Extension: basically, the point here is that Go doesn't allow for a "clean" cross-distribution compilation if I enable them, and I cannot chase all the possible glibc configurations. But it's an open point for sure.

Isn't JSON included in the "main trunk" now?


Here's our precondition, it expects a single row with precondition_result to be 0/1. You could add an error reporting string column to the return type, but that's about all we needed. Very basic:

    SELECT
     CASE user_version
     WHEN ${schema.pragmas.user_version} THEN 1
     ELSE 0 END AS precondition_result
     FROM pragma_user_version() LIMIT 1
On extensions: Yes, JSON is included in trunk. Is the latest trunk included in your build? Or are you linking against the platform's sqlite3? At least for the built-in extensions, you can pre-build an amalgamation .c file that contains all the first-party extensions with `make sqlite3.c` and a few env vars. But maybe you're using a pre-built dependency ¯\_(ツ)_/¯.

One extension we like is LIMIT for update/delete. I believe (at least in the version of SQLite we build with), that we need to enable that one when producing the amalgamation.

On userspace transaction batching: our usecase is internal inside an app. When we built the second version of our SQLite bridge, we were replacing a safe but very rigid API, so we decided to build something with as few limitations as possible. The main use-case today is batching multiple transactions into a single IPC message/JSON request object. But, for our platforms, there's exactly 1 client of the JSON bridge at a time, and we know out of band if that client dies or resets -- so we can terminate hanging transactions appropriately. You could also add an assertion that a batch never leaves a transaction open.

For an HTTP use-case, perhaps not a great idea because of multiple clients, sessions, etc. Maybe something to consider for the web socket API, which would give you a protocol session concept to reason with.


Sorry, forgot to comment: SQLite is currently 3.38.0, see https://github.com/mattn/go-sqlite3/blob/master/sqlite3-bind... I don't package it directly, but via the go bindings by mattn (go-sqlite3).


thanks for your comments, really interesting. I'll come back to this post for ideas ;-). I like the precondition, maybe I'd like to structure it so that it passes when 1+ rows are returned and fails if 0 rows... should be equivalent and doesn't make any assumption on rows _and_ columns being there... but I'll think about it. Thank you again!


You can also query sqlite databases directly from an http server that supports range requests (like s3). There are a bunch of implementations of this in different languages including Go[0] and Javascript[1].

[0]: https://github.com/psanford/sqlite3vfshttp

[1]: https://github.com/phiresky/sql.js-httpvfs


And Python - https://github.com/michalc/sqlite-s3-query

(although this is specific to S3 rather than generic HTTP)

(shameless plug - made mostly by me)


Author here. This is a pet project of mine, and what a surprise to see on HN! :-) Your discussions on PostgREST and on using SQLite for different use cases are what motivated me for this. So thanks!


This is an amazing project! I always wanted to leverage sqlite behind a REST API, this will make my life easier. Massive Kudos to the well-written info/documentation full of useful links to dive deeper into security and best practices about databases.


Thanks a lot!


How do you handle multiple readers and writers? Have you benchmarked performance? How is authentication and authorization handled?


Performances here https://germ.gitbook.io/ws4sqlite/features/performances, multiple read and writes are serialized under the hood, but concurrency is allowed if the db is in read only mode and WAL is not disabled.

Security discussion here https://germ.gitbook.io/ws4sqlite/security and here https://germ.gitbook.io/ws4sqlite/features#security-features

In particular, authentication https://germ.gitbook.io/ws4sqlite/documentation/authenticati...


In my testing of a similar project I found that WAL and multiple connections could still get errors with concurrent writes, what I found was that a single writer with multiple readers performed extremely well. Have you seen anything similar?


I didn't find any concurrency issue; it isn't based on sqlite alone, the underlying layer (go-sqlite3) has a connection pool that I set to 1. I have a test case on this, and it works fine (as far as I can see). If you care to take a look at it, I'd like to have your opinion.

About performances: I was blown out, yes. SQLite is an amazing piece of software, I just had to do my stuff while treating it with respect :-).


Or you can just use my HTTP native database: http://root.rupy.se

A complete JSON file database over HTTP in 200 lines of code.


(author) Different use case, but thanks, I'll take a look


This is just sql in json, I prefer mapping SQL semantics onto HTTP semantics like postgrest does. Not aware of any project that does that for sqlite.


https://github.com/bradleyboy/tuql converts an sqlite database to a GraphQL API


In a short time i'll release exactly that, postgrest semantics backed by sqlite :)


No batches there. And only simple queries...

> For more complicated filters you will have to create a new view in the database, or use a stored procedure


You can do bulk/batch inserts, and because of the single request = single query behavior you can much more easily cache responses.

It's a tradeoff, but ws4sqlite just sees HTTP as a tunnel for SQL-in-json while postgrest tries to represent SQL in REST. Which you prefer is a matter of taste.


(author) Yes, it's a tradeoff. ws4sqlite was meant to give an unstructured way to access sqlite, that cannot be accessed on the net; while I see PostgREST (a wonderful product that I also use) as a way to implement persistence for a model you already have in your application. They're not "direct competitors", as I see it, and PostgREST is much more ambitious (and complex, arguably).


What does that look like?



I would just like to warn people that you shouldn't use this in any production or client-facing app, it's trivial for a malicious actor to change the SQL and drop your database or whatever.

If you need to access a sqlite database remotely, try https://github.com/rqlite/rqlite, it offers clustering on top of a basic SQLite database as well as a HTTP(s) API.


Author here. Thank you for this comment! Yes, it's probably a bad idea at this stage of maturity; there are security provisions put in place, please see https://germ.gitbook.io/ws4sqlite/security and of course never do anything of the sort without a reverse proxy. If you care to read it, I'd be interested in discussing things further.


If I'm not mistaking rsqlite is a different beast, it's a RDBMS which happens to use SQLite for storage. You can't just use it to share over the Internet a live database while it's being used by another regular sqlite application.


rqlite author here. Yes, that is a correct statement about rqlite.

https://github.com/rqlite/rqlite/blob/master/DOC/FAQ.md


Sending arbitrary SQL for remote execution feels very much like a remote shell, and can quickly turn into one with a misconfigured or buggy extension.


Not much different than a remote invocation of any other database (ok, this is more "tool-friendly" so it may open up to more attacks). But your point stands.


Not using a version in the request, or response, or URL? That's a paddlin'.


Very true, thanks. It's on my to-do list.


this reminded me of that p2p sqlite project...i can't recall the name but seems like something like this that runs on mesh networks would be a game changer

you could create databases that can't ever be taken down


This one? https://news.ycombinator.com/item?id=30605252

Something very interesting is happening with SQLite. Has always been widely adopted (in the background?), but lately there's a lot of usecase development. Litestream for no-hassle replication (and live-read replicas coming), querying static DB with VFS, torrent fetching, and now no-setup DB to API.


this might be the true decentralization platform that will take off. no merkle tree non-sense.

although gun eco seems to be already quite popular, its a graph DB which makes it tough to use.


I definitely prefer this flavor of decentralization. Yeah not having to learn Graph DB would be a big selling point.


Ironically one thing I reach for with this is how to achieve immutability with sqlite that is decentralized? Can't anybody and anyone read and write? or is it possible with another column that references the previous column with a cryptographic keys from multiple peers....my god we have blockchain on sqlite


This is the most interesting project I have seen in a while. Well done!


What about concurrency?


Hi! It doesn't add much to what SQLite allows. It's allowed if database is set as read only and you don't opt-out of default WAL mode; if not, the transactions are serialized. Even so, you rarely notice it, because SQLite is quite fast.

Concurrency at the HTTP layer is permitted, though. Basically what is done is to set 1 max connection on the connection pool.


Looks like a clean API.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: