Hacker News new | past | comments | ask | show | jobs | submit | mumoshu's comments login

Anyone interested this topic would also be interested in the discussion at https://news.ycombinator.com/item?id=32285435. mvsqlite might be able to handle cross-database transactions in the storage layer(=mvsqlite's mvstore backed by FoundationDB's fully serializable transactions) rather than sqlite's native one so I think there should be no theoretical limit due to SQLITE_MAX_ATTACHED.

Update:

But yeah, mvsqlite as of today seems to be tied to one sqlite per mvstore so it's attaching databases to the sqlite instance is the only way to deal with multi databases. So, it will naturally be affected by SQLITE_MAX_ATTACHED.

Perhaps a potential big idea would be to have a sqlite instance per database and a "proxy" layer above sqlite instances to (1)obtain/set a global transaction ID per multi-db transaction and (2)redirect queries to each involved sqlite databse, while the serializability is guaranteed in the FoundationDB.


Awesome! I've been wishing something like FDB's record-layer w/o Java/JVM and this has a lot of potential.

I have a question though- How does it handle conflicts within a sqlite DB and among multiple sqlite DBs?

I believe FDB maintains strict serializability by detecting conflicts among concurrent transactions by checking conflict ranges.

I read the doc and the mvstore code and perhaps it's working by writing the deltas of changed pages with the read version obtained at the sqlite transaction creation?

If that's the case, I'm still unsure what you added to the conflict ranges other than the deltas of the pages. To make it actually serializable, you'd need to add conflict ranges for all the pages that are `select`ed within the sqlite transaction?


TLDR: The conflict range is the entire SQLite database. mvsqlite does not support concurrent read-write transactions to the same DB. If multiple RW transactions with overlapping [read_version, commit_version] ranges are are requested to be committed, only one of the commits will succeed.

To scale out writes, you can use smaller databases - one database per user, for example. It is possible to do multi-database serializable transactions with mvsqlite (not yet implemented, but the logic shouldn't be complex).

mvsqlite actually doesn't just use FDB's native transaction, as I would like to avoid FDB's low txn size and time limits. Instead, there are two separate keyspaces for each SQLite DB - one "page index" keyspace, and one content-addressed store keyspace.

For reads: Pages are fully versioned, so they are always snapshot-readable in the future. The read version is fetched from `mvstore` when each SQLite transaction starts, and is used as the page index per-page range scan upper bound in future read requests.

For writes: Pages are first written to the content-addressed store keyed by the page's hash. At commit, hashes of each written page in the SQLite transaction is written to the page index in a single FDB transaction to preserve atomicity. With 8K pages and ~60B per key-value entry in the page index, each SQLite transaction can be as large as 1.3 GB (compared to FDB's native txn size limit of 10 MB).

So actually, you can do one page read or write per FDB transaction and still preserve ACID properties.


Versioned pages backed by content-addressed store and transactions over the page index rather than pages! That totally makes sense to me.

Before you managed to produce mvsqlite, I was wondering if it is possible to rebase https://github.com/dolthub/dolt content-addressed page store(implemented with ProllyTree over a standard OS FS) onto FDB, so that there will be a MySQL-compat DB with similar properties to mvsqlite where actual page updates can be done outside FDB transactions to overcome 5sec FDB limit. Apparently, you already materialized a similar idea in a more sophisticated, practical, and complete way.

Thanks a lot for clarifying and keep up the great work. Your work is totally awesome!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: