Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Minimal versioned log structured relational DB in Common Lisp (github.com/codr7)
110 points by codr7 on June 28, 2021 | hide | past | favorite | 15 comments



Cool project. Have you known that PostgreSQL was initially written in Lisp?

> By a process of elimination, we decided to try writing POSTGRES in LISP. We expected that it would be especially easy to write the optimizer and inference engine in LISP, since both are mostly tree processing modules. Moreover, we were seduced by AI claims of high programmer productivity for applications written in LISP.

> We soon realized that parts of the system were more easily coded in C, for example the buffer manager which moves 8K pages back and forth to the disk and uses a modified LRU algorithm to control what pages are resident. Hence, we adopted the policy that we would use both C and LISP and code modules of POSTGRES in whichever language was most appropriate. By the time Version 1 was operational, it contained about 17000 lines of LISP and about 63000 lines of C.

src: https://dsf.berkeley.edu/papers/ERL-M90-34.pdf

It also had time travel (from the same paper):

> Lastly, POSTGRES supports the notion of time travel. This feature allows a user to run historical queries. For example to find the salary of Sam at time T one would query:

    retrieve (EMP.salary)
    using EMP [T]
    where EMP.name = "Sam"

> POSTGRES will automatically find the version of Sam’s record valid at the correct time and get the appropriate salary.

Regarding your project:

> Databases are implemented as directories containing one file per table.

PostgreSQL splits tables into 1 GB segments (1 segment is one file). This is done for filesystems that can't have large files. Isn't one file per DB too limiting in that regard?

Are you aiming to be ACID compliant? Can you expand on 'versioned'? PostgreSQL achieves that with copy-on-write, did you take the same approach? I quickly looked over the code, it's been a while since I read some lisp. The on-disk representation seems to be just key value records where each record can be a sexp(?). Where is the relational part and do you plan to implement a query language?


Didn't know that about PostgreSQL, thanks.

I'm aiming to be as ACID as makes sense for a project this scale, it's slowly getting there.

The disk format is straight Lisp, I want it human readable and easy to process. Versioning is a consequence of the log format and tracking the different versions in memory.

By relational I mean based on tables, or relations; take away SQL and it's more or less the same thing.

No query languages planned, I strongly prefer an API to a query language for interfacing with the database.


Author here:

Whirlog is a minimal versioned log structured relational DB implemented in Common Lisp.

It's a design I've been playing around with in several languages to solve my issues with traditional databases, complexity and poor integration being two of them.

I've also grown fond of versioning, being able to track changes through time; which is uncommon and tricky to implement well on the application side.

I've written two non-trivial webapps on top of similar designs that have been running without a hitch for over a year now.

I'm more than happy to answer any questions posted in this thread!


Is this inspired by what Chris Okasaki calls persistent data structures in his marvelous little book?


To add more detail, I seem to remember, that "persistent" in that book means the following:

Once you have a handle on the data structure, you can perform any pure operation, which is intended for that data structure and your handle will still point to the same data structure, because "modified" versions are not really "modified" as in someone performing a mutation on the data structure, but actually new instances of the data structure, which might share parts with the original one.

This property is very nice to have for working with the data structure.


Sorry, no.

I bought the book at one point during my Haskell years but found it too dry for my taste at the time.


Check out Datomic


I'm having trouble understanding the term "log structured". Wikipedia has an entry for it, but that seems to describe a very specific mathematical theory that flies right above my head. Is it related to that or does it mean

    "This database shares characteristics with a log, in the sense of an append-mostly data structure"
?

And if I squint at, it's something like a "transaction log" or an "event sourcing" just at a different abstraction level ?


Yes, "log structured" basically means "append only (mostly)"

Traditionally on-disk data-structures are modified in-place. But random writes keep disks busy (especially spinning disks, though sequentially writes are still a bit faster in solid state drives), and are also very hard to do transactionally.

So "log structured" data-structures try to make do with only appends (and usually background "compaction"). This also has the benefits of automatically providing history (global versioned snapshots!) and an audit trail "for free", if you want that.

"Log structured merge trees" are a now popular data structure that implements a key-value map using several sorted, append only sequences, instead of the traditional hashtable or b-trees which require in-place modifications



I found these: https://github.com/barrucadu/logdb http://vldb.org/pvldb/vol5/p1004_hoangtamvo_vldb2012.pdf

Basically, what you said. Append-only, and in this case apparently just S-expressions.


Looks like a useful and fun side project, just wish I had time to play with it!


How does it compare to other similar software?


Could you be a tiny bit more specific about what software you're referring to? A link or two would help.


Off topic but can you please email hn@ycombinator.com? I want to put your post in the second chance pool (https://news.ycombinator.com/pool, explained at https://news.ycombinator.com/item?id=26998308) but it needs a bit more information.

Edit: thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: