Hacker News new | past | comments | ask | show | jobs | submit login

> If I understand it well, Loki doesn't entirely do what I'd want: it does not save the data to disk on every change, but less often, if I get it right.

From a cursory glance at the source, it doesn't look like anything internal ever calls .save()/.saveToDisk(), so it is up to you to decide how often you want to be creating your on-disk checkpoints.

EDIT: Seems I am wrong: "In a nodejs/node-webkit environment LokiJS also persists to disk whenever an insert, update or remove is performed." [1]. But I still can't spot where .save[/ToDisk] is called in the source [2]. Anyone?

> That might be good enough if my problem allows for many little independent Loki databases, but if the dataset is a gigabyte and persistence means flushing that whole gigabyte to disk after every data change, it probably won't work very well.

If you're looking at gigabyte+ datasets, then something designed to be in-memory is probably not your best bet. Aside from saving, (one imagines) this would also affect load times (reading->parsing->importing a multi-GB JSON file at launch can't be quick).

From [1]:

> LokiJS is ideal for the following scenarios:

* where a lightweight in-memory db is ideal

* cross-platform mobile apps where you can leverage the power of javascript and avoid interacting with native databases

* data sets are not so large that it wouldn't be a problem loading the entire db from a server and synchronising at the end of the work session

[1] https://github.com/techfort/LokiJS

[2] https://github.com/techfort/LokiJS/blob/master/src/lokijs.js




> If you're looking at gigabyte+ datasets, then something designed to be in-memory is probably not your best bet. Aside from saving, (one imagines) this would also affect load times (reading->parsing->importing a multi-GB JSON file at launch can't be quick).

Why not? I only need to load when my program starts, i.e. after a crash or maybe an upgrade when hot code swapping is not available. Sounds like there may be plenty of persistence schemes that make this not entirely painful (load latest data first, or fallback to disk reads if the data isn't entirely loaded yet, which makes the service slower but still available right after a crash).

Note, I'm really just dreaming here, and I appreciate you dreaming along. Dreaming is good! Some frontend dev dreamed "i just want to rebuild the entire page whenever data changes!" and then React happened.

Thanks for your dig into the code btw, nice findings! I can't find the save() calls either, so I suspect that they used to be there but aren't now, or the other way around. It's alpha, after all.


My (unseen) emphasis was on best bet.

It would work, but I think it would be dangerous and/or inconvenient.

The danger part comes from having a sync-to-disk operation that lasts any considerable amount of time - the longer it lasts, the more the likelihood that an inopportune crash would leave you with an incomplete (read: corrupted) JSON file. A DB built for fast disk persistence would only update the relevant records, keeping the disk writes as small as possible. I don't think Loki has any option other than writing the entire thing each time (with the current JSON savefiles, that is). Since save() is an expensive op, you also can't be calling it at every update (it wouldn't even work! the first one would lock the file for writing and all subsequent save() ops, until the first one is done, would fail!) so it would be inherently unsafe unless you committed GBs to disk for each update!.

So, this might be somewhat dangerous, but we can mitigate that, right? We'll save frequently, but not too frequently, and somehow version-control our JSON file. Which is GB+ in size. So we'll also compress them? And when we need to restore, we'd.. hmm.. start reading from the latest until a valid one is found? (<--inconvenience)

> Note, I'm really just dreaming here, and I appreciate you dreaming along.

Likewise!

> Sounds like there may be plenty of persistence schemes that make this not entirely painful (load latest data first, or fallback to disk reads if the data isn't entirely loaded yet, which makes the service slower but still available right after a crash).

While those certainly exist, the one currently chosen by Loki can't do any of these things. Since you can't parse half a JSON file (especially in this format, which appends index info etc at the end of the file), you have to read the entire thing from disk, JSON.parse it, and then feed it into Loki. Only after all of that will you know that you are actually restoring from meaningful (and complete) data files and not junk.

The way I see this, it would be great for throwaway-prototypes on the serverside (or for few/unimportant data) but its real value would show on the client side. You basically get a mini-mongodb for 2K LOC that you can embed on anything that speaks ECMA3.

What's even cooler IMHO is the future - on their slideshare presentation they list replication and horizontal scaling on their roadmap. While I have no idea how they have envisaged implementing those, I would love to experiment with the meteor.js concepts and this - providing a full, fast cache of the user's data on the client side and then throw differential updates around.

edit: spelling


I downloaded Loki and ran some tests and the README is wrong, it does not do any automatic persisting to disk at any time. That might be a future planned feature, perhaps.

However it does throw events, so it would not be hard to implement this behavior yourself with a simple event listener.


I was looking for that too, it seems to be missing indeed. Opened an issue[1].

[1] https://github.com/techfort/LokiJS/issues/14




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: