Not really. Because then once you restart that process you lose everything. And ...

bastawhiz · 2024-11-04T03:50:23 1730692223

If you need that, you can use an embedded data store like leveldb/rocksdb or sqlite. Why bring another application running as its own process into the equation?

112233 · 2024-11-04T06:26:30 1730701590

But having separate applications can give you better perfomance and isolation than one single threaded process?

Or do you compare using separate processes to having multiple threads in a single process?

bastawhiz · 2024-11-04T13:43:53 1730727833

There's no universe where a single threaded embedded persistence implementation is slower than a single threaded application synchronously talking to a single threaded database over the network stack.

As far as isolation goes, if you are worried about the properties of reading and writing data to the disk then I simply don't know what to tell you. Isolation from what?

112233 · 2024-11-07T08:58:06 1730969886

Why network stack? On same host IPC over shared memory is a normal thing.

Perfomance-wise, I do not know of a nice portable way of flushing changes to disk securely that does not block (like, e.g. fsync does).

If you own the whole system and can tune whole kernel and userspace to run single application, sure, why overengineer. Otherwise software faults (bugs, crash due to memory overcommit, oom killer etc.) take down single process, and that can be less disruptive than full stop/start.

bastawhiz · 2024-11-08T04:44:11 1731041051

> On same host IPC over shared memory is a normal thing.

Not for Redis.

> I do not know of a nice portable way of flushing changes to disk securely that does not block (like, e.g. fsync does).

If you use Redis, you're either not waiting for writes to be acknowledged or you're waiting on fsync. You always fsync no matter whether it's in process or not or you're risking losing data.

Which process blocks doesn't affect performance, it's getting the data there in the first place.

> Otherwise software faults (bugs, crash due to memory overcommit, oom killer etc.) take down single process, and that can be less disruptive than full stop/start.

Even worse: Redis crashes and now your application (which hasn't crashed) can't read or write data, perhaps in the middle of ongoing operations. You have a whole new class of failure modes.

nine_k · 2024-11-04T04:37:50 1730695070

Running in its own process, and, better yet, in its own cgroup (container) makes potential bugs in it, including RCEs, harder to exploit. It also makes it easier to limit resources it consumes, monitor its functioning, etc. Upgrading it does not require you to rebuild and redeploy your app, which may be important if a bug or performance regression occurs is triggered, and you need a quick upgrade or downgrade with (near) zero downtime.

Ideally every significant part should live in its own universe, only interacting with other parts via well-defined interfaces. Sadly, it's either more expensive (even as Unix processes, to say nothing of Windows), slower (Erlang / Elixir), or both (microservices).

LtWorf · 2024-11-04T10:16:44 1730715404

At the cost of requiring a lot of IPC and memory copying.

throwaway81523 · 2024-11-04T06:33:46 1730702026

Either fork the process so the forked copy can dump its data (I think Redis itself does something like this), or launch a new process (with updated code if desired), then migrate the data and open sockets to it through some combination of unix domain sockets and maybe mmap. Or if we had OS's that really used x86 segmentation capability as it was designed for (this is one thing the 386 and later did cleverly) it could all be done with segments.

mcv · 2024-11-04T07:16:02 1730704562

Now you're making it complicated again.

I think a good rule of thumb is:

* If you need it from a single process and it doesn't need to survive process restarts, just use basic in-process data structures.

* If you need it from multiple processes or it needs to survive process restarts, but not system restarts, use redis.

* If it needs persistence across system restarts, use a real DB.

throwaway81523 · 2024-11-04T08:07:02 1730707622

Redis is nice but you take a huge speed hit depending on what you're doing, compared to using in-memory structures. Note that Redis can also persist to disk, either by forking or by writing an append-only op log that can be replayed for reloading or replication. Anyway, you'veve forgotten the case where you want not only the data structures, but also the open network connections to persist across process restarts. That's what passing file descriptors through unix domain sockets lets you do. It gets you the equivalent of Erlang's hot upgrades.

xxs · 2024-11-04T09:18:21 1730711901

>I think Redis itself does something like this)

It does... and it's a blocking operation (Bad). It's used when storing data to a file to allow restarts w/o losing the said data.

throwaway81523 · 2024-11-05T04:00:31 1730779231

Do you mean fork is a blocking operation? That's surprising but hmm ok. I didn't realize that. I thought that the process's pages all became copy-on-write, so that page updates could trigger faults handled by in-memory operations. Maybe those could be considered blocking (I hadn't thought of it like that) or maybe you mean something different?