The context is exactly once in a distributed system. When you construct that hig...

xg15 · on March 1, 2023

I mean, the property of distributed systems is that they crop up everywhere you have an unreliable transport and generally only bring downsides. If you could un-distribute your system that easily, I bet that would make a lot of people very happy.

The OP defines "distributed systems" like this:

> Web browser and server? Distributed. Server and database? Distributed. Server and message queue? Distributed.

By that definition, as soon as I have a server, a client and an unreliable connection between them, I have a distributed system. In that context, nothing stops me from counting IDs.

karpierz · on March 1, 2023

When you receive a message in your scheme, you have to do the following:

1. Some action with a side-effect (ex: update an entry on disk, send a message out to some third party, etc.). This might be a bank transaction, or a note saying "you gotta ship package X to person Y".

2. Some action to note that you've received ID X (ex: write to disk, send a message out to your DB, etc.)

How do you set up your server to deterministically do neither in event of a crash, and both in event that your code turns to completion?

iainmerrick · on March 1, 2023

If the action is sending a message, I’d say the answer is to just send it again, and defer the “exactly-once” problem to the point where the message is actually used. (It seems like we’re in a context where sending messages is always unreliable, so that send could fail, so there needs to be some protocol for retrying it anyway.)

So the action is some local action. If it’s purely digital, it’s fiddly but surely not impossible to ensure that the action and the record of the action either both take place or both don’t. It’s a database with a transaction log.

If the action is some irrevocable physical thing - remotely controlling a printer, say - you need to make a best effort to handle errors gracefully, sure.

I’d concede that it’s impossible to ensure that a document is printed out exactly once, say - maybe there’s a paper jam, and it’s debatable whether the jammed paper counts as a valid printout. But that’s not very surprising and I don’t think it tells you much. It’s mostly a problem for printer manufacturers rather than distributed system architects.