What you describe sounds more like Orleans grains. Erlang's lightweight processe...

mungobungo · on Nov 9, 2022

What i had in mind is some kind of gen_fsm or gen_statemachine.

For stateless applications it is pretty obvious how supervision trees can improve the reliability. Essentially the only ‘state’ there is request itself. Worst case scenario client would just retry.

But with some state involved it becomes not as simple.

Essentially the answer to my question probably would be like : “you should store the state in the external system, and design your system in such a way that stored state is always consistent. In case of failure supervisor will respawn the process and it will recreate what it needs from the saved state”

ramchip · on Nov 9, 2022

You may find these interesting...

- "The Onion Layer Theory" https://learnyousomeerlang.com/building-applications-with-ot...

- "On Erlang, State and Crashes" http://jlouisramblings.blogspot.com/2010/11/on-erlang-state-...

- "Why Restarting Works" https://ferd.ca/the-zen-of-erlang.html (search for "Heisenbug")

> you should store the state in the external system

Disk works too, but if you're multi-node this means you now have a distributed database embedded in your system, which may or may not be your goal :)

RabbitMQ does this, they developed a library for "persistent, fault-tolerant and replicated state machines" based on Raft: https://github.com/rabbitmq/ra.

mungobungo · on Nov 9, 2022

Ahh great. “On Erlang, State and Crashes” is exactly answering my question.

As for disks - in good old times when servers were pets, not cattle that was a good idea. But now when the servers are as ephemeral as actors, we need to approach it differently, hence my original question.

Sidenote - i have a strange relationship with Erlang. I first learned it in 2006, liked the idea and was hoping it will eat the world as scale increases. I even contacted Joe Armstrong in hope to translate his thesis. Zero Erlang books in the world at the time. Then i did some load tests using Tsung in 2012. Then i used akka.net in 2018. But till this day i never had a chance to properly use in production.

h0l0cube · on Nov 9, 2022

> you should store the state in the external system

It's an old talk, but this one has a strategy for robustly persisting state across ephemeral containers. The example here is running a multiplayer game server while doing code updates etc. using Horde and CRDTs

https://www.youtube.com/watch?v=nLApFANtkHs

kungfooguru · on Nov 9, 2022

Agreed, sounds like Orleans. See https://github.com/erleans/erleans :)

Also for general writing on erlang and k8s: https://adoptingerlang.org/docs/production/ -- I try to explain why the two work at different levels so complement each other.