Hacker News new | past | comments | ask | show | jobs | submit login

What you describe sounds more like Orleans grains. Erlang's lightweight processes serve a different purpose than actors or objects, they're really about fault tolerance.

A typical Phoenix app spawns one process for every HTTP request, websocket channel, or DB connection, for example. All these things have a lifetime shorter or equal to the pod's.

The VM gives guarantees and tools to ensure one of these process crashing will not affect the others, unless you want it to. For instance, if a request crashes, its memory will be reclaimed, open files will be closed, DB connections will be re-opened, the exit reason will be logged, error metrics incremented, etc. This is possible with no defensive programming like try/catch and so on, because the processes are isolated. For example they don't share memory, so it's always safe to deallocate something if its owner process died.

This gives a two-layer safety net that allows for very reliable apps, where the VM protects you against bugs in your own code, and Kubernetes protects you against bugs in Erlang, or really catastrophic failures. There's a good blog post on this: http://blog.plataformatec.com.br/2019/10/kubernetes-and-the-...




What i had in mind is some kind of gen_fsm or gen_statemachine.

For stateless applications it is pretty obvious how supervision trees can improve the reliability. Essentially the only ‘state’ there is request itself. Worst case scenario client would just retry.

But with some state involved it becomes not as simple.

Essentially the answer to my question probably would be like : “you should store the state in the external system, and design your system in such a way that stored state is always consistent. In case of failure supervisor will respawn the process and it will recreate what it needs from the saved state”


You may find these interesting...

- "The Onion Layer Theory" https://learnyousomeerlang.com/building-applications-with-ot...

- "On Erlang, State and Crashes" http://jlouisramblings.blogspot.com/2010/11/on-erlang-state-...

- "Why Restarting Works" https://ferd.ca/the-zen-of-erlang.html (search for "Heisenbug")

> you should store the state in the external system

Disk works too, but if you're multi-node this means you now have a distributed database embedded in your system, which may or may not be your goal :)

RabbitMQ does this, they developed a library for "persistent, fault-tolerant and replicated state machines" based on Raft: https://github.com/rabbitmq/ra.


Ahh great. “On Erlang, State and Crashes” is exactly answering my question.

As for disks - in good old times when servers were pets, not cattle that was a good idea. But now when the servers are as ephemeral as actors, we need to approach it differently, hence my original question.

Sidenote - i have a strange relationship with Erlang. I first learned it in 2006, liked the idea and was hoping it will eat the world as scale increases. I even contacted Joe Armstrong in hope to translate his thesis. Zero Erlang books in the world at the time. Then i did some load tests using Tsung in 2012. Then i used akka.net in 2018. But till this day i never had a chance to properly use in production.


> you should store the state in the external system

It's an old talk, but this one has a strategy for robustly persisting state across ephemeral containers. The example here is running a multiplayer game server while doing code updates etc. using Horde and CRDTs

https://www.youtube.com/watch?v=nLApFANtkHs


Agreed, sounds like Orleans. See https://github.com/erleans/erleans :)

Also for general writing on erlang and k8s: https://adoptingerlang.org/docs/production/ -- I try to explain why the two work at different levels so complement each other.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: