Big fan of formalizing state machines and designing backends around them. I’ve been building a similar thing [0] but taking it a step further by enabling async tasks to run in response to transitions.
Yes, it is a “capital F” Framework. Meaning, at least using the go sdk, I felt like a lot of my existing go knowledge (was developing go professionally for about 6 years before my first temporal project) wasn’t as much of a help when learning temporal as I thought it would be. Mainly I was very comfortable using things like go routines and channels, but picking up the temporal versions of these in order to avoid non-determinism took a while. So there were a number of situations where another dev would ask me “how do I do X in go”, and frequently I knew the answer but had to figure out how to make it work in a temporal workflow. This wasn’t helped by the fact that, at least as of recently, their documentation was incredibly lackluster (the sdk is well commented though).
Additionally error handling is significantly more difficult in temporal than in normal go code in my experience, and I ended up needing to toss a lot of patterns around returning errors to avoid making handling on the caller side incredibly verbose. And I mean verbose by go error handling standards.
I will say that the overall product is really good, and I would definitely recommend it for use cases where you have long running workflows. We have workflows running for several months and being able to do things like call ‘time.sleep(10 days)’ in normal code, without having to think about the internals of scheduling resumption ourselves, is huge. I’ve just heard other teams in my org wanting to use temporal when there isn’t the same time (or “temporal”) component to the problem they’re trying to solve, usually just because they heard about auto retries, and I’m not convinced it’s worth it in those cases
I don’t consider the sleep and auto retry features that interesting, but definitely nice to have.
The advantages to me are the the ability to easily have long-lived “workflows” that are written in the language of your choice (5 SDKs the moment), remove the need for storing state in your application database (even remove the need for a database all together for some services), handle the rough edges around distributed concurrency, and provide out of the box visibility into workflow states.
It does seem like it would be overkill for simple use cases, but with the SaaS offering being so compelling IMHO, I think it can make sense to put smaller use cases there if there is a chance of more use cases in the future.
It does sound like you have more production experience with temporal than me though, so please push back if my glasses are a bit rosey
I do definitely think temporal is a good product, and I hope my original comment didn’t come across as overly negative. I also only have production experience with the go sdk, so maybe some of the pain points are less relevant in other languages. The main thing is that in order to accomplish all of the things it does, using temporal effectively adds a second runtime to your application. A runtime that most developers in your language of choice are not familiar with and, at least in my experience, has a fraction of the documentation.
I’d still definitely recommend temporal if your use-case involves long running workflows. I’ve just seen a decent number of developers viewing temporal as some sort of silver bullet but, as always, there are trade offs.
So happy that others are getting behind this idea. Protecting critical backend state by establishing a state machine that owns it makes so many things so much easier to reason about.
Recently opened up access to a backend as a service [0] for running arbitrary state machines as reliable workflows or real-time backends. After building a bunch of systems on top of this, I'm definitely convinced that this is a better way to structure a lot of systems.
[0] https://github.com/fabianlindfors/restate