This is billed as something revolutionary and forward progress but that’s exactly how we used to do async in $lang - including Rust - before language support for async/await landed.
The biggest productivity boost to my rust embedded firmware development was when I could stop manually implementing state machines and marshalling all local variables into custom state after custom state between each I/O operation snd let rust do that for me by using async/await syntax!
That’s, after all, what async desugars to in rust: an automatic state machine that saves values across I/O (await) points for you.
I tried to address this at the end of the post: If what you are implementing is mostly _sequential_ IO operations, then this model becomes a bit painful.
That isn't always the case though. In more packet-oriented usecases (QUIC, WebRTC & IP), doing the actual IO bit is easy: send & receive individual packets / datagrams.
There isn't really much the compiler can generate for you because you don't end up with many `.await` points. At the same time, the state management across all these futures becomes spaghetti code because many of these aspects should run concurrently and thus need to be in their own future / task.
I don't think that's quite true. The lift here is that the state machine does not do any IO on its own. It always delegates that work to the event loop that's hosting it, which allows it to be interpreted in different contexts. That makes it more testable and more composable as it makes fewer assumptions about the runtime environment.
Theoretically, you could do the same thing with async/await constructing the state machines for you, although in practice it's pretty painful and most async/await code is impure.
There are lots of more experimental languages which exceptional support for this style of programming (Eff, Koka, Frank). Underlying all of Haskell's IO discourse is a very deep investment into several breeds of this kind of technology (free monads and their variants).
Lately, Unison has been a really interesting language which explores lots of new concepts but also has at its core an extensible effects system that provides excellent language-level support for this kind of coding.
> I don't think that's quite true. The lift here is that the state machine does not do any IO on its own.
Here is a simple counter example. Suppose you have to process a packet that contains many sequences (strings/binary blobs) prefixed by 4 bytes of length.
You are not always guaranteed to get the length bytes or the string all in one go. In a sequential system you'd accumulate the string as follows
handle_input(...)
while not received 4 bytes
accumulate in buf
len = toInt(buf[0..4])
while not received len bytes
accumulate in buf
If implemented as a state machine ,these would require two await points to assemble the string. Flattening this out into a state machine manually is a pain.
I'm not sure what part of that is supposed to be a pain. The sans-io equivalent would be:
handle_input(buf) -> Result {
if len(buf) < 4 { return Error::IncompletePacket }
len = toInt(buf[0..4])
if len(buf) < 4 + len { return Error::IncompletePacket }
packet = buf[4..(4 + len)]
return Ok { packet: packet, consumed: 4 + len }
}
where the semantics of `Error::IncompletePacket` are that the caller reads more into the buffer from its actual IO layer and then calls handle_input() again. So your "while not received required bytes: accumulate in buf" simply become "if len < required: return Error::IncompletePacket"
I don't think that implementation is particularly good, although this is a big trick with Sans-IO: is the event loop responsible for buffering the input bytes? Or are the state machines?
In effect, you have to be thoughtful (and explicit!) about the event loop semantics demanded by each state machine and, as the event loop implementer, you have to satisfy all of those semantics faithfully.
A few alternatives include your version, one where `handle_input` returns something like `Result<Option<Packet>>` covering both error cases and successful partial consumption cases, one where `handle_input` tells the event loop how much additional input it knows it needs whenever it finishes parsing a length field and requires that the event loop not call it again until it can hand it exactly that many bytes.
This can all be pretty non-trivial. And then you'd want to compose state machines with different anticipated semantics. It's not obvious how to do this well.
Fair enough. So let's complicate it a little. If you have hierarchical variable sized structures within structures (e.g. java class file), then you need a stack of work in progress (pointers plus length) at every level. In fact, the moment you need a stack to simulate what would otherwise have been a series of function calls, it becomes a pain.
Or let's say you have a loop ("retry three times before giving up"), then you have to store the index in a recoverable struct. Put this inside a nested loop, and you know what I mean.
I have run into these situations enough that a flat state machine becomes a pain to deal with.
These are nicely solved using coroutines. That way you can have function related temporary state, IO-related state and stacks all taken care of simply.
I agree totally, it wasn't my intention to say that there aren't protocols which require non-trivial state machines to implement their behavior.
To be more clear, I'm contesting that the only thing being discussed in the article is this convenience around writing state machines. I think whether or not you have to write non-trivial state machines by hand or have them generated by some convenient syntax is orthogonal to the bigger insight of what Sans-IO is going after.
I think the most important part here is that you write these state machines such that they perform no impure calculation on their own. In other words, you write state machines that must be driven by an event loop which is responsible for interpreting commands from those state machines and that all IO (and more generally, all impure calculation) is performed exclusively by that event loop.
It's much more possible to compose machines like this because they don't make as many assumptions on the runtime. It's not that they're reading from a blocking or non-blocking socket. It's that they process some chunk of bytes and _possibly_ want to send some chunk of bytes back. The event loop, constructed by the user of the state machine, is responsible for deciding how to read/write those bytes.
Yep. The only things about async that bothers me is the need to write ".await" everywhere. I wish there'd be a way to inverse this, and actually just run ".await" by default, while having a special construct not to.
It’s important to be able to see where the async function might pause execution.
For example, if you’re holding a mutex lock, you probably want to avoid holding it “across” an await point so that it’s not locked for longer than necessary if the function is paused.
To play devil's advocate though, the case of Mutex specifically has it going for it that MutexGuard is !Send, so it's a compiler error if a MutexGuard is held across an await point in a Send Future. But yes if your Future is also !Send then the compiler will allow it. In that case, your only recourse is that clippy has lints for holding Mutex and RefCell guards across await points, as long as you're running it and paying attention to it of course.
I disagree. It should be a compiler warning, maybe a "clippy" one, in such cases.
Btw the problem of sync code blocking async code is very real and also needs to be resolved, adding explicit `.blocking` to every blocking call is just as bad as explicit .await at every line.
Also, I like Haskell approach of being able to introduce syntax extensions at a file-level, so that for code that'd benefit from explicit await – I'd rather let author have it explicit.
The biggest productivity boost to my rust embedded firmware development was when I could stop manually implementing state machines and marshalling all local variables into custom state after custom state between each I/O operation snd let rust do that for me by using async/await syntax!
That’s, after all, what async desugars to in rust: an automatic state machine that saves values across I/O (await) points for you.