> I would have been more interested in seeing how they could implement an encaps...

hardwaresofton · 2024-07-04T11:45:23 1720093523

> The "encapsulated function" is the `StunBinding` struct. It represents the functionality of a STUN binding. It isn't a single function you can just call, instead it requires an eventloop. > > The point though is, that `StunBinding` could live in a library and you would be able to use it in your application by composing it into your program's state machine (assuming you are also structuring it in a sans-IO style).

What I was thinking was that the functionality being executed in main could just as easily be in a library function -- that's what I meant by encapsulated function, maybe I should have said "encapsulated functionality".

If the thing I want to do is the incredibly common read or timeout pattern, how do I do that in a sans-IO way? This is why I was quite surprised to see the inclusion of tokio::select -- that's not very sans IO, but is absolutely the domain of a random library function that you might want to expose.

It's a bit jarring to introduce the concept as not requiring choices like async vs not, then immediately require the use of async in the event loop (required to drive the state machine to completion).

Or is the point that the event loop should be async? That's a reasonable expectation for event loops that are I/O bound -- it's the whole point of a event loop/reactor pattern. Maybe I'm missing some example where you show an event loop that is not async, to show that you can drive this no matter whether you want or don't want async?

So if I to try to condense & rephrase:

If I want to write a function that listens or times out in sans-IO style, should I use tokio::select? If so, where is the async runtime coming from, and how will the caller of the function be able to avoid caring?

wh33zle · 2024-07-04T12:10:22 1720095022

> If I want to write a function that listens or times out in sans-IO style, should I use tokio::select? If so, where is the async runtime coming from, and how will the caller of the function be able to avoid caring?

To "time-out" in sans-IO style means that your state machine has an `Instant` internally and, once called at a specific point in the future, compares the provided `now` parameter with the internal timeout and changes its state accordingly. See [0] for an example.

> but is absolutely the domain of a random library function that you might want to expose.

That entire `main` function is _not_ what you would expose as a library. The event loop should always live as high up in the stack as possible, thereby deferring the use of blocking or non-blocking IO and allowing composition with other sans-IO components.

You can absolutely write an event loop without async. You can set the read-timeout of the socket to the value of `poll_timeout() - Instant::now` and call `handle_timeout` in case your `UdpSocket::recv` call errors with a timeout. str0m has an example [1] like that in their repository.

> It's a bit jarring to introduce the concept as not requiring choices like async vs not, then immediately require the use of async in the event loop (required to drive the state machine to completion).

All the event loops you see in the post are solely there to ensure we have a working program but are otherwise irrelevant, esp. implementation details like using `tokio::select` and the like. Perhaps I should have made that clearer.

[0]: https://github.com/firezone/firezone/blob/1e7d3a40d213c9524a... [1]: https://github.com/algesten/str0m/blob/5b100e8a675cd8838cdd8...

hardwaresofton · 2024-07-04T12:51:03 1720097463

> To "time-out" in sans-IO style means that your state machine has an `Instant` internally and, once called at a specific point in the future, compares the provided `now` parameter with the internal timeout and changes its state accordingly. See [0] for an example.

This part of the post was clear -- I didn't ask any clarifications about that, my point was about what I see as "read or timeout", a reasonable functionality to expose as a external facing function.

The question is still "If I want to read or timeout, from inside a function I expose in a library that uses sans-IO style, how do I do that?".

It seems like the answer is "if you want to accomplish read or timeout at the library function level, you either busy wait or pull in an async runtime, but whatever calls your state machine has to take care of that at a higher level".

You see how this doesn't really work for me? Now I have to decide if my read_or_timeout() function exposed is either the default sync (and I have to figure out how long to wait, etc), or async.

It seems in sans-IO style read_or_timeout() would be sync, and do the necessary synchronous waiting internally, without the benefit of being able to run other tasks from unrelated state machines in the meantime.

> That entire `main` function is _not_ what you would expose as a library.

Disagree -- it's entirely reasonable to expose "read your public IP via STUN" as a library function. I think we can agree to disagree here.

> The event loop should always live as high up in the stack as possible, thereby deferring the use of blocking or non-blocking IO and allowing composition with other sans-IO components.

Sure... but that means the code you showed me should never be made into a library (we can agree to disagree there), and I think it's reasonable functionality for a library...

What am I missing here? From unrelated code, I want to call `get_ip_via_stun_or_timeout(hostnames: &[String], timeout: Duration) -> Option<String>`, is what I'm missing that I need to wrap this state machine in another to pass it up to the level above? That I need to essentially move the who-must-implement-the-event-loop one level up?

> You can absolutely write an event loop without async. You can set the read-timeout of the socket to the value of `poll_timeout() - Instant::now` and call `handle_timeout` in case your `UdpSocket::recv` call errors with a timeout. str0m has an example [1] like that in their repository.

Didn't say you couldn't!

What you've described is looping with a operation-supported timeout, which requires timeout integration at the function call level below you to return control. I get that this is a potential solution (I mentioned it in my edits on the first comment), but not mentioning it in the article was surprising to me.

The code I was expecting to find in that example is like the bit in strom:

https://github.com/algesten/str0m/blob/5b100e8a675cd8838cdd8...

Clearly (IMO evidenced by the article using this method), the most ergonomic way to do that is with a tokio::select, and that's what I would reach for as well -- but I thought a major point was to do it sans IO (where "IO" here basically means "async runtime").

Want to note again, this is not to do with the state machine (it's clear how you would use a passed in Instant to short circuit), but more about the implications of abstracting the use of the state machine.

> All the event loops you see in the post are solely there to ensure we have a working program but are otherwise irrelevant, esp. implementation details like using `tokio::select` and the like. Perhaps I should have made that clearer.

I personally think it exposes a downside of this method -- while I'm not a fan of simply opting in to either async (and whichever runtime smol/tokio/async-std/etc) or sync, what it seems like this pattern will force me to:

- Write all code as sync - Write sync code that does waiting based on operations that yielding back control early - Hold my own tokio runtime so I can do concurrent things (this, you argue against)

Async certainly can be hard to use and have many footguns, but this approach is certainly not free either.

At this point if I think I want to write a library that supports both sync and async use cases it feels like feature flags & separate implementations might produce an easier to understand outcome for me -- the sync version can even start as mostly `tokio::Runtime::block_on`s, and graduate to a more performant version with better custom-tailored efficiency (i.e. busy waiting).

Of course, I'm not disparaging the type state pattern here/using state machines -- just that I'd probably just use that from inside an async/sync-gated modules (and be able to share that code between two impls).

wh33zle · 2024-07-04T13:33:52 1720100032

> What am I missing here? From unrelated code, I want to call `get_ip_via_stun_or_timeout(hostnames: &[String], timeout: Duration) -> Option<String>`, is what I'm missing that I need to wrap this state machine in another to pass it up to the level above? That I need to essentially move the who-must-implement-the-event-loop one level up?

Essentially yes! For such a simple example as STUN, it may appear silly because the code that is abstracted away in a state machine is almost shorter than the event loop itself.

That very quickly changes as the complexity of your protocol increases though. The event loop is always roughly the same size yet the protocol can be almost arbitrarily nested and still reduces down to an API of `handle/poll_timeout`, `handle_input` & `handle_transmit`.

For example, we've been considering adding a QUIC stack next to the WireGuard tunnels as a control protocol in `snownet`. By using a sans-IO QUIC implementation like quinn, I can do that entirely as an implementation detail because it just slots into the existing state machine, next to ICE & WireGuard.

> At this point if I think I want to write a library that supports both sync and async use cases it feels like feature flags & separate implementations might produce an easier to understand outcome for me -- the sync version can even start as mostly `tokio::Runtime::block_on`s, and graduate to a more performant version with better custom-tailored efficiency (i.e. busy waiting).

> Of course, I'm not disparaging the type state pattern here/using state machines -- just that I'd probably just use that from inside an async/sync-gated modules (and be able to share that code between two impls).

This is what quinn does: It uses tokio + async to expose an API that uses `AsyncRead` and `AsyncWrite` and thus fully buys into the async ecosystem. The actual protocol implementation however - quinn-proto - is sans-IO.

The way I see this is that you can always build more convenience layers, whether or not they are in the same crate or not doesn't really matter for that. The key thing is that they should be optional. The problems of function colouring only exist if you don't focus on building the right thing: an IO-free implementation of your protocol. The protocol implementation is usually the hard bit, the one that needs to be correct and well-tested. Integration with blocking or non-blocking IO is just plumbing work that isn't difficult to write.

hardwaresofton · 2024-07-04T13:50:42 1720101042

Ahh thanks for clarifying this! Makes a ton of sense now -- I need to try writing some of these style of programs (in the high perf Rust style) to see how they feel.

> For example, we've been considering adding a QUIC stack next to the WireGuard tunnels as a control protocol in `snownet`. By using a sans-IO QUIC implementation like quinn, I can do that entirely as an implementation detail because it just slots into the existing state machine, next to ICE & WireGuard.

Have you found that this introduces a learning curve for new contributors? Being able to easily stand up another transport is pretty important, and I feel like I can whip together an async-required interface for a new protocol very easily (given I did a decent job with the required Traits and used the typestate pattern) where as sans-IO might be harder to reason about.

Thanks for pointing out quinn-proto (numerous times at this point) as well -- I'll take a look at the codebase and see what I can learn from it (as well as str0m).

[EDIT]

> The problems of function colouring only exist if you don't focus on building the right thing: an IO-free implementation of your protocol. The protocol implementation is usually the hard bit, the one that needs to be correct and well-tested.

The post, in a couple lines!

[EDIT2] Any good recommendations of a tiny protocol that might be a good walk through intro to this?

Something even simpler than Gopher or SMTP? Would be nice to have a really small thing to do a tiny project in.

wh33zle · 2024-07-04T14:53:18 1720104798

> [EDIT2] Any good recommendations of a tiny protocol that might be a good walk through intro to this? > > Something even simpler than Gopher or SMTP? Would be nice to have a really small thing to do a tiny project in.

I only have experience in packet-oriented ones so I'd suggest sticking to that. Perhaps WireGuard could be simple enough? It has a handshake and timers so some complexity but nothing too crazy.

DNS could be interesting too, because you may need to contact upstream resolvers if you don't have something cached.

joshka · 2024-07-05T00:40:27 1720140027

If you want a web protocol, try oauth2. There's complexity in the number of things you can support, but in essence there's a state machine that can be modeled.

hardwaresofton · 2024-07-05T01:06:38 1720141598

Ahh didn't even think of that level of the stack… It is true that the OAuth2 tango can be represented by a state machine…

I’d probably do CAS instead, it’s simpler IMO.

joshka · 2024-07-05T03:06:34 1720148794

The Stun protocol is surprisingly easy to implement, but see my comment at https://news.ycombinator.com/item?id=40879547 about why I'd just use async instead of making my own event loop system.

https://gist.github.com/joshka/af299be87dbd1f64060e47227b577...

hardwaresofton · 2024-07-05T04:21:26 1720153286

Thanks for the code! Going to pore over this.

I read the comment and I definitely agree (though it took me a while to get to where you landed), I think there are some benefits:

- More controllable/easy to reason about cancel safety (though this gets pushed up the stack somewhat). You just can't cancel a thread, but it turns out a ton of places in an async function are cancel points (everywhere you or some function calls .await, most obviuosly), and that can cause surprising problems.

- Ability to easily slap on both sync and async shells (I personally think it's not unforgivable to smuggle a tokio current thread runtime in as a dep and use block_on for async things internally, since callers are none the wiser)

Great comment though, very succinctly explained what I was getting at... I personally land on the "just make everything async" side of things. Not necessarily everything should be Send + Sync, but similar to Option/Result, I'd rather just start using async everywhere than try to make a sync world work.

There's also libraries like agnostic[0] that make it somewhat easier to support multiple runtimes (though I've done it in the past with feature flags).

> The problem with the approach suggested in the article is that it splits the flow (event loop) and logic (statemachine) from places where the flow is the logic (send a stun binding request, get an answer).

Very concisely put -- If I'm understanding OP's point of view, the answer to this might be "don't make the flow the logic"? basically rather encoding the flow as a state machine and passing that up to an upper event loop (essentially requiring the upper layer to do it).

Feels like there are at least 3 points in this design space:

- Sync only state machines (event loop must be at the outermost layer) - Sync state machines with possibly internal async (event loops could be anywhere) - Async everything (event loops are everywhere)

[0]: https://crates.io/crates/agnostic

hardwaresofton · 2024-07-05T04:38:07 1720154287

also a bit late, but you've seen anyhow & miette right? noticed the color_eyre usage and was just wondering

joshka · 2024-07-05T04:57:44 1720155464

yep - color_eyre is a better anyhow (and there's plans afoot to merge them into just one at some point[1]). Miette occupies a space that I generally don't need (except when processing data), while color-eyre is in the goldilocks zone.

[1]: https://github.com/eyre-rs/eyre/issues/177

hardwaresofton · 2024-07-05T05:31:52 1720157512

Thanks for the pointer! Rustaceans are spoiled for choice with good error handling and libraries, great to have so many great choices.