Rusty Typestates – Starting Out

nllsh · on Dec 3, 2021

I just read about typestates in Rust on Cliff Biffle's blog here: http://cliffle.com/blog/rust-typestate/#variation-state-type...

He covers a few additional parts of the typestate pattern, such as isolating data in specific states as well as sharing common implementations across a subset of states.

I'd also like to note that typestates also show up in functional programming under Indexed Monads, where a function might take a struct from an initial typestate, unwrap its data, and return a final(likely different) typestate. You can search Indexed Monad for more explanation there. If you work primarily in typescript you can find a production ready implementation of typestate programming here: https://github.com/DenisFrezzato/hyper-ts

Klasiaster · on Dec 3, 2021

Since this often matters when implementing protocols, there is research about session types and verification. Here is a Rust library for session types: https://github.com/Munksgaard/session-types

I believe both have some overlap.

snowytrees · on Dec 3, 2021

The same site has a great introduction (not for rust though) to session types: https://rustype.github.io/notes/notes/session-types.html

thechao · on Dec 3, 2021

My gut feeling is that the combination of typestates and stackless coroutines (really: CSP) would make for a kick-ass driver design. It's pretty solid for the toys I've designed for OpenGLES-2-class drivers. My gut feeling is that a lot of the command-buffer shenanigans in Vulkan, Metal, and DX45035+ come from trying to make reasoning about driver development tractable, while also not making the users' lives awful.

ohazi · on Dec 4, 2021

In Drone<Hovering>, shouldn't land be:

    fn land(self) -> Drone<Idle> {
        Drone::<Idle>::from(self)
    }

instead of:

    fn land(self) -> Drone<Idle> {
        Drone::<Idle>::new()
    }

Otherwise the position assertions in drone_flies fail, because it's looking at the position of a brand new Drone<Idle>, rather than one based on the position you just flew to.

Edit:

Yes it should. [1]

[1] https://github.com/rustype/drone/blob/main/src/drone.rs#L53

afranchuk · on Dec 4, 2021

Typestates are an excellent way to provide stronger compile time guarantees, and I wish I saw them more in the wild (I have used them myself when appropriate, though I was unaware of the terminology).

However, the article mentions:

The attentive reader may have noticed that the consumption and conversion of self into other types implies the values are moved (and in some cases, copied) around.

This is untrue; those moves will definitely be optimized away and the typestates will end up being zero cost.

nextaccountic · on Dec 4, 2021

There's no guarantee that moves will be optimized away. They won't if you compile without the --release flag, for example.

The fact that Rust debug builds are unreasonably slower than C debug builds is a pet peeve of mine. And unoptimized moves are a big part of the problem. Or saying otherwise: "optimizing out" simple moves shouldn't be the job of an optimizer that sometimes doesn't run. It should be always done!

afranchuk · on Dec 5, 2021

I'm sorry, I didn't qualify my statement well. I did mean in release situations when optimizations are enabled. But I'm not sure whether such optimizations are appropriate in debug builds. I suspect there may be cases where that would interfere with debugging. I know one shouldn't necessarily assume certain optimizations will occur, but after all the point of optimizations is to improve generated code while retaining functionality, and the article was stating that the moves will be an issue, which isn't always true (especially not in typical scenarios of using --release code).

However for completeness, below is the article's example, where you can see it inlines pretty much everything as you'd expect. But even if you make it so it won't inline functions (e.g. add println to them) it still optimizes the moves away, which is good.

https://rust.godbolt.org/z/3s1KcPa7d

brundolf · on Dec 4, 2021

I'm pretty sure it doesn't do this right now even in optimized builds: https://stackoverflow.com/a/38571602/11392896

Of course the language semantics do still leave the door open for this optimization to happen some time in the future

afranchuk · on Dec 5, 2021

That answer is from 5.5 years ago. Rust was barely 1.0 then. A lot has changed in 5.5 years. See my godbolt link in the other reply for verification that it does optimize the moves away.

Besides LLVM improvements, I suspect the moves might be optimized out at the MIR level, though that's just a hunch.

comex · on Dec 4, 2021

LLVM has at least two optimizations that can elide moves in different circumstances, MemCpyOpt and SROA. But it depends on the situation. If you call a function that moves its argument to its return value, that move definitely won't get elided if the function is not inlined. But simple functions are likely to be inlined.

nextaccountic · on Dec 4, 2021

> If you call a function that moves its argument to its return value, that move definitely won't get elided if the function is not inlined.

Wow that's worse than I thought. This means that any pub function that does this pattern of taking ownership then giving it back should have at least #[inline] to enable inlining across crates, or maybe even #[inline(always)].

The difference to C to C++ is that in Rust, such pattern is sometimes necessary to work around the borrow checker.

naasking · on Dec 4, 2021

Wouldn't that inhibit debugging which is the whole point of a debug build?

nextaccountic · on Dec 4, 2021

It wouldn't. Moves are just memcpy which has no side effects besides copying something to another address. Eliding moves doesn't break debugging, it actually makes it cleaner IMO. Actual operations on data still happen.

FranchuFranchu · on Dec 4, 2021

I wonder if it would be a feasible to have some sort of internal compile-time variable in each binding that could make compile-time states not have to use hacky methods like this one.

Essentially, instead of the compile-time value attached to a binding only being a type (with generics, optionally), it could also have a regular Rust value attached to it.

naasking · on Dec 4, 2021

> Essentially, instead of the compile-time value attached to a binding only being a type (with generics, optionally), it could also have a regular Rust value attached to it.

You've just reinvented dependent typing and all of its glory and challenges.

ajkjk · on Dec 4, 2021

This is the feature I most want in every language I use. The main usecase is having 'initialized' / 'ready' / 'destroying' states for objects, which tends to be useful in all kinds of gluecode in lots of settings.

adav · on Dec 3, 2021

This approach would be a great addition to the AWS CDK. I was trying that out recently and, without constantly referring back to the AWS console or to the docs, it feels like flying blind.

lliamander · on Dec 3, 2021

Note that you can do something like Typestates in Java using static inner classes that have no public constructors, though undoubtedly there are advantages to having first class language support. Behold:

  Scanner.OpenScanner sOpen = Scanner.open(/*some source*/);
  //do some processing
  Scanner.ClosedScanner sClosed = sOpen.close();
  sClosed.nextLine() //type error

I sometimes do this in combination with the Builder pattern. This "typesafe builder" allows for a fluent API that guides the user as to its use.

The point is, you can still use types to represent states.

tstack · on Dec 3, 2021

Yes, one of the advantage of language support is that "sOpen" will be consumed by the "close()" call and can't be reused. I don't think that's really possible in Java without an extra checker. But, it's usually not a problem in practice if the user sticks to using the API in a fluent manner.

valenterry · on Dec 4, 2021

Instead of trying to create an extra language-feature for that, I like the resource-style-solution. I.e., you don't have a "Scanner", you have "Resource<Scanner>" and you can't use it directly. You have to do this:

    Resource<Scanner> scannerResource = ...
    scannerResource.use( scanner => 
       ... // use the scanner
    )

After the closing parenthesis the .close() is called automatically.

This allows for a lot of nice things, such as providing methods for stacking and combining resources, thus making sure they automatically close in the correct order. Callers never have to care about closing resource explicitly. Error handling is much easier to do.

Etc.

lliamander · on Dec 4, 2021

I was just thinking about this earlier today! But then isn't that just the same thing as try-with-resouces?

valenterry · on Dec 4, 2021

Very similar but not the same. try-with-resources is essentially a "hack" that tries to emulate what I described (and it mostly succeeds with it). That way, the ecosystem doesn't have to be changed. "My" solution requires that libraries use/support this pattern, and it also requires the language to have nice syntax, something that Java isn't very good at.

ajkjk · on Dec 4, 2021

I read that pseudocode and all I think is: this sounds like it's making up for a missing language feature.

valenterry · on Dec 4, 2021

Well, it's pseudocode, but it's close enough.

Here's a runnable example: https://scastie.scala-lang.org/5klYiw6cT7yckJbn2zGmJg

The gist is: this does everything needed for resources, including preventing errors at compile-time, being composable/reusable and handling errors correctly (you can throw an exception in the middle and the scanners will still be closed).

That being said, having it as a language feature might make certain use-cases nicer and more ergonomic. However, it also makes the language much more complex; I don't think resources alone are a sufficient reason to fundamentally change the language.

One thing that Rust has gotten fairly right so far is not having added too many special cases and rely on general language features mostly. For example, don't have special syntax/handling for optional/nullable values, but just use the existing macrosystem to deal with it. I hope this philosophy will continue.

Rust is here to stay and these kind of decisions make a big difference in the long run.

ketralnis · on Dec 3, 2021

Is sOpen.nextLine() after the close also a compile time error?

lliamander · on Dec 3, 2021

No. That's a good point and one of those things that having built-in language support helps. But if you are using method chaining (i.e. fluent API) that matters less.

baby · on Dec 4, 2021

I think they should have started with the poor man typestate, which is to create a type per state (can use wrapper types if needed)

brunoqc · on Dec 3, 2021

I wonder how it improved since the last article.

solmag · on Dec 4, 2021

I believe there is a library that can generate type-state machines for you in Rust, ie. you don't have to code it by hand.

baby · on Dec 4, 2021

I think you're talking about what Pin does? But that's unrelated to these custom state machines.

solmag · on Dec 5, 2021

https://github.com/eugene-babichenko/rust-fsm - I think I meant this.