Rust traits for developer friendly libraries

ndarilek · on May 25, 2015

I've been using Scala for years and have been eying Rust for a while. Nice to see lots of the things I like from Scala carry over. In particular this seems like a less magical version of Scala implicits, which seem incredibly cool at first until you realize that a particular library or framework implements implicits for everything, and tracking down the source for a given function involves guessing the signature or chasing down an implicit implicit conversion chain that gets you to one.

One thing I'm not sure about though, the article talks about implementing the Into trait, then quickly segues over to From. When would I use Into, when From, and do they both lead to the same end (I.e. a function that takes Into<Whatever>?) I've looked at the docs for each, and maybe I just haven't had enough coffee yet but the distinction isn't too clear.

steveklabnik · on May 25, 2015

> When would I use Into, when From

The standard library contains this code:

    // From implies Into
    #[stable(feature = "rust1", since = "1.0.0")]
    impl<T, U> Into<U> for T where U: From<T> {
        fn into(self) -> U {
            U::from(self)
        }
    }

So, you basically only ever implement From, and you get the equivalent Into for free.

More specifically, you would usually use `Into` to bound a generic function:

    fn is_hello<T: Into<Vec<u8>>>(s: T) {
        let bytes = b"hello".to_vec();
        assert_eq!(bytes, s.into());
    }
    
    let s = "hello".to_string();
    is_hello(s);

Whereas you'd use From to perform a conversion:

    let string = "hello".to_string();
    let other_string = String::from("hello");
    
    assert_eq!(string, other_string);

These APIs are pretty new; they landed _right_ before beta happened. We used to have FromFoo traits, and this generic system replaced them.

jimmyhmiller · on May 26, 2015

Maybe a bit off topic.

I've always heard that traits in rust are basically just type classes in haskell. But I'm not quite sure how you would implement this trait as a typeclass.

Specifically, from what I'm seeing here is that if you just implement From you've also now implemented Into. Is there an analog in haskell? I also can't figure out how to specify a type class that is parametrized the way into is.

Can anyone help by shedding some light on how you might go about this in haskell? Or where traits in rust differ from typeclasses in haskell?

rapala · on May 26, 2015

Standard Haskell has only single parameter type classes. With MultiParamTypeClasses you can define

  class Into a b where
    into :: a -> b

Note that From would be the exact same class. I guess the distinction in Rust is relevant because of ownership.

rbalicki · on May 25, 2015

Thank you! This is the best explanation I've been able to find for how to use From, instead of Into.

One thing that's useful about From is that you can implement From<SomePublicStruct> for YourPrivateStruct, but the compiler complains about Into<YourPrivateStruct> for SomePublicStruct.

SomePublicStruct could be (i32, i32) and YourPrivateStruct could be MyPoint, for example.

Making YourPrivateStruct public (with pub struct) fixes things, but that's not always what you want.

steveklabnik · on May 25, 2015

Any time. I myself checked in with IRC as I posted. :) Like I said, these traits are really new, and so they don't have great docs yet...

novocaine · on May 25, 2015

I still have the scars from libraries implementing implicit conversions in c++.

The question is - for people reading the code at the call site, how easy is it to grep for what's actually happening?

I guess this rust trait at least hangs off the geobox class, but I think I might prefer the explicit ctor for non write-only code.

steveklabnik · on May 25, 2015

Coherence helps with this: you can only implement a trait for a type if you've defined either the trait or the type. So a third party library can't impl From<MyType> for YourType`, for example.

In general, I find Rust pretty grep-able, though I'm a bit biased.

jblow · on May 25, 2015

This way of doing things sure sounds like it has massive performance implications.

arthursilva · on May 25, 2015

Those are very lightweight abstractions as the compiler will generate optimized code for each variant separately. That's the entire point of Rust Trait abstractions.

jblow · on May 25, 2015

I am talking about the actual conversion of an unowned object to an owned object, which as nearly as I can tell involves copying the object? Implicitly? All the time?

pcwalton · on May 25, 2015

Well, it's not implicit. It's explicit both inside the function body (".into()") and in the function signature (where "into" alerts the user that something may be going on). You can also avoid the copy by just supplying String in the first place, in which case it gets moved (avoiding the allocation), not copied.

jblow · on May 25, 2015

I see. Having to say .into() makes me feel a little better about it. But it does make it clear there is a runtime performance cost to insisting on a strict ownership model.

pcwalton · on May 25, 2015

If the API has to hang onto the string, sure. But not all APIs have to do that. The only reason why the Elasticsearch API requires owned strings in this case is that the JSON API does, and this is a trait that many JSON APIs share, regardless of language. (For example, the first Google result for "c json api" pulls up this API [1], which also copies strings.)

You could write a JSON API that doesn't insist on a strict ownership model, if you wanted. There is even a type, MaybeOwned [2], in the standard library to support this kind of API. In such a library, the JSON type would have a lifetime parameter, which could be 'static for owned strings, but which could be non-static for JSON types that contain references to strings.

[1]: https://jansson.readthedocs.org/en/2.7/apiref.html#string

[2]: http://doc.rust-lang.org/0.11.0/collections/str/type.MaybeOw...

dbaupp · on May 26, 2015

Incidentally, MaybeOwned is now written `Cow<str>`. http://doc.rust-lang.org/std/borrow/enum.Cow.html

Rusky · on May 25, 2015

The only runtime performance cost here is one you would need regardless for correctness- you only need to copy if you're going to hold onto the data longer than a borrow would allow.

In fact, Rust's semantics allows a fewer number of copies than naive use of std::string, since an already-owned value won't be copied, as mentioned above.

Manishearth · on May 26, 2015

I don't see what you mean. Borrowed pointers (and `AsRef`) work fine in Rust.

In fact, Rust's borrowing model provides some perf gains easily where it would be hard to do so in C++ (http://manishearth.github.io/blog/2015/05/03/where-rust-real...)

steveklabnik · on May 25, 2015

I would be really interested in hearing you elaborate on specifics here, possibly even with code.

(Also, I can't wait until I can download Jai and give it a go, keep putting out videos.)

jblow · on May 25, 2015

Well, keeping in mind that I don't know much of the specifics of Rust, and am just making a guess at what it's like to use, this is what I mean:

Actually predicting where data is really going to go involves solving the halting problem. So by necessity any static analysis of ownership is going to be conservative, in the sense that it has to err on the side of safety.

So there's a process of structuring things so that it's not just the programmer who understands, but the compiler who understands. Structuring the code in alternative ways so that ownership is made clear and/or ambiguous cases are resolved. Sometimes this could be a small amount of work, but sometimes it could be a very large amount of work (analogous to the simpler situation in C++ where you are using const everywhere but need to change something deep down in the call tree and now either everyone has to lose their consts or you have to do unsavory things).

At points, it might be possible to structure things so that the compiler would understand them and let you do it, but it would take a large amount of refactoring that one doesn't want to do right now (especially if one has had a few experiences of starting that refactor and having it fail), so instead one might punt and just say "this parameter is owned, problem solved". And that's great, you can stop refactoring, but you just took a performance hit.

Now, in some cases it is probably the case that this is in reality an ambiguous and dangerous ownership situation and the language just did you a favor. But there are also going to be cases where it's not a favor, it's just the understanding of ownership being conservative (because it has to be), and therefore diverging from reality. But I want to get work done today so I make the parameter owned, now the compiler is happy, but there is a performance cost there. If I were not so eager on getting work done today, I might be able to avoid this performance hit by wrestling with the factoring. But I might deem the cost of that prohibitive.

That's all I mean. But like I said, I have never written a large program in Rust so I am not speaking from experience.

jblow · on May 25, 2015

(And really my point is that I perceive there is an ambient pressure toward copying function parameters in general in order to minimize refactoring ... which is what I mean by there being an overall performance impact).

steveklabnik · on May 25, 2015

Thanks for elaborating, this was really helpful.

I don't know if I agree, per se, but I will say that Rust is still in such early stages that we'll be seeing how it shakes out as more people work with Rust. I haven't found this personally to be an issue, but I've also been doing Rust so long that it's hard to see things from a fresh perspective, you know?

Manishearth · on May 26, 2015

I don't think there is. It's quite common to use references as usual, and unnecessary param copying happens pretty rarely in most Rust code.

pcwalton · on May 26, 2015

I think there is, honestly—the JSON API is a legitimate example of that—but I think it's no worse in Rust than in C, where equivalent pressure already exists.

Dewie3 · on May 25, 2015

Invoking The Halting Problem is a pretty big sledge hammer. Especially for a language that you claim you don't know much (of the specifics) about. And your argument is so broad and without specifics that it just boils down to "expressing things in such a way that the compiler believes you", with no mention of Rust except that it might cost some performance if you are not able to communicate this. But how hard will it really be to communicate? All semantic properties are not created equal across languages -- some languages make a set of them easy to check in all realistic cases, others very hard or impractical.

For instance, there is definitely a concrete conversation to be had about the ownership model and implementing data structures -- you even have to use `unsafe {}` for implementing something as simple as a doubly linked list[1]. So that's a concrete example of how single-ownership makes something conceptually simple and safe hard to express in safe code. But in this case, there is so much vagueness about the supposed "massive performance implications" (HP + laziness = massively inefficient) that it comes off as trying a bit hard to... let's just say "to be negative".

> (And really my point is that I perceive there is an ambient pressure toward copying function parameters in general in order to minimize refactoring ... which is what I mean by there being an overall performance impact).

Well admittedly this argument is more concrete.

[1] But that's just a burden for the implementer, since you can expose a safe interface to the data structure.

jblow · on May 25, 2015

This is a waste of time. I am out of here.

Manishearth · on May 26, 2015

I don't see what you mean. Borrowed poitners (and

tatterdemalion · on May 26, 2015

Rust also has the AsRef<T> trait, which converts to a reference, hopefully without copying. Functions which do not need to take ownership of the object should take an AsRef<T> instead of an Into<T>.

MichaelGG · on May 25, 2015

Only if the user is providing borrowed strings in the first place, in which case, there's not much choice. If the user can provide an owned string, it'll just use that, no copy.

SamReidHughes · on May 25, 2015

That's no different from having a C++ function that takes a std::string, or overloads for const std::string& and std::string&&, instead of only accepting one. Only you have to go out of your way to have Into<String> to do it. And we're talking about an API to make queries to send remotely.

Manishearth · on May 26, 2015

Overloading doesn't scale when there are tons of ways of making a `String`. I don't see why `Into<String>` is "out of your way"; it's actually less verbose than overloading.

SamReidHughes · on May 26, 2015

Into<String> is "out of your way" compared to having a String parameter in Rust. (Thus the person writing the code is expressly asking for the interface to allow implicit copying.) It is not being compared to C++ overloading. (The point is, this is more "out of your way" than you have to go in C++, where using an unoverloaded argument type of std::string will get you copying with opportunistic moving, and where a const std::string& will get you copying somewhere on the inside of the function. Thus Rust protects you from accidental expensive copies.)

jblow · on May 25, 2015

Yeah, and this is one of the many reasons why performance programmers usually do not touch std::string.

SamReidHughes · on May 25, 2015

Yeah, they might want some sort of string type that you can't implicitly copy, like Rust's.

steveklabnik · on May 25, 2015

Rust's String does not implement Copy. &str does. (That means Strings aren't ever implicitly copied.)

SamReidHughes · on May 25, 2015

That's what I said.

steveklabnik · on May 25, 2015

Ah, now that you say that, I see that my brain produced a _very_ poor parse. Sorry :(

SamReidHughes · on May 25, 2015

I was speaking in TI-85 and you were reading in TI-83 :)

nightpool · on May 25, 2015

As another commenter implies, trait methods are statically dispatched, so there's no indirection, and very little runtime overhead[0]

[0] I'm not that great at Rust, so if there's some hidden cost to this beyond just the conversion methods themselves, someone please correct me.

Tyr42 · on May 25, 2015

There are two ways of being polymorphic over traits in Rust, using static dispatch or boxed traits.

Static Dispatch is very analogous to C++ templates (with the as of yet not included in the standard "Concepts" as Traits), so you get a copy of the function specialized for each type. (We get nicer error messages than templates in C++ because we require you to declare up front what methods you are expecting the type to have at function definition time instead of checking at specialization time that everything is defined.) There is no runtime cost to using a statically dispatched trait over a hand specialized version of the function.

The other method, boxed traits, is very similar to vtables. It has some runtime overhead, since the size of the time is not known, you must have a pointer to it (hence "boxed"). I think Rust currently uses fat pointers for this, that is a pair of pointers, one to the object, and the other to the vtable, since you can add new instances to types in other crates, so it'd be tricky to have a complete vtable for all methods of all the traits the type implements in one place.

jdub · on May 25, 2015

Hmm. Can you...

  impl From<T> for GeoBox

... outside the module/crate in which GeoBox is defined (as suggested at the end)?

benashford · on May 25, 2015

Yes, as long as the implementation is in the same module/crate as the T.

It's only where both the left and right-hand side are in external crates that would be a problem.

jdub · on May 25, 2015

Epic, thanks!

untothebreach · on May 25, 2015

Yep, that is an advantage rust traits have over, say, java interfaces. (Though I hear java 8 is going to be able to do this as well)

twic · on May 25, 2015

Java 8 is out now and doesn't have this. I'm not aware of anything like this being on the cards for Java 9. Could you tell us more about what you heard?