How I bail myself out of Rust lifetime problems as somebody who probably learned...

insanitybit · on Dec 5, 2023

> who probably learned Rust the wrong way (by just trying to build stuff as if it were C/node.js and run into problems instead of slowing down + reading):

This is the right way to learn. I'm quite familiar with lifetimes and whatnot, but when I didn't bother with them much at all when learning - I just "clone"'d.

This allowed me to learn 95% of the language, and then once I did that, learn the last 5% (lifetimes).

Highly recommend.

inferiorhuman · on Dec 5, 2023

I think there's a lot to be gained by not writing Rust like C, to the extent that it might be worth taking some time to pick up another language (maybe a lisp variant?) first.

2.) Be careful because &String is not &str, although in many cases you can pretend thanks to magic of AsRef/AsDeref.

4.) If you find yourself calling is_none, rethink things a bit. Pattern matching and the various destructuring (e.g. if let) features are some of the most powerful tools at your disposal with Rust. This is where experience with something else first (e.g. Elixir) can be helpful.

  I rarely introduce a lifetime to a struct/impl.

IMO the big use case here is &str.

  Arc kind of bails you out of a lot

While that's totally reasonable it's good to remember that you're essentially trading compile time guarantees for runtime ones. If you can figure out how to beat the borrow checker into submission that's one less thing to debug at runtime.

  primitive like u32 and it's borrowed and you had to dereference it or clone it.

The primitive types should all implement Copy which means you should (almost?) never have to explicitly clone them. Dereferencing is another story tho.

lmm · on Dec 5, 2023

> maybe a lisp variant?

Nah, go with an ML-family language. Maybe even Standard ML, because it will nudge you away from writing "C in ML" and encourage you to pick up the idiomatic way of doing things. (Laurence Paulson's book has an online version available for free on his homepage).

girvo · on Dec 5, 2023

SML is great, but I always suggest OCaml. Still my favourite language that I never get to write these days!

lmm · on Dec 5, 2023

For practical programming I'd probably agree, but if the point is to learn a non-Algol way of thinking then I think SML is a better way to go; OCaml makes it easier to write imperative-style code, for better and for worse.

girvo · on Dec 7, 2023

Yeah that's a fair point!

packetlost · on Dec 5, 2023

This is how my team writes production Rust code. Knowing which one to use and when is important, but there's nothing wrong with using the tools available to you.

Non-lexical lifetimes are, in my experience, pretty uncommon in most non-library code. You don't really need them until you really need them.

inejge · on Dec 5, 2023

Non-lexical lifetimes are, in my experience, pretty uncommon in most non-library code.

To avoid confusing the newcomers: lifetimes are always non-lexical (see [1] for the pedantic details.) I suppose you meant that explicit lifetime annotations are pretty uncommon, which is not wrong.

[1] https://blog.rust-lang.org/2022/08/05/nll-by-default.html

packetlost · on Dec 5, 2023

Oh yeah, wrong terminology. My bad!

ForkMeOnTinder · on Dec 5, 2023

Good advice, though I'd recommend Rc<RefCell<T>> instead of Arc<Mutex<T>> if you're not sharing the data between threads, to avoid synchronization overhead. I use Arc pretty rarely.

gnulinux · on Dec 5, 2023

The overhead of an uncontested lock is not much more than a memory operation but it allows you to be able to use the same code in threaded context in tokio async which is a huge benefit. Unless you need the optimization (i.e. you profiled and determined that Arc in a hot loop is slowing you down) I think it's fine to use Arc in general.

spacechild1 · on Dec 5, 2023

> The overhead of an uncontested lock is not much more than a memory operation

An atomic memory operation! These can be orders of magnitude slower than regular memory operations.

gpderetta · on Dec 5, 2023

An atomic read-modify-write. Atomic non-seq-cst load/stores can be cheap.

/Overly pedantic

spacechild1 · on Dec 5, 2023

> An atomic read-modify-write.

No, this also applies to (non-relaxed) atomic loads and stores, depending on the platform.

> Atomic non-seq-cst load/stores can be cheap.

Relaxed atomic loads and stores are always cheap, but anything above requires additional memory order instructions on many platforms, most notably on ARM.

Here we are talking specifically about mutexes, which follow acquire release semantics.

To be clear: locking an uncontented mutex is indeed much, much cheaper than an actual call into the kernel, but it is not free either.

gpderetta · on Dec 5, 2023

Ok, technically we both used the weasel word 'can' so we are both right.

But even on ARM, these days store releases and load acquires, while not as free as on x86 are very cheap.

To make my statement more precise, typically what is still expensive pretty much everywhere is anything with #StoreLoad barrier semantics, which is what you need to acquire a mutex.

afdbcreid · on Dec 5, 2023

`RefCell` does have one big advantage, though: it'll panic instead of deadlock for reentrant borrow.

jlffwoymasdf · on Dec 5, 2023

tokio is so wide spread now such that Arc<Mutex<T>> is coincidentally the right choice.

I'm not saying that's a good thing.

ben-schaaf · on Dec 5, 2023

Doesn't tokio have a single-threaded runtime where that's not needed?

zackangelo · on Dec 5, 2023

Yes but Send + Sync is required everywhere regardless.

vgatherps · on Dec 5, 2023

This is not true, you can run non-send futures using Tokio: https://docs.rs/tokio/latest/tokio/task/struct.LocalSet.html

packetlost · on Dec 5, 2023

Eh, I don't think the overhead of an uncontested lock acquire is all that much.

Georgelemental · on Dec 5, 2023

Using `clone` etc when it's easier is actually common advice, it's perfectly OK to not have borrowing everywhere if you don't need it. My usual starting point / default / rule of thumb is to take references as parameters and return owned values (for example, take `&str` as an argument, return an owned `String`).

inferiorhuman · on Dec 5, 2023

IMO stringy data is a good example of where you should think about what you're returning in part because common APIs (e.g. regex) will take a more nuanced approach.

If you're creating a new string, then sure return String. But if you have a path where you could just return the original input, consider returning a Cow<str>.

chrismorgan · on Dec 5, 2023

burntsushi actually regrets making regex replace return a Cow<str>: https://github.com/rust-lang/regex/issues/676#issuecomment-6.... I’m glad it does, and wish it took an impl Into<Cow<str>> there, for the reasons discussed in the issue, but burntsushi has a lot more experience of the practical outcomes of this. Just something more to think about.

inferiorhuman · on Dec 5, 2023

So from reading those comments, I'd come to the opposite conclusion: Cow<str> is absolutely the right choice and perhaps String should really have been Cow<str>.

Insofar as taking an impl into, burntsushi linked to a rust playground demonstrating where that approach falls down. In general (heh) taking arguments , especially options or stringy ones, that are generalized over an into impl is one of those things that seems real nice at first but gets real unpleasant pretty quick IMO.

burntsushi · on Dec 5, 2023

In my defense, I said I occasionally regret the choice. But in rebuttal, I certainly do not have your confidence that returning Cow<str> is the right choice. Basically, when it comes down to it, I'm not 100% convinced that it pulls its weight. But like I said in the issue, it's decently motivated.

I don't think String could be a Cow<str>. Remember, Cow<str> is actually a Cow<'a, str>, and if you want to borrow a &str from a Cow, the lifetime of that &str is not 'a, but rather, attached to the Cow itself. (This is necessary because the Cow<str> may contain an owned String.) This in turn would effectively kill string slicing.

In order for something like Cow<str> to be the default, you need more infrastructure. Maybe something like hipstr[1]. It is a nice abstraction, but not one that would be appropriate for std.

[1]: https://docs.rs/hipstr/latest/hipstr/

chrismorgan · on Dec 5, 2023

(Sorry I missed the word “occasionally” there!)

brundolf · on Dec 5, 2023

This is totally valid. Lifetimes can be an optimization you go back and add later as needed; in structs, especially, they require a ton of code changes to add and a ton of code changes to change your mind about later, so they should be used judiciously

Other things I would add to this list:

- For structs that don't own anything on the heap, derive Copy. Then you can just pass them around willy-nilly without explicit clone()s

- Using a functional style where it makes sense to helps a lot; it can be really easy to pass things by reference when they only need to be temporarily, immutably used by that one function. And if you make Copyable structs, you can pass owned structs and return owned structs and not worry about any of it

zaphirplane · on Dec 5, 2023

I know there was a thread involving a rust team member saying that clone / to_owned is ok to start with The memory copying just nags and distracts me from moving on

burntsushi · on Dec 5, 2023

Yes, it absolutely is.

And it's even okay beyond just starting with.

Search the regex crate repository for clones. There are a lot of them. Hell, Regex::new accepts a &str for a pattern, and one of the first things it does is convert it to a String.

diarrhea · on Dec 6, 2023

Interesting. Most regexes being short, I reckon that copy is very cheap. Still I wonder, wouldn’t Cow be an acceptable middle ground, “best of both worlds” style (only copy when needed)?

burntsushi · on Dec 11, 2023

No, because the clone is always a marginal cost, no matter how big the pattern is.

`Cow` would be a needless and gratuitously bad type to accept for Regex::new. There's no point. It would very likely suffer the same class problems as using Into<String>, because we'd need to use Into<Cow<'_, str>>, and thus it is susceptible to annoying inference failures.

It's always important to contextualize costs. In this case, regardless of the size of the pattern string, cloning it is always marginal relative to the other costs incurred. (One possible exception to this is if the pattern is just a simple literal with no regex meta characters. There in theory could be a fast path to side-step regex parsing and other things, but in practice there's not much need for that.)

estebank · on Dec 5, 2023

I'm pretty sure everyone in the teams would endorse that statement :)

tialaramex · on Dec 5, 2023

Cloning things which are Copy (such as u32) is futile and Clippy will tell you not to bother where it can see this is definitely Copy. If you don't use Clippy, I'd suggest trying it for a while.

Rc will be faster (if that matters to you) than Arc but it can't cross threads. (Safe) Rust will check you didn't get this wrong, so there's no danger but obviously knowing ahead of time avoids writing an Rc that you then need to be an Arc instead.

Sometimes it's tidier to write the borrow in the type of a pattern match e.g. if let Some(&foo) = ... Means you won't need to dereference foo inside the block.

farmeroy · on Dec 5, 2023

Once I started taking this advice, Rust became manageable to me! Currently, Trait Implementations have been more of a stumbling block for me than the borrow checker

outside1234 · on Dec 5, 2023

This is the way. The real misconception of lifetimes is that people have to use them often. You don’t unless your are writing libraries or system code.

estebank · on Dec 5, 2023

For the edit: if you have an idea what the output should look like instead, please file a ticket.