Rust Language Cheat Sheet

alpaca128 · on April 25, 2021

My favorite is still the Rust Container Cheat Sheet[0] from the Rust Performance Book[1]. Great overview on just one page - that's what I call a cheat sheet.

Edit: Also recommended: The Little Book of Rust Macros[2]

[0] https://docs.google.com/presentation/d/1q-c7UAyrUlM-eZyTo1pd...

[1] https://nnethercote.github.io/perf-book/title-page.html

[2] https://veykril.github.io/tlborm/introduction.html

raphlinus · on April 25, 2021

Thanks, appreciate the shout-out.

I'll give a little background how that came about. At the time, I was promoting the use of Rust within Fuchsia. The team was largely composed of badass C++ programmers, who of course were very familiar with reasoning about programs in terms of exactly how things were laid out in memory. The official Rust documentation was at a more abstract level with respect to a lot of this stuff, in many case careful not to overspecify because a lot of the details of memory layout are not normative and may in fact change. In explaining the most common and important Rust data types, I found these kinds of diagrams useful, and it grew into the page you see there.

est31 · on April 25, 2021

Another container cheat sheet: https://i.redd.it/220xo2f6wci51.png

https://oldreddit.com/r/rust/comments/idwlqu/rust_memory_con...

alpaca128 · on April 25, 2021

Thanks, that looks helpful too. Note your second link seems to be missing a dot between "old" and "reddit".

est31 · on April 25, 2021

> Note your second link seems to be missing a dot between "old" and "reddit".

Oh indeed. I visited the link while logged in, but to share links I make the pre-redesign explicit. I guess a mistake occured during that. Can't edit the post any more, but the fixed link is:

https://old.reddit.com/r/rust/comments/idwlqu/rust_memory_co...

rob74 · on April 25, 2021

Cool, but doesn't really make Rust less intimidating for novices I'm afraid. I mean, it's been some time since I last saw a language that needed a cheat-sheet entry for concatenating strings:

  format!("{}{}", x, y)

GrumpySloth · on April 25, 2021

A simple "+" works as well, as long as "x" is a String, and not a &str.

Also, concatenating strings for any other purpose than printing them is a rarely done thing in my experience, no matter the language. So I don't see why we should optimize the ergonomics of something like that.

the_mitsuhiko · on April 25, 2021

String concatenation is an antipattern in general. Making it too convenient is not a great idea anyways.

moldavi · on April 25, 2021

Why would concatenating strings be an antipattern?

talideon · on April 25, 2021

There are lots of reasons it can be. Here's an example: in many languages, concatenation requires reallocation to generate a new string object. This is especially the case where strings are immutable. That can get expensive quickly as you're producing lots of garbage for little benefit, and each concatenation requires progressively larger and larger reallocations. You end up seeing quadratic runtime costs in terms of both memory and speed. There are a few different approaches taken to this. In Python, the popular option is to append everything to a list, and use `''.join(lst)` at the end to concatenate everything. Java and C#, on the other hand, have StringBuffer/StringBuilder types to accomplish something similar.

Generally, you want to avoid simple string concatenation in loops because of this, and it's not always obvious when string concatenation is being done in a loop, such as where the string concatenation is being done in a method that's being called by some other code.

kevincox · on April 25, 2021

String concatenation isn't expensive in Rust. Since the + operator consumes the first argument it will reuse the buffer and allocate quadratically. The only possible performance downside is that it can't count all of the lengths in advance and do a single allocation.

estebank · on April 25, 2021

Add<&str> for String in a loop isn't as problematic as adding two, for example, Java Strings because of the capacity growth curve and buffer reuse, but it is still worse than not having a `String::with_capacity` call before the loop. I am sad to see that clippy doesn't have any lints for this, but I can see how it would be hard to make a general good suggestion for it.

talideon · on April 26, 2021

Keep in mind that the potential expense of string concatenation was only _one_ example.

There's a reason why I explicitly mentioned immutability: Rust's `String` type has more in common with C#'s `StringBuilder` class than it does with strings in most languages. Meanwhile, the behavour of str/&str is very different.

burntsushi · on April 25, 2021

Because it's very easy for it to become accidentally quadratic in a loop.

s_brady · on April 25, 2021

I really want to like Rust, I like the intention of what the language is trying to do. But in a universe where languages like Smalltalk, Eiffel and Lisp exist, it just feels like a step backwards. Surely the computer should be doing the heavy lifting? What happened to my bicycle for the mind? Just my opinion. I admire the work going on in Rust but the complexity makes me sad.

marcus_cemes · on April 25, 2021

My experience with Rust is that it tries to squeeze every last drop of performance, but instead of relying on simplicity (C) or unsafety (C++, the reason Mozilla started it was to reduce C++ memory-related bugs), it introduces language complexity (lifetimes, "prove that your code is safe, or fallback to a runtime overhead such as Mutex") and a little-bit of runtime overhead (explicit Option<T>/enum checks). In my mind, it doesn't play in the same league as high(er)-level garbage collected languages. It was made for robust low-level applications, which is why it's being considered for the Linux kernel, and not Lisp.

Concatenating strings in a GC language is easy, you just "+" them together, the cleanup will happen eventually. Concatenating in low-level languages means you have to think about the memory allocation(s) taking place, and the format("{}{}") is a lot more powerful for anything more than just adding two strings. What if you want to prepend "https://"? In GC languages, this might be an extra string being allocated and discarded, in Rust, it's a constant hard-coded into the binary. It just happens that the language has attracted a lot of people who are used to higher-level abstractions due to it's relatively pleasant syntax, type system and abstractions, ecosystem and perfomance.

otabdeveloper4 · on April 25, 2021

C++ is also 'just "+" them together', and it's not unsafe.

oijk · on April 25, 2021

Turns out, it is:

https://alexgaynor.net/2019/apr/21/modern-c++-wont-save-us/

otabdeveloper4 · on April 25, 2021

string_view is a new feature designed on purpose to be unsafe.

pjmlp · on April 25, 2021

The idea was to copy slices model from memory safe languages, naturally being C++ with C's influence regarding performance trumps safety, string_view couldn't be made to be as safe as the idea they were copying from.

Another example is span, while C#'s Span and Microsoft's gsl::span do bounds checking, ISO C++17 span naturally leaves it to the developer's responsibility.

otabdeveloper4 · on April 26, 2021

> The idea was to copy slices model from memory safe languages

No, the idea was to make a simple one-to-one wrapper for the "char, char" pairs programmers were already using all over the place, nothing more.

pjmlp · on April 25, 2021

This is why while I toy with Rust, I can only see it replacing C and C++ (long term), in the context of kernels, device drivers and real time scenarios where any kind of memory allocation is a no-go (MISRA style).

For anything else, it isn't worth the productivity drop of automated memory management (regardless in what form) coupled with value types.

flohofwoe · on April 25, 2021

If you don't allocate memory in the first place, Rust also looses a lot of appeal though, because dynamic memory management is often the root cause of memory corruption issues in C and C++ (while working with a upfront defined static memory layout avoids some typical footguns, because of things like stable memory locations throughout the lifetime of the application). I'm aware that Rust also helps to make accessing "static" memory safe, but Rust's safety guarantees are much more important with complex memory management patterns.

GrumpySloth · on April 25, 2021

While it is true that dynamic memory management in C and C++ is a source of memory corruption issues, the other great source is precisely the reluctance of many C and C++ programmers to use dynamic memory allocation instead of buffers of statically-defined sizes. E.g.

1. Functions which convert a number to a string, where the target string is a byte buffer passed into the function as a pointer argument. This allows C developers to pass pointers to statically-sized arrays of bytes, which leads to buffer overflows. (Inb4 someone shows up with "but that's the programmer's fault for not using an array that's big enough".) Example: strcpy.

2. Functions which return pointers to static storage, which makes them thread-unsafe, and even in single-threaded cases is error-prone, when you call one such function multiple times in succession (aliasing). Example: gmtime.

Both cases are very prominent in the standard library.

If instead functions allocated memory dynamically, then they would avoid buffer overflow, because size would be handled exclusively within those functions, allocating as much memory as is needed, no more, no less; instead of forcing the programmer to worry about the size at every call-site. They could also be easily made thread-safe and less error-prone (eliminating aliasing), provided that the allocator is thread-safe.

Another issue with the first case is that it leads to uninitialised variables. Often a pattern of "bool NumToString(int n, char* dst, int size);" is employed instead of "Optional<std::string> NumToString(int n);". This invites the possibility of using "dst" without checking the return value, which says if the conversion succeeded. In this case dst is most likely uninitialised, and definitely incorrect. An "Optional<std::string>" solves both issues.

flohofwoe · on April 25, 2021

All true, but on the other hand you have much more dangerous (because hidden) memory corruption problems in C++ caused by hidden dynamic memory management in the C++ stdlib like iterator invalidation, or dangling std::string_views. And IMHO those cases where C++ suddenly pulls the rug from under your references are much more dangerous than working with a static memory layout in C.

The C stdlib string API is rubbish of course (along with most other C stdlib APIs). But at least it's quite simple in C to ignore the stdlib (and use better 3rd-party-libs instead) without loosing important language features.

pjmlp · on April 25, 2021

The experience with CVEs in MISRA-C(++) vs Java and Ada embedded deployments, proves that there is still room for improvement by pushing C and C++ completely out of the picture.

jimktrains2 · on April 25, 2021

Rust also helps prevent many types of concurrency bugs, and the type system is much more expressive than c's or c++'s, which can lead to cleaner code as well as preventing bugs like null references.

pjmlp · on April 25, 2021

Those exposed via thread access to in memory data, for sure.

However we are slowly moving back into multi-processes as way to avoid in-process security exploits and improve overall application stability.

In these kind of scenarios, there is little that Rust type system can help regarding races in IPC.

mxcrossb · on April 25, 2021

I’m pretty sure that’s the point. But since it’s trendy people try to apply it to everything.

cytzol · on April 25, 2021

> But since it’s trendy people try to apply it to everything.

When people do things you don't understand, don't just say that they're just doing it because it's fashionable. Sure, they might be — or they have their own reasons.

I use Rust in a couple of places outside of its original "systems programming" niche, where, yeah, I don't really need to track every allocation, I don't care about GC pauses, I don't need to ship one single binary, and the overhead of a runtime wouldn't bother me. Things like Web servers for side projects, or small scripts to do a task I need to automate.

However, I found that:

• The effort it took to bring Rust out of its niche, and the time it took to learn the domain-specific libraries for my use cases (Rocket for Web stuff; duct for shell script stuff) was less than I thought;

• The amount of knowledge I needed to retain to use a programming language effectively — the components of its standard library, common third-party helper libraries, how to navigate the documentation, how to fix mistakes, how to avoid traps and pitfalls, how to structure your program, how to handle differences between language versions — was much larger than I thought!

So I stick with Rust for non-systems tasks because the benefits outweigh the detriments for me.

(Granted, I was only able to do this because I already knew my way around the language and the borrow checker; if you already know, say, Python, you can make this exact same argument in reverse. But then you need to know how to wield Python for low-level programming as well as high-level programming.)

I read an article a few years ago called Java for Everything[1] (which was discussed here on HN[2]) that makes the same point, only with Java. If I had to pick an "everything language", I don't think it would be 2014-era Java, but the article did sell me on the benefits of having an "everything language" in the first place, and I feel the same benefits apply here with Rust.

[1]: https://www.teamten.com/lawrence/writings/java-for-everythin... [2]: https://news.ycombinator.com/item?id=8677556

hu3 · on April 25, 2021

Some go as far as saying that "Rust is pretty accessible to developers coming from scripting languages (JavaScript, Python, etc)." [1].

Meanwhile in the real world: "I find myself hanging for long periods of time on borrow checker errors. One of the errors has stopped my progress dead for a week now, I swear it worked a week ago and then Rust decided that a borrow I was doing was no good." [2]

[1] https://news.ycombinator.com/item?id=26929012

[2] https://news.ycombinator.com/item?id=23807574

jhugo · on April 26, 2021

The borrow checker is deterministic, so the commenter in [2] is mistaken about it “deciding” that a previously-OK borrow was suddenly no good.

I do recall the steep learning curve and the frustrations with the borrow checker, but I haven’t had it impede me at all in probably two years now and I’m writing Rust most of the day every day.

Also, most of the knowledge I gained by learning to work with it is general knowledge that made me a better programmer, not specific knowledge of how to “work around” the borrow checker.

conradludgate · on April 25, 2021

Well. I think they're very different languages for different purposes. It's trying to replace C/++, and in those languages you do have to pay a lot more attention otherwise you'll have the occasional use-after-free or unchecked memory leaks etc. They can be extremely problematic and for a human it's impossible to catch everything.

Rust helps lift than mental load because if you're in safe rust then you knows the compiler will catch these problems for you. It also makes quite explicit when you're on the stack or heap.

I also love the explicit mut keyword makes functions very easy to reason about. In Go I find pointers being used for three purposes, to act as a reference, to act as an optional (nullable) or to act as a mutable parameter, and it's impossible to tell which by looking at the function signature and I can't tell you how many bugs that's caused in our codebase

hctaw · on April 25, 2021

In my experience lisps are orders of magnitude more complex than Rust, and require extensive debugging to be certain that the code works. Rust is mostly a compile and run experience, idiomatic Rust is hard to screw up.

galangalalgol · on April 25, 2021

I think rust was targetted at c and c++. Do you like any more recent languages or do they all feel like they went the wrong way? I find Julia interesting, but I haven't delved deep.

cytzol · on April 25, 2021

> What happened to my bicycle for the mind?

The bicycle you pick depends on the problem you're trying to solve. You can divide up all programming languages into two categories:

1. Those that allow you to state your problem in terms of ideas. You write your code, the compiler or interpreter does some optimisations, and the end result runs fast enough for your use case.

2. Those that allow you to state your problem in terms of machine instructions. You demand a certain level of performance, or want every memory allocation to be explicit. The compiler or interpreter still does some optimisations, but instead of inserting more instructions implicitly if it has to, you'd rather your program straight-up fail to build.

The choices made by a language's designers will make it seem unnecessarily complicated if you're looking at a (2) language when you have a (1) problem, and will seem like it's making too many assumptions for you if you're looking at a (1) language with a (2) problem.

For example, let's go back to string concatenation, which, as you saw in Rust, looks like this:

    format!("{}{}", x, y)

And, yes, in other languages it looks like one of these two things:

    x + y
    concat(x, y)

If all you want is to end up with a "String" that's made up of those two other strings in it, this is good enough. I've done this bajillions of times over the course of my career. (Truly, I am a string-concatenating expert.)

But a String in Rust, under the hood, contains a pointer to a heap-allocated buffer. And getting one of those requires a memory allocation, and Rust makes those explicit. So the method of concatenating two strings you choose depends on what machine instructions you pick:

• If you don't really care, or if you need a new heap-allocated buffer, then `format!("{}{}", x, y)` will give you what you want.

• If `x` is already a heap-allocated String, and you know for a fact that you aren't going to need to use it again, then you can re-use the existing buffer with simply `x + y` (or something like `x.push_str(y)`).

• If `y` is a heap-allocated String but `x` isn't, you can re-use that existing buffer with `y.insert_str(0, x)` (but this requires it to copy the bytes in the buffer to make room for the string you're inserting).

• If neither is a heap-allocated String, and you don't want to allocate any more memory, you'll have to skip concatenation altogether and do something else (for example, if you're just writing the concatenated string somewhere, and don't need a buffer to put it in, then you can do `write!(somewhere, "{}{}", x, y)` which won't allocate).

> Surely the computer should be doing the heavy lifting?

I find that the Rust compiler does an excellent job of doing the heavy lifting for me: I don't have to worry about iterator invalidation or use-after-free, surprise memory allocations or numeric type conversions, or unexpected performance drops when I make a small change. But this is all because I'm expecting a type (2) language, rather than a type (1) language.

It's the last part I'd like to focus on, because you may have read the above list and thought "a compiler could choose the appropriate optimisation for me!". And, well, it probably could! After all, if the only difference between the first option `format!("{}{}", x, y)` and the second option `x + y` is whether the string `x` gets used after concatenation or not, a compiler could certainly check whether this happens and optimise out the allocation if necessary.

But then you need to worry about keeping this optimisation when the code changes. Sticking with the example, suppose you have code like this in some hypothetical (1) language that allows you to express ideas but optimises them the best it can, where `x` and `y` are Strings still:

    var frobozz = x + y;

And then someone on your team makes this change:

    var frobozz = x + y;
    /* skip a couple dozen lines of code */
    var gunther = x + z;

A code change that starts using `x` for a second time now results in an allocation being introduced a couple dozen lines above from where the code change was, because it can now no longer be optimised out. If this is still good enough for you, because you're solving a (1) problem, then there's nothing wrong with this. But if you're solving a (2) problem, then the language has kind of... let you down.

So don't be sad! It just depends on what problem you want the language to solve for you.

pjmlp · on April 25, 2021

The only fallacy is when one then goes to Goldbolt and then reaches the conclusion that regardless of the language, the kind of generated Assembly can be the same, depending on the proficiency of the author on the language features at their disposal, and the compiler backend being used.

rtoway · on April 25, 2021

I think it's fair to say the compiler does a lot of heavy lifting in Rust

emma-w · on April 25, 2021

Isn't it true that Rust is meant to solve problems in spaces where those languages really can't play? I'll admit to plenty of ignorance about all three of those languages, so I'm genuinely curious what you think.

de6u99er · on April 25, 2021

Smalltalk is still a thing?

pjmlp · on April 25, 2021

Yes, https://pharo.org/

As for commercial versions,

https://www.cincomsmalltalk.com/main

https://www.exept.de/en/smalltalk-x.html

https://www.instantiations.com/vast-platform

And its IDE spirit lives on Java IDEs and Visual Studio Code (as Erich Gamma is on the team).

estebank · on April 25, 2021

To be fair, when doing the naïve thing, the compiler will tell you what you need to do and (tersely) explain why it is needed:

    error[E0369]: cannot add `&str` to `&str`
     --> src/main.rs:4:15
      |
    4 |     let z = x + y;
      |             - ^ - &str
      |             | |
      |             | `+` cannot be used to concatenate two `&str` strings
      |             &str
      |
    help: `to_owned()` can be used to create an owned `String` from a string reference. String concatenation appends the string on the right to the string on the left and may require reallocation. This requires ownership of the string on the left
      |
    4 |     let z = x.to_owned() + y;
      |             ^^^^^^^^^^^^

This used not to be the case and tripped people up. some even after being told why it is like this disagree that anyone should care.

Rust is aiming at being performant. The thesis is that having a clarity on the actual behavior of code when reading it helps on this front. Because of this, you need to write more code than you would in other languages, while the upside is that when reading Rust code you can be fairly certain of what the code is doing in the final binary, without any thresholds at play changing the generated code from, for example, monomorphization to dynamic dispatch because you added a method or yet another instance. It's a trade-off. A language that didn't target the embedded and systems space could take all of Rust and radically simplify it, making it closer in experience and performance to Swift. That doesn't mean that the Rust project won't attempt to make things easier to use and write. A possible future move could be to make `Cow<str>` be the default string people actually use, with `&str` continuing to be the preferred argument and relegating `String`s to specialized cases, probably mostly data buffers.

pansa2 · on April 25, 2021

Reminds me of this cheat sheet [0] for slices in Go, which includes a snippet to delete an element from a slice:

    a = append(a[:i], a[i+1:]...)

[0] https://github.com/golang/go/wiki/SliceTricks

LukaD · on April 25, 2021

Wow I forgot how unergonomic go can be. And that example even is a potential memory leak...

masklinn · on April 25, 2021

> And that example even is a potential memory leak…

Yup more generally a slice is not an actual vector, the backing array lives independently from the slice, meaning if you "reset" a slice

    a = a[:0]

all the elements are still part of the backing array, and thus still referenced.

That's also why write-sharing (which is trivial in Go) is super dangerous e.g.

    a := GetSlice()
    b := a[:]
    a = append(a, 1)
    b = append(b, 2)

depending on the size of the backing array associated with slice `GetSlice` returns (aka the cap() of the slice), the last element of `a` might be 1 or 2.

majewsky · on April 25, 2021

Um, what? My impression was always that the last line would copy the underlying slice since someone has already appended to the original backing array. That's the whole point of why append is not in-place. You get a new slice value back because the extension may have happened by copying data into a new backing array.

masklinn · on April 25, 2021

> My impression was always that the last line would copy the underlying slice since someone has already appended to the original backing array.

Nope. append only copies if there’s no room in the backing array, if there is it will just extend the slice in-place and set the new elements.

So if there’s room, a will be extended in-place, with the new element set, then b will be extended in-place, overwriting the element set by a.

Hell, how would append even know that somebody else appended to the backing array? As far is it can see that might as well be leftover garbage.

> You get a new slice value back because the extension may have happened by copying data into a new backing array.

That only happens if the backing array is full aka len() + new > cap().

majewsky · on April 25, 2021

> how would append even know that somebody else appended to the backing array? As far is it can see that might as well be leftover garbage.

Well, the backing array knows its length and capacity, and the slice knows its length, so it's quite trivial to see that there are elements in the array beyond the length of the slice.

EDIT: I'm shocked and appalled by the fact that you're completely right. https://play.golang.org/p/UZSlBMUlc_E

masklinn · on April 26, 2021

> Well, the backing array knows its length and capacity

arrays don’t have capacity. The capacity of a slice is the length of its backing array (minus the offset of the slice in the array, which is why slices do need to store a cap separately).

> it's quite trivial to see that there are elements in the array beyond the length of the slice.

There are always elements in the array beyond the length of the slice (unless the slice is « at capacity »).

> EDIT: I'm shocked and appalled by the fact that you're completely right. https://play.golang.org/p/UZSlBMUlc_E

Yup. Go slices are fun. And by fun I mean error-prone.

Well most of the trouble really comes from them having double-duty as actual slices and add-hoc pseudo vectors.

rascul · on April 25, 2021

17 different ways to concatenate strings in rust, with benchmarks:

https://github.com/hoodie/concatenation_benchmarks-rs

tialaramex · on April 25, 2021

Summary: There are several ways to concatenate strings in Rust. The method listed (using the format! macro) is one of the slowest but also most flexible, it will cheerfully concatenate any things that either are, or present an ability to be expressed as, strings.

The very fastest "method" is to just tell Rust you're quite confident that it will be OK to assume all these strings happen to be contiguous in memory anyway and so it can just go ahead and assemble that memory into one string. Rust won't let you pretend it said this was safe, and sure enough in practice it will almost invariably break, but this is often what equivalent C code is trying for, it genuinely is fast, and if you really are confident you can arrange for the assumption to be true Rust lets you express that in an unsafe block so you get that same performance as C.

gourlaysama · on April 25, 2021

To be fair, this is more a Rust Syntax Reference Sheet than any kind of cheat-sheet. It basically has everything.

It's useful to quickly lookup the syntax for something you never use, for example. Or to browse and discover things you didn't know were even possible.

grumpyprole · on April 25, 2021

Whilst it may appear that concatenation of strings is easier in other (more declarative) languages, in practice, understanding the runtime performance/complexity and memory usage/lifetimes is far from trivial. Sure, a more declarative language might still be preferable, but for those who need better control and more ability to reason about how the program with execute, the Rust approach makes sense.

estebank · on April 25, 2021

To drive the point home on how strings work in Rust and what the pitfalls it addresses are, https://fasterthanli.me/articles/working-with-strings-in-rus...

oblio · on April 25, 2021

I'm not sure I understand why format is a macro and print and equivalent in most other programming languages is a regular function. There's some complexity there that I guess I just don't know.

For the second part, Rust should probably add some sugar à la Python f-strings (or just plain old string interpolation dating back to the Bourne shell):

format!("{x}{y}")

burntsushi · on April 25, 2021

> There's some complexity there that I guess I just don't know.

It's because the format string itself is checked to be consistent with the types of arguments given. If there's an inconsistency, you get a compile time error. Most printf-style functions in other programming languages don't do this.

oblio · on April 25, 2021

Ah, that's awesome and it makes a lot of sense, plus it's super helpful.

masklinn · on April 25, 2021

There are other more "negative" reasons too:

The language doesn't have varargs (to say nothing of type-safe generic variable-type args), so you'd have to pass in a slice of references e.g.

    print!(..., a, b, c)

would become

    print(..., &[&a, &b, &c])

Which is not exactly sexy, plus all these references would have to implement a matching trait, whereas currently formatting different values can involve different traits depending on the format specifier e.g. a plain `{}` is Display but `{:?}` is Debug and `{:x}` is LowerHex. You can have a type which implements LowerHex but neither Debug nor Display if that makes sense.

CryZe · on April 25, 2021

This is actually planned already. Not sure how far along the implementation and progress to stabilization is though.

oblio · on April 25, 2021

When you say planned, do you mean it's in some roadmap? I couldn't find anything after a quick googling session.

Edit: Found the RFC: https://rust-lang.github.io/rfcs/2795-format-args-implicit-i... - but I don't see any sign that it has been accepted :-?

Edit II: I had missed the PR and issue at the top :-)

gpm · on April 25, 2021

It's already in nightly too

https://play.rust-lang.org/?version=nightly&mode=debug&editi...

oblio · on April 25, 2021

Ah, nice! Anything that makes Rust more approachable will help.

unrealhoang · on April 25, 2021

format and co (write, print, ...) are macros because Rust doesn't have variadic functions. Also, the format string and its parameters are checked at compile time (number of params match number of field, and type), other language I know don't have this feature.

oblio · on April 25, 2021

> Rust doesn't have variadic functions

By design or because nobody has implemented them, yet?

gpm · on April 25, 2021

Nobody has implemented them or designed them yet. No obvious reason you couldn't come up with a satisfying design, but there's consensus building work there around "this is a good feature and the right design" as well as implementation work.

unrealhoang · on April 25, 2021

it's a hard design problem, actually. But I think with const generic, we are getting very close, variadic could be:

    fn foo<const N: usize, T>(varied: [&T; N]) { ... }
    // usage
    foo([&a, &b, &c])
    foo([&a, &b, &c, &d])

iudqnolq · on April 25, 2021

If I'm reading your intent right that would monomorphize to a slice of a specific N, meaning you couldn't express format!("{}{}", 1, 1.0)

gpm · on April 25, 2021

I think you're reading the comment you replied to right, but I don't (completely) agree with your conclusion.

There are two magic things the format macro does apart from varargs, it checks the format string against the length and format types of the argument at compile time, and it automatically borrows the arguments. Implementing format with varargs means you need to lose those.

Once you lose those calling it looks like

    format("{}{}", &1, &1.0)

under the above proposal the type signature of format would look like

    fn format<const N: usize>(fmt_str: &str, args: [&dyn Format; N]) -> Result<String, Error>;

Where `Format` replaces the various `Display`, `Debug`, etc traits and looks something like this

    trait Format {
        fn display(&self, fmt: &mut Formatter) -> Result<(), Error> {
             // Default implementation of this method
             return Err(Error::NotImplemented)
        }
    
        fn debug(&self, fmt: &mut Formatter) -> Result<(), Error> {
             // Default implementation of this method
             return Err(Error::NotImplemented)
        }
    
        ...
    }

I.e. the trick is that the conversion to a trait object happens (implicitly) on the calling side when you take a reference, so all the arguments are all of the same type.

iudqnolq · on April 25, 2021

That's a much better way of putting it. I was mostly trying to point out the missing dyn, but I never explicitly said that

geofft · on April 25, 2021

> I'm not sure I understand why format is a macro and print and equivalent in most other programming languages is a regular function. There's some complexity there that I guess I just don't know.

It's a really good question! It's a little bit complex but hopefully not too much, and it shines a light on some interesting differences between languages.

The answer is roughly that Rust is strongly typed, and println!, format!, etc. all accept arbitrary types. They all also accept a varying number of arguments, and Rust only allows you to define functions with a fixed number of arguments. Python, sh, etc. have neither of these restrictions.

What would the type (function signature) of println/format/etc. be if they were regular functions?

    fn println(format: &str, arg1: SomeType?, arg2??);

You could solve the variable-number-of-arguments problem in Rust itself somehow, maybe

    fn println(format: &str, args...: SomeType);

but you'd have a few problems remaining.

First, you'd need to figure out what that type is. The formatting macros don't just accept things that implement Display - they also accept things that implement Debug if you use {:?}, things that implement LowerHex if you use {:x}, etc. And the macro wants to return a compile-time error if you mismatch types and the formatting string, like

    println!("Hex {:x}, debug {:?}", [1, 2, 3], 4);

And second, and a little more simply, the macro wants to return a compile-time error if you just have the wrong number of things being formatted:

    println!("Three things: {} {} {}", 1, 2);

In order to do that level of checking based on the value of the string, it has to be some sort of compile-time extension. In Python, you'd just get a runtime error and in bash, you'd just get empty strings.

    >>> "{:x}".format("hi")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: Unknown format code 'x' for object of type 'str'
    >>> "{} {} {}".format(1, 2)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    IndexError: tuple index out of range

Regular functions in Rust (and any other language) don't get to see their arguments at compile time; they only see their arguments when they're called.

C, of course, just ignores this problem entirely. You don't get an error if you mess up the format string; you just get invalid reads to memory.

    printf("Is this a pointer? %s", 1234);
    printf("What's the third thing? %d %d %d", 1, 2);

Also (especially) in C, it's very easy to accidentally let the format string be input external to your program:

    printf(username);

which means that an attacker can specify the format string and cause reads to memory you weren't expecting and cause data to be leaked (or worse, an attacker can cause writes: C's %m token stores the number of bytes written to the next argument to printf, interpreted as a pointer - or the next thing that happens to be memory, if you didn't pass an argument for it). Rust protects against this by making you use a string literal as the first argument.

Python also protects against this by having f-strings be a special part of the language syntax. You can't have an f-string that's a variable, so an attacker can't set their username to f"{DATABASE_PASSWORD}". So, in that sense, the f sigil in Python is also a macro, like in Rust - it's just shorter syntax.

ryl00 · on April 25, 2021

> C, of course, just ignores this problem entirely.

Many C/C++ compilers special-case printf, and can actually catch some printf format string problems at compile time. For example, gcc 5.4.0 and msvc 19 warned about both of the dodgy printf usages in your example (the type mismatch and the wrong number of arguments).

richardanaya · on April 25, 2021

Check out https://tourofrust.com

KingOfCoders · on April 25, 2021

<3 Rust

The only thing I find difficult in Rust is lifetimes of e.g. "String"s in data structures, e.g. in Scala I have:

    case class CustomerSale(name:String, item:String)

and because of GC I don't have to think about life times, copying etc. - data gets removed after a request or when nothing is used any longer.

In Rust it is

  struct CustomerSale{
    name: String,
    item: String,
  }

which rarely works that simple. I need to think of the life times of name and item, e.g. when I get them from a database, and I don't want to own it, or if I want to copy them or have &str etc. And in web frameworks in Rust there is no notion of request life times which would make things much easier.

pitaj · on April 25, 2021

If you use String which is heap allocated, you don't really need to worry about lifetimes.

&str on the other hand

KingOfCoders · on April 25, 2021

Yes, String was an oversimplification, String means I either own or copy it, both things I usually don't want to do, so I end up - as you've said - with &str.

phaylon · on April 25, 2021

You might want to look into using `Arc<str>`. It's an immutable string slice that can have many owners and will be dropped once the last one is done with it.

KingOfCoders · on April 25, 2021

Thanks, I will try that - fearing my code is littered with Arc ;-) - but sarcasm aside, thanks.

phaylon · on April 25, 2021

I usually just have a `type Text = std::sync::Arc<str>` at the root, specifically during prototyping :)

rmdashrfstar · on April 25, 2021

How does this differ from just sharing &str?

phaylon · on April 25, 2021

A `&str` is a borrow of the string slice data, and has borrowing semantics. An `std::sync::Arc<str>` on the other hand isn't borrowed but has shared ownership and is atomically reference counted.

Does that answer the question? Could you expand on what you mean by sharing `&str`?

rmdashrfstar · on April 25, 2021

I suppose my question is what motivates the use of atomically referencing counting string slice rather than borrowing?

phaylon · on April 26, 2021

You can store it independently without it being borrowed. So your structs don't need lifetimes.

I consider it a midpoint between `String` and `&str`. Most of the convenience of `String` (barring mutation) but less costly.

ajross · on April 25, 2021

"Cheat Sheet" prints as a 64 page PDF. Yeah, that sounds about right.

_gjrn · on April 25, 2021

Pretty cool! Is there something like this for other languages? I know that Khronos publishes this pdf for WebGL*

* https://www.khronos.org/files/webgl20-reference-guide.pdf

dotslashroot · on April 25, 2021

you can look at this: https://overapi.com/#more

gilnaa · on April 25, 2021

Great work!

"Allocate T bytes on stack bound as x" may sound a bit like T is a size instead of a type.

golergka · on April 25, 2021

Similar cheatsheet for most of modern languages: http://learnxinyminutes.com

da39a3ee · on April 25, 2021

This is amazing!

But there seems to be something missing: error handling. After using Rust on and off for a couple of years, a thing I still don’t understand is how to write idiomatic code which deals with multiple different error types.

davidhyde · on April 25, 2021

If you are writing your own library then your best bet is to define your own Error enum in each module. Below each Error enum write a bunch of From trait implementations to convert errors into a custom variant in your Error enum. This allows you to use the "?" operator to "bubble" errors up through your library and build some sort of useful printable error tree. Don't used boxed errors and don't use std:io:Error for your own custom purposes. Doing things this way will mean that consumers of your library will not be bound to use some error handling library that they may not want to use. It is also very fast.

If this is not a library then feel free to use any number of error handling helper libraries out there. It would be unfair to single one of them out. However, once I got the hang of writing my on custom errors I ended up using them for my binary applications too.

shepmaster · on April 25, 2021

> If you are writing your own library then your best bet is to define your own Error enum in each module

This is exactly the path that my SNAFU library takes. It’s a procedural macro attached to an enum that generates the From, Display and Error traits for you.

This way, consumers of your code don’t need to know anything beyond the fact that you implement Error and any inherent public methods or traits you choose to expose. The implementation provided by SNAFU is internal to your code.

https://docs.rs/snafu/0.6.10/snafu/

da39a3ee · on April 25, 2021

Thanks, that's very helpful and may well be the encouragement I apparently needed! It is a standalone CLI application but it might want to offer a library in the future. And in any case I am attracted to following your suggestion regarding custom enums, partly because it seems more likely to dispel my lack of understanding of the topic.

davidhyde · on April 25, 2021

Cheers! It's a daunting topic with many different ways of doing things and it is difficult to find reading material on the topic of idiomatic error handling. I just read through a lot of popular code bases and picked out what I liked. I found that using custom errors was the simplest approach for me albeit a little verbose.

iudqnolq · on April 25, 2021

My sibling posts a great general overview, and they make a good point about not biasing specific libraries. When I was starting out in Rust, however, I found value in just doing what everyone else was doing and then deciding if I wanted to do things my own way once I was confident I could do the popular way. In that vein, here's doing error handling like a lemming.

Library: thiserror

Application: anyhow (or eyre if you want the same thing but with fancy colors and nicer backtraces).

Make sure you #[from] to derive From a lot. The value is in chains. Here's an example I just wrote

    #[derive(thiserror::Error)]
    enum LoadError {
        #[error("Failed to parse")]
        Parse(#[from] ParseError),
        #[error("Failed to acquire lock")]
        Lock(#[from] file_lock::AcquireError),
        #[error("IO error")
        Io(#[from] std::io::Error),
    }

ParseError and AcquireError are defined in a similar style.

Then I have a function

    fn load(...) -> Result<..., LoadError> {
        let lock = FileLock::acquire(...)?;
        let parsed = parse(...)?;
        Ok(parsed)
    }

The magic is that ? calls .into on the error, which uses a From implemention to convert the error into the error type the function returns. This means the ParseError gets converted into a LoadError::Parse, for example.

Now we've got that nice semantic cause chain, we use anyhow to display it.

    fn main() -> anyhow:: Result<()> {
        let index = load(...)?;
        // ...
        Ok(())
    }

If index fails I'll get something like this printed:

    Error: Failed to acquire lock

    Caused by:
        Failed to open lock
        No such file or directory (OS error 2)

It feels like a superpower to have a high-level function fail and see the massive list of errors all the way down to the root cause.

da39a3ee · on April 26, 2021

Thank you!

papaf · on April 25, 2021

I sill don’t understand is how to write idiomatic code which deals with multiple different error types.

I am not sure if it is idiomatic but I have been following the guidelines int this article for general error handling:

https://nick.groenen.me/posts/rust-error-handling/

And this article for having context:

https://kazlauskas.me/entries/errors.html

da39a3ee · on April 25, 2021

Thank you. (I'd come across the first but not the second). It looks like there is nothing blocking me apart from the need to find a clear morning to which to devote myself to the task of doing error handling properly in my project.

mesaframe · on April 25, 2021

Enums?

dotslashroot · on April 25, 2021

Some people have asked the cheatsheet for other languages. you can look at this: https://overapi.com/#more

truth_seeker · on April 25, 2021

Thanks. One of the best cheatsheets I have even seen.

optimalsolver · on April 25, 2021

Any recommendations for a good C++ cheat sheet?

ProNeo · on April 25, 2021

Does something like this exist for Kotlin?