I'm a network engineer, and I've done C++ for over a decade now. One of the nice things about Rust is that they decided to go with async over fibers, which is in line with how most high performance C++ is written. The Rust team also isn't rushing Future out the door, so it's coming along much nicer than the C++ Future, which is usually replaced because it's not monadic.
Rust is great, and I highly recommend learning Rust if you want to become a better C++ developer. C++ introduces a lot of unnecessary headaches Rust completely solves. Rust is worth the switch if only because reading error messages from expanded templates is horrible compared to type system that actually understands type parameters.
I'm really bullish on Rust becoming even more mainstream, to the point anyone can pick it up and use it in their business without a second thought by 2020.
I am a big fan of rust but I don’t think the Async using the ”await” keyword or using Future/promis is the way to go.
I wrote a large web app in scala using Futur only and it turned into a monster because of it. Even if you don’t use callback you still need to manualy unwrap them and they pollute your méthode signature.
Then more recently I wrote a pub/sub server in c# using the “await” keyword. While much better it still polute your method signature and make the debugger and stacktrace useless. Also you still have to think about trying to no make anything that could block spinlock or calling api that are not Async ex: Createfile
Compare this to Golang where project like Groupcache (1) have super clean code that replaced a thousand line of code C++ system at google.
Why do you think Golang fiber/gorotine are worse ?
The answer to this question is quite involved. As you may already know, Rust at one time had a fiber model, but it was removed in favor of Futures.
One big advantage of Futures is they are decoupled from their execution environment.
For example, different types of executors can be used depending on the type of Future. For example, you may have an IOThreadPoolExecutor to handle accepting and responding to connections that should spend very little time blocking and a CPUThreadPoolExecutor for offloading heavy processing. In Go, you have no say in how goroutines are scheduled, you sacrifice control to reduce complexity.
Another big one in network programming is the polling mechanism. In Go, you have no control over your polling mechanism, whereas with a Future is decoupled from the event loop, and you can write your own event loop on any experimental kernel APIs you'd like.
Rust made the right move because in Rust, just as in C++, we want maximum control. Futures are more primitive, and it's possible to build coroutine models on top of them, but not vice versa.
I agree with you.
I guess the best solution would be something in between “await” and goroutine.
My understanding is the main issue with “await” model is that you only abstract The Task model.
Running task don’t gave a real ID and stack like “Go” and “Erlang” you can list all running fiber how much memory and CPU they use ...
I believe you could write your own event loop easily in Go if they exposed the internal api used to pause and resume goroutine, you can already pause goroutine by having them wait on a mutex and resume them from your custom event loop by releasing that mutex.
> Rust at one time had a fiber model, but it was removed in favor of Futures.
No, fibers were removed with nothing to replace them. The futures as we know them today did not exist in the far, far past when fibers and the whole runtime thing got dropped.
Futures and fibers are not mutually exclusive, as shown in the latest (2018) Oracle Code One keynote speech, where they demoed project Loom (basically you can wrap a fiber in a CompletableFuture, in the same fashion you can wrap a thread/execution context and expose it in a CompletableFuture). That being said, the JVM team seem to have come to a similar conclusion that using fibers is the least invasive way in terms of syntax as compared to async.
there is less context switches because you cant write some data to several distinct file descriptor using a single system call.
You can also read data from several file descriptor in a single System Call.
This way you significantly reduce the number of System call instead of doing one blocking read() per connection.
I believe context switch round trip (from user space to kernel to user space) is much more expensive than simply switching between goroutine of the same process.
Chrome manages to do async stack traces for their implementation of the similar JavaScript feature. I wonder if this would be possible for C# and Rust.
I've looked at ilovecaching's comment history (at least up to ~60 days ago) and they seem consistent on being a networking programmer who uses C++. Was there a particular comment you found which would indicate this isn't the truth?
https://news.ycombinator.com/item?id=18185795 "I've been using Rust in production for a little over half a year, and my team and I have run into very few issues. [...]
We're all extremely glad we chucked C++ and Go and switched to Rust. "
It really isn't that ridiculous that someone who has worked as a network engineer for a decade would use Erlang, C++, and Go. Even Haskell isn't too much of a stretch, considering the user's company has functional programming expertise in Erlang. Rust is a neat language that, depending on their workload, could replace all of those languages. Why the presumption of bad faith?
Not seeing any major inconsistencies. Three months ago they exclusively used Go and Rust, at some point before that they used Erlang and Haskell and they used C++ for a total of over 10 years.
Talk about cherry-picking. It's pretty clear from those comments that they've worked with Erlang before, as well as Haskell and C++, and that they've transitioned to use Rust.
For example, in the same comment that begins with "As a long time Erlang developer", they later continue: "I’m more optimistic about Rust ... We’ve been able to scale Rust to handle the same load as our Erlang cluster". So there's no contradiction here.
I guess I'm glad that someone sits around looking through comment histories skeptically, but I don't see what about these comments seem suspicious to you.
One thing I don't like about Rust is how taking a slice of a string can cause a runtime panic if the start or end of the slice ends up intersecting a multi-byte UTF-8 char.
I would prefer it if this feature didn't exist at all rather than cause runtime panics.
It's not a problem in practice, because you'd use something like `.char_indices()` iterator, or result from a substring search, etc. to get correct offsets in the first place.
It's not useful to blindly read at random offsets in UTF-8 strings. If it didn't panic, you'd get garbage. If offsets were automatically moved to skip over garbage, you wouldn't know what you're getting, and your overall algorithm would likely end up with nonsense (duplicated or skipped chars).
For algorithms that don't care about characters or UTF-8 validity, there's zero-cost `.as_bytes()`.
Couldn't syntax like `a_string[..3]` be made to result in compilation errors in Rust? Since that'd almost always be a bug? (right?)
And in the rare cases, when it's not a bug, then one can just use `as_bytes` which would be good to do in any case, to indicate to other humans that this is not a bug.
B.t.w. I love the error message `[..3]` generates: "thread 'main' panicked at 'byte index 3 is not a char boundary; it is inside '早' (bytes 2..5) of `ab早`'" — I've never seen such easy to understand error messages in any language (except for in a few cases in Scala).
What does zero-cost mean in this context? It must cost something to run, no? Or is it basically a compiler hint instructing the next function to treat the data as pure bytes?
In this particular context, you can think of going from a `&str` to a `&[u8]` via `string.as_bytes()` as a safe cast. The in-memory representation remains the same, and the function call will almost certainly be inlined because its implementation is trivial.
It is a common pattern in Rust to use [] for things that cannot fail and will panic otherwise and a method for things that can fail and return Option or Result.
e.g. my_hashmap["foo"] will panic at runtime if the key "foo" is not present, or return the associated value if it is. But my_hashmap.get("foo") will return None if "foo" is not present and Some(value) if it is.
What's the point of the [] version then? It seems inherently more dangerous, and Rust emphasizes safety. I know it wants to be pragmatic as well as safe, but this seems like a strange default.
First of all, panics are perfectly safe. None of this has to do with safety guarantees.
Second, the [] syntax is controlled by the Index trait, which returns an &T, not an Option<&T>. It does this due to Rust's error handling philosophy. There's two kinds of errors: recoverable and unrecoverable errors. When something shouldn't fail, unless there's a bug, you shouldn't be using Option/Result, you should panic. When something may normally fail, and you want to be able to handle that explicitly, you should use Option/Result.
If [] always returned an Option, you'd be seeing tons and tons and tons of unwraps. It's not the right default here. However, that's why the .get method also exists: If you do think that this may fail, but not due to a bug, then you should use .get instead, which does give you an option.
TL;DR: everything is tradeoffs, and we picked a specific set of them, and that's how they all play out together.
Personal commentary: this is the kind of thing that's largely concerning until you actually use the language more, IMHO. Dealing with Options all the time here would feel really bad. Consider the other sub-thread about floats; it often feels like boilerplate for no good reason. That would introduce this for every single time you want to index something, which is a very common operation.
Does Rust support a monadic coding style (like Haskell "do" blocks or F# computation expressions)? That would allow you to work with Options without having to explicitly unwrap them.
Not generic monads, but it does have the `?` operator for Option (similar to Haskell Maybe) and Result (similar to Haskell Either) which would support a similar syntax to using `do` with the Maybe monad
Scala programmers would recognize this as the difference between () and .get(). I hope rust copied scalas syntax- its much cleaner, rather than trying to be nice to the established system languages (c/c++)
This would also free up the [] to be used for generics and avoid syntactical warts like ::<> parsin
This seems specious to me. The only way to get an invalid index in a string in any language is that you either have an array index arithmetic error or you are blindly operating on a string you haven't validated.
If you want all the data after a : character, you slice on the index of the :. The character after it is going to be the beginning of a UTF-8 character.
You do not under any circumstances guess that the colon is at position 6 in the string. That's not safe. Why are you going cowboy in a language that is so obsessed with safety?
I just realized that I have bug in my GPS driver. It operates on ASCII data, so [] operator is safe, BUT data can be corrupted (low chance, but non-zero), so it can form valid multibyte character, so my code will panic on it, trying to parse and validate NMEA message.
Truncating a string to fit in a fixed-size storage field is probably the most common reason to split at a particular byte position. If you’re throwing data away anyway, you probably don’t care too much about the little bit of corruption.
Granted, this is certainly incorrect but has little to do with safety, especially if the downstream code has to revalidate everything anyway.
String slicing using byte indices has to exist in some form, since it is the only thing that is efficient (O(1)). But, I guess it could have used syntax other than somestring[...].
That means one loses all the conveniences and guarantees of the string types and, in many cases, forces an immediate revalidation the byte slice as UTF-8 to get back to &str, which is O(n). Furthermore, this is also rather clunky.
I suppose one could have it return StrWithInvalidSurrounds, where just the first (at most) 3 and last (at most) 3 bytes might be invalid, which would then allow for O(1) revalidation to a &str, and even other operations like continuing to slice... But this is even more clunky for actual use!
I think a moderately less clunky API might have been to not use integers for byte indexing, but instead some ByteIndex wrapper type that string operations return, meaning one can't just write `s[..5]` in an attempt to get the first 5 characters of the string.
Does this bug exist because it would be too expensive to check every string before slicing? (Being Rust-ignorant), can you not type a binary as UTF-8? Are there 2 versions of string functions, fast ones that assume ASCII and slow ones that assume UTF-8?
Every string is checked. But UTF8 is a multi-byte encoding, and slicing works per bytes, so you if you slice in the middle of a multi-byte character, you may get nonsense. The error happens because of this checking, not in spite of it.
String always assumes full UTF-8. You could make an AsciiString type if you wanted, but it's not provided by the standard library.
The obvious follow up question would be: so why is slicing a string a byte-wise operation and not a character-wise operation? If a string is an array of characters, why does it let me refer to individual bytes without explicitly casting it to a byte array? How often comparatively do you want the nth byte compared to the nth character? I would suspect that's pretty rare.
> How often comparatively do you want the nth byte compared to the nth character? I would suspect that's pretty rare.
It's exactly the opposite of what you expect. Getting the nth codepoint is often (not always) semantically incorrect since a codepoint isn't necessarily one character. Multiple codepoints might combine to form one character. (In Unicode, these are called grapheme clusters.)
Byte offsets are used a ton because you might often have the index to a position in the string from some routine, like, say, a search[1].
I've been working on text related things in both Rust and Go for several years. Both languages got this part of their strings exactly right given that their representation in memory is always a sequence of bytes.
I still think that using the common [] operator for this is a mistake. Strings shouldn't offer [] at all, and instead should provide methods like codepoints(), bytes(), grapheme_clusters() etc for indexing, slicing, and iterating.
The reason being that the behavior of [] for string varies widely in different languages, and so this is something that's best made explicit, both to force the author of the code to consider whether their assumptions are valid and reasonable for what they're trying to do, and to give additional context to anyone else reading the code.
As it is, I suspect a common class of bugs for Rust will be with people assuming that [] slices codepoints, because it seems to work that way for ASCII.
I'm quite thankful that Rust has succinct notation for slicing strings. Do note that `string[n]` is not supported, so you'll stumble over an inconsistency in your mental model quite quickly if you think slicing is by codepoint.
The lack of direct indexing is a good point. But strings aren't sliced on byte boundaries all that often either - it's far more common to use higher-level APIs like split(), that deal with offsets under the hood, so that sugar mostly ends up being used in the implementation of such APIs. And, really, would something like s.slice_u8(x, y) be that unwieldy over s[x..y]?
How often do you actually want the nth character as opposed to the nth grapheme?
There is pretty much no case where indexing by character actually makes sense because it is almost always incorrect and it is always inefficient.
Indexing by byte is rarely useful, but it does have some usefulness since it can be used correctly and efficiently since you can easily find the next or previous character by searching a maximum of four bytes for the a byte that has a MSB of 0. If you want to do something like get a &str that would fit in a n-byte buffer, then byte indices will let you do that efficiently and correctly.
panics are safe. You expect the “I don’t have a bug” case to be fast.
You can, but it depends on what you mean by “character”, as that’s not a concept in Unicode. Every kind of thing you could mean has a method, specific to it, since they’re different things.
(char in Rust is a Unicode scalar value, and you can collect into a Vec<char> and then slice it, as an example of one of those things. And that’s still O(1) at the cost of using up to four times the memory.)
It's not clear to me what you're suggesting; is it that String shouldn't have supported indexing in the first place? That code does work, but you have a &[u8] not a &str.
But neither AsciiString. It has a as_str method, but it's still a kludge.
This example was basically a suggestion to throw0u1t: if they want to cut in the middle of the utf-sequence for whatever reason, they can [edit:] do it without extra crates.
What I don't understand is why slices are indexed in bytes and not in objects. If String has an ability to check that we're cutting in the middle of the character sequence, why doesn't it provide an ability to take 3 fully formed characters.
I think the rust designers want to keep the implicit contract that indexing into a string is fast and O(1).
If you want to find the one millionth codepoint of a UTF8-encoded string, you have to more or less (1) visit every byte of the string.
If, on the other hand, you want to find the codepoint that covers the millionth byte, on the other hand, you have to read at most four bytes (read the millionth byte, and there are three cases:
- it’s a full codepoint. If so, you‘re done.
- it is the first byte of a multi-byte codepoint. If so, read forwards in the string for up to 3 continuation characters.
- it is a continuation character. If so, search backwards in the string for the first byte, then, if necessary, read forwards to find more continuation characters.
So, that is O(1)
(1) you can skip continuation characters, but these typically are rare.
It does: `s.chars().take(3)`. It just does it with iterators rather than with indexes because that better communicates the performance characteristics.
I think he's suggesting that slicing on strings should be by character, and if you want to slice on bytes, you should explicitly ask to treat the string as a byte array. It makes more sense semantically, and it's safe.
That seems like taking it too far. It's like using pointer arithmetic to index a linked list on the assumption that the nodes happen to be allocated contiguously in memory. I mean, I guess the thinking is, indexing a Unicode string isn't cheap, but indexing strings used to be cheap once upon a time, when strings were encoded in fixed one-byte-per-character representations, so let's pretend that's still the case and panic if it doesn't work out.... That's weirdly antithetical to Rust's purported focus on safety.
Also, you can get the same performance from an operation that returns a byte array instead of a string. If that kind of performance is what you want, then a Unicode string is simply not the right type to use.
Indexing a Unicode string is cheap... if you have a byte index. If you want to count out some fixed number of codepoints, then of course you've just moved the cost to calculating the corresponding byte index. But counting codepoints is almost always the wrong thing to do anyway [1]. In practice, it's more common to obtain indices by inspecting the string itself, e.g. searching for a substring or regex match. In that case, it's faster for the search to just return a byte index; there's no benefit to having it return a codepoint index, and then having to do an O(n) lookup when you try to use the index. And byte indices obtained that way will always be valid character boundaries, so you can use [] without worrying about panics.
You suggest just using a byte array instead, but then you'd lose the guarantee that what you're working with is valid Unicode. Contrary to your assertion, it is useful to have a type that provides that guarantee, yet which can still be operated on efficiently.
Panics are not unsafe. Panic exists in Rust because they are safe. If you don't want a panic on index, just don't index.
Indexing into a UTF-8 string doesn't serve any reasonable consistent purpose anyway, because it is an abstraction of text that doesn't provide support to the notion that a "character" is more fundamental than a word or paragraph, etc. Rust's string slicing exists solely to make ASCII text easy to handle. If your text is not ASCII, then you shouldn't be slicing it at all. Thus the panic.
Indexing into a UTF-8 string doesn't serve any reasonable consistent purpose anyway
If that's true, isn't it the job of a type system to help avoid such nonsensical operations? If "slice" only makes sense for byte arrays and ASCII strings, it could be provided on those types without being defined on UTF-8 strings.
Panics are not unsafe. Panic exists in Rust because they are safe.
That's "safe" by a very limited definition of safety. It's one step up from undefined behavior, granted, but it's not a very high standard. In practice, in most programs, you'd want to ensure that such a panic would never happen, and personally I think the language's unhelpfulness in that regard is a wart.
>If that's true, isn't it the job of a type system to help avoid such nonsensical operations?
It's not strictly true, because there are situations where you want to slice UTF-8. For instance, if you already know where the code point boundaries are for newlines. But if you know that, then you've run something like a regex with >O(1) behavior and you certainly wouldn't want string slicing to do redundant work.
>hat's "safe" by a very limited definition of safety
That's the definition of safe that is used. Safety in the context of Rust means memory safety. (Division can panic, btw.) If you don't see why undefined behavior is so much worse than a panic, then do some research on it. If you want programs that never fail, you need a comprehensive plan that takes into account things like hardware failure. A programming language can't do that.
Go indexes bytes on strings, even though there's a builtin type called Rune which delimits utf-8 codepoints. This is yet another footgun. Is there a language that doesn't handle this poorly?
In Rust, you're supposed to use `unicode-segmentation`[1] if you need to split on logical character (grapheme cluster in the Unicode standard). Otherwise, the iterators `.bytes` emits raw bytes, and `.chars` emits UTF-8 codepoints.
Basically, string indexing is a lot harder than it seems at first glance, depending on what you want.
One nitpick: `.chars`[1] gives you an iterable[2] of `char`s[3], each of which is always a 4 byte representation of a valid unicode character. This means that `"asdf".chars().collect()` will have a different size to `"asdf"` and `"asdf".chars().as_str()`. `.chars()` will never give you an incomplete codepoint, but it will give you incomplete characters, as you could have many c̶̼̟̏ó̷̘̉n̴̖̞̏̇t̸̡̃ĭ̸̻̬n̴̯͉̂͑ṵ̴̑a̷̛̫̳ẗ̸͕́i̷̱̫̓̋ǫ̸̑ǹ̶̼̅s̸̩̾̌ to represent what visually are a single char.
UTF-8 is at odds with efficient array indexing. I like pythons approach where bytes and strings are distinct types, though I have no idea what it is doing under the hood.
Modern Python uses whatever representation is sufficient to ensure 1-unit-per-codepoint for a given string (which it can do on creation, since strings are immutable). So you get ASCII, UTF-16 sans surrogate pairs, or UTF-32.
This is great for high-level code, but painful to work with from native code, because it usually needs some specific encoding to call into other libraries, and it's usually UTF-8 - so you need to re-encode all the time.
I actually had to work with Python strings at the C level recently, and their approach is pretty clever. IIRC, the runtime can take any common form of Unicode, and will store it. When you access that string, the accessor requests a specific encoding, and the runtime will convert if need be, and then store it in the string object.
So it handles the (very) common case of needing the same encoding multiple times (e.g. for all file paths on Windows), while not introducing too much overhead in memory or speed.
I could be mistaken on exact details though, especially since I recall there being multiple implementations even within py3.x.
> A lot of great, well loved, crates don’t have a lot of Github stars.
This was a really important learning for me. When I'm looking for crates to solve a problem and there's only a handful of them, I almost always go through every single one of them, even if some have thousands and the others only tens of downloads.
It's an incredible feeling to find those unknown diamonds..
OP here -- this post is pretty dated. I currently use the OrderedFloat [1] crate to solve this problem which works quite well and plays nicely with NaN
To me this feels quite pedantic. Being able to do less than on NaNs and just having it do something vaguely sensible (as in your linked solution) is a far more common requirement than need to handle NaNs specially.
You could even say the reason NaN exists is so that you don't have to check for NaN constantly. Rust is being technically correct but practically really annoying, for basically no benefit.
I'm curious as to why it isn't implemented in hardware. Is it really so rare to need to sort floats, or so common to need a different ordering when you do?
Of course sorting floats happens a lot. In practice one rarely encounters NaN's and ±Inf's, so fast comparison for concrete values is the default. I don't know why the 'slow' total order is not implemented in hardware though.
But fortunately in comparison sort algorithms that run in O(n lg n) you can get away with doing an O(n) partitioning of the array into [-, +, NaN] and then applying a fast integer comparison operator to the negative values (-) and positive values (+).
In fact the above idea ties in neatly with QuickSort, which is already based on partitioning & sorting recursively.
Yes, the literal < in the language induces a partial order by convention. What I'm getting at in my comment is that you can define a sensible total ordering.
The challenge becomes what code to emit when you see those operators: native target comparisons, or the software implementation of your total ordering? The latter is safe and slow and the former is fast and IMO idiomatic.
So since no one needs this often enough to emit the soft-float comparison code, we should emit the fast code. If folks need different behavior they should use different types. This is similar to the behavior with integer overflow, which you can opt into by using checked types or checked operations. Though in rust we have a convenience that the overflow-detecting code is emitted for debug targets.
It is possible to do redefine NaN as something different than what IEEE 754 contains, but it will surprising to some users, and it will come at a performance cost because you can no longer let the hardware handle all float comparisons directly.
Rust is one of the few times where the language/ecosystem does precisely what I wished it would do. For instance, being able to partially destructure JSON into a struct in a typesafe manner is just awesome.
I’m basically talking about Serde. Read the section titled “Parsing JSON as strongly typed data structures”. What’s nice is that you don’t have to include all the fields in the JSON in the struct. Serde will only give you the ones defined by the struct.
The issue does not mention memory safety and neither did I. Honestly, knee-jerk reactions like "BUT MUH MEMORY SAFETY" doesn't inspire confidence specially when it couldn't help saving that dev from the troubles and bugs documented in the issue. To quote a few:
> it was a hassle to track down because Rust itself didn't complain and the panic message during serialization wouldn't tell me which file of the hundreds of thousands was causing it to die. For lack of a purpose-built tool, I had to manually bisect it until I narrowed it down.
> That said, definitely a footgun in the standard library to be remedied.
> My main concern here is getting rid of the footgun if at all possible. I really don't want to have to maintain a special "Never allow these types to creep into structs I'm deriving Serialize/Deserialize on, because the compiler certainly won't warn you" audit list.
If that's considered safe in Rust's standards then I rest my case.
One thing Rust doesn't seem to be doing very well yet is guard clauses, specifically when handling Option<T>.
I've seen and appreciated the use of guard clauses in many languages, as a good way to quickly check for a few conditions at the top of a function, and return early if those conditions aren't met.
Since it seems that Option<T> are recommended in Rust, there's a lot of time you want to quickly return if `Some(x)` is not here (i.e. it's `None`), and if it's here, continue through the function, without having an unnecessary indentation from an extra brackets.
There seem to be a good amount of smart discussion into handling those [1][2]. some threads are more than a year old, but it seems to be making progress.
You can use ? on an Option if your type returns an Option. If it returns a Result, you can use ok_or()?, and at some point in the nearish future, you can just use ?.
I see, not very familiar with both those idioms `?` and `ok_or()?`
My current understanding is that those would return an Error only?
I was more describing cases where you do want to return, but not necessarily return an `Error`.
For instance in a simplified example function that returns a boolean, you could decide to return `false`. is it possible there?
// Function that returns a boolean value
fn is_equal_to_ten(n: Option<u32>) -> bool {
// some one liner that checks for None, if it's not none, gives you `x` when `n` matches content of `Some(x)` (not real code):
if let Some(x) = n else { return false; /* what to do in case it's a None*/ }
// `x` is available here:
return (x == 10);
}
The question mark operator works via a trait, Try. Both Option and Result implement Try. If you use ? on a None value, it will return None, just like using ? on an Err returns an Err.
ok_or is a method on Option that would let you manually convert it to a Result. You could then combine it with ?, turning a None into a specific Err.
It won’t help for stuff that returns bool, it’s true.
I'm using `.ok_or(SomeError)?` which converts Option into Result and opens Ok() for the rest of the scope, short-circuiting on Err(). I know about Swift's `guard` statement, but I never had a case yet when I didn't want to return Result in such kind of code, so ok_or was working for me well.
Related to the discussion in the second link, it sounds like Mr Pearce got to coin the phrase 'flow typing' to describe this situation but this is something people have been talking about for a long time.
Pseudocode:
if (foo is a String) {
foo.someStringMethod();
}
Flip that around a little bit:
if (foo is not a String) {
return "error";
}
foo.someStringMethod();
And you've got a guard clause that's fundamentally the same kind you're asking for. I've wanted this structure in a language for a very long time. I was happy to see it pop up in Kotlin and would love to see it in Rust as well.
Swift does, but typically for Optional promotion. You see stuff like `guard let delegate = delegate else { return }` inside functions all the time, shadowing a property with a local & promoting the type from Optional<T> to T.
It's not the same as the Rust example because you're shadowing an ivar to a local, but since `self.` is implicit you're still shadowing.
That all makes sense. I've only done a couple days of Rust, so my recollection is spotty. The shadowing makes sense because if a local variable was moved out, then you're not really shadowing it anymore. Is my understanding of that correct?
> do you mean that Rust's `self.` is implicit in Swift?
Swift's `self.` is implicit in Swift – in most contexts, to access a property the `self.` is not required. `self.name = "John"` and `name = "John"` are equivalent (assuming self is an object with a name property).
There are places where explicit `self.` is required though – when you want to differentiate between a shadowed local and a property (obviously), or when you're inside a closure (to make it clear that the closure is capturing self, not just capturing a reference to the property).
> The shadowing makes sense because if a local variable was moved out, then you're not really shadowing it anymore. Is my understanding of that correct?
This is kind of a philosophical corner: can you shadow something that isn't there anymore? Once you've moved out of something, if you attempt to use the old name, then you'll get a compiler error different from "no such variable", so it's still there in some sense.
I actually really try to avoid this pattern. For one it pollutes the function scope with a shadowed binding and it's pretty unnecessary boilerplate. Most always you can map across the optional and use the unwrapped version as the block scoped argument identifier, e.g. `$0`. When you can't or you want to handle both cases, I find a simple `if let delegate = delegate {} else {}` is more explicit and scopes the binding to the block instead of polluting the current scope. I'm not saying `guard` is bad, I use it all the time when there are good names to bind things to, but I dislike how much it proliferates and how often it doesn't actually provide value and just lets the program silently fail instead of loudly fail which half the time is worse than a nullpointerexception anyway...
let foo = "...";
let foo = parse(foo);
let foo = escaped(foo);
...
doSomethingWith(foo);
I don't see how this is helpful for avoiding the bug described. The most common bug with this type of code is mistaking which form "foo" represents at a given line of code, or that form changing as the code evolves. For example, if one programmer writes
let foo = "...";
let foo = parse(foo);
...
doBarWithFoo(foo);
and another programmer comes along, doesn't notice the call to doBarWithFoo, and needs an escaped version of foo:
let foo = "...";
let foo = parse(foo);
...
let foo = escaped(foo);
doBazWithFoo(foo);
...
doBarWithFoo(foo); // This still looks correct in isolation
This is a classic problem with mutable variables that frequently causes hard-to-spot bugs. Whereas if each form has a distinct meaningful name, this change wouldn't introduce a bug, and if somehow a bug were introduced, good names will make it possible to spot even considering the line in isolation:
doBarWithFoo(fooEscaped); // Hey! The input to doBarWithFoo shouldn't be escaped!
If the only useful form of foo is the final one, then you can avoid having accessible names for the invalid intermediate forms like this:
let foo = escaped(parsed("..."));
Or like this for more complicated logic (I'm not a Rust programmer (yet) so this may not be the right syntax):
let foo = {
// Complicated logic
finalForm;
};
I feel like if I ever write a lot of code in Rust I'll find a linter rule that warns about shadowing variables and use it religiously. But maybe I'm missing a use case where it's crucial.
I believe the way this is solved is by having the type of escaped(foo) different than that of parse(foo) and only accepting a EscapedString in doBaz and an ParsedString in doBar.
Your type structure at no extra runtime cost is
String : ParsedString : EscapedString.
This ensures you don't escape strings before parsing them too. Nice!
I think the example only really works if `parse` and `escaped` return something other than a string. If that was the case, the type system will save you.
I've not done any rust dev before but the whole variable shadowing thing really scares me, in that it makes it not obvious where the initial declaration comes from. This seems difficult to reason about.
Two things, 1) the compiler doesn’t really allow you to screw it up in really scary ways, but you can I suppose. 2) it allows you to rebind a variable name to a new type, of a new state during execution without coming up with a hokey name for the new thing that’s really just the original modified in some way.
It is shadowing (though it could be converted into mutation by a sufficiently smart compiler...). ReasonML does something similar with their take on Ocaml.
> Like Go, Rust can compile statically linked linux binaries.
The GNU C library (needed not only by C programs for C things) doesn't support static linking, so the only way this is possible is to use another library entirely, or raw inlined syscalls (where applicable).
Yeah, Rust's musl support is great. We use it heavily at work for all sorts of CLI tools. Many thanks to everybody who worked on this.
(And if you need to link against common C libraries like OpenSSL or PostgreSQL, I maintain a Docker image with the necessary C toolchains, and instructions on how to use it: https://github.com/emk/rust-musl-builder. There are a couple of similar images out there, too, I think.)
By default rust links dynamically to glibc, while rust stuff is statically linked. Which I think is a reasonably good default. Most Linux systems provide glibc, and you can deploy binaries to them without those systems needing a rust installation.
Of course, there's the usual issue with glibc symbol versioning, so you probably want to build your binaries on the oldest supported system (say, a centos 6 container).
Great job. Just a nit/request for that gif illustrating agrind's use -- I keep watching it to see what args you've passed to see how it's being used but the output finishes and loops before I understand what I'm seeing. Not sure if it's a property of the image or my browser but if you can turn off the looping in the image that might work better. Or add static frames at the end for the slow folks like me ;)
Rust is great, and I highly recommend learning Rust if you want to become a better C++ developer. C++ introduces a lot of unnecessary headaches Rust completely solves. Rust is worth the switch if only because reading error messages from expanded templates is horrible compared to type system that actually understands type parameters.
I'm really bullish on Rust becoming even more mainstream, to the point anyone can pick it up and use it in their business without a second thought by 2020.