The whole point of Rust is memory safety and security, but I don’t care too much about either of them. I mostly work on offline desktop apps. When I do work on network exposed servers, I use C# which delivers even better safety and security for a reasonable performance overhead.
In pursue of that memory safety, Rust sacrificed pretty much everything else.
The language is incredibly hard to learn. The compiler is even slower than C++. It’s hard to do R&D in Rust because memory safety forces minor changes into large refactors. The recent push to async/await is controversial, to say the least.
It’s impossible to implement any non-trivial data structures in safe Rust. In my line of work designing data structures is more important than writing executable code, because the gap between RAM bandwidth and processor performance have been growing for decades now.
I found Rust easy to learn after ~25 years of C/C++. It was definitely easier than learning to use Python properly. Maybe it was because I was often writing multi-threaded shared-memory code in C++, and that forces you to think in a very Rust-like way.
Rust makes it difficult to implement data structures that rely on pointers. But in practice, you often want to minimize the use of actual pointers, even in data structures that use pointers on an abstract level. Pointers are large, following them is slow, and they encourage making many small allocations, which has a large space overhead. And if you define the interface in terms of pointers, you can't easily switch to an implementation where you can't point directly to any individual item stored in the structure.
> Maybe it was because I was often writing multi-threaded shared-memory code in C++
I’ve been writing C/C++ for living for about 25 years, also often writing multi-threaded shared-memory code in C++. However, seems I worked on very different projects, because I think in very Rust-incompatible way.
Here’s an example. Quite often I need to compute long arrays of numbers, and the problem is parallel, like multiplication of large matrices. A good way to do that is slicing the output into blocks, and computing different blocks on different CPU cores, using OpenMP or some other thread pool. Different CPU cores need concurrent write access to the same vector, this is illegal in Rust.
> But in practice, you often want to minimize the use of actual pointers
In practice I often want actual pointers because graphs and trees are everywhere in computing. Many of them mutable, like DOM tree of this web page we’re visiting.
Pointer chasing is generally slow compared to arithmetic instructions, but much faster than hash maps which can be used to implement the same thing. A hash map lookup is chasing at least 1 pointer usually multiple (depends on the implementation), and before that spends time computing the hash.
> they encourage making many small allocations
Pointer and allocations are orthogonal. It’s possible to design pointer-based data structure like tree or graph where the nodes are owned by containers, as opposed to other nodes.
> Here’s an example. Quite often I need to compute long arrays of numbers, and the problem is parallel, like multiplication of large matrices. A good way to do that is slicing the output into blocks, and computing different blocks on different CPU cores, using OpenMP or some other thread pool. Different CPU cores need concurrent write access to the same vector, this is illegal in Rust.
This is actually easier in Rust than in C++, because of par_iter_mut() [1] from Rayon.
(In any case, usually if you want to do that sort of thing quickly then you'd use ndarray which can use BLAS.)
> Pointer chasing is generally slow compared to arithmetic instructions, but much faster than hash maps which can be used to implement the same thing. A hash map lookup is chasing at least 1 pointer usually multiple (depends on the implementation), and before that spends time computing the hash.
Usually in Rust you use indices into arrays instead, which can be folded into the addressing mode on most architectures. If you really want to use a hash map, there's slotmap which precomputes the hash.
> usually if you want to do that sort of thing quickly then you'd use ndarray which can use BLAS
In C++ I’d usually use Eigen, because these expression templates are saving memory allocations and bandwidth storing/reloading temporary matrices. Sometimes much faster than BLAS libraries with C API. I’m not sure Rust has an equivalent.
> indices into arrays instead, which can be folded into the addressing mode on most architectures
For some applications of graphs and trees it’s useful to have nodes polymorphic. An example is a visual tree in GUI: different nodes are instances of different classes. Array elements are of the same type.
On AMD64 that’s only true when the size of the elements being addressed is 1/2/4/8 bytes, the SIB bytes only have 2 bits for the scale. For any other use case, addressing these arrays requires to multiply (or if you’re lucky at least left shift) these integers
Even when the elements are 8 bytes so the indexing can be merged, you need to either spend a register for the base address, or load it from memory with another instruction.
It’s relatively expensive to split or merge linked lists/trees/graphs stored that way. If the tree/graph is long lived, mutable, and changes a lot, eventually you might need to compact or even garbage collect these arrays.
> In C++ I’d usually use Eigen, because these expression templates are saving memory allocations and bandwidth storing/reloading temporary matrices. Sometimes much faster than BLAS libraries with C API. I’m not sure Rust has an equivalent.
That equivalent would be ndarray.
> For some applications of graphs and trees it’s useful to have nodes polymorphic. An example is a visual tree in GUI: different nodes are instances of different classes. Array elements are of the same type.
And in that case you can use Box (or Rc/Arc).
> Even when the elements are 8 bytes so the indexing can be merged, you need to either spend a register for the base address, or load it from memory with another instruction.
I've never seen this be a performance problem in practice; the cost of doing a shift and add is incredibly low compared to the cost of actually fetching the memory.
> It’s relatively expensive to split or merge linked lists/trees/graphs stored that way. If the tree/graph is long lived, mutable, and changes a lot, eventually you might need to compact or even garbage collect these arrays.
Which is the same thing modern thread-caching mallocs also have to do, except that compacting and garbage collecting is actually possible with the arena approach (not that I think it's terribly important either way).
It seems ndarray is conceptually similar to C libraries, it doesn’t have expression templates.
When you write r = a * b * c with ndarray, you allocate, store and then load a temporary array with a * b. When you write r = a * b * c with Eigen, depending on the types it sometimes skips the temporary, and instead computes the complete expression in one shot. For some use cases, the tactic causes substantial performance win.
> Box (or Rc/Arc)
An array of boxes will cause another pointer chase: first to load the pointer, another one to reach the payload.
All data structures are compromises. If you want something (such as the ability to interleave queries and updates), you lose something else (such as performance or space-efficiency). Instead of using a single general-purpose data structure, I've found it useful to have several specialized structures making different compromises, with efficient conversions between them.
When it comes to graphs, the naive hash map representation has its uses. But I more often use representations based on conceptual arrays. The representations could be row-based or column-based, the arrays could store bytes or structs, and the concrete arrays could be vectors or something similar to B+ trees. Not all combinations are relevant, but several of them are.
And then there are overlays. If one representation is otherwise ideal but it doesn't support a specific operation (such as mapping between graph positions and positions on certain paths), an overlay can fix that. Another overlay could use the graph to represent a subgraph induced by certain nodes. And another could represent the same graph after a transformation, such as merging unary paths into single nodes.
When you have several graph implementations and overlays, the interface has to be pretty generic. You probably want to use either node identifiers or opaque handles. The latter could be node identifiers, array offsets, or pointers, depending on the concrete graph.
Memory safety and safety is not "the whole point of Rust". It's a nice low-level language with proper sum types (!), neat sum-type approach to errors, proper and easy to use iterators and much else including nice tooling, dep management, very high standards in packages and docs overall, etc.
C++ lacks in pretty much all of those, so memory-safety or no, personally I can't picture myself choosing to do C++ for low-level work at will these days. Unless it's too low-level like dpdk or gpu kernels.
> the language is incredibly hard to learn
Not really. A few months from zero and you'll be up and running, provided you have prior C++ experience (based on cases I've observed in teams I've worked in). Maybe not including async rust though.
> it's impossible to implement any non-trivial data structures in safe rust
ANY, really? Any examples? (Other than linked list or other self-dependent structures)
The usability is not great, too opinionated. For example, they have decided their strings are guaranteed to contain valid UTF-8. Many external components (like Linux kernel) don’t have these guarantees, so the Rust has another OsString type for them. I think Rust is the only language which does that. The rest of them are fine keeping invalid UTF8 / UTF16 in their strings.
> proper and easy to use iterators
C++/11 introduced range-based for loops, they make iterators very easy to use.
> nice tooling
Packages are OK I guess, but I think other classes of tools are lacking. For example, I can’t imagine working on any performance critical low-level code without a good sampling profiler.
By comparison, modern C# is memory safe all the way down, possible to implement even very complicated data structures without a single line of unsafe code. The collection classes from the standard library don’t use unsafe code either.
> I think Rust is the only language which does that.
Python 3 strings are enforced valid Unicode (but not technically UTF-8, as they can contain unpaired surrogates).
> Packages are OK I guess, but I think other classes of tools are lacking. For example, I can’t imagine working on any performance critical low-level code without a good sampling profiler.
I've been using sampling profilers with Rust since before the language was publicly announced!
> Yes. Even very simple data structures like std::vec::Vec require unsafe code, they call unsafe APIs like alloc::alloc for memory management. This can, and did, cause security bugs:
This is yet another example of "Rust's liberal use of 'security bug' gives people the wrong impression". Although I wish they didn't, Rust users often use "security bug" to mean any memory safety problem in safe code. By this measure every C++ program in existence is a security bug. The question is whether any applications were exploitable because of this, and I don't know of any.
> By comparison, modern C# is memory safe all the way down, possible to implement even very complicated data structures without a single line of unsafe code.
Only because the runtime has hundreds of thousands of lines of unsafe code written in C++, including the garbage collector that makes it possible to write those collections.
> Even very simple data structures like std::vec::Vec require unsafe code, they call unsafe APIs like alloc::alloc for memory management. This can, and did, cause security bugs
That's way better than everyone writing their own unsafe vec implementation in C or C++, and then having to deal with those same security bugs over and over and over.
And I'm sure GNU's libstdc++, LLVM's libc++, and whatever Microsoft calls their C++ stdlib have had a laundry list of security bugs in their implementations of the fundamental data structures over time. Just they were found and fixed 20 years ago, in a time when the security issue du jour wasn't huge news like it is today.
> By comparison, modern C# is memory safe all the way down, possible to implement even very complicated data structures without a single line of unsafe code
You really can't compare a GC'd language to Rust (or C or C++), at least not without acknowledging the inherent tradeoffs being made. And obviously that C# runtime has a bunch of unsafe C/C++ underneath it that lets you do those things.
> everyone writing their own unsafe vec implementation in C or C++
No one does that in C++, people use std::vector
> I'm sure GNU's libstdc++, LLVM's libc++, and whatever Microsoft calls their C++ stdlib have had a laundry list of security bugs in their implementations
Admittedly, it’s not too important for standard libraries because most bugs already found and fixed by other people. Still, very important when designing custom data structures.
> You really can't compare a GC'd language to Rust (or C or C++), at least not without acknowledging the inherent tradeoffs being made
For most use cases these tradeoffs are not that bad, especially with modern .NET. I have great experience using .NET Core on 32-bit ARMv7 embedded Linux (not a hobby project, a commercially available equipment), with touch screen GUI and networking. Another example, here’s a library implementing a subset of ffmpeg for ARM Linux in memory-safe C#: https://github.com/Const-me/Vrmac/tree/master/VrmacVideo
That’s a large library with tons of C++ which implement much more than just a video player. See readme on the main page of that repository.
The video player in the VrmacVideo subfolder consumes Linux kernel APIs like libc, V4L2 and ALSA. It also uses C libraries which implement audio decoders. Most of the unsafe code you see in that subfolder is required to interop with these things.
The player itself, including parsers for Mpeg4 and MKV containers, is memory-safe C# apart from very small pieces, most of them in that static class: https://github.com/Const-me/Vrmac/blob/master/VrmacVideo/Con... And these functions are only used by Mpeg4 container parser. The parser for MKV containers didn’t need any unsafe at all.
That project is an interesting showcase for .NET because many people incorrectly believe that handling high bitrate realtime audio/video requires C or C++, and high level JIT compiled languages with garbage collected runtimes is a no go for use cases like that. However, as you see from the readme, the performance is slightly better than VLC player running on the same device.
> performance is slightly better than VLC player running on the same device
Interesting, this was back then with .NET Core 2.1, right? Codegen quality has massively improved since then, the difference is particularly stark between 6, 7 and 8 if you use structs (7 -> 8 goes from simplistic enregistration to physical promotion that closes the gap with LLVM in some scenarios).
Yeah, .NET Core 2.1. The target device was Raspberry Pi 4 running Debian Linux.
The code uses many structs defined by Mpeg4, MKV, and V4L2 kernel API.
However, my C# code doesn’t do much work during playback. It loads slices of bytes from the input file, sends video frames to V4L2 (the OS exposes an asynchronous API and supports memory-mapping so I can read bytes directly into V4L2 buffers), sends compressed audio into dedicated audio thread, that thread decodes to PCM format into memory mapped ALSA buffer. Both threads then sleep on poll() listening for multiple handles each, and dispatching these events.
I didn’t have a profiler on Pi4 but I expect majority of CPU time spent during playback is in external unmanaged code, device drivers and audio codecs. I’m not sure a modern runtime gonna help substantially.
It will probably improve seek, and especially loading. These use cases are implemented with a large pile of C# code because media files (mpeg4 format in particular) are ridiculously complicated.
What tradeoffs are you referring to here, exactly? C# (in the right hands) is more flexible than what you seem to be implying here, and the GP is likely in the small minority of experts that knows how to leverage it.
The limiting ”tradeoffs” with C# have much more to do with the JIT than anything related to a GC. That’s why you don’t see constant expressions (but in the future that may change).
This one specifically is just how C# is today. Otherwise, it's like saying that lack or presence of C++ constexpr is a LLVM feature - JIT/ILC have little to do with it.
Keep in mind that const patterns is a relatively niche feature, and won't be bringing C++ level of constexpr to C#. As it is currently, it's not really an issue outside of subset of C++ developers which adopt C# and try to solve problems in C++ way over leaning on the ways offered by C# or .NET instead (static readonly values are baked into codegen on recompilation, struct generics returning statically known values lead to the code getting completely optimized away constexpr style, etc.)
As long as you don't fight the tool, it will serve you well.
> What tradeoffs are you referring to here, exactly? C# (in the right hands) is more flexible than what you seem to be implying here, and the GP is likely in the small minority of experts that knows how to leverage it.
It's flexible, but if you want to write something that is GC-less, you're on your own. Most libraries (if they exist) don't care about allocation. Or performance.
I'm not talking HFT here, I'm talking about performance intensive games.
> if you want to write something that is GC-less, you're on your own
GC allocations only harm performance when they happen often. It’s totally fine to allocate stuff on managed heap on startup, initialization, re-initialization, or similar.
Another thing, modern C# is very usable even without GC allocations. Only some pieces of the standard library allocate garbage (most notably LINQ), the majority of the standard library doesn’t create garbage. Moreover, I believe that percentage grows over time as the standard library adds API methods operating on modern things like Span<byte> in addition to the managed arrays. Spans can be backed by stack memory, or unmanaged heap, or even weird stuff like the pointers received from mmap() Linux kernel call to a device driver.
> I'm not talking HFT here, I'm talking about performance intensive games.
You want to minimize and ideally eliminate the garbage created by `operator new` during gameplay of a C# game. But it’s exactly the same in C++, you want to minimize and ideally eliminate the malloc / free runtime library calls during gameplay of a C++ game.
In practice, games use pools, per-frame arena allocators, and other tricks to achieve that. This is language agnostic, C++ standard library doesn’t support any of these things either.
Most libraries used in a typical LoB or web apps — yeah, probably. However, in performance-critical C# LINQ ain’t even applicable because Span<T> and ReadOnlySpan<T> collections don’t implement the IEnumerable<T> interface required by the LINQ functions.
Here’s couple examples of relatively complicated logic which uses loops instead of LINQ:
Transient allocations that do not outlive Gen0 collections are fine. Hell, they are mostly a problem in games that want to reach very high frequency main loop, like Osu! which runs at 1000hz. There are also multiple sets of libraries that target different performance requirements - there are numerous libraries made specifically for gamedev that are allocation-aware. The performance culture in general has significantly improved within recent years. But you wouldn't know that as you don't actively use C#.
> Transient allocations that do not outlive Gen0 collections are fine.
Depends on the game. What if you have huge numbers of entities (100k), pulling huge numbers of fields from yaml (hundreds if not thousands) that live almost as long as the game (unless you use hot reload, or run admin commands)?
This is problem SS14 (multiple simulation game with ~100-200 players) had with YamlDotNet. To quote devs: it's slow to parse and allocates out of the ass.
What I don't understand is what makes you think it's an issue with .NET or its ecosystem: if you continuously re-parse large amount of entities defined in YAML, then any remotely naive approach is going to be expensive regardless of the underlying stack. Rust, C++ or C#, and respective serialization libraries only can do as much as define the threshold beyond which the scaling becomes impossible, hell, there's a good change that .NET's SRV GC could have been mitigating otherwise even worse failure mode caused by high allocation traffic, and if the data was shared, you'd hate to see the flamegraphs of entities shared through Arc or shared_ptr.
SS14 seems to have quite complex domain problem to solve, so a requirement for an entirely custom extensive (and impressive) implementation might seem unfortunate but not uncharacteristic.
> What I don't understand is what makes you think it's an issue with .NET or its ecosystem.
AFAIK the issue remains unsolved to this day I've seen them complain about YamlDotNet as early as a couple of months ago.
But shouldn't there be a zero-copy solution? Dotnet is way older than Rust, has the necessary tools but no libs. Hell Java oftentimes has better maintained libs. I recall looking for Roaring Bitmaps implementation and the Java one was both a port, rather than wrapper, and much better maintained.
> SS14 seems to have quite complex domain problem to solve
> Please put your FUD elsewhere
Sure I agree on some level that domain is complex, but in context of your previous messages it's not FUD. If you intend to work on a game that pushes boundaries in complexity and you pick C#, you're going to have a bad time.
Like sure you can write Cookie clicker in Python, but that doesn't mean you can write Star Citizen 3 in it.
Problem is can you tell at start of project how complex will it be?
Your arguments do not make sense, they are cherry picked and do not look on the overall state of ecosystem (I could bring up the woeful state of Rust's stdlib memchr and other routines, or C++ being way slower than expected in general purpose code because turns out GCC can't solve lack of hands, but I didn't, because all these languages have their merits, which you seem to be incapable of internalizing). Roaring bitmaps is not even properly implementable in Java in regards to optimal SIMD form, which it is in C#. It's a matter of doing it.
I think I understand your position now - you think the grass is greener on the other side, which it's not. There is probably little I can say that would change your mind, you already made it and don't care about actual state of affairs. The only recourse is downvote and flag, which would be the accurate assessment of the quality of your replies. Though it is funny how consistent gamedev-adjacent communities are in having members that act like this. Every time this happens, I think to myself "must be doing games", and it always ends up being accurate.
> Your arguments do not make sense, they are cherry picked and do not look on the overall state of ecosystem
My argument is my (indirect) experience in the C# ecosystem. I'm of firm belief I don't have to preface anything with IMO. But for clarity, I will clarify below.
By indirect I mean people on SS14 cursing YamlDotNet, and asking for more things to be written as a struct, with more stackalloc.
By my experience means dabbling in C#, trying to find solutions SS14 maintainers could use. Trying to find acceptable
Roaring and Fluent localization implementation.
> Roaring bitmaps is not even properly implementable in Java in regards to optimal SIMD form, which it is in C#. It's a matter of doing it.
Plain wrong. No it doesn't depend on SIMD, look into Roaring Bitmaps paper. It's possible to implement it without any SIMD. The basis of the algorithm depends on hybridization of storage of bitset into three types: bitmap, array and run.
C# didn't even have the "naive" implementation at time of writing. It did have bindings, but those were a no go.
> could bring up the woeful state of Rust's stdlib memchr
Anyone worth their salt in Rust would tell you, to just use the memchr crate. We were discussing ecosystem not just stdlib.
> I think I understand your position now - you think the grass is greener on the other side, which it's not.
Grass is greener? I'm not a C# dev. I don't pine for something I know less than Java. Essence of it is - if you want a high perf game engine better go Rust, Zig, or C++. If you want to use Godot or Unity, then C# is already enforced.
That aside.
Greener or not greener. The SS14 project was forced to write their own localization library from scratch. If they had enough mad people, they would have rewrote the Yaml parser as well. And god knows what else.
I did look into efficient bitmaps for their PVS system. But promptly gave up. And that's the extent of my contributions to SS14.
> Though it is funny how consistent gamedev-adjacent communities are in having members that act like this
Game dev adjacent? Are you sure that's the right term? I'm not affiliated with SS14 at all.
Yes. I made some minor contributions to it, but I also contributed to cargo, Servo, and Yaml.
Does that makes me Yaml-adjacent or Browser-adjacent?
I do have idea what it is to write a C# engine. I was involved in some minor parts and frequented dev chat. It's messy, and you got to NIH lots of stuff.
Samply is awesome. It's my go-to profiler I can throw at any un-stripped binary on any OS that I use and then just drill down with a nice Firefox profiler UI (particularly the fact that it allows to navigate multi-threaded traces well).
In fact, its predominant use for me so far was collecting "full" traces of C# code that shows the exact breakdown between time spent in user code and in write barriers and GC since .NET's AOT compilation target produces "classic" native binaries with regular object/symbol format for a given platform.
> The usability is not great, too opinionated. For example, they have decided their strings are guaranteed to contain valid UTF-8.
Wait, why is that a problem? Using UTF-8 only is the closest to programmer's ideal. Basically, you turn the UTF-8 validator for Rust's str into a no-op.
> Yes. Even very simple data structures like std::vec::Vec require unsafe code
I guess this is the sort of glass half-full vs glass half-empty.
In C++ the std::vector::pop_back will forever cause undefined behavior. It's like putting razors on your screwdriver's handle, sure you can avoid them, but why not do the sane thing...
> Wait, why is that a problem? Using UTF-8 only is the closest to programmer's ideal.
I have a lot of archived data with various unknown encodings. APIs that blindly try to force an encoding when all I have is "bytes" tend to be a major pain and I usually end up bruteforcing my way around them with things like base64.
If that's the case the winning move is get the bytes, and convert them to UTF8 or just process it as bytes. &str is a wrapper around &[u8] with strict UTF8 guarantees.
Modern systems should be able to convert various encodings at GiB/s into UTF8.
> If that's the case the winning move is get the bytes, and convert them to UTF8
That would require knowing the original encoding.
> or just process it as bytes
As long as the APIs you have to use take bytes and not u8 strings.
> Modern systems should be able to convert various encodings at GiB/s into UTF8.
They might even guess the correct source encoding a quarter of the time and it will break any legacy application that still has to interact with the data. I guess that would be a win-win situation for Rust.
Yes, you can try to automatically guess the wrong encoding based on statistics that only work some of the time when given large amounts of free flowing text.
The whole point of Rust is memory safety and security, but I don’t care too much about either of them. I mostly work on offline desktop apps. When I do work on network exposed servers, I use C# which delivers even better safety and security for a reasonable performance overhead.
In pursue of that memory safety, Rust sacrificed pretty much everything else.
The language is incredibly hard to learn. The compiler is even slower than C++. It’s hard to do R&D in Rust because memory safety forces minor changes into large refactors. The recent push to async/await is controversial, to say the least.
It’s impossible to implement any non-trivial data structures in safe Rust. In my line of work designing data structures is more important than writing executable code, because the gap between RAM bandwidth and processor performance have been growing for decades now.
Also, interesting post and discussion: https://news.ycombinator.com/item?id=40172033