More

aldanor · 2024-06-15T02:00:23

They say that modern LLMs have reached intelligence of older kids. Perhaps it's the other way around...

userbinator · 2024-06-15T07:20:51

That's the same thought I had --- if humans are doing no better than LLMs, it won't be long before they'll be replaced with one. I've seen many similar instances with much older humans (adults), unfortunately.

aldanor · 2024-06-14T00:04:27

Memory safety and safety is not "the whole point of Rust". It's a nice low-level language with proper sum types (!), neat sum-type approach to errors, proper and easy to use iterators and much else including nice tooling, dep management, very high standards in packages and docs overall, etc.

C++ lacks in pretty much all of those, so memory-safety or no, personally I can't picture myself choosing to do C++ for low-level work at will these days. Unless it's too low-level like dpdk or gpu kernels.

> the language is incredibly hard to learn

Not really. A few months from zero and you'll be up and running, provided you have prior C++ experience (based on cases I've observed in teams I've worked in). Maybe not including async rust though.

> it's impossible to implement any non-trivial data structures in safe rust

ANY, really? Any examples? (Other than linked list or other self-dependent structures)

Const-me · 2024-06-14T00:43:37

> a nice low-level language

The usability is not great, too opinionated. For example, they have decided their strings are guaranteed to contain valid UTF-8. Many external components (like Linux kernel) don’t have these guarantees, so the Rust has another OsString type for them. I think Rust is the only language which does that. The rest of them are fine keeping invalid UTF8 / UTF16 in their strings.

> proper and easy to use iterators

C++/11 introduced range-based for loops, they make iterators very easy to use.

> nice tooling

Packages are OK I guess, but I think other classes of tools are lacking. For example, I can’t imagine working on any performance critical low-level code without a good sampling profiler.

> ANY, really?

Yes. Even very simple data structures like std::vec::Vec require unsafe code, they call unsafe APIs like alloc::alloc for memory management. This can, and did, cause security bugs: https://shnatsel.medium.com/how-rusts-standard-library-was-v...

By comparison, modern C# is memory safe all the way down, possible to implement even very complicated data structures without a single line of unsafe code. The collection classes from the standard library don’t use unsafe code either.

pcwalton · 2024-06-14T02:48:47

> I think Rust is the only language which does that.

Python 3 strings are enforced valid Unicode (but not technically UTF-8, as they can contain unpaired surrogates).

> Packages are OK I guess, but I think other classes of tools are lacking. For example, I can’t imagine working on any performance critical low-level code without a good sampling profiler.

I've been using sampling profilers with Rust since before the language was publicly announced!

> Yes. Even very simple data structures like std::vec::Vec require unsafe code, they call unsafe APIs like alloc::alloc for memory management. This can, and did, cause security bugs:

This is yet another example of "Rust's liberal use of 'security bug' gives people the wrong impression". Although I wish they didn't, Rust users often use "security bug" to mean any memory safety problem in safe code. By this measure every C++ program in existence is a security bug. The question is whether any applications were exploitable because of this, and I don't know of any.

> By comparison, modern C# is memory safe all the way down, possible to implement even very complicated data structures without a single line of unsafe code.

Only because the runtime has hundreds of thousands of lines of unsafe code written in C++, including the garbage collector that makes it possible to write those collections.

kelnos · 2024-06-14T06:32:58

> Even very simple data structures like std::vec::Vec require unsafe code, they call unsafe APIs like alloc::alloc for memory management. This can, and did, cause security bugs

That's way better than everyone writing their own unsafe vec implementation in C or C++, and then having to deal with those same security bugs over and over and over.

And I'm sure GNU's libstdc++, LLVM's libc++, and whatever Microsoft calls their C++ stdlib have had a laundry list of security bugs in their implementations of the fundamental data structures over time. Just they were found and fixed 20 years ago, in a time when the security issue du jour wasn't huge news like it is today.

> By comparison, modern C# is memory safe all the way down, possible to implement even very complicated data structures without a single line of unsafe code

You really can't compare a GC'd language to Rust (or C or C++), at least not without acknowledging the inherent tradeoffs being made. And obviously that C# runtime has a bunch of unsafe C/C++ underneath it that lets you do those things.

Const-me · 2024-06-14T11:01:36

> everyone writing their own unsafe vec implementation in C or C++

No one does that in C++, people use std::vector

> I'm sure GNU's libstdc++, LLVM's libc++, and whatever Microsoft calls their C++ stdlib have had a laundry list of security bugs in their implementations

True, C++ shares that thing with Rust. However, C# doesn’t. Their standard library is implemented in memory-safe code. Here’s the equivalent of std::vector, BTW https://source.dot.net/#System.Private.CoreLib/src/libraries...

Admittedly, it’s not too important for standard libraries because most bugs already found and fixed by other people. Still, very important when designing custom data structures.

> You really can't compare a GC'd language to Rust (or C or C++), at least not without acknowledging the inherent tradeoffs being made

For most use cases these tradeoffs are not that bad, especially with modern .NET. I have great experience using .NET Core on 32-bit ARMv7 embedded Linux (not a hobby project, a commercially available equipment), with touch screen GUI and networking. Another example, here’s a library implementing a subset of ffmpeg for ARM Linux in memory-safe C#: https://github.com/Const-me/Vrmac/tree/master/VrmacVideo

jiripospisil · 2024-06-14T16:14:06

> Another example, here’s a library implementing a subset of ffmpeg for ARM Linux in memory-safe C#

What's with all those "unsafe" keywords then? https://github.com/search?q=repo%3AConst-me%2FVrmac+unsafe&t...

Const-me · 2024-06-14T16:47:52

That’s a large library with tons of C++ which implement much more than just a video player. See readme on the main page of that repository.

The video player in the VrmacVideo subfolder consumes Linux kernel APIs like libc, V4L2 and ALSA. It also uses C libraries which implement audio decoders. Most of the unsafe code you see in that subfolder is required to interop with these things.

The player itself, including parsers for Mpeg4 and MKV containers, is memory-safe C# apart from very small pieces, most of them in that static class: https://github.com/Const-me/Vrmac/blob/master/VrmacVideo/Con... And these functions are only used by Mpeg4 container parser. The parser for MKV containers didn’t need any unsafe at all.

That project is an interesting showcase for .NET because many people incorrectly believe that handling high bitrate realtime audio/video requires C or C++, and high level JIT compiled languages with garbage collected runtimes is a no go for use cases like that. However, as you see from the readme, the performance is slightly better than VLC player running on the same device.

neonsunset · 2024-06-14T17:41:04

> performance is slightly better than VLC player running on the same device

Interesting, this was back then with .NET Core 2.1, right? Codegen quality has massively improved since then, the difference is particularly stark between 6, 7 and 8 if you use structs (7 -> 8 goes from simplistic enregistration to physical promotion that closes the gap with LLVM in some scenarios).

Const-me · 2024-06-14T19:03:46

Yeah, .NET Core 2.1. The target device was Raspberry Pi 4 running Debian Linux.

The code uses many structs defined by Mpeg4, MKV, and V4L2 kernel API. However, my C# code doesn’t do much work during playback. It loads slices of bytes from the input file, sends video frames to V4L2 (the OS exposes an asynchronous API and supports memory-mapping so I can read bytes directly into V4L2 buffers), sends compressed audio into dedicated audio thread, that thread decodes to PCM format into memory mapped ALSA buffer. Both threads then sleep on poll() listening for multiple handles each, and dispatching these events.

I didn’t have a profiler on Pi4 but I expect majority of CPU time spent during playback is in external unmanaged code, device drivers and audio codecs. I’m not sure a modern runtime gonna help substantially.

It will probably improve seek, and especially loading. These use cases are implemented with a large pile of C# code because media files (mpeg4 format in particular) are ridiculously complicated.

Nuzzerino · 2024-06-14T07:50:45

What tradeoffs are you referring to here, exactly? C# (in the right hands) is more flexible than what you seem to be implying here, and the GP is likely in the small minority of experts that knows how to leverage it.

The limiting ”tradeoffs” with C# have much more to do with the JIT than anything related to a GC. That’s why you don’t see constant expressions (but in the future that may change).

https://github.com/dotnet/csharplang/issues/6926#issuecommen...

neonsunset · 2024-06-14T12:43:52

This one specifically is just how C# is today. Otherwise, it's like saying that lack or presence of C++ constexpr is a LLVM feature - JIT/ILC have little to do with it.

Keep in mind that const patterns is a relatively niche feature, and won't be bringing C++ level of constexpr to C#. As it is currently, it's not really an issue outside of subset of C++ developers which adopt C# and try to solve problems in C++ way over leaning on the ways offered by C# or .NET instead (static readonly values are baked into codegen on recompilation, struct generics returning statically known values lead to the code getting completely optimized away constexpr style, etc.)

As long as you don't fight the tool, it will serve you well.

Ygg2 · 2024-06-14T08:48:07

> What tradeoffs are you referring to here, exactly? C# (in the right hands) is more flexible than what you seem to be implying here, and the GP is likely in the small minority of experts that knows how to leverage it.

It's flexible, but if you want to write something that is GC-less, you're on your own. Most libraries (if they exist) don't care about allocation. Or performance.

I'm not talking HFT here, I'm talking about performance intensive games.

Const-me · 2024-06-14T12:45:26

> if you want to write something that is GC-less, you're on your own

GC allocations only harm performance when they happen often. It’s totally fine to allocate stuff on managed heap on startup, initialization, re-initialization, or similar.

Another thing, modern C# is very usable even without GC allocations. Only some pieces of the standard library allocate garbage (most notably LINQ), the majority of the standard library doesn’t create garbage. Moreover, I believe that percentage grows over time as the standard library adds API methods operating on modern things like Span<byte> in addition to the managed arrays. Spans can be backed by stack memory, or unmanaged heap, or even weird stuff like the pointers received from mmap() Linux kernel call to a device driver.

> I'm not talking HFT here, I'm talking about performance intensive games.

You want to minimize and ideally eliminate the garbage created by `operator new` during gameplay of a C# game. But it’s exactly the same in C++, you want to minimize and ideally eliminate the malloc / free runtime library calls during gameplay of a C++ game.

In practice, games use pools, per-frame arena allocators, and other tricks to achieve that. This is language agnostic, C++ standard library doesn’t support any of these things either.

Ygg2 · 2024-06-14T14:18:19

> most notably LINQ

That's the culprit and most libraries use the hell out of it. I mean it's a widely touted feature of C#, it makes sense to use it.

Which is a problem once it becomes one of your dependencies and starts hogging your memory.

What to do to offending library:

A) Rewrite it? B) Work around it? C) Cry?

Const-me · 2024-06-14T15:05:25

> most libraries use the hell out of it

Most libraries used in a typical LoB or web apps — yeah, probably. However, in performance-critical C# LINQ ain’t even applicable because Span<T> and ReadOnlySpan<T> collections don’t implement the IEnumerable<T> interface required by the LINQ functions.

Here’s couple examples of relatively complicated logic which uses loops instead of LINQ:

https://github.com/Const-me/Vrmac/blob/master/VrmacVideo/Aud...

https://github.com/Const-me/Vrmac/blob/master/VrmacVideo/Con...

neonsunset · 2024-06-14T14:47:06

Transient allocations that do not outlive Gen0 collections are fine. Hell, they are mostly a problem in games that want to reach very high frequency main loop, like Osu! which runs at 1000hz. There are also multiple sets of libraries that target different performance requirements - there are numerous libraries made specifically for gamedev that are allocation-aware. The performance culture in general has significantly improved within recent years. But you wouldn't know that as you don't actively use C#.

Please put your FUD elsewhere.

Ygg2 · 2024-06-15T14:26:15

> Transient allocations that do not outlive Gen0 collections are fine.

Depends on the game. What if you have huge numbers of entities (100k), pulling huge numbers of fields from yaml (hundreds if not thousands) that live almost as long as the game (unless you use hot reload, or run admin commands)?

This is problem SS14 (multiple simulation game with ~100-200 players) had with YamlDotNet. To quote devs: it's slow to parse and allocates out of the ass.

neonsunset · 2024-06-15T16:01:59

I think I found the issue you are referring to: https://github.com/space-wizards/RobustToolbox/issues/891 (2019) and PRs that address it https://github.com/space-wizards/RobustToolbox/pull/1606 (2021), https://github.com/space-wizards/space-station-14/pull/3491 (2021)

What I don't understand is what makes you think it's an issue with .NET or its ecosystem: if you continuously re-parse large amount of entities defined in YAML, then any remotely naive approach is going to be expensive regardless of the underlying stack. Rust, C++ or C#, and respective serialization libraries only can do as much as define the threshold beyond which the scaling becomes impossible, hell, there's a good change that .NET's SRV GC could have been mitigating otherwise even worse failure mode caused by high allocation traffic, and if the data was shared, you'd hate to see the flamegraphs of entities shared through Arc or shared_ptr.

SS14 seems to have quite complex domain problem to solve, so a requirement for an entirely custom extensive (and impressive) implementation might seem unfortunate but not uncharacteristic.

Ygg2 · 2024-06-15T17:47:17

> What I don't understand is what makes you think it's an issue with .NET or its ecosystem.

AFAIK the issue remains unsolved to this day I've seen them complain about YamlDotNet as early as a couple of months ago.

But shouldn't there be a zero-copy solution? Dotnet is way older than Rust, has the necessary tools but no libs. Hell Java oftentimes has better maintained libs. I recall looking for Roaring Bitmaps implementation and the Java one was both a port, rather than wrapper, and much better maintained.

> SS14 seems to have quite complex domain problem to solve

> Please put your FUD elsewhere

Sure I agree on some level that domain is complex, but in context of your previous messages it's not FUD. If you intend to work on a game that pushes boundaries in complexity and you pick C#, you're going to have a bad time.

Like sure you can write Cookie clicker in Python, but that doesn't mean you can write Star Citizen 3 in it.

Problem is can you tell at start of project how complex will it be?

neonsunset · 2024-06-15T18:21:28

Your arguments do not make sense, they are cherry picked and do not look on the overall state of ecosystem (I could bring up the woeful state of Rust's stdlib memchr and other routines, or C++ being way slower than expected in general purpose code because turns out GCC can't solve lack of hands, but I didn't, because all these languages have their merits, which you seem to be incapable of internalizing). Roaring bitmaps is not even properly implementable in Java in regards to optimal SIMD form, which it is in C#. It's a matter of doing it.

I think I understand your position now - you think the grass is greener on the other side, which it's not. There is probably little I can say that would change your mind, you already made it and don't care about actual state of affairs. The only recourse is downvote and flag, which would be the accurate assessment of the quality of your replies. Though it is funny how consistent gamedev-adjacent communities are in having members that act like this. Every time this happens, I think to myself "must be doing games", and it always ends up being accurate.

Ygg2 · 2024-06-16T11:56:13

> Your arguments do not make sense, they are cherry picked and do not look on the overall state of ecosystem

My argument is my (indirect) experience in the C# ecosystem. I'm of firm belief I don't have to preface anything with IMO. But for clarity, I will clarify below.

By indirect I mean people on SS14 cursing YamlDotNet, and asking for more things to be written as a struct, with more stackalloc.

By my experience means dabbling in C#, trying to find solutions SS14 maintainers could use. Trying to find acceptable Roaring and Fluent localization implementation.

> Roaring bitmaps is not even properly implementable in Java in regards to optimal SIMD form, which it is in C#. It's a matter of doing it.

Plain wrong. No it doesn't depend on SIMD, look into Roaring Bitmaps paper. It's possible to implement it without any SIMD. The basis of the algorithm depends on hybridization of storage of bitset into three types: bitmap, array and run.

C# didn't even have the "naive" implementation at time of writing. It did have bindings, but those were a no go.

> could bring up the woeful state of Rust's stdlib memchr

Anyone worth their salt in Rust would tell you, to just use the memchr crate. We were discussing ecosystem not just stdlib.

> I think I understand your position now - you think the grass is greener on the other side, which it's not.

Grass is greener? I'm not a C# dev. I don't pine for something I know less than Java. Essence of it is - if you want a high perf game engine better go Rust, Zig, or C++. If you want to use Godot or Unity, then C# is already enforced.

That aside.

Greener or not greener. The SS14 project was forced to write their own localization library from scratch. If they had enough mad people, they would have rewrote the Yaml parser as well. And god knows what else.

I did look into efficient bitmaps for their PVS system. But promptly gave up. And that's the extent of my contributions to SS14.

> Though it is funny how consistent gamedev-adjacent communities are in having members that act like this

Game dev adjacent? Are you sure that's the right term? I'm not affiliated with SS14 at all.

Yes. I made some minor contributions to it, but I also contributed to cargo, Servo, and Yaml.

Does that makes me Yaml-adjacent or Browser-adjacent?

neonsunset · 2024-06-16T12:32:14

Maybe you shouldn’t talk about something you have little idea about then?

Ygg2 · 2024-06-16T13:05:49

About what?

I do have idea what it is to write a C# engine. I was involved in some minor parts and frequented dev chat. It's messy, and you got to NIH lots of stuff.

I also got to port some zero copy parsers to C#.

orlp · 2024-06-14T08:51:31

> For example, I can’t imagine working on any performance critical low-level code without a good sampling profiler.

I'm pretty happy with https://github.com/mstange/samply.

It worked out-of-the-box on Linx and MacOS, even on Windows after installing directly from the repo (recently added feature).

neonsunset · 2024-06-14T15:39:03

Samply is awesome. It's my go-to profiler I can throw at any un-stripped binary on any OS that I use and then just drill down with a nice Firefox profiler UI (particularly the fact that it allows to navigate multi-threaded traces well).

In fact, its predominant use for me so far was collecting "full" traces of C# code that shows the exact breakdown between time spent in user code and in write barriers and GC since .NET's AOT compilation target produces "classic" native binaries with regular object/symbol format for a given platform.

Ygg2 · 2024-06-14T09:01:49

> The usability is not great, too opinionated. For example, they have decided their strings are guaranteed to contain valid UTF-8.

Wait, why is that a problem? Using UTF-8 only is the closest to programmer's ideal. Basically, you turn the UTF-8 validator for Rust's str into a no-op.

> Yes. Even very simple data structures like std::vec::Vec require unsafe code

I guess this is the sort of glass half-full vs glass half-empty. In C++ the std::vector::pop_back will forever cause undefined behavior. It's like putting razors on your screwdriver's handle, sure you can avoid them, but why not do the sane thing...

josefx · 2024-06-14T11:33:21

> Wait, why is that a problem? Using UTF-8 only is the closest to programmer's ideal.

I have a lot of archived data with various unknown encodings. APIs that blindly try to force an encoding when all I have is "bytes" tend to be a major pain and I usually end up bruteforcing my way around them with things like base64.

Ygg2 · 2024-06-14T14:10:25

If that's the case the winning move is get the bytes, and convert them to UTF8 or just process it as bytes. &str is a wrapper around &[u8] with strict UTF8 guarantees.

Modern systems should be able to convert various encodings at GiB/s into UTF8.

josefx · 2024-06-14T15:35:56

> If that's the case the winning move is get the bytes, and convert them to UTF8

That would require knowing the original encoding.

> or just process it as bytes

As long as the APIs you have to use take bytes and not u8 strings.

> Modern systems should be able to convert various encodings at GiB/s into UTF8.

They might even guess the correct source encoding a quarter of the time and it will break any legacy application that still has to interact with the data. I guess that would be a win-win situation for Rust.

Ygg2 · 2024-06-15T13:26:27

> That would require knowing the original encoding.

If you don't know that's one more reason to get bytes and try to figure out encoding. Usually using lib like encodings.rs or WTF8.rs

> As long as the APIs you have to use take bytes and not utf8 strings.

You can convert one into the other, albeit converting to str requires a check.

dralley · 2024-06-15T01:19:58

>That would require knowing the original encoding.

Or just use a library that can detect the encoding, and spit out utf-8. There's several of those.

josefx · 2024-06-15T13:16:20

Yes, you can try to automatically guess the wrong encoding based on statistics that only work some of the time when given large amounts of free flowing text.

aldanor · 2024-06-15T02:44:37

That's why byte strings exist. You don't know the encoding? Well, it's a binary chunk and nothing else.

shakow · 2024-06-14T08:34:27

> I can’t imagine working on any performance critical low-level code without a good sampling profiler.

cargo-flamegraph: am I a joke to you?

https://github.com/flamegraph-rs/flamegraph

aldanor · 2024-06-10T23:07:44

They were smart enough to invent a new term, Apple Intelligence. You know, AI

theshrike79 · 2024-06-11T04:34:22

It's referring to the whole platform of partially on device, partially in their private cloud -thing.

During the presentation itself, they talked about machine learning, not "AI".

aldanor · 2024-06-04T21:25:38

Arrow is columnar storage, nothing relevant here.

And capnproto is basically another version of pb... (see discussion above)

adolph · 2024-06-04T23:53:11

Look again at Arrow. Part of the concept is the wire format is the same as the in memory format—no de/serialization. Arrow isn’t intended for storage; they point to parquet for that.

Apache Arrow defines an inter-process communication (IPC) mechanism to transfer a collection of Arrow columnar arrays (called a “record batch”). It can be used synchronously between processes using the Arrow “stream format”, or asynchronously by first persisting data on storage using the Arrow “file format”.

The Arrow IPC mechanism is based on the Arrow in-memory format, such that there is no translation necessary between the on-disk representation and the in-memory representation. [0]

The Arrow spec aligns columnar data in memory to minimize cache misses and take advantage of the latest SIMD (Single input multiple data) and GPU operations on modern processors. [1]

0. https://arrow.apache.org/faq/

1. https://arrow.apache.org/docs/js/

aldanor · 2024-06-02T10:02:04

Sufficiently BSD-like :)

aldanor · 2024-05-30T22:35:40

Rogues likes are probably the best fit.

JohnFen · 2024-05-30T22:43:29

Too many games that call themselves "roguelike" these days are nothing at all like Rogue, though. I'm not exactly sure what the term really means anymore.

aldanor · 2024-05-31T02:19:44

Yes, it doesn't have much to do with the rogue anymore (except a few games). These days, the meaning is along the lines of "the game consists of many repeating runs, each one being reasonably short like 0.5-1h, failure is normal, you will progress in many different ways both within each run and globally even if you fail, there will be a predominant effect of randomness, there may be some exploration effect".

secstate · 2024-05-31T02:15:36

Right?! Cataclysm DDA is the first thing I think of when I think roguelike. And then someone says Slay the Spire is roguelike because, I dunno, you progress in ability. I suppose I'm just too old.

aldanor · 2024-05-30T22:19:24

If Firefox natively integrated TreeStyle tab or Sidebery and made it lightning fast, searchable and syncable, I would probably return to FF on that day.

Definitely not yet another "reduction of visual clutter".

aldanor · 2024-05-30T10:56:41

German, and Russian, and a few other languages...

aldanor · 2024-05-16T14:44:35

Feel thoughts or think feelings?

aldanor · 2024-05-16T10:14:11

Or, use sonic_rs::Value which uses simd and arena allocation for keys and in my case for big payloads was 8x faster than serds_json.

Or, use sonic_rs::LazyValue if you want to parse json on demand as you traverse it.

ComputerGuru · 2024-05-16T18:04:22

Hey, that's a nifty looking library. Thanks for pointing it out to me!