Hacker News new | past | comments | ask | show | jobs | submit login

> Huge. Only checking array bounds on every access degrades performance considerably.

Citation needed, since all evidence points to the contrary.

Could you please point us to a Rust application (there are hundreds of thousands at this point) that gets noticeably faster when disabling bound checks?

In servo, a whole web browser written in Rust, the cost of doing this was negligible, to the point that it was barely measurable (1-3%, noise levels for such a big app).

Same for Firefox which has a substantial amount of Rust.

Go ahead and give Fuchsia a try. You can enable bound checks for a substantial part of Android's user space and not really notice it.

Same for Redox, or any operating system kernel written in Rust.

You have many large applications to choose from, so please, just point us to 1 for which this is the case.

---

Compared with other mitigations already in the kernel, that can cost you up to 50% perf, and that people seem to be ok with, bound checking all the array accesses seems like a no brainer, given that ~70% of CVE are caused by memory issues.

When most people think about bound checking all array accesses, they think, for some "i can only think inside the box" reason, that this happens on hardware, for every memory access.

But that is not how Rust works. Rust adds bound checks "in Rust", and the Rust compiler and LLVM are really good at removing duplicates, hoisting many bound checks out of a loop into a single bound check at the beginning of the loop, etc.

People also think that this is an all or nothing approach, but Rust allows you to manually access without bound checks and do the hoisting manually. So if you find a function in which this makes a big difference, you can just fix the performance issue there manually.




For the optimisation, the compiler will even reason that e.g. iterating over the vector necessarily involves knowing the size of the vector and stopping before the end, so it doesn't need to add the bounds check at all because that's redundant. This is easier in Rust because the compiler knows nobody else can be mutating this vector while you're iterating over it - that's forbidden in the language.

So, in general, the idiomatic Rust "twiddle all these doodads" compiles to the same machine code as the idiomatic C++ for that problem, even though Rust bounds checked it and C++ didn't care. Lots of Rust checks are like this, they compile away to nothing, so long as what you did is necessarily correct. The Option<NonZeroU64> stuff a few days ago is another example. Same machine code as a C++ long integer using zero as a signal value, but with type safety.


Same applies when using C++ with bounds checking enabled, but the FUD regarding bounds checking is deep.


What happens about mutation in the C++ case though? Maybe I should just try it in Godbolt instead of bothering people on HN.


What mutation?

Naturally this only kind of works when everyone on the team goes safety first.

Doing some C style coding will just bork it, similarly to any unsafe block or FFI call in other better suited languages.

But in the subject of making juice with lemons, is way better than plain C.


Right, so long as there isn't mutation, we're golden, which is why the machine code is the same.

This is, after all, why Godbolt was first invented as I suspect you know (Matt Godbolt wondered if C++ iterators really do produce the same machine code as a hand-rolled C-style for loop, and rather than just trust an expert he built the earliest Compiler Explorer to show that yes, with real C++ code you get the same machine code, any time spent hand-rolling such loops is time wasted)


Yeah, but since there are domains where C++ is unavoidable, this is the best we can do.

By the way, this should be a nice update about the state of affairs on Android (I am yet to watch it).

"Improving Memory Safety in Android 12 Using MTE"

https://devsummit.arm.com/en/sessions/57


I don't understand what you mean by "domains where C++ is unavoidable" in this context. C++ is a choice, presumably usually a reasonable choice, but a choice, so if they wanted to people could avoid it.

Memory tagging (which is what MTE is about) reminds me of ASLR and password entropy requirements. They're slightly raising the bar which is not something I have much time for. I prefer to put the effort in to solve problems permanently so I can worry about something else instead. Whether that's a practical opportunity here is unclear though, and I think Rust is a big part of finding out.


I did once have a bizarre situation where removing a bounds check that always succeeded degraded performance by over 30%.

The bounds check wasn't being elided either. I checked and it was there in the assembly, so I figured that the function is so hot that an unchecked access might help things. Apparently not. The only thing I can think of is that the reduction in code-size for that function had an unintended effect elsewhere, either for the optimizer or that it resulted in a hot bit of code crossing a cache line?


Or the always succeeding check helped CPU's branch predictor substantially.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: