So, I had been thinking about this post for a while, and Blow's video caused som...

nemothekid · on Sept 15, 2018

>There's a world of difference between "logic error and/or panic" and "undefined behavior".

I recently started using Rust daily as a break from $dayjob because I've never really liked C++. I took the time to watch Blow's full rant because I think we made an interesting point, that would take issue with your retort.

The naive version of West's "memory allocator" (without the generational index), in the context of her game, would have also had undefined behavior (in the game world). The naive system still defeats the borrow checker, but you can still end up in a situation where you try to deference something that no longer exists, and worse still since the object that lives there was the same type as before, your state corruption is even more silent. This necessitates the need for a generational index, however West only knew to use a generational index because she is an experienced game dev, not because the borrow checker told her to.

For Blow (who hasn't written rust), believes the borrow checker (and/or language) should prevent these kinds of logical bugs in his game code and bypassing it in this way effectively turns it off ("entity safety"), while the Rust borrow checker only really guarantees memory safety.

His final argument is then, since the borrow checker doesn't provide "entity safety", it impedes on games development because Blow (and any other modern game developer) would have been smart enough to start with a proper ECS system anyways and the borrow checker wouldn't have bought him anything. This final argument is where I disagree with Blow, but 1.) I don't think any programmer is smart enough 100% of the time however I will concede he comes from a different world (Game Dev) which has stricter deadlines than most other industries so he may be more sensitive to tools like the Borrow checker. 2.) Something I've noticed with Go as well is when the language developers tend to say something like "you can't use this toy because you will shoot yourself", it becomes a personal attack on developer ego rather than a nuanced trade off on system stability.

steveklabnik · on Sept 15, 2018

> would have also had undefined behavior (in the game world).

"Undefined behavior" is a term of art in programming languages. She would have a logic error, but not UB. See my comment over here: https://news.ycombinator.com/item?id=17995007

kibwen · on Sept 15, 2018

> His final argument is then, since the borrow checker doesn't provide "entity safety", it impedes on games development because Blow (and any other modern game developer) would have been smart enough to start with a proper ECS system anyways and the borrow checker wouldn't have bought him anything.

But even in C++ gamedev using a proper ECS, one is still using plenty of plain-old references all over the place for things unrelated to the world state, no? If so, then saying "the borrow checker doesn't help manage world state" doesn't imply that the borrow checker doesn't benefit the codebase in other places, especially considering that use of an ECS circumvents the places where we have determined that references (and hence the borrow checker) are already poorly suited to model.

kbwt · on Sept 15, 2018

Thanks for the response. I wasn't trying to speculate on your intentions for publishing the blog post.

> There's a world of difference between "logic error and/or panic" and "undefined behavior".

Is is really so different for the programmer who wrote the bug?

If you have undefined behavior, the language implementation can do whatever it wants. It won't actively work against you, but the implementer is given permission to ignore what would happen if you violate their assumptions.

With a logic error in custom memory management, the program execution will still be following well-defined rules but the invariants assumed by the programmer will no longer hold. The resulting behavior appears effectively undefined to the programmer, because the point of invariants is to ignore what would happen when they are broken.

Defensive coding with panics/asserts will definitely help catch some of these mistakes during development.

> Yes, Rust doesn't fix all bugs. But it's still an improvement here.

I applaud your efforts with Rust, it's great to see someone actually trying to improve the state of programming languages.

Rusky · on Sept 15, 2018

> Is is really so different for the programmer who wrote the bug?

Yes, absolutely. The symptoms may be similar (though the logic error will still never lead to memory corruption or segfaults), but debugging is much easier when you can still rely on the language's invariants, if not your own.

steveklabnik · on Sept 15, 2018

It’s all good, it’s a totally reasonable thing, which was also brought up in all the other threads :)

> it won’t actively work against you

I guess it depends on what you mean by “active.” Consider the Option<NonNull<T>> case. We can do the null check in safe Rust. We know the check is done. Now consider the case with UB: https://blogs.msdn.microsoft.com/oldnewthing/20140627-00/?p=...

These kinds of things can cause lots of subtle issues. The rust code won’t.

oconnor663 · on Sept 15, 2018

> The resulting behavior appears effectively undefined to the programmer, because the point of invariants is to ignore what would happen when they are broken.

I still think there are big differences here, especially when we think about these things as security issues.

If you write a program that's supposed to draw some pixels to the screen, and you have a logic bug, you program is going to draw the Wrong Pixels (https://xkcd.com/722). But your program isn't going to install a rootkit on your machine, or mine bitcoins, or send all your spreadsheets to the bad guys. If you never call the `mine_bitcoins()` function anywhere in your program, there's no way a logic bug can make you somehow call that function.

Not so with undefined behavior. An attacker who exploits a buffer overrun in your program can make you do anything. This almost sounds paranoid, but as soon as your code is taking input from the internet, this is really truly the problem you have. This sort of problem is why projects like Chrome spend millions of dollars building security sandboxes for their C++ code, and researchers still keep coming up with ways to break it.

earenndil · on Sept 15, 2018

If you segfault, then the error is clear. You're crashing, because (as the debugger or valgrind will show you) you tried to dereference this memory after you already freed it. You can then figure out why you freed it this early and change that. If you're in an undefined (according to your application's internal logic) state, it can be much harder to track down why it's acting erratically.

CUViper · on Sept 15, 2018

Use-after-free and other memory errors won't necessarily segfault anywhere near the source of the problem. It could also limp along, corrupting memory in weird places, until something totally unrelated segfaults instead.

ECS might let you continue with an outdated index, but that problem is contained.

unrealhoang · on Sept 15, 2018

Logic bugs with undefined system state are much easier to debug than UB, that is why people use memory-safe programming language.

kbwt · on Sept 16, 2018

Citation needed.

There is great tooling to pin down incorrect memory accesses when you are using the system allocator (valgrind, clang sanitizers). You're truly on your own if you access logically repurposed memory within a persistent system allocation.

steveklabnik · on Sept 15, 2018

> If you segfault, then the error is clear. You're crashing,

if you segfault. UB means anything can happen. Sometimes that's a segfault. Sometimes it means worse things.

archgoon · on Sept 15, 2018

To point; if you free memory, and reuse it (without nulling the reference), you likely won't segfault.

Simple example (compiled on OSX)

  #include <stdlib.h>
  #include <stdio.h>

  struct X {
      int x;
  };

  int main() {
      struct X *a = (struct X *) malloc(sizeof(struct X));
      a->x = 4;
      printf("%d\n", a->x);
      free(a);  // a is no longer a valid reference
      struct X *b = (struct X *) malloc(sizeof(struct X));
      printf("%d\n", b->x); // b is probably reusing the memory used by a
      a->x = 5; // updating a probably updates b
      printf("%d\n", b->x);
  }

If you compile this without optimization (clang test.c) you'll probably get

  4
  4
  5

'Probably' because this is both relies on both undefined behavior (which is partially why turning on -O2 changes the result), and the way malloc is implemented.

Fortunately, in a simple case like this, compiling your application with '-fsanitize=address' will give a very nice error in this case. :)

steveklabnik · on Sept 15, 2018

Just for fun, on Windows, I get

  > cl.exe foo.c
  > foo
  4
  10372440
  5
  > cl.exe -O3 foo.c
  > foo
  4
  9025424
  9025424

I've seen stuff like this work on OS X, and segfault on Linux. Yay UB!

earenndil · on Sept 16, 2018

No, but valgrind will tell you.

archgoon · on Sept 16, 2018

Not necessarily. It can tell you if it is triggered by your tests; but it won't tell you if it isn't. So if you run your test suite under valgrind, and you don't trigger the problem, valgrind won't tell you that there is a potential issue for certain inputs. Which, in this case, will result in silent corruption of the heap.

So, trivially adding argc to main,

    if (argc > 5) {
       free(a); // a is no longer a valid reference
    }
    // valgrind won't catch this issue
    b->x = 2; // valgrind complained about us dereferencing b before intializing
    a->x = 5; // valgrind won't complain about this if argc <= 5

results in a program that valgrind won't catch. Valgrind is great; but your users won't be running it when they use your program.

Now sure, you combine valgrind with other tools like afl (https://en.wikipedia.org/wiki/American_fuzzy_lop_(fuzzer)) or KLEE (https://klee.github.io/), and insist that your test suites have full coverage (however, code coverage isn't the same as input space coverage), but the point is, you're stuck doing runtime analysis (and need to know that you need to do that analysis) to make sure you did this right. Baking this type of error checking into the type system itself is valuable.

Given that large projects like Google's Chrome keep hitting these issues, it seems reasonable to say that they aren't strictly trivial to solve. :)

https://www.cvedetails.com/cve/CVE-2017-5036/

https://vuldb.com/?id.100280