> This is a very common misconception that stemmed from a recent conversation wi...

pyjarrett · on Nov 6, 2022

> how could this ever be proven to never go out of bounds?

I've done this in Ada/SPARK.

My simple_regex implementation of Rob Pike's algorithm [1] based on this blog post [2], is proved not to crash on overflow or bounds checks.

[1]: https://github.com/pyjarrett/simple_regex

[2]: https://benhoyt.com/writings/rob-pike-regex/

gamegoblin · on Nov 7, 2022

IMO, a program failing due to a parse error on the input, and a program failing due to a bounds check are functionally the same thing (provided you are using a language with bounds-checked arrays so you don't segfault). You're just moving the bounds checking from the array access path to the input collecting path. I could be misunderstanding you, though.

pyjarrett · on Nov 7, 2022

My referenced program is just proven for input sizes.

If I wanted to do something to validate an entire program subset similarly, I'd make a type with the bounds I needed that gets fed into that algorithm which is proved for those ranges. This isolates the proved code from user input, but the newtype with range ensures those checks are done prior to being fed into the algorithm. You can't "prove" user input, there has to be validation, but you can force that validation prior to entering the proofed code.

slondr · on Nov 6, 2022

If anyone is curious, the reason that says "most non-research languages" and not "all non-research languages" is pretty much Ada.

touisteur · on Nov 6, 2022

Well if you go for proof of absence of runtime errors, it's SPARK, which is a (compatible) subset of Ada, and comes with a toolset to perform verification. Proof of absence of array-bounds errors or any runtime error (well except for OOM and stack-overflow - for now).

I'd add that Frama-C does also this kind of proof and IIRC Eve (Eiffel Verification Environment ?) too, though I find the SPARK workflow far easier and advanced (but we use it more so... I'm biased).

gamegoblin · on Nov 6, 2022

I also wanted some wiggle room for languages that are fairly researchy but some people do use them in real life for hosting blogs and such, like Idris.

thesuperbigfrog · on Nov 6, 2022

Specifically, the SPARK subset of Ada: https://learn.adacore.com/courses/intro-to-spark/index.html

scottlamb · on Nov 6, 2022

There might be a few others. https://github.com/google/wuffs for example isn't general purpose or mainstream, but it's meant to solve practical problems, so I don't think I'd call it a research language. Opinions may vary.

missingcolours · on Nov 7, 2022

> Consider a program in which an array was accessed based on the index provided by some user input -- how could this ever be proven to never go out of bounds?

In fairness, the approach to similar operations in Rust is usually by returning an Option or a Result, so it is arguably counterintuitive that a few specific operations like array indexing panic instead.

(you can do Vec.get(), but normally the "safe" behavior is the default and the "unsafe" behavior is optional and more verbose, e.g unwrap)

gamegoblin · on Nov 7, 2022

IMO this kind of just moves the goalpost from "prove the program never panics" to "prove the program never hits the `None` or `Err` path" which is functionally identical in most use cases.

Obviously I am not suggesting that people use `arr[i]` everywhere and try to catch the panic, this is an antipattern -- use `.get(i)` if you don't know the bounds. I'm just talking about for the purposes of program formal verification.

Aside:

I think the language design team making the most terse `arr[i]` code not need unwraps or anything was the right move. Doing `.get(i)?` or `.get(i).unwrap()` all over the place is too much visual clutter.

If the user really wants, they can make a custom slice wrapper that implements the `Index` trait in a way that returns `Option`.

duckerude · on Nov 7, 2022

> If the user really wants, they can make a custom slice wrapper that implements the `Index` trait in a way that returns `Option`.

You can't, because indexing is required to return a reference. It's hard to get creative.

dllthomas · on Nov 7, 2022

> IMO this kind of just moves the goalpost from "prove the program never panics" to "prove the program never hits the `None` or `Err` path" which is functionally identical in most use cases.

If my goal is to show that a piece of code never panics, then it suffices to show that the None or Err path doesn't panic; I don't need to show that it's never hit.

csande17 · on Nov 6, 2022

It's a common misconception due in large part to Rust's marketing.

Claim: Rust enables you to eliminate many classes of bugs at compile-time. (https://www.rust-lang.org, "Why Rust?", "Reliability")

Reality: Rust enables you to eliminate an extremely limited class of bugs at compile time, and also inserts if statements into your program to detect some other classes of bugs at runtime and crash cleanly.

boundchecked · on Nov 7, 2022

If you count optimizing your ill-formed function into an inlined panic [1] as "bug eliminated at compile-time", then it may be true!

https://godbolt.org/z/M7Gr3TazE

gdprrrr · on Nov 9, 2022

While it would obviously be nice to error at compile time, the docs of the slice type clearly say the code mudt panic in this case. Depending on the context, [T; 3] may be better

tialaramex · on Nov 6, 2022

> Consider a program in which an array was accessed based on the index provided by some user input -- how could this ever be proven to never go out of bounds?

It's a type constraint on the index. In most programming languages we're used to the index value is some machine type, like a 32-bit unsigned integer, but that's arbitrary, the language can say OK, this is an index into an array of sixteen things, so, the index must be between 0 and 15 inclusive.

Now, depending on the software you might just have to write bounds checks yourself to satisfy this constraint, so you just made more work for yourself, but if your software actually deals in types that can and perhaps should be constrained properly elsewhere this approach means you get a compiler error not a runtime failure.

Suppose we've got code that does foo[k] where foo is an array of eight things and k is an unsigned variable with the bottom four bits of a value from an image file that's supposed to be the CGA medium resolution colour palette ID (0, 1, or 2). Alas the file might be corrupt or malicious and those bottom bits could be, for example 10.

In C and C++ foo[k] compiles, and when the corrupt file is used there is Undefined Behaviour

In Rust foo[k] compiles, and when the corrupt file is used the program panics on this line

In WUFFS foo[k] doesn't compile, variable k is the wrong type for indexing foo.

https://github.com/google/wuffs/blob/main/doc/wuffs-the-lang...

Now, WUFFS is not a general purpose programming language. You should not write an operating system, a web server, a compiler in WUFFS. But, you should write your thumbnail making routine, your PDF validation code, your audio compressor in WUFFS, because in choosing not to be a general purpose language WUFFS is freed to simply always be safe. That WUFFS thumbnail code might make the RCE attempt from Leet Hacker into a blue rectangle, or crop all the executives photos to just their nose, but it simply cannot accidentally create a reverse shell, or spew passwords into the thumbnails, or a billion other things which Rust would make difficult to do by mistake but never impossible.

dwaite · on Nov 10, 2022

>> Consider a program in which an array was accessed based on the index provided by some user input -- how could this ever be proven to never go out of bounds?

> It's a type constraint on the index

Consider a program in which the index type is provided by some user input -- how could this ever be proven to never go out of bounds?

You can move the problem elsewhere (and indeed often should!), but eventually it becomes a condition check and error handling behavior appropriate to the scenario. The default error handling behavior is to halt the program.