A JS engine is a high-risk, high-reward problem for Rust. High-reward because JS...

johncolanduoni · on Jan 24, 2019

I'm curious what proportion of vulnerabilities in JS engines are due to mis-generated JIT code vs direct errors in their compiled code. Rust allows you to express some nice properties not always directly related to memory safety (e.g. checked consumption, convenient and safe ADTs), but unless there is a novel application of these facilities to the structure of a JIT engine it won't help a ton with the former kind of vulnerabilities.

I'm excited to see a practical programming language that implements full dependent typing; languages like Idris are actually really good at dealing with precisely the kinds of situations you mention.

ridiculous_fish · on Jan 24, 2019

JS engines have many parts implemented natively, which may be called from JS, and in turn call back into JS. An example is CVE-2015-6764: this grabs an array length, which quickly becomes stale, because accessing one of the array's elements invokes a custom toJSON which in turn modifies the array's length.

This feels like a hopeless problem; can any of Rust's powers be brought to bear here? Could Idris?

johncolanduoni · on Jan 25, 2019

F* is probably the best equipped at the moment to deal with situations like that CVE at the moment, since its library has a concept of heaps. Basically, any function that can access or modify the "heap" (which in F* is just a set of pointers that are guaranteed to point to a value and not alias any others outside of the same heap) must specify what properties of the state of the heap must be true at entry, and what properties are true afterwards. So in pseudo-types (, the functions for accessing a JavaScript array would be something along the lines of

    fn arrayLength(x: JSArray*) -> n: uint (requires nothing) (ensures length of x = n, changes nothing)
    fn callToJson(x: JSValue*) -> JSValue* (requires nothing) (ensures nothing)
    fn arrayAccess(x: JSArray*, m: uint) -> JSValue* (requires length of x > m) (changes nothing)

(NB: F* syntax doesn't look much like this, but I'm guessing this will be readable to more people on HN)

The stuff in parentheses after each function type are the preconditions and post-conditions respectively. So if you do something like:

    let x = arrayLength(someArray)
    for i in range(x) {
      let element = arrayAccess(x-1)
    }

It will typecheck just fine. But if you add the call to toJSON:

    let x = arrayLength(someArray)
    for i in range(x) {
      let element = arrayAccess(x-1)
      let transformed = callToJson(element)
      // ERROR: (requires length of x > m) not satisfied for all runs of loop body
    }

Since callToJson cannot ensure any property of the heap after it runs. In this way you can elide range checks when needed for performance without worrying that you've sacrificed safety.

Covering all the cases a JS engine would need without adding 10 million lines of proofs to the size of SpiderMonkey is still an open problem, but this general approach (known as Hoare Logic[1]) is very enticing, and the type systems that languages like Idris and F* have are definitely the closest to realizing it in more places. There are real software engineering efforts using descendants of Hoare logic like TLA+ (notably Amazon IIRC), but it's rare to see it even in huge projects like browsers.

It's also critical to note that the heap concept of F* is not a totally fixed part of the language; most of the specification of how heaps work are actually in the standard library. That level of flexibility is what I think makes these languages likely to become capable of tackling these problems: something like a JS engine or any optimizing compiler is exactly the kind of place where being able to come up with your own type-level verification model is worth the effort.

[1]: https://en.wikipedia.org/wiki/Hoare_logic