Rust syntax is weird. Weirdly good and sometimes bad. I'm currently designing my...

coolreader18 · on May 3, 2021

Because ideally your JSON schema validator would turn it into a type that mirrors the structure of the data. "Parse, don't validate"[0]

[0]: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...

linkdd · on May 3, 2021

But the Rust type system cannot fully express a JSON Schema:

  {
    "type": "object",
    "oneOf": [{
      "required": ["kind", "foobar"],
      "properties": {
        "kind": {"enum": ["foo"]},
        "foobar": {"type": "string"}
      }
    }, {
      "required": ["kind", "barbaz"],
      "properties": {
        "kind": {"enum": ["bar"]},
        "foobar": {"type": "number"},
        "barbaz": {"type": "string"}
      }
    }]
  }

Or am I wrong?

masklinn · on May 3, 2021

I'm sure you can design schemas screwy enough that Rust can not even express them[0] but that one seems straightforward enough:

    #[derive(Serialize, Deserialize)]
    #[serde(tag = "kind", rename_all = "lowercase")]
    enum X {
        Foo { foobar: String },
        Bar {
            #[serde(skip_serializing_if = "Option::is_none")]
            foobar: Option<f64>, 
            barbaz: String
        }
    }

[0] an enum of numbers would be an issue for instance, though I guess you could always use a `repr(C)` enum it might look a bit odd and naming would be difficult.

remexre · on May 3, 2021

In general, JSON Schemas are (wrongly, in my view...) validation-oriented rather than type-oriented (for notions of types that would be familiar to Haskell, Rust, or Common Lisp programmers).

I think that schema in particular could be represented, though, as:

    enum Thing {
        foo { foobar: String },
        bar { foobar: Option<f32>, barbaz: String },
    }

linkdd · on May 3, 2021

What about user-supplied JSON schemas? You can't add types at runtime.

Also, JSON schemas allows you to encode semantics about the value not only their types:

  {"type": "string", "format": "url"}

That's something I like about Typescript's type system btw:

  type Role = 'admin' | 'moderator' | 'member' | 'anonymous'

It's still a string, in Rust you would need an enum and a deserializer from the string to the enum.

masklinn · on May 3, 2021

> What about user-supplied JSON schemas? You can't add types at runtime.

That kinda sounds like you just launched the goalposts into the ocean right here.

> Also, JSON schemas allows you to encode semantics about the value not only their types:

JSON schemas encode types as constraints, because "type" is just the "trival" JSON type. "URL" has no reason not to be a type.

> in Rust you would need an enum

Yes? Enumerations get encoded as enums, that sounds logical.

> a deserializer from the string to the enum.

Here's how complex the deserializer is:

    #[derive(Deserialize)]
    #[serde(rename_all = "lowercase")]
    enum Role { Admin, Moderator, Member, Anonymous }

And the second line is only there because we want the internal Rust code to look like Rust.

linkdd · on May 3, 2021

Yep, I'm still new to serde and the Deserialize workflow :)

I come from highly dynamic languages, and even when I was doing C/C++ 10 years ago, I would do more at runtime that what could be considered "best practice".

steveklabnik · on May 3, 2021

As the input gets more dynamic, so does the type system representation. If you want to handle user-supplied JSON schemas, in my understanding of JSON Schema, you'd have to use the serde_json::Value way: https://docs.serde.rs/serde_json/#operating-on-untyped-json-...

remexre · on May 3, 2021

> What about user-supplied JSON schemas? You can't add types at runtime.

Right, well, since they're validators anyway, might as well represent them as a defunctionalized validation function or something. Agreed that this is more-or-less past the point where the type system helps model the values you're validating, though a strong type system helps a lot implementing the validators!

> It's still a string, in Rust you would need an enum and a deserializer from the string to the enum.

Yep, though if you really wanted it to be a string at runtime, you could use smart constructors to make it so. The downsides would be, unless you normalized the string (at which point, just use an enum TBH), you're doing O(n) comparison, and you're keeping memory alive, whether by owning it, leaking it, reference counting, [...].

Thankfully due to Rust's #[derive] feature, the programmer wouldn't need to write the serializer/deserializer though; crates like strum can generate it for you, such that you can simply write:

    use strum::{AsRefStr, EnumString};
    
    #[derive(AsRefStr, EnumString, PartialEq)]
    enum Role {
        Admin,
        Moderator,
        Member,
        Anonymous,
    }
    
    fn main() {
        assert_eq!(Role::from_str("Admin").unwrap(), Role::Admin);
        assert_eq!(Role::Member.as_ref(), "Member");
    }

(strum also has also a derive for the standard library Display trait, which provides a .to_string() method, but this has the disadvantage of heap allocating; EnumString (which provides .as_ref()) compiles in the strings, so no allocation is needed, and .as_ref() is a simple table lookup.)

[0]: https://docs.rs/strum/0.20.0/strum/index.html

masklinn · on May 3, 2021

> Yep, though if you really wanted it to be a string at runtime, you could use smart constructors to make it so. The downsides would be, unless you normalized the string (at which point, just use an enum TBH), you're doing O(n) comparison, and you're keeping memory alive, whether by owning it, leaking it, reference counting, [...].

Nit: it's more constraining but serde can deserialize to an &str, though that assumes the value has no escapes.

Ideally `Cow<str>` would be the solution, but while it kind-of is, that doesn't actually work out of the box: https://github.com/serde-rs/serde/issues/1852

linkdd · on May 3, 2021

Thanks for the insight! Didn't know strum.

But yeah, I tend to do more work at runtime than compile-time, which is not really the way to go in Rust.

grumpyprole · on May 3, 2021

Unfortunately Rust is currently lacking structural records/structs and enums. I think they removed them early on in the design. So you'd have to name the all the types. I hope they do add them back one day.