Hacker News new | past | comments | ask | show | jobs | submit login

Rust syntax is weird. Weirdly good and sometimes bad.

I'm currently designing my own toy language and writing the compiler (to LLVM IR) in Rust.

Representing the AST with Rust's sum types is so simple. Visiting that AST through pattern matching is great. But the "enum" keyword still bugs me.

The way you define product types (tuples, records, empty types) and then their implementation, just awesome. But the "struct" keyword still bugs me too.

It feels "high level" with some quirks.

Then you have references, Box, Rc, Arc, Cell, lifetimes etc... It feels (rightfully) "low level".

Then you have traits, the relative difficulty (mostly for Rust newbies like me) of composing Result types with distinct error types, etc...

It feels "somewhat high level but still low level".

Sometimes you can think only about your algorithm, some other times you have to know how the compiler works. It seems logical for such a language, but still bugs me.

The one thing I hate though, is the defensive programming pattern. I just validated that JSON structure with a JSON schema, so I KNOW that the data is valid. Why do I need to `.unwrap().as_string().unwrap()` everywhere I use this immutable data?




Because ideally your JSON schema validator would turn it into a type that mirrors the structure of the data. "Parse, don't validate"[0]

[0]: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...


But the Rust type system cannot fully express a JSON Schema:

  {
    "type": "object",
    "oneOf": [{
      "required": ["kind", "foobar"],
      "properties": {
        "kind": {"enum": ["foo"]},
        "foobar": {"type": "string"}
      }
    }, {
      "required": ["kind", "barbaz"],
      "properties": {
        "kind": {"enum": ["bar"]},
        "foobar": {"type": "number"},
        "barbaz": {"type": "string"}
      }
    }]
  }
Or am I wrong?


I'm sure you can design schemas screwy enough that Rust can not even express them[0] but that one seems straightforward enough:

    #[derive(Serialize, Deserialize)]
    #[serde(tag = "kind", rename_all = "lowercase")]
    enum X {
        Foo { foobar: String },
        Bar {
            #[serde(skip_serializing_if = "Option::is_none")]
            foobar: Option<f64>, 
            barbaz: String
        }
    }
[0] an enum of numbers would be an issue for instance, though I guess you could always use a `repr(C)` enum it might look a bit odd and naming would be difficult.


In general, JSON Schemas are (wrongly, in my view...) validation-oriented rather than type-oriented (for notions of types that would be familiar to Haskell, Rust, or Common Lisp programmers).

I think that schema in particular could be represented, though, as:

    enum Thing {
        foo { foobar: String },
        bar { foobar: Option<f32>, barbaz: String },
    }


What about user-supplied JSON schemas? You can't add types at runtime.

Also, JSON schemas allows you to encode semantics about the value not only their types:

  {"type": "string", "format": "url"}
That's something I like about Typescript's type system btw:

  type Role = 'admin' | 'moderator' | 'member' | 'anonymous'
It's still a string, in Rust you would need an enum and a deserializer from the string to the enum.


> What about user-supplied JSON schemas? You can't add types at runtime.

That kinda sounds like you just launched the goalposts into the ocean right here.

> Also, JSON schemas allows you to encode semantics about the value not only their types:

JSON schemas encode types as constraints, because "type" is just the "trival" JSON type. "URL" has no reason not to be a type.

> in Rust you would need an enum

Yes? Enumerations get encoded as enums, that sounds logical.

> a deserializer from the string to the enum.

Here's how complex the deserializer is:

    #[derive(Deserialize)]
    #[serde(rename_all = "lowercase")]
    enum Role { Admin, Moderator, Member, Anonymous }
And the second line is only there because we want the internal Rust code to look like Rust.


Yep, I'm still new to serde and the Deserialize workflow :)

I come from highly dynamic languages, and even when I was doing C/C++ 10 years ago, I would do more at runtime that what could be considered "best practice".


As the input gets more dynamic, so does the type system representation. If you want to handle user-supplied JSON schemas, in my understanding of JSON Schema, you'd have to use the serde_json::Value way: https://docs.serde.rs/serde_json/#operating-on-untyped-json-...


> What about user-supplied JSON schemas? You can't add types at runtime.

Right, well, since they're validators anyway, might as well represent them as a defunctionalized validation function or something. Agreed that this is more-or-less past the point where the type system helps model the values you're validating, though a strong type system helps a lot implementing the validators!

> It's still a string, in Rust you would need an enum and a deserializer from the string to the enum.

Yep, though if you really wanted it to be a string at runtime, you could use smart constructors to make it so. The downsides would be, unless you normalized the string (at which point, just use an enum TBH), you're doing O(n) comparison, and you're keeping memory alive, whether by owning it, leaking it, reference counting, [...].

Thankfully due to Rust's #[derive] feature, the programmer wouldn't need to write the serializer/deserializer though; crates like strum can generate it for you, such that you can simply write:

    use strum::{AsRefStr, EnumString};
    
    #[derive(AsRefStr, EnumString, PartialEq)]
    enum Role {
        Admin,
        Moderator,
        Member,
        Anonymous,
    }
    
    fn main() {
        assert_eq!(Role::from_str("Admin").unwrap(), Role::Admin);
        assert_eq!(Role::Member.as_ref(), "Member");
    }
(strum also has also a derive for the standard library Display trait, which provides a .to_string() method, but this has the disadvantage of heap allocating; EnumString (which provides .as_ref()) compiles in the strings, so no allocation is needed, and .as_ref() is a simple table lookup.)

[0]: https://docs.rs/strum/0.20.0/strum/index.html


> Yep, though if you really wanted it to be a string at runtime, you could use smart constructors to make it so. The downsides would be, unless you normalized the string (at which point, just use an enum TBH), you're doing O(n) comparison, and you're keeping memory alive, whether by owning it, leaking it, reference counting, [...].

Nit: it's more constraining but serde can deserialize to an &str, though that assumes the value has no escapes.

Ideally `Cow<str>` would be the solution, but while it kind-of is, that doesn't actually work out of the box: https://github.com/serde-rs/serde/issues/1852


Thanks for the insight! Didn't know strum.

But yeah, I tend to do more work at runtime than compile-time, which is not really the way to go in Rust.


Unfortunately Rust is currently lacking structural records/structs and enums. I think they removed them early on in the design. So you'd have to name the all the types. I hope they do add them back one day.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: