Looking into Odin and Zig

abnercoimbre · on Sept 7, 2021

We hosted [0] a conversation between the two creators. It includes closed captioning and a chapters menu -- their back-and-forth is worth a listen.

P.S. We're having another conversation this November.

[0] https://media.handmade-seattle.com/the-race-to-replace-c-and...

lordfoom · on Sept 7, 2021

Very interesting and cool. Exciting things happening in this space.

losvedir · on Sept 7, 2021

Zig's `errdefer` is really cool. While I like the semantic simplicity of, e.g., rust's Result for errors, and treating an error like any other value, having a little language special handling for errors is very nice. Zig's equivalent of "Result" with special notation for errors, and `errdefer` and all that simplifies a very common, important programming task. I think language design is all about finding those things that you do all the time and make them a little smoother to handle.

That said, one issue with the Zig approach is all the application's errors boil down to an integer set (with easy mapping to names). If you want to pass along data with the error, you sort of have to fall back to a "result" value type, and lose all the great error handling machinery, or do other workarounds. I think Zig's special handling of errors, but for arbitrary errors with data (like Result.Err in rust), would be the best of both worlds. But there's probably significant implementation hurdles with that that I don't know about.

dralley · on Sept 7, 2021

There's a discussion about this here: https://github.com/ziglang/zig/issues/2647

gingerBill · on Sept 7, 2021

Odin language creator here.

Thank you to the author of the article for taking the time to try out both Odin and Zig.

I want to clarify my position as the comments you are commenting on are lacking a little nuance.

For the first one regarding OOM, Odin's allocators all support error values signalling things such as `Out_Of_Memory`, `Invalid_Pointer`, `Invalid_Argument`, and `Mode_Not_Implemented`. So if you you want to handle those error states, there is nothing preventing you from doing so in Odin. However, there are two aspects to my original argument: you should know the constraints of the platform you are on and plan accordingly as it is part of your problem; what happens in the case when you have completely ran out of memory (globally/system-wide)?

> My reaction to that is “640KB is enough for everything”, right?

I've worked on systems with 16KiB and that's just how much you had to deal with. Meaning we had to plan accordingly for what we needed so that we never ran out of memory. The program was heavily tested (both analyzed and empirically) so that we made sure that it ran within the constraint of 16 KiB. On such platforms, `malloc` was never called, and in fact banned.

> I have never had a program cause a system to run out of memory in real software (other than artificial stress tests).

In this comment, I say "program" and not _allocator_ here. I have many allocators that run out of memory _on purpose_. On desktop applications that I have worked on (I know this does not apply to all), if we did run out of memory, the better option is to just crash rather than try to recover. But of course some applications cannot do this, especially when dealing with third-party programs, such as databases (as you state). In those cases, the best you can do is empirically test how much is required and gracefully handle those cases when you do OOM (if possible). That's part of your problem, it's not an external thing. If you can control things, try to.

Minor note: Zig's allocators only return one possible error value `error.OutOfMemory`, whilst Odin has 4 possible error values meaning there is a finer granularity in Odin with its error value system for its allocators.

Onto the second comment you bring up regarding error value handling.

> ...My code, which is calling the service, need to be able to handle any / all of those

Yes. And you should handle them accordingly, and not just in a general "catch all", especially since these are wildly different kinds of failure states.

> ...Odin relies on the multiple return values system. We have seen how good that is with Go. In fact, one of the most common issues with Go is the issue with how much manual work it takes to do proper error handling.

Regarding error value propagation, I needed to clarify my position a little more in that article I was referring to an aggregate set of comments from a GitHub post and not a truly nuanced argument. Odin has the `or_return` operator which allows the user to easily propagate values in code. It is similar to Rust's `?` or Zig's `try` but complements Odin's multiple return values AND does not rely on having a concept of an "error value type".

Demo of `or_return` <https://github.com/odin-lang/Odin/blob/fd256002b3190076bb91e...> And check out Odin's [`core:math/big`](https://github.com/odin-lang/Odin/tree/master/core/math/big) which has big extensive use of the concept.

Even though I added this concept to Odin, I do believe that my general hypotheses are still correct regarding exception-like error value handling. The main points being:

* Error value propagation _ACROSS_ library bounds * _Degenerate states_ due to type erasure or automatic inference * Cultural lack of partial success states

The most important one is the degenerate state issue, where all values can generate to a single type. It appears that you and many others pretty much only want to know if there was an error value or not and then pass that up the stack, writing your code as if it was purely the happy path and then handling any error value. Contrasting with Zig, Zig has a built-in concept of an error value type, and all error values are either inferred `!`, have specific error set, or degenerate to `anyerror`. In practice from what I have seen of many Zig programmers, most people just don't handle error values and just pass them up the stack to a very high place and then pretty much handle any error state as they are all the same degenerate value: "error or not". This problem occurs in Go code too because everyone degenerates to the error interface in practice, therefore it's now equivalent to a fancy boolean.

For error values in Odin, you can use any data type you require for your problem, and even compose many together. It's very common to just use an `enum` or even a boolean, but sometimes you need something more complex or aggregate too. A good example of this is when you need to aggregate different possible error values from multiple different data types. The two common approaches are to make a mapping procedure to convert them to the specific `enum` OR to have a (discriminated) `union` of the `enum`s. This means Odin does not suffer from the degenerate state issue that Zig, Go, and many other languages do suffer from. This is because Odin doesn't have a built-in degenerate type which would be typically used for error values.

One issue with Zig's approach is due to its error value inference system, it's very difficult to know what possible errors values a procedure returns just from __reading__ the code (assuming `!T` or `anyerror!T` is used), you have to compile the code in order to determine that. This may be okay for a single developer, but if you are on a larger team you are now either relying on external documentation (which Zig can provide) or the compiler telling you this. With a specific (error) set this is not a problem because it's explicit to the reader.

Most of the time you are reading code, not writing code.

Odin's type system and feature set is rich enough that you can achieve everything you want _extremely cleanly_. You can many parallels between Odin and Zig: `errdefer` would be akin to `defer if err != nil {` in Odin, `try` would be akin to `or_return`, etc.

n.b. Someone recently wrote an article which is closer to my own position on error value handling (with some of the author's own flair and preferences) here: <http://tuukkapensala.com/files/software_does_not_contain_err...>

Thank you again for writing this article. And as I always say, I highly recommend people try out as many tools (which includes programming languages) as they can to find what are the best tools for your problems!

Regards,

Bill

jstimpfle · on Sept 7, 2021

Very well articulated. That link you posted is great as well. The issue I've always had with exception systems is the loss of control over the control flow, but the problem with errors is that it's not that clear what situations are errors. Libraries that are designed with exception systems are in the business of second-guessing what's an error for the caller.

For that reason, in practice, exceptions tend to (either not be handled or) be transformed/interpreted at (almost) every place before being forwarded up the call stack. In the end exceptions are only an additional mechanism that the programmer painfully has to deal with. It would have been better to use normal values describing the situation (error or not) without any exception semantics attached.

A perspective on the commonly seen argument that all those "if err != nil" are cumbersome is that you shouldn't have many of those, and most of them should be a "return" and nothing more. Complex error state (or any state for that matter) belongs not on the call stack (nor in return values) but it should be stored in data structures so it can be handled appropriately and at the appropriate time.

aidenn0 · on Sept 7, 2021

Allow me to include my mandatory comment about how other languages need to (re)discover Common Lisp's condition and restart system.

The condition system from Lisp gives handlers the option of running before the stack is rewound; other languages put various amounts of state in their exception objects, which only gives you a portion of the power of just not destroying that state in the first place.

Restarts let callers communicate down the stack what to do when certain conditions happen, which removes the need for second-guessing and allows for better locality of code; having the "When X, Y" closer to the code that is responsible for the decision rather than the code that just happens to be closer to X is almost always a win.

jstimpfle · on Sept 7, 2021

Running various callbacks or whatever is still overthinking the problem, similar to throwing exceptions. There's still a preconceived notion that anybody would want to (or could) handle a certain state right then and there.

In general, some responsible code does something about the state that is created by the library. But also, in general it isn't the code that issued the request, nor should happen right when the library detects the error.

In theory, such callbacks are at least as powerful as only storing the state so the client can inspect it later - because a trivial callback is free to only store the state so the client can inspect it later. However, in practice, callbacks lead to overly complex code (beyond some trivial lambdas), and having to provide a callback encourages bad program structure - creating a temporal coupling between the occurence of some event and the handling of it

aidenn0 · on Sept 7, 2021

> Running various callbacks or whatever is still overthinking the problem, similar to throwing exceptions. There's still a preconceived notion that anybody would want to (or could) handle a certain state right then and there.

Sometimes you want to and can, other times you don't. Sometimes you want to log some state that will be destroyed when you unwind the stack. Not being allowed to handle an exception before unwinding the stack is a hindrance.

The common case is that you will unwind the stack; lisp even has a form that does so (handler-case vs handler-bind) because it is so common.

"Where" an exception is handled is two dimensional. There is "where in the source" and "when at runtime" Most exception systems unnecessarily couple the two. If you make an API call that you know will cause a network request somewhere downstack, and at that point in the code instruct it to retry up to 3 times when a connection-refused exception occurs, then you have a usable exception system.

If you can't do that, then just give up and return errors.

jstimpfle · on Sept 7, 2021

There are valid usecases for libraries to take callbacks to parameterize stuff - namely, when the library needs something from the client that the client should always be able to provide synchronously, and it would be too much work to create an interface to let the client pass in the information asynchronously.

Example: memory allocation?

But it doesn't work the other way around: The library shouldn't try to restrict the user about all the things it should want to do. Taking dozens of hooks to cover all possible situations will only make the API harder and harder to use.

aidenn0 · on Sept 7, 2021

You may be misunderstanding. These aren't explicit hooks any more than a thrown exception is. Imagine this pseudo python:

  tryNounwind:
    Foo()
  except SomeException as e
    # This block is run as soon as SomeException is raised, before unwinding the stack

There is still an equivalent to try that doesn't unwind the stack. One major use for this is to call dynamic restart points that intermediate functions have provided. A function implementing an RPC shouldn't have to decide how to handle e.g. a network error, but there are several reasonable things that it could do (retry, fallback to a different host, &c.), it can provide those hooks while defaulting to propagating the error up, and since the caller of the function can run its exception handler without unwinding the state, those hooks are accessible to it.

It also dovetails nicely with a debugger, because the restarts can be invoked interactively.

jstimpfle · on Sept 7, 2021

No no, I understand. It is callbacks. Or some fancy syntax that creates and registers a closure implicitly. But technically it's just a callback - the library calls back into the user code, synchronously. The syntax was never the problem, it's more like this syntax sugar (or in the case of LISP, let's call it semantic sugar) is making it way too easy to do something complicated like this. My strong opinion is that this is not a good idea to do.

RPC is bad, it's an extremely leaky abstraction (network requests pretending to be synchronous function calls). Maybe that's ok for scripts, or in a compute cluster that can guarantee high availability and low latency, but it's nonsensical for general programming (another reason for slowness and clunkiness of so much software).

aidenn0 · on Sept 7, 2021

> No no, I understand. It is callbacks. Or some fancy syntax that creates and registers a closure implicitly. But technically it's just a callback - the library calls back into the user code, synchronously. The syntax was never the problem, it's more like this syntax sugar (or in the case of LISP, let's call it semantic sugar) is making it way too easy to do something complicated like this. My strong opinion is that this is not a good idea to do.

This is essentially how traditional exceptions work, it's just that it also happens to include a non-local transfer of control first. Every place an exception is raised is an additional surface to the API.

> RPC is bad, it's an extremely leaky abstraction (network requests pretending to be synchronous function calls). Maybe that's ok for scripts, or in a compute cluster that can guarantee high availability and low latency, but it's nonsensical for general programming (another reason for slowness and clunkiness of so much software).

Perhaps this is overfocusing on my example? File system access, database requests, memory allocation, are other examples of things that can fail that have multiple plausible ways of handling the failure, and separating the implementation of handing the failure from the choice of how to handle the failure can be useful.

[edit]

I mostly agree with you on RPC; I picked that example specifically because it has the most non-local exceptional situations of things I could think of on the spot.

Joker_vD · on Sept 7, 2021

> Libraries that are designed with exception systems are in the business of second-guessing what's an error for the caller

Okay, let's take some concrete example: a library that does a network call in order to accomplish it task. Imagine it sends/receives all relevant data, and then the TCP connection is closed from the remote end with RST instead of FIN: for the library it looks like a RemoteHostClosedError being thrown. What next?

Some libraries would second-guess the caller that it's not an error, swallow this exception and return the data as normal — that's bad, we don't like those libraries.

Some libraries second-guess the caller that it is an error and rethrow it, throwing away the would-be returned data — that's bad, we don't like those libraries.

A good library would instead not second-guess the caller and do... what, exactly? Return both the resulting data and the exception? But that's the case when the error happened on the wind-down, what if it happened in the middle: should the library return some sort of "resumable context" together with an error? Or what?

jstimpfle · on Sept 7, 2021

So, the library received some data. It also received an RST. It should store both pieces of information in the connection handle (library data structure).

The caller can then inspect the state and decide what to do.

There are other options. The library could for example throw an exception and never report the received data. It's a different paradigm - making choices to present a convenient API that works in the common cases, at the cost of taking away control over the less common ones. APIs like this are popular and maybe useful for scripting languages, but probably less useful in larger systems.

abainbridge · on Sept 7, 2021

I read the article and found myself saying "No, no, no!". Then I read your response here and completely agree with it. How does this difference come about? There's no way to prove who is right. Maybe people's experience of different domains is part of the cause. But mainly I believe the difference comes about due to the different way people's brains work. I think the most harmful thing you can do to a team is mix people together who naturally have different approaches. This aspect never seems to be tackled directly by the interview process.

jstimpfle · on Sept 7, 2021

It's less subjective than that. I find exceptions to sort-of work in "boring" software (short running programs like scripts, or systems comprised of short running processes), or in software with low quality standards (sorry to be blunt).

It's nice that I can bang out a 100 line Python batch script, and catch and transform that one exception, and otherwise just be happy when the script quits automatically because a file could not be created or some input data was malformed. Whatever, I'll "rm -rf" my temp directory and re-try.

Exceptions are also popular in enterprise software - all issues with real-world business logic aside, my idea is that enterprise software doesn't have a lot of complex state after all. There is a domain model that is pretty much single-state as far as most operations on it are concerned. Normally errors are not expected to happen, and for some exceptional situations programmers simply don't care about writing error handling code - the GC will probably clean up to a reasonable state to serve the next request, it will do so in reasonable time, and all will probably just work, and nobody needs to understand what went wrong. (Maybe someone will later have a manual look at the stack trace from when the exception was thrown).

If you're not a "happy path" programmer and need control over what situations are handled and when, I don't see that there is a good way with exceptions - because exceptions take away control (control flow) and take away detailed error information as gingerBill described. In my perception (while I don't read a lot of code and certainly not a lot of GC'ed code) most code never cares to catch exceptions, because it's inconvenient, and because that was the idea behind exceptions.

Hope I made a reasonable argument. Can't see anything subjective about it.

gingerBill · on Sept 7, 2021

I agree with you here 100%. It's empirically true that all these things commonly happen. So the question is how do you mitigate these issues, if there are better alternatives? That's what I've been trying to do with the design of Odin, and try to nudge the user away from any of this nonsense whilst still being enjoyable to program in.

jeltz · on Sept 7, 2021

Pretty much agreed. Especislly web stuff is ususlly "boring" since it delegates the state to the database so if sonething goes wrong there is usually not that much interesring to fo other than cleaning up resources.

gingerBill · on Sept 7, 2021

I'm not even saying my approach is "the best" but that there is no silver bullet. Different people and teams will benefit from different styles and approaches. Pluralism in thought and design is better for finding good approaches to things.

If people prefer Odin's approach of Zig's approach or vice versa, that's fine and more power to them.

vanderZwan · on Sept 7, 2021

> One issue with Zig's approach is due to its error value inference system, it's very difficult to know what possible errors values a procedure returns just from __reading__ the code (assuming `!T` or `anyerror!T` is used), you have to compile the code in order to determine that. This may be okay for a single developer, but if you are on a larger team you are now either relying on external documentation (which Zig can provide) or the compiler telling you this. With a specific (error) set this is not a problem because it's explicit to the reader.

While I see your point that this should ideally have been fixed at the language level, this is still solvable with a style guide and a linter that lets you forbid using !T / anyerror!T, right?

> The purpose of the programmer exists at the time of writing the software, and so errors in respect to that purpose are well-defined in a contained manner. A bug is an error relative to the programmer's purpose in writing the software. An error relative to programmer's purpose that is caused by the software not being executed as described in the source code is a compiler or a hardware bug. These well-defined error classes are categorically different from errors relative to the purpose of an arbitrary future software user, which is the focus of this article.

I'm a bit confused by this part of the abstract. Doesn't the focus on user error completely remove the relevance to programmers, who are more interested in the first "well-defined" class of errors?

florin_g · on Sept 7, 2021

" If you are a desktop machine and run out of memory, don’t try to recover from the panic, quit the program or even shut-down the computer."

Taken at face value and without additional context, this statement displays an appalling lack of respect for the user.

AnIdiotOnTheNet · on Sept 7, 2021

It would seem that the clarification in parent agrees:

> On desktop applications that I have worked on (I know this does not apply to all), if we did run out of memory, the better option is to just crash rather than try to recover. But of course some applications cannot do this, especially when dealing with third-party programs, such as databases (as you state). In those cases, the best you can do is empirically test how much is required and gracefully handle those cases when you do OOM (if possible). That's part of your problem, it's not an external thing. If you can control things, try to.

pqb · on Sept 7, 2021

Among flaws mentioned in the article was error handling in Odin, suggesting OP prefers Zig's way. It looks like Odin is actually experimenting with new approaches too [0].

[0]: https://www.gingerbill.org/article/2021/07/05/value-propagat...

scrubs · on Sept 8, 2021

>"I didn’t really know how to answer that at first ... Http ... Cert error ... "

Here's what I would have preferred to see:

- review of dbc, hoare logic. Getting lost in atom sized details about how to return errors rolling into language keywords, and exceptions is pointless without first establishing defined v. undefined behavior. How should I deal with sqrt of -1? Exceptions? Error code? Errno? Logging? Some zig key word? First get straight the check for negative numbers is the caller's prob or should be. If caller/callee have decent unit testing w/assertions most of this code can defined out for prod release. Passing a url as a string v. a url object from a cache where it was at least once known as a good url that dns likes is a much about contracts as it is about nitty gritty like error reporting.

- some non trivial c++ libs are exception free except where a standard requires it. These libs use a pretty clean error api with heavy contract definition. See Bloomberg's stl replacement BDE open source. Big. Non Trivial. Clear api. And it works.

- pigeon dropping code is unsightly and also a good measure of how much can go wrong if one bothers to think about it. If you never thought about the contract lack of pigeon code is ignorance not help. Here go isn't the cancer ppl make it out to be.

- a good alternative to pigeon dropping code is error callbacks. Rougewave's cross platform c++ lib from 90s gave me this idea in th context of dbs. Sick of doing a one line sql thing then 10 lines of error checking? The api had a callback with good context to rollback etc. This way the good and bad case code is more consolidated. Erlang supervisors may be like the MT version of this

dnautics · on Sept 7, 2021

> I have never had a program cause a system to run out of memory in real software

It's pretty common when your users sporadically upload GB-sized pdfs and you have code that quietly duplicates the pdf octet stream =D.

Luckily, our system is stateless, but it could be far, far worse.

Shadonototro · on Sept 7, 2021

odin is much nicer

zig feels like it forces me to type, constantly, all the time, i get tired at the end of the day.. and the code becomes a huge pain to manage

not to mention the "unused variable" = error, i gave up on the language the day they pushed this change.. very counter productive