I’m curious why we see so many papers and discussions about the difficulty in writing Rust relative to other languages. I would like to see more analyses of total cost of ownership.
Yes, Rust requires approaching writing code somewhat differently than most popular languages (different isn’t inherently bad), and it’s true borrow checker limitations cause it to reject code that is sound. But that increase in cost at the front end, in my experience, is significantly offset by the decrease in maintenance costs.
I’ve spent a non-trivial portion of my career hunting down and fixing memory corruption issues, race conditions, and other undefined behavior in extremely large codebases. Yes, we’ve seen great advances in tooling in Clang+LLVM, but writing that buggy C/C++ code is still quite easy (a point I didn’t see in the paper). Using validation tools to their full extent is costly from maintenance, configuration, and compute perspectives, and their correctness assessments are less strong than what an advanced type system provides.
Preventing such classes of bugs through an advanced type system seems like a better approach to reducing total cost of ownership because:
1. Some classes of (expensive to fix) bugs are eliminated by codifying system behavior into the type system.
2. The need for a variety of CI runs with Address Sanitizer, Thread Sanitizer, Undefined Behavior sanitizer, etc. and their associated corpus of coverage tests is greatly reduced because such issues are defined away by the language. This saves on CI infra costs and requires engineers to spend less time writing/maintaining tests to exercise the tools.
3. More engineers will spend more time moving the product forward and less time hunting down hard-to-reproduce bugs.
Because for most software 'good enough' is the best outcome.
Users don't care if you wrote it in Rust (unless the target audience is developers). All they care about is if it solves their actual problem: a real world, pain-in-the-neck problem.
If you take 20% longer to write it in Rust, that's short term cost for long term gain, but users don't think that way. Everyone from the users, sales, marketing, product and even engineering departments want the product out the door yesterday, so the pressure is always on.
Users don't care if you need 30% fewer CI servers. The CEO doesn't care if one additional dev has to spend his time making sure asan, ubsan, etc are running properly. This is cost factored into pricing. Coverage metrics are used as proxy metrics for quality.
Meanwhile the devs are wrangling the complexity of the software itself, which requires mental resources. Any additional complexity introduced by Rust slows down development and frustrates the team and stakeholders.
And the bugs are still there. So many bugs from devs who were sloppy or couldn't tame the complexity that +-20% due to Rust doesn't matter. The QA team still needs to exist and test the product thoroughly. The issues still come back and require fixing, regardless of language used or frequency of bad bugs.
In the end, this software process produces acceptable, good enough software that gets fixed, patched, updated, etc Users see improvements and progress.
Operating efficiency becomes important only after the product is out and stable, with paying customers and product-market fit, but by then, why boil the ocean with a rewrite? Sometimes leadership can be sold on a new and improved v2 (now in Rust), and sometimes they see it as a distraction from the current target (if it ain't broke, don't fix it, add more features instead).
That's why I think we need languages that make it less mentally taxing for developers to write product software (because the logistics are already complicated), while features like the borrow checker, though cool, increase the mental overhead.
> That's why I think we need languages that make it less mentally taxing for developers to write product software (because the logistics are already complicated), while features like the borrow checker, though cool, increase the mental overhead.
But all Rust do is expose those issues? For example, do you consider a type checker a burden or a help? Is being able to check at compile time that “hey, why are you trying to pass a nullable to a string? This don’t make sense” helpful? I cant count on how many times have saved my time debugging issues.
I agree that Rust is not for everyone. Not everyone will like types even if it good for them. I would argue that a language that can accept dev/random as source to be very popular especially because you can get started very easily even if that means you will spend the rest of your life debugging it.
It's not whether it's a burden or help, those words have connotations associated with them.
The type system is objectively overhead, because it forces you to think about the types in your program, so it consumes mental bandwidth.
You and I might decide the cost is well worth it, but you cannot ignore that there is a cost. In the real world, many companies didn't always want to incur that cost in the beginning, so they built their products using scripting languages, Python/PHP/Ruby, etc and only much later added types via annotations or features of the language. Even C++ recognizes that there is a sizable overhead and introduced auto types.
Your /dev/random example makes no sense because it would not generate a working program, so it would not be popular.
There is no 'good for you'. Is the type system good for you if it slows down your velocity? Are dynamic types good for you if you get an exception deep in the code one hour before launch because someone was adding a string to a number?
There are only trade-offs. You either make the right trade-off (even by luck) given some set of goals and limitations, or you don't and pay the price one way or the other.
>The type system is objectively overhead, because it forces you to think about the types in your program, so it consumes mental bandwidth.
I don't understand this point. Types is an overhead yes, but its an overhead that you _have_ to deal with. Would typing without knowing what letters would come up be a lesser burden? Seeing the code that I just wrote seems to be an overhead too. Types is simply there to tell you what you wrote. Much like when I realized I mistyped `j` instead of `k`, I would also realize that I am trying to append an `Object` to a `number`.
> Your /dev/random example makes no sense because it would not generate a working program, so it would not be popular.
My point is that a language that will literally, and I mean literally, accept any and all source code and execute them is probably going to be popular. Just imagine a language designed with this goal in mind; to make the set of possible program in the language be the entire space of possible sequences of texts. For example,
```
00000what????;
function hello() {
world
}
```
This seems meaningless. But with that mantra in mind we can simply define that any line that is undecipherable be a declaration to an unused string. Why not? So the first line is is simply something like `let _ = "00000what????";` And the second one is simply a function that returns the string `world`. Why? Oh because its so inconvenient to put quotations to declare a string isn't it?
I think you'd agree that this language is horrible. It might be faster to get to a prototype but I personally can't imagine maintaining a codebase written in this language.
Have you considered that this is exactly what JS is like (in spirit)? It defines so many things that would otherwise not be defined in other languages. For example, what does this even mean? Without looking at the spec, tell me what it does?
```
function hello() {
1.0 + "hello"
}
console.log(hello())
```
Does this program even makes sense? Does this program makes any more sense than something like
```
"helloworld"();
```
Not even JS would compile this, but why not? Why not simply throw an exception here? Something like "function not found"? Since this clearly works
```
const a = {
"hello": ()=>{console.log("world")}
}
a["hello"]()
```
The set of possible program is JS is much, much larger than any other languages. There's so many little tricks that the language provide that makes it so powerful and so much worse.
You hit the nail on the head with long term vs short term gain, but I think there is an additional factor to consider: software generally isn’t frozen at v1. You should also consider if using Rust reduces the toil of incremental changes to your software after its gold release.
It’s always hard to convince management of the business value of a v2 rewrite, especially if v1 works “well enough” from the outside :( Sometimes it’s easier to just front load the cost to avoid significant churn later.
I also use a lot of these points when I explain why I use NodeJS and avoid typescript for most of my projects.
The thing is though, it really depends on the project.
Most of my projects are web UI heavy with simple commodity CRUD RESTful APIs sprinkled with business logic.
Time to market tends to be the biggest constraint (I live in startup world), so "good enough" performance and a loose type system are perfect for that.
If I was writing SCADA controllers for a nuclear power plant, my choice of technology would be much different.
The sorts of "rah rah" debates we have on HN about languages frequently don't take the specific project characteristics into account.
Like with everything, it boils down to intelligently selecting the right tool for the job.
Totally agree! But they do care if Word corrupts a document that’s been open for 10 days (a bug that a principal engineer spent many, many weeks hunting down).
> If you take 20% longer to write it in Rust, that's short term cost for long term gain, but users don't think that way.
There are many steps between writing code and delivering a product to users. Sure, I can commit some C++ in less wall time than Rust (though I’m not actually sure that’s true!). But when we have to delay a major release by 2 weeks because there’s a heap corruption bug in the flagship feature, everyone loses.
> the pressure is always on
Sigh, yeah. It’s difficult for organizations to act in their own best interests. If a codebase primarily has architecture debt and little implementation quality debt, it’s much easier to keep taking the shortcuts to ship now. But with implementation quality debt, UB is lurking behind every corner and can cause the project to go off the rails at any time. This is why I would like to see more research about the value of Rust with respect to the predominant cost of engineering in extremely large systems: maintenance.
> Users don't care if you need 30% fewer CI servers
Sure, but DevOps does. When we have to drop jobs due to capacity constraints but mark the commit as stable anyway, it’s just a matter of time until a major regression sneaks in.
> The CEO doesn't care if one additional dev has to spend his time making sure asan, ubsan, etc are running properly
Haha, yeah, in practice it‘s always one dev maintaining the asan, ubsan, etc. loops! But, it’s up to engineers writing code to make sure it’s exercised by those systems, so in reality the loops are highly under utilized and eventually hard-to-find UB bug slips in.
> Meanwhile the devs are wrangling the complexity of the software itself, which requires mental resources.
Yes. And encoding ownership and lifetimes into the type system is a huge reduction in the mental tax when working through code that hasn’t been touched in years, or wiring up some feature to an interface another team just landed that’s not documented at all. Not having to reverse engineer the code to discern ownership and concurrency patterns is a huge time saver.
> Any additional complexity introduced by Rust slows down development and frustrates the team
I agree artificial complexity due to limitations of Rust’s soundness checks are frustrating, but a cost I’m willing to pay given the long term maintenance benefits. But the cost of system design complexity has to be paid at some point, and I’d rather pay as much as possible at compile time vs. discovering problems at run time.
> And the bugs are still there.
Certainly. But essentially eliminating classes of UB bugs that are hard to repro saves all the engineering teams time.
> Operating efficiency becomes important only after the product is out and stable, with paying customers and product-market fit
Yes. Total cost of ownership is what I want to see more research in.
And I think this is the source of our deferring perspectives. My original comment wanted more research on TCO, but I don’t think that’s particularly applicable before product-market fit.
> but by then, why boil the ocean with a rewrite?
Who said anything about a rewrite? Use Rust for new development. And, as the system grows, components will be rewritten (whether due to new business needs or to improve maintainability). Don’t keep playing with fire when it’s no longer necessary!
> Sometimes leadership can be sold on a new and improved v2
Perhaps I’m jaded, but my motto is “Rewrites always fail.”
> we need languages that make it less mentally taxing for developers to write product software
For extremely large systems whose codebases have been written over decades, we need languages that make it less mentally taxing for developers to maintain product software. The complexity of maintaining these systems is why FAANG employs tens of thousands of engineers who seem to move slower than an aircraft carrier. The difficulty in reasoning about system behavior greatly impedes progress.
> Totally agree! But they do care if Word corrupts a document that’s been open for 10 days (a bug that a principal engineer spent many, many weeks hunting down).
Caused by memory corruption, or the remaining 30% of logical bugs present in any safe language?
1. Mismatched type definitions in different translation units caused an implicit 32 to 16 bit conversion.
2. An addition to the 16 bit value overflowed, causing it to become negative. (The “run for 10 days” thing helped the value get to the point of overflowing.)
3. The negative value led to an out of bounds write, corrupting a key bookkeeping data structure.
Rust would have failed to compile at step 1, and would have panicked at 3 (and potentially 2, depending on compilation settings). I’m not sure what other safe languages are candidates for this domain, but I would suspect these issues would likely be similarly identified.
That bug, unfortunately was identified after release. The release blocking heap corruption bug I mentioned was due to a C++ object being deleted when it called back to an event handler and wrote to its member variables after the handler returned. Ownership and lifetimes would have prevented this design error (which is surprisingly not that uncommon).
> Mismatched type definitions in different translation units caused an implicit 32 to 16 bit conversion.
Rust would not have failed to compile at step one necessarily. Or were you assuming a full revamping of the build and dependency management systems in addition to the rewrite of the program itself?
If we need to clean up the build and dependency management system to start using Rust correctly, isn't that two major migrations?
> Rust would not have failed to compile at step one necessarily.
True. In this case there was old code and new code. I assumed in a hypothetical hybrid Rust codebase, bindgen would be used to bring the types in old code to Rust, so the compiler would identify the type mismatch in perhaps the impl From.
> If we need to clean up the build and dependency management system to start using Rust correctly, isn't that two major migrations?
At the scale of Office, managing the build system is an evergreen project staffed by dozens of engineers. Adding support for Rust integration is a typical deliverable for such an org.
> My point was that data can be corrupted due to logic bugs as well
For sure. But in the 10+ years I worked on Office, I spent far more time per UB bug than logic bugs. 1 UB bug could take weeks to fix. I don’t recall any logic bug taking more than a few days.
Developer productivity would be much improved if logic bugs were the only class of bugs.
I think this comes from people trying to use rust to replace fast gc languages (e.g. C#/java). rust is much lower cognitive overhead than writing correct C/C++, but it's a lot harder than writing code where memory is automagically dealt with for you.
I am still in awe that I could write an entire emulator in Rust without encountering heap corruption, use-after-frees or segfaults in host code (as opposed to guest code, the code under emulation). It's a truly amazing language.
Java makes it a lot more painful to interact with native code (which I have to do often), while offering much worse performance characteristics and no memory safety while doing so. I also wouldn't enjoy dealing with NullPointerExceptions all the time — not a memory safety issue, but also something Rust deals with better.
For an emulator, just go with SPARK :-) Ada didn't bring lots of memory safety until the recent work on ownership, but yes it's all getting there. If you have to manage registers, bitmasks, complex bit-precise data structures, and weird non power of 2 bit sizes or even specific bounds constraints, you'll be in paradise.
TCO is not that important in most cases compared to finding out what to build and estabilishing the user base and not failing due to slow iteration speeds at the beginning. If it takes off, the value derived from the software is tends to be more than the cost, so discovering more things to build is better than optimizing sw expenses at second half of the lifecycle.
Of course there are exceptions, sometimes the potential upside delivered by the software is small and capped and the requirements fixed in advance, and implementation path is well trodden...
Because the language's resource allocation style is not so much a "cross-cutting concern" as something which is all-pervasive across every single non-trivial program written in that language. For a long time, we had four styles: Static allocation, purely manual management of dynamic resources, garbage collection (usually with manual management of some resources), and "don't worry about it" for truly high-level declarative languages. RAII adds something to this, but C++ didn't give up new and delete (or malloc and free... or anything whatsoever) so it isn't really a wholly new paradigm. Rust, however, adds something that's genuinely new to most programmers, and it requires a re-working of the thought-styles, so it's worth thinking about whether it's worth it.
> it's worth thinking about whether it's worth it.
Totally agree we need to evaluate the trade offs of various paradigms. But, I think the most important aspect to understand is the impact on maintenance as opposed to writing.
I’m not sure what my ratios of time spent on reading vs. writing vs. debugging code are, but I am certain writing is what I do least, which is why I’d like to see more research on total cost of ownership.
Yes, Rust requires approaching writing code somewhat differently than most popular languages (different isn’t inherently bad), and it’s true borrow checker limitations cause it to reject code that is sound. But that increase in cost at the front end, in my experience, is significantly offset by the decrease in maintenance costs.
I’ve spent a non-trivial portion of my career hunting down and fixing memory corruption issues, race conditions, and other undefined behavior in extremely large codebases. Yes, we’ve seen great advances in tooling in Clang+LLVM, but writing that buggy C/C++ code is still quite easy (a point I didn’t see in the paper). Using validation tools to their full extent is costly from maintenance, configuration, and compute perspectives, and their correctness assessments are less strong than what an advanced type system provides.
Preventing such classes of bugs through an advanced type system seems like a better approach to reducing total cost of ownership because:
1. Some classes of (expensive to fix) bugs are eliminated by codifying system behavior into the type system. 2. The need for a variety of CI runs with Address Sanitizer, Thread Sanitizer, Undefined Behavior sanitizer, etc. and their associated corpus of coverage tests is greatly reduced because such issues are defined away by the language. This saves on CI infra costs and requires engineers to spend less time writing/maintaining tests to exercise the tools. 3. More engineers will spend more time moving the product forward and less time hunting down hard-to-reproduce bugs.