One issues I have had with Rust applications is the huge binary size (yes, I know this has improved a bit lately). Is there a good comparison between kernel C and kernel Rust code in this regard?
The thing is, you lose a lot of nice features when you do this, like panic unwinding, debug symbols, stdlib… for kernel and some embedded development it’s definitely important, but for most use cases, does it matter?
In ye olden days it was common to distribute a binary without debug symbols, but to keep a copy of them for every released build¹. If an application crashed (panicked, signalled, etc.) you got a core dump that you could debug using the stripped binary together with the symbol file. This gave you both smaller binary sizes and full debugging capability at the cost of some extra administration. I'm not sure if this is possible with "stock" Rust, but if you need lean binaries but want to do forensic investigation it's something to look into.
It's possible, but rust follows platform conventions in only doing this by default on Windows. However it is now easy to configure by setting split-debuginfo in your Cargo.toml [1]
Most size issues come from not using release builds, using too many dependencies, or overuse of generics. The rust std lib being linked in statically also contributes.The kernel shouldn't suffer from any of these problems. Plenty of embedded use is able to use rust in highly constrained environments without size issues compared to C.
If you do release builds, strip debug symbols and turn LTO from thin to full, do dependencies and the static stdlib still matter? You should be only paying for code that's called at that point.
At that point I suspect the biggest culprits are overuse of monomorphisation, and often just more stuff happening compared to equivalent C++ code because the language makes larger code bases more maintainable. I'd also count some niceties in that category like better string formatting or panic handling, which is an insignificant cost in any larger software but appears big in tiny hello-world type programs.
Overuse of monorphisation by the community is typically the right choice given the average use case for them is servers where this need not make a meaningful difference. As with any ecosystem, choices folks make may not be suitable for everyone. Those differing requirements require fracturing into smaller ecosystems which share common requirements. Ultimately, that's what's happening with rust and it's very healthy to see in my opinion. You can't force the overall ecosystem to optimize for a minority of users.
This is also true for everything in general. Having the one best thing for foo isn't as helpful as an array of choices, each with different tradeoffs. You simply choose the one best for your needs. Whether it's cheese at the supermarket, an webserver framework, operating system intrinsics, or command line argument handling. Some things can be standardized and serve as a common base for everyone, but it's challenging to do that without at least one person's requirements. Standards also always feature creep until someone tries to reset it with a new standard which is less complex, but I guess that's a different topic.
I believe the menu of options is undesirable until you actually know you have requirements you can evaluate them against. As much as possible even if I do have a choice, there should be a default and I needn't be asked. When I make a Rust project, cargo notices I have git and, since I didn't say otherwise, it mints a Git repo for the new project automatically. It doesn't insist on asking if I want one, and then asking if it should have the obvious name, and then asking if it should use my default local git settings, the defaults for all of these are IMNSHO obvious and my tacit approval via not having explicitly turned this off is enough.
Do you want a Doodad, a Gooba or a Wumsy? No idea? Me either. So until I care, I'd rather not be asked to choose. But once I discover that I need something with at least 40% Flounce, I can see that Doodads and Goobas both are rated at 50% Flounce, whereas Wumsy has only 10% Flounce, now we're making an informed choice, it should be easy enough to insist on a Doodad to meet my requirement.
If I measure that Monomorphization is out of hand in my codebase I can use dyn to get that back under control for a fair price, but I think the default here is sound.
I agree, but I have yet to see a single real-world example of a Rust project meaningfully reducing its binary size by switching from monomorphization to dynamic dispatch in its own code. Many Rust developers boast that they virtually never use `dyn`, but then still appeal to it when arguing that Rust has dynamic dispatch so monomorphization is an avoidable cost.
Sometimes you can provide `T = Arc/Box<dyn Foo>` where `T: Foo` is required, but only if the trait is designed to be object-safe, not simply by default. If you get to design the trait and all of its consumers yourself, you might have this option, but it's very possible that you're using a library that does not make this possible. You can easily be the first person to bother trying the `dyn` for a trait and running into these limitations.
Besides that, you might not even have that much control of the concrete type used. For example, if you are generating large schemas with serde, serde decides how that code is monomorphized, not you. In contrast, for better or worse, the path of least resistance in Go is to use a reflection-based serialization framework which has notable runtime costs (that may or may not matter to a given project) but successfully avoids compile time and binary size costs. (There are other reasons that Go binaries end up even larger than Rust ones, this just isn't one of them)
Despite Rust's general principle of giving its users informed choices here, I am not aware of any option that does 100% dynamic dispatch for (de)serialization, so in practice this is a largely unavoidable cost in each project that is decided only by how complex the schema is.
It's also only fair to point out that C++ tends to end up in this place too, mitigated only by dynamic linking and not any magical property of the language itself. Even C can head this way because monomorphizing with macros has the same effect, though due to how such code is structured, it's also less likely to be inlined than C++ or Rust.
That's a fair observation, I know when I was first writing Rust my inclination was to return impl IntoIterator<Item = T> from functions which are going to actually return a Vec<T> because hey, if I change my mind you can still iterate over whatever I give you now instead with no code changes.
But of course that's an anti-pattern because they are in reality likely to forever just return Vec<T> and knowing that helps you. My early choice only makes sense if either I can't tell you anything more specific than impl IntoIterator<Item = T> or I already know I intend to make a change later. So these days I almost always write down what exactly is returned unless either I can't name it or no reasonable person would care.
For serde in particular my guess is that if you need lots of dynamism serde is the wrong approach even though it's popular. It might be interesting to build a different project which focuses on dynamic dispatch for the same work and tries to re-use as much of the serde eco-system as possible. Not work which attracts me though.
Note that `impl Foo` return types don't actually cost anything extra with regards to code-size, the compiler knows what the actual type is and there is no dynamic dispatch. Only actual generics have an impact here, and `impl` in a return position doesn't count.
The code size cost doesn't live in my code, but in yours.
Because I didn't admit you were getting a Vec, if you actually need a Vec you actually can't just use the one I gave you. You must jump though hoops to turn whatever I gave you into a Vec, bloating your code.
The implementation is pretty clever, it is probably not going to meticulously take my Vec to pieces, throw it away and make you a new one, instead just giving the same Vec. But this trick is fragile, so much better not to even need it.
Maybe a more specific way to put it is: you only pay for the (combinations of) types you actually use, whether that's in argument position, return position, or even a local binding. So if it's always Vec<T> it's not costing much more in compile time or code size, but if it's sometimes another type then you do now pay for both.
Saying part of the problem is “using too many dependencies” is not an overly helpful thing if the ecosystem keeps on trying to download 3Gb of build dependencies because you tried to use some simple little library. The problem is obvious, it’s the solution that is much more difficult.
It's not a problem when you compare it to C. You have few available dependencies to choose from with C. If you are equally picky and constrain yourself to parts of the ecosystem which care about binary size, you still have more options and can avoid size issues.
For things like a kernel, it is moot as most deps are simply not possible to use anyway.
When you consider the full ecosystem, you need to really compare it to alternatives in largely managed languages like Java, go, node, etc. those binaries are far larger.
> If you are equally picky and constrain yourself to parts of the ecosystem which care about binary size, you still have more options and can avoid size issues.
What's an example of this for, say, libcurl? On my system it has a tiny number of recursive dependencies, around a dozen. [0] Furthermore if I want to write a C program that uses libcurl I have to download zero bytes of data ... because it's a shared library that is already installed on my system, since so many programs already use it.
I don't really know the appropriate comparison for Rust. reqwest seems roughly comparable, but it's an HTTP client library, and not a general purpose network client like curl. Obviously curl can do a lot more. Even the list of direct dependencies for reqwest is quite long [1], and it's built on top of another http library [2] that has its own long list of dependencies, a list that includes tokio, no small library itself.
In terms of final binary size, the installed size of the curl package on my system, which includes both the command line tool and development dependencies for libcurl, is 1875.03 KiB.
[0] I'm excluding the dependency on the ca-certificates package, since this only provides the certificate chain for TLS and lots of programs rely on it.
> If you are equally picky and constrain yourself to parts of the ecosystem which care about binary size, you still have more options and can avoid size issues.
The market and your boss do not care about that. They want tasks X and Y done. You have no time to vet 15 alternatives and pick the most frugal one in terms of binary size. Not to mention that for many tasks you have no more than 3-4 alternatives anyway, and none of them prioritize binary size. What are you going to do? Roll your own? Deadline is looming ever closer, I hope you can live without sleep for several days then.
> if the ecosystem keeps on trying to download 3Gb of build dependencies because you tried to use some simple little library.
Downloading 3GB of dependencies is not a thing that happens in the Rust ecosystem. Reality is orders of magnitude smaller than that. Why are you exaggerating so much?
Some people bristle at the thought of external dependencies, but if you want to do common tasks it makes sense to pull in common dependencies. That’s life.
> Downloading 3GB of dependencies is not a thing that happens in the Rust ecosystem. Reality is orders of magnitude smaller than that.
Assuming they're talking about the built size of dependencies that are left lying around after cargo builds a binary, they're really not exaggerating by much. I have no difficulty of believing that there are Rust projects that leave 3GB+ of dependency bloat on your file system after you build them.
To take the last Rust project I built, magic-wormhole.rs [1], the source code I downloaded from Github was 1.6 MB. After running `cargo build --release`, the build directory is now 618 MB and there's another 179 MB in ~/.cargo, for a total of 800 MB used.
All this to build a little command line program that sends and receives files over the network over a simple protocol (build size 14 MB). God forbid I build something actually complicated written in Rust, like a text editor.
This is why XCode, Android Studio/NDK, VC++ and co have such huge sizes people complain about, compiled binaries for all major variations of compile flags are part of the download.
Also why those GNU/Linux repos are actually multiple DVDs nowadays.
I'm not sure I understand your point with these, as of course no one ever installs the complete repository (e.g. all of Debian), because there's a ton of software in it you don't need or want. Assuming you mean the installation media, at the very least Arch Linux is still less than 1 GB.
Moreover, I think the point in comparing the behavior of Rust dependencies with other ecosystems (C, C++, Haskell, Python) is that most of this cruft is left behind in the individual directories used to build the software. I occasionally write programs to solve some problem, or for fun, and usually I have to download nothing at all, because I can rely on the dependencies supplied by my system and already installed on behalf of other programs (yes, I'm well aware that this doesn't cover all use cases). Rust is fundamentally not designed to work that way, and the large build sizes and huge dependency trees have a multiplying effect on that foundational issue.
I think it was a false equivalence between node_modules and Rust. Like any language where developers rely on a package manager to pull in libraries will necessarily be 3GB in size.
> One issues I have had with Rust applications is the huge binary size
Turn off the standard library and your binaries can be incredibly small. This is how it’s used in microcontrollers and the Linux Kernel doesn’t use the full standard library either.
Not quite. Every Rust program will have some code path that may panic, and the default panic handler uses debug formatting, which uses dynamic dispatch, which prevents elimination of the rest of the printing machinery.
There’s panic_immediate_abort unstable setting that makes Rust panics crash as hard as a C segfault, and only then you can get rid of a good chunk of stdlib.
The printing machinery is quite unfortunate. Beyond being large, dynamic dispatch makes any attempt at stack size analysis much harder.
I’ve used Rust for some embedded side projects and I really wish there was a way to just get some unique identifier that I could translate (using debug symbols) to a filename and line number for a crash. This would sort of be possible if you could get the compiler to put the filenames in a different binary section, as you could then just save the address of the string and strip out the actual strings - but today that’s not possible.
The printing machinery alone is quite large when you consider that it includes the code & raw data for Unicode, whether or not similar facilities were already available on the host libc. Though you're not likely to avoid that in any non-trivial Rust program anyway, as even a pretty barebones CLI will need Unicode-aware string processing.
I generally find Rust binaries to be "a few" megabytes if they don't have an async runtime, and a few more if they do. It has never bothered me on an individual program basis, but I can imagine it adding up over an entire distribution with hundreds of individual binaries. I see the very real concern there, but personally I would still not risk ABI hazards just to save on space.
So one issue I can imagine being the culprit with rust is the specializing / c++ style semantics of rust generics. C code generics tend to be void* flavored or point to a struct of function pointers. Which will generate less code. Not sure how this translates to the kernel setting thoughb
That is true. Rust makes it easy to overuse monomorphisation. There are tools like `cargo-bloat` that find these.
However, most complaints are about size of “Hello World”, which in Rust is due to libstd always having debug info (to be fixed soon), and panic handling code that includes backtrace printing (because print to stdout can fail).
Printing of backtrace is very bloaty, because it parses and decompresses debug info.
Another thing just to mention here is `strip`, which IIRC `cargo build --release` doesn't do by default. I think `stripping` binaries can reduce binary size by up to 80-85% in some cases (but certainly not all; just tried it locally on a 1M rust binary and got 40% reduction).
You can strip C compiled binaries too. And that halves the binary size. The point is for example a hello world Rust binaries is 300kb after striping while C compiled one is 15kb. A difference of 20 times.
Such comparison exaggerates the difference, because it’s a one-time constant overhead, not a multiplicative overhead.
i.e. all programs are larger by 275KB, not larger by 20x.
Rust doesn’t have the privilege of having a system-wide shared stdlib to make hello world executables equally small.
The overhead comes from Rust having more complex type-safe printf, and error handling code for when the print fails. C doesn’t handle the print error, and C doesn’t print stack traces on error. Most of that 200KB Rust overhead is a parser for dwarf debug info to print the stack trace.
That might be true for Hello World, but libgcc_s is where a lot of builtins for C itself go, so you'll find it ends up linked into a lot of non-trivial C programs as well. See https://gcc.gnu.org/onlinedocs/gccint/Libgcc.html
That's not why rust statically links the runtime. The main benefit is that they don't have to try to design and maintain a stable ABI for it. Which is not moot.
More generally, you statically link something to avoid distribution hassles of various kinds, not because you care about the specific number.
It would be nice if it fit on a standard size 1.44MB floppy, but given that I haven't used a floppy drive in about a decade, yeah I guess it doesn't matter much.
I think my current motherboard does not have pins for a floppy drive, but every motherboard I've owned before that does. I just kept moving the floppy drive from chassis to chassis every time I upgraded just in case I needed it. IIRC the last time I used a floppy was either to archive old data to CD-ROM or boot the computer when I couldn't find a USB thumb drive.
I do still own my first computer, an IBM PS/2 Model 50Z, which still has its original floppy drive. Other parts I upgraded -- the 286 was replaced with a 386 SX/Now!, the 30MB ESDI was upgraded to 100MB, and it now has a full 2MB of RAM. I keep the floppy drive because it reads disks that no other floppy drive has been able to read.
Do I understand it correctly that upgrading to a new rust version is mostly implementing new best practices and new features, instead of needing to "fix" your code, as rust is backwards compatible?
I've only used rust nightly for my own projects and didn't give too much thought about rust versions
And you can use clippy to tell you about changes you should make.
For example, in my projects I run this in the CI pipeline:
cargo clippy --all-targets --all-features
and
cargo fmt --all --check
In addition to the regular test and build steps.
This both means that I follow clippy recommendations and cargo fmt in the first place, and also that my CI tells me about any clippy changes if I didn’t notice them myself as well as any formatting I’m not following. In my main IDE I auto format the code of course. But sometimes I make small changes in vim and don’t run the format step myself so it’s nice to have for that reason as well.
For the integration of Rust into the Linux kernel I imagine it’s a bit more convoluted.
For Rust on Linux it's a bit more involved because they use nightly features, which can change from day to day. In practice there's an implicit tiered strata, with features that rarely change and features that frequently change in bursts. I would like it if we formalized that distinction a bit. We already mark when a feature is very likely to change (unstable_features), but not when they are very close to being stabilized.
Rust-for-Linux made it more complicated for themselves, because they chose to enable unstable/experimental features of the compiler without waiting until they’re released, so they don’t get the stability and compatibility guarantees that normal Rust projects get.
I was confused when reading this because I was pretty sure that using other allocators had been supported for a while in Rust. From refreshing myself on the details, it seems that replacing the default allocator is stable (https://doc.rust-lang.org/std/alloc/trait.GlobalAlloc.html), but the API for arbitrary allocators (which includes stuff like being able to do "zero-size" allocations) is not yet stable (https://doc.rust-lang.org/std/alloc/trait.Allocator.html). I guess if there were ever a project that needed fine-grained control over how allocators be work, it would be the kernel.
There's also https://docs.rs/allocator-api2/latest/allocator_api2/ -- I end up using this more often than I end up using custom allocators, simply because it has a stable version of the currently-unstable functions for constructing uninitialized Vec<MaybeUninit<u8>>s.
GlobalAlloc can be used for fallible allocations: its functions just return a null pointer on failure, as with malloc() and realloc() in C. The main limitations are around the safe heap data structures in the standard library, which don't stably expose any fallible APIs except for Vec::try_reserve().
Hmm, I'd expect that would mean that it's possible to add those APIs today then rather than requiring the `Allocator` trait. Is the idea that the Allocator a parameter (maybe a generic one) when calling `try_new`, so they don't want to stabilize anything now?
The Allocator type is an unstable parameter on the heap type; Vec<T> is unstably Vec<T, A>, Arc<T> is unstably Arc<T, A>, and so on. (The allocator "A" defaults to Global, which is an Allocator that forwards to the registered GlobalAlloc.) I think the Linux kernel also wants an Allocator trait for other reasons than fallibility, such as allocating different kinds of objects on different heaps.
It seems prudent to limit rust usage in the kernel until that list can be burned down to zero. It makes sense that you need to at least get rust in the kernel to find out what missing features you need to have implemented and stabilized, but excessive use will make folks lives painful as they try to track upstream rust releases.
Please bear in mind that Linux has used non-standard GCC extensions to C for decades as well. The tradeoffs here are their call to make.
Besides, at this stage, it makes perfect sense for Linux to use unstable Rust features. It was one thing to say Rust should be great for writing kernels, it's another to actually get feedback on how it needs to be better, and that's only possible if the potential improvements are motivated by those who need them and incubated without the constraints of backwards compatibility nor the risks of locking in permanent tech debt.
Rust's unstable feature concept was designed for exactly this kind of freeform evolution and it's working exactly as intended. As for the specific tradeoffs being made in Linux, its contributors are in a much better position to weigh those than we are.
What you propose is exactly what's been done by the kernel. They are integrating the language in a non-mandatory way, to both exercise the kernel side and the language itself. The unstable features haven't been stabilized because either they have open questions on their implementation (and having a customer using them helps define them) or no-one has cared enough to complete them (and having a customer using them gives them the extra push). Either way what's happening now is exactly the process you are proposing.
The article is about updating the Rust version the kernel targets where a feature they use (offset_of) was stabilized.
If you ignore dependencies and stick to stable features yes.
If you include dependencies then it can happen that a dependency relies on unstable features. In which case you might have to upgrade the library version (if they support the new compiler version). The library might have changed the API by then which would force you to change your code.
Except for the above use case, upgrades to the latest version of the compiler have been painless for me.
Dependencies are not allowed to use unstable features either with the stable compiler. The only exception is the standard library, which uses numerous unstable features even with a stable distribution of Rust.
Worth briefly explaining the rationale for this (stdlib gets to use unstable features)
Rust's stdlib is maintained with the rest of the language and by the same broad team, so, if you're tweaking unstable feature X, you are also responsible for ensuring the stdlib people using feature X sort that out. I'm not sure if Rust's internal policies mean you shouldn't land a change to the main tree without accompanying stdlib patches, or whether you're only required to give them adequate notice, but either way it's not going out the door in a stable release being incompatible with its own implementation.
This couldn't really work with 3rd party libraries.
There's a second category of unstable features to mention here.
Some of the features are essentially perma-unstable, because they're exposing some compiler intrinsics for the library to be able to use. This is the equivalent of things like __builtin_* for C compilers.
So this is a huge tangent but couldn't most of the uses of the non_null!() macro in this diff just be (safe!) pointer comparisons or subtractions, without the unsafe{} logic to convert a pointer value to a reference just for the purposes of comparison or subtraction? https://lore.kernel.org/lkml/20240217002717.57507-1-ojeda@ke...
Although I think I recall that the kernel doesn't use Rust's standard library (which is consistent with the diff you linked), it's possible that the standard library's documentation on the pointer subtraction might reference a concern they could share (https://doc.rust-lang.org/std/primitive.pointer.html#method....):
> If any of the following conditions are violated, the result is Undefined Behavior:
> * Both the starting and resulting pointer must be either in bounds or one byte past the end of the same allocated object.
> * The computed offset cannot exceed isize::MAX bytes.
> * The offset being in bounds cannot rely on “wrapping around” the address space. That is, the infinite-precision sum must fit in a usize.
> Most platforms fundamentally can’t even construct such an allocation. For instance, no known 64-bit platform can ever serve a request for 263 bytes due to page-table limitations or splitting the address space. However, some 32-bit and 16-bit platforms may successfully serve a request for more than isize::MAX bytes with things like Physical Address Extension. As such, memory acquired directly from allocators or memory mapped files may be too large to handle with this function.
> Consider using wrapping_sub instead if these constraints are difficult to satisfy. The only advantage of this method is that it enables more aggressive compiler optimizations.
If their pointer subtraction uses similar semantics, there might be issues if they want to compare pointers from different allocation objects, are worried about 32-bit or 16-bit platforms, or maybe even consider the performance concerns too worrisome. The Rust I write tends to be a bit higher-level and doesn't require unsafe, so my instinctual reaction to "using unsafe for performance" generally errs on the same of abject terror, but it's a fundamental part of what makes the safe side of abstractions I use possible, and the kernel is probably one of those places that needs to do that sometimes, so I'd reluctantly have to admit I'm probably not qualified to evaluate whether these cases would merit unsafety for performance alone, but the first two concerns sound like legitimate things that the kernel would need to handle.
> Although I think I recall that the kernel doesn't use Rust's standard library
Rust's standard library has three elements
core has stuff you get with the Rust language, like any use of Rust, Rust for Linux has core. You could technically implement Rust without core, or at least, without most of it, but that's not really the Rust language, you've instead made your own weird fork.
[T]::sort_unstable() is a core function which sorts a slice of some Ordered type T but may re-arrange elements despite them comparing equal hence the word "unstable".
alloc depends on an allocator. You may not have an allocator, e.g. you're a tiny embedded controller, in which case you likely don't want and can't use this. Rust for Linux re-implements alloc, basically cloning the "official" alloc and fiddling with it.
Vec::try_reserve() is a feature found in alloc, it tries to allocate enough space to ensure your Vec has a certain amount of capacity beyond its current size, and if not reports it could not.
std further depends on an Operating System, it offers exciting features like knowing what the time is, reading a file, connecting to a remote service over TCP/IP, or making a thread. Rust for Linux does not provide std.
File::create() is a std function which creates files.
The function you were interested in is part of core (although you were looking at its re-export from std) and so yes, it exists in Rust for Linux.
> If their pointer subtraction uses similar semantics, there might be issues if they want to compare pointers from different allocation objects, are worried about 32-bit or 16-bit platforms, or maybe even consider the performance concerns too worrisome.
Any of these type of issue would equally invalidate using unsafe{} to cast the pointer to a reference, which is what non_null! does.
My wishlist would include gradually refactoring core in Rust and formal verification a-la seL4 to prove correctness. There's no point to refactor churn from one language religion to another for low entropy, core code without improvements in assurance that it's also provably bug-free, race-free, and secure while also being as fast or possibly faster.
So If we would only count code and not comments, it is only 9489 LoC Rust. Which would be about 0.03% and if we take all lines and not only LoC it would be around 0.05%
At least according to the Github's language breakdown for https://github.com/Rust-for-Linux/linux, C is still 98.3% of the repository, and Rust is in the 0.1% of "others".
LFS does not need to do all of that. LFS already uses the host computer's C compiler, so it seems just as reasonable to also use the host computer's rust compiler.
Doesn't part of the Rust compliation chain end up using `cc` anyway eventually for like, linking or something?
That might not apply at a "system's" level but I'm guessing in the massive Linux compilation job with module support you're making a bunch of object files with exported symbols?
Yes, the GHC story is also terrible (and just a tad worse than the rust story). The GHC problem is worse largely because the origins of GHC are murky. While it's still possible to get copies of the early GHC versions through the Internet Archive, the code lives firmly in the 1990s and assumes that you have access to long lost Haskell compilers. Turns out that all these Haskell compilers (with the exception of Hugs) have the same kind of problems that GHC has --- only worse because they are even older, depend on binaries of unreleased previous versions, and are really difficult to build with tools from the last two decades.
One issues I have had with Rust applications is the huge binary size (yes, I know this has improved a bit lately). Is there a good comparison between kernel C and kernel Rust code in this regard?