Building Tiny Rust Binaries for Embedded Linux

kibwen · on April 2, 2018

Note that one of these steps, the one where you replace jemalloc with the system allocator, is quite likely to become the default behavior sometime after the global allocator API stabilizes (which should happen this year). It's a bit of a shame because jemalloc really is faster than the system allocator, often by a significant margin, but in practice jemalloc is a very large burden to package in-tree, plus it largely contributes to binary bloat as seen here, and also tools like valgrind don't play nicely with anything other than the system allocator. Of course, we won't make the switch until we have an easy way to get jemalloc back via crates.io, for our users who really want jemalloc's performance benefits.

GrayShade · on April 2, 2018

I think GLIBC 2.26 added a per-thread cache, so it might no longer be that slow when compared to jemalloc.

kibwen · on April 2, 2018

I'm under the impression that what also makes jemalloc fast is that it has a richer API than glibc malloc (in a way that isn't drop-in compatible, so libc-compatible malloc can't ever provide it), thereby giving it more info that it can use for optimization (but also stymying tools like valgrind).

GrayShade · on April 2, 2018

That's sized deallocation, right? Unfortunately, GLIBC doesn't seem to provide it.

jamesmunns · on April 2, 2018

Hey all, I was able to attend the Rust All Hands this year, and this is a write up of a cool project I worked on there. Let me know if you have any questions!

Lerc · on April 2, 2018

The article mentions the Flash storage size at 8meg. What was the ram size of the target?

UPX can be a bit of mixed bag. It doesn't necessarily cause load delays, If decompression is faster than data loading it can actually speed things. You can't mmap it though which could be an issue depending on RAM. If using an underlying compressed filesystem it might be providing on-paper-only gains.

It would be interesting to see breakdowns of the size of the text and data section at each stage of the size optimization.

arghwhat · on April 2, 2018

I find the number for xargo, without stripping or UPX to be by far the most interesting. It almost gets you all the way (1/6th of the default release size), while not breaking anything.

Stripping removes useful things from your binary, and UPX tends to be a no-go for many things.

steveklabnik · on April 2, 2018

We've been talking about "unforking xargo", and treating libstd/etc more like any other crate, which would mean that you could do this in a more easy way. It's gonna take some time though.

nixpulvis · on April 2, 2018

This sounds like a really good idea! Would there still be prebuilt versions for the sake of faster builds out of the box?

kibwen · on April 2, 2018

Details are still up in the air. All we know for sure (decided at the Rust All-Hands last week) is that we definitely want to extend cargo in a way that obviates xargo (a plan which xargo's maintainer has enthusiastically endorsed).

JoshTriplett · on April 2, 2018

Based on the discussions in the Rust All-Hands last week: yes, there will still be pre-built versions of std for common platforms.

steveklabnik · on April 2, 2018

I haven’t been following the exact details, but that would make sense.

dom96 · on April 2, 2018

For comparison's sake I just compiled the Jester hello world example using Nim:

    import asyncdispatch

    import jester

    routes:
      get "/":
        resp "Hello World"

    runForever()

Command: nim c -d:release --opt:size file.nim

The size on my Mac is 349K and that's with very little tweaking.

steveklabnik · on April 2, 2018

This comparison is impossible without knowing the relative sizes of jester vs rocket, and how much space they take up of their effective binaries.[1] As I mentioned below, Rocket isn't really size-optimized yet, maybe it would be significantly smaller if it were. Maybe Jeseter is just far more lean-and-mean than Rocket is.

You can't draw whole conclusions between two languages with just one tiny command; even the OP, comparing a language to itself, ran a whole lot more in-depth analysis to figure out what costs were being paid where.

It is cool it’s that small so easily though!

1: shout-out to https://crates.io/crates/cargo-bloat which is very cool

dom96 · on April 2, 2018

I agree to a certain extent, but it's still a useful look at how Nim's leading web framework compares to Rust's leading web framework.

I'm not saying we should draw conclusions based on just this comparison. But it is an additional nugget of information and an invitation for others to also give Nim a try when targeting embedded systems :)

kbenson · on April 2, 2018

> it's still a useful look at how Nim's leading web framework compares to Rust's leading web framework.

I'm not sure Rocket is Rust's leading framework. It certainly looks nicer than some, but still requires Rusts's nightly build, IIRC. I'm sure that means a lot of people opt for something like actix-web, or just fall back on hyper. My impression is that Rust's users are pretty widely spread across the frameworks that exist, so it may not be that you picked the wrong framework, but that isn't really a "leading" framework yet.

Someone might jump in with better info on the usage rates than me and correct me though, since this is mostly observational on my part. :)

zokier · on April 2, 2018

I'd say that would be splitting hairs. But arewewebyet has downloads badges for the frameworks. If we go by those numbers, Iron has significant lead and Rocket coming second.

http://www.arewewebyet.org/topics/frameworks/

Also just for fun I did try out Iron, Actix-web, and Gotham, and they all put out pretty similar figures (iron&gotham bit smaller, actix-web bit bigger). I do feel like Rocket is being a good representative example here.

That being said, I wouldn't be surprised if Rocket would be bit more feature-packed than Jester.

kbenson · on April 3, 2018

Fair enough. I just know that the main complaint I see against Rocket here is that it requires nightly (and supposedly enough nightly features that it may be a while before it doesn't). As something that requires a developmental build of the compiler, it didn't seem like the best representative sample to me, but if it's about par with other frameworks, then it works well enough.

Thanks for the leg work. :)

littlestymaar · on April 2, 2018

Many people have complained about the Rust Evangelism Strikeforce, but the few nim users here on HN are taking it to the next level…

uryga · on April 2, 2018

I don't think that's fair towards GP – they only provided a data point using a language that is competing with Rust in some areas.

(Unless I misread your comment, and "RustEvStrFo" doesn't have the same negative connotation to you...)

littlestymaar · on April 2, 2018

You're right, it not really nice on that specific comment. I just noticed this trend the past few months, and this time I reacted.

Edit: if you look at the commenter's profile, you'll see that he does that really often, on threads talking about Rust, Go or C. I think the comparison to the Rust Evangelism Strikeforce holds.

dom96 · on April 2, 2018

I think there is a difference between a core developer of Nim (me) trying to raise some awareness, by showing how easy it is to achieve what the article achieves in Nim, vs. a random developer proclaiming "Why haven't you written this in Rust?".

Isn't that what the Rust Evangelism Strikeforce is known for?

dbaupp · on April 2, 2018

It's what people think the "RESF" does, but as is consistently mentioned whenever it is brought up, there seems to be more people replying to any comment about Rust with complaints of "RESF" (even if the original comment is just "trying to raise some awareness", etc.) than people actually proposing rewriting anything/everything in Rust.

In any case, I think being a core developer on something means a much higher bar for accuracy/relevance/general public behaviour when discussing it and its competitors: everything one says about the project is a semi-official representation of it. Certainly the Rust core developers try to be careful about conveying an accurate picture of costs as well as benefits when answering questions or correcting misconceptions[1]. That is certainly something I found weighed on my mind a lot when I was on the Rust teams: anything I said about it could become part of "the Rust project" (as an idea people have) itself.

[1]: One extra point is that, IME, this is often how Rust team members interact with social media: they won't be the first person to bring up Rust in a thread. Of course, it's fair to say that Rust has a bigger mindshare and more people outside the project team members will bring it up/compare to it than Nim, but one possible alternative approach to awareness is more whole threads (i.e. submitted articles) about Nim rather than just comments within threads about other technologies.

(To be clear, this is just a reply to the parent comment, I'm not trying to say the original Nim comment was "NESF"-ish or "RiiN".)

doom_Oo7 · on April 3, 2018

> there seems to be more people replying to any comment about Rust with complaints of "RESF"

https://www.reddit.com/r/programming/comments/83n32i/the_def...

dbaupp · on April 3, 2018

I didn't say it didn't happen. I acknowledge it, and I've even personally written many, many comments calling out/correcting comments that are too enthusiastic in their promotion of Rust.

In any case, for that specific comment, see Steve's analysis in response to your ping:

> But a brand new account with three trolly comments is different than an actual Rust advocate. Just like I'd ignore our local anti-rust trolls, I'd also ignore any pro-Rust trolls.

gtrs · on April 2, 2018

Binary size was one of the main reasons I started using Nim.

littlestymaar · on April 2, 2018

I'm sorry if I offended you so much you had to revive a dormant account that had been untouched for 4 years.

carterza · on April 2, 2018

It's a perfectly valid comparison though.

Your comment on the other hand...

littlestymaar · on April 2, 2018

And, with every single of your comments being related to the Nim language, I guess you are objective.

ajross · on April 2, 2018

This gets confused right off the bat between file size and mapped size. The whole "strip" nonsense could have been skipped if the author looked at .text/.data/.rodata (and .bss too -- often the resource you're trying to conserve is RAM and not storage!) sizes instead.

Likewise the playing with upx isn't really relevant to a rust article per se. That's a generic executable compressor that will work with anything.

But some stuff here was really interesting. I honestly had no idea that rust was statically linking its own heap implementation into every binary it created! Come on, guys, jemalloc isn't even in rust and doesn't benefit from static linkage. If you're going to distribute your own runtime (and if you're using your own heap, you're distributing your own runtime!) at least distribute it in a shared-by-default way.

pcwalton · on April 2, 2018

> Come on, guys, jemalloc isn't even in rust and doesn't benefit from static linkage. If you're going to distribute your own runtime (and if you're using your own heap, you're distributing your own runtime!) at least distribute it in a shared-by-default way.

Our users consistently indicate that they prefer the convenience of being able to copy all-in-one binaries from system to system to the space savings of dynamic linking. Rust used to dynamically link by default, but based on popular demand this was changed prior to 1.0.

Look at what people say they like about Go, for example. Static linking for easy deployment is high on the list.

btilly · on April 2, 2018

One of the things that I learned from Google is how much breaks on a machine that has somehow corrupted memory in a dynamically linked library. Every application that you deploy using that library is broken. By contrast if you deploy a statically linked application, it probably works.

If you deploy to enough machines in the cloud, you'll eventually hit this use case.

kpcyrd · on April 3, 2018

If you deploy enough machines that corrupted memory becomes a common problem you should look at your healthchecks instead of static linking.

btilly · on April 3, 2018

You need to do both.

Large distributed jobs at Google can literally take several times the expected lifetime of an average server to run. When you run such a job, you must expect that it will wind up running on computers of all stages of repair, and some of the computers that it runs on will come out of commission during your job and will have to be replaced by others.

This scale requires defense in depth. Health checks are necessary, but not sufficient. You also need to minimize the problems if the machine you're on is not yet known to be bad. And deal with double checking results because at that scale you can't trust the hardware. And so on.

monocasa · on April 2, 2018

http://harmful.cat-v.org/software/dynamic-linking/

pcwalton · on April 2, 2018

I don't agree with this. Dynamic linking is perfectly reasonable for OS libraries like kernel32.dll, for instance. Dynamic vs. static linking is a tradeoff like any other.

monocasa · on April 2, 2018

Sure, but NT is a totally different story since you just aren't given another stable ABI other than linking against .dlls.

But there is a lot of value on Unixen of static linking (or only linking against VDSO equivalents).

littlestymaar · on April 2, 2018

Dynamic linking is indeed harmful, but static linking also in its own ways. As often, there is no silver bullet, just trade-offs.

v_lisivka · on April 2, 2018

It was popular demand because tooling was not mature. Now we want to install programs, which are written in Rust, using system package manager, thus we demand dynamic linking.

Moreover, when part of program is static and part is dynamically linked to a C library, it does not matter any more.

For embedded systems, dynamic linking is better when more than one running application is written in Rust, to save memory and processor cache.

pcwalton · on April 2, 2018

Dynamic linking is fully supported by Rust. Just pass -C prefer-dynamic in your RUSTFLAGS.

tmzt · on April 3, 2018

Since the parent mentioned installing using a package manager, could it be default in that case as well as in the `cargo install` case where cargo essentially controls the binary?

If not default, could it be more easily chosen with a shorter option?

Is a rust heap/runtime packaged for distros like Debian which have packaged binaries of some crates?

Note, I'm not in favor of making dynamic default for `cargo build`.

v_lisivka · on April 2, 2018

But ABI is not fixed yet, so all libraries and binary must be compiled by same compiler with same options.

jamesmunns · on April 2, 2018

Hey ajross,

I originally included the output of `size` for each step, but I thought it made the text output too large. However in this case, the actual size of binary on "disk" is important, as we were trying to fit this binary on a small embedded linux device, so if we had shipped all of that debuginfo, it would have taken up useful space on the device (which has a total storage size of 8MB).

Regarding UPX, yup, not Rust specific at all, but still relevant when talking about producing tiny binaries.

Re: the allocator, steveklabnik wrote a good response, though I would add that just because something isn't written in Rust, doesn't mean it can't still be well tested, performant code. Why rewrite it in Rust if you don't have to? :)

Anyway thanks for reading and thanks for the feedback!

ajross · on April 2, 2018

> I would add that just because something isn't written in Rust, doesn't mean it can't still be well tested, performant code.

I know. My point was that unlike rust binaries (which can benefit from significant static optimization) jemalloc is just a blob and doesn't need to be included in every binary.

pcwalton · on April 2, 2018

It doesn't need to, but our users (including me) want it to, because it makes binaries easier to copy around.

MayeulC · on April 2, 2018

I think the point here is that if you are building a few binaries for a system that has severe storage constraints, shared libraries might be a way to reduce disk usage.

If the library is stable and installed system-wide, it shouldn't be much harder to copy binaries around, and it makes each of them smaller.

Of course, there is also the dual, busybox-way of embedding everything into a single binary. This should lead to even more gains, but moves complexity elsewhere: calling the binaries, making sure no symbols conflicts, more complex compilation chain, etc. Even better gains can be achieved if every tool is developed simultaneously, and is reusing utility functions from other parts of the project (which is actually what shared libraries do, to a lesser extent).

steveklabnik · on April 2, 2018

We are interested in switching the default back to the system allocator, but since swapping in your own isn't stable yet, it would be a regression to not be able to include jemalloc instead. That work is underway, and when it's stable, we're likely to change the default, for exactly this kind of reason.

IshKebab · on April 2, 2018

No you've got confused because you think he's trying to conserve RAM but it's very clear in the article that he's targeting low disk size.

> however with their limited flash storage space (8MB for the whole firmware

haberman · on April 2, 2018

Shameless plug: my project Bloaty (https://github.com/google/bloaty) is a size profiler that shows you both mapped size and file size out-of-the-box.

rhn_mk1 · on April 2, 2018

What is left in such a binary? 800K of code seems on the high side even for a http framework.

steveklabnik · on April 2, 2018

One thing that rocket ends up including is some uncompressed literal html; jamesmunns was planning on reporting that upstream.

zokier · on April 2, 2018

While not answering the question directly, you can peek into Cargo.lock to see what all libraries are used by the project:

https://github.com/spacekookie/tinyrocket/blob/master/Cargo....

Of course what of those end up in the final binary is an open question. It would be kinda cool to see a dependency tree with size annotations, but I suppose that might be bit tricky to produce.

But I did find "cargo-bloat"[1] tool that gives following output:

     File  .text     Size Name
     4.8%  19.5%  98.2KiB core
     4.0%  16.3%  81.9KiB std
     3.9%  15.8%  79.4KiB rocket
     2.4%   9.9%  49.9KiB hyper
     2.0%   8.1%  40.6KiB [Unknown]
     1.7%   7.0%  35.3KiB alloc
     1.6%   6.5%  32.7KiB toml
     1.1%   4.5%  22.5KiB url
     1.0%   4.3%  21.4KiB yansi
     0.5%   2.0%   9.9KiB tinyrocket
     0.5%   1.8%   9.2KiB time
     0.3%   1.3%   6.4KiB ring
     0.2%   0.8%   3.8KiB ordermap
     0.1%   0.5%   2.4KiB idna
     0.1%   0.4%   1.8KiB percent_encoding
     0.1%   0.4%   1.8KiB unicode_normalization
     0.1%   0.3%   1.3KiB serde
     0.1%   0.2%   1.2KiB std_unicode
     0.0%   0.2%     880B pear
     0.0%   0.1%     576B log
     0.0%   0.1%     338B cookie
     0.0%   0.1%     307B unwind
     0.0%   0.0%     198B unicode_bidi
     0.0%   0.0%     161B state
     0.0%   0.0%     160B httparse
     0.0%   0.0%     124B memchr
     0.0%   0.0%      94B alloc_system
     0.0%   0.0%      77B typeable
     0.0%   0.0%      52B smallvec
     0.0%   0.0%      45B compiler_builtins
     0.0%   0.0%      28B rustc_tsan
     0.0%   0.0%      25B unicase
     0.0%   0.0%      11B panic_abort
     0.0%   0.0%      11B panic_unwind
    24.6% 100.0% 502.6KiB .text section size, the file size is 2.0MiB

When the stripped binary on my system is 979K. So it manages to account for about half of the binary size. From what is accounted, I suppose yansi (ANSI terminal lib) is the most superfluous, followed by toml parsing. Other than those, most of the stuff seems kinda relevant for a web framework.

I'm now bit curious what the 979 - 502.6 = 476.4 remaining kilobytes are.. and totally nerdsniped. Running size gets me following:

    [zokier@zarch tinyrocket]$ size -A -d target/release/tinyrocket
    target/release/tinyrocket  :
    section                size      addr
    .interp                  28       624
    .note.ABI-tag            32       652
    .note.gnu.build-id       36       684
    .gnu.hash                28       720
    .dynsym                2568       752
    .dynstr                1532      3320
    .gnu.version            214      4852
    .gnu.version_r          288      5072
    .rela.dyn             40320      5360
    .rela.plt              2328     45680
    .init                    23     48008
    .plt                   1568     48032
    .plt.got                  8     49600
    .text                519185     49664
    .fini                     9    568852
    .rodata              294652    568864
    .eh_frame_hdr         15580    863516
    .eh_frame             82544    879096
    .tdata                   48   3060064
    .tbss                   136   3060112
    .init_array               8   3060112
    .fini_array               8   3060120
    .data.rel.ro          34984   3060128
    .dynamic                576   3095112
    .got                    872   3095688
    .data                   201   3096576
    .bss                    464   3096784
    .comment                 69         0
    Total                998309

So almost 290k in .rodata, that seems suspicious. Quick look at it shows about 17k of html as steveklabnik mentioned, but other than that I can't really identify any major blocks. I suppose this is the end of the dive, unless someone knows some tools to get more insight into .rodata data. I suppose in theory it should be possible to track down where in the code each bit of .rodata is accessed from, but that seems bit of a stretch.

[1] https://github.com/RazrFalcon/cargo-bloat

haberman · on April 2, 2018

> I suppose this is the end of the dive, unless someone knows some tools to get more insight into .rodata data. I suppose in theory it should be possible to track down where in the code each bit of .rodata is accessed from, but that seems bit of a stretch.

My tool Bloaty (https://github.com/google/bloaty) attempts to do exactly this. It even disassembles the binary looking for instructions that reference other sections like .rodata.

It doesn't currently know anything about Rust's name mangling scheme. I'd be happy to add this, though I suppose Rust's mangling is probably written in Rust and Bloaty is written in C++.

mnemonik · on April 2, 2018

GNU libiberty includes a Rust symbol demangler that you can link: https://github.com/gcc-mirror/gcc/blob/master/libiberty/rust...

zokier · on April 2, 2018

Cool. So I ran bloaty (with -d sections,segments,rawsymbols) on tinyrocket and used rustfilt[1] to demangle the symbols, and we have numbers for .rodata:

    demangled                                                   Filesize (KiB)
    idna::uts46::find_char                                            89.25
    unicode_normalization::normalize::d                               69.52
    [384 Others]                                                      43.48
    rocket::config::RocketConfig::override_from_env                   23.59
    unicode_bidi::char_data::bidi_class                               15.16
    idna::uts46::decode_slice                                         12.23
    unicode_normalization::normalize::compose                         10.29
    core::num::dec2flt::algorithm::power_of_ten                        5.97
    unicode_normalization::tables::normalization::canonical_comb       3.91
    [section .rodata]                                                  1.99
    idna::uts46::validate                                              1.95
    <hyper::status::StatusCode as core::fmt::Debug>::fmt               1.79
    tinyrocket::main                                                   1.46
    core::num::flt2dec::strategy::grisu::CACHED_POW10                  1.27
    rocket::config::config::Config::set_raw                            1.15
    time::display::parse_type                                          0.99
    hyper::server::listener::spawn_with::{{closure}}                   0.96
    percent_encoding::percent_encode_byte                              0.76
    <hyper::status::StatusCode as core::fmt::Display>::fmt             0.71
    rocket::catcher::defaults::get::handle_431                         0.66
    rocket::config::init::{{closure}}                                  0.65

Lots of unicode and idna stuff there. I think the binary size would be reduced significantly if we could drop those somehow.

Most likely those embedded HTML (error) pages are accounted in "rocket::config::RocketConfig::override_from_env", which sort of makes sense.

> It doesn't currently know anything about Rust's name mangling scheme. I'd be happy to add this, though I suppose Rust's mangling is probably written in Rust and Bloaty is written in C++.

I suppose you don't want to have a (optional) dependency to Rust code? It should be pretty easy to provide C interface to rustc-demangle[2] which would be usable from Bloaty.

[1] https://github.com/luser/rustfilt [2] https://github.com/alexcrichton/rustc-demangle

zokier · on April 2, 2018

This was easier than I thought:

https://github.com/google/bloaty/compare/master...zokier:rus...

and here is the wrapper for rustc-demangle:

https://github.com/zokier/rust-demangle-clib

Seems to work just fine for my simple use at least:

    [zokier@zarch bloaty]$ ./bloaty -d sections,segments,symbols -C rust /tmp/tinyrocket/target/release/tinyrocket
         VM SIZE                                                                                        FILE SIZE
     --------------                                                                                  --------------
      52.0%   506Ki .text                                                                              506Ki  24.8%
         100.0%   506Ki LOAD [RX]                                                                          506Ki 100.0%
              65.6%   332Ki [967 Others]                                                                       332Ki  65.6%
               5.1%  25.7Ki hyper::server::listener::spawn_with::{{closure}}                                  25.7Ki   5.1%
               3.6%  18.3Ki <yansi::Paint<T> as core::fmt::Display>::fmt                                      18.3Ki   3.6%
               2.9%  14.5Ki rocket::ignite                                                                    14.5Ki   2.9%
               2.5%  12.8Ki rocket::config::config::Config::set_raw                                           12.8Ki   2.5%
               2.1%  10.7Ki core::ptr::drop_in_place                                                          10.7Ki   2.1%
               1.9%  9.78Ki std::sys_common::backtrace::output                                                9.78Ki   1.9%
               1.9%  9.62Ki tinyrocket::main                                                                  9.62Ki   1.9%
               1.6%  8.16Ki <alloc::raw_vec::RawVec<T, A>>::double                                            8.16Ki   1.6%
    ...

I can try to clean it up bit more etc if you want a proper pull request for this?

haberman · on April 2, 2018

Cool! If the dependency is optional I think this would be great and I'd love to see a PR for it. It could be configured at CMake time.

I think I'd prefer to just make this part of shortsymbols/fullsymbols instead of making a separate "rustsymbols". I assume that Rust symbols won't successfully demangle as C++ (and vice-versa), so we can just try both demanglers and use whatever works. That seems like it will be more graceful for mixed C++/Rust binaries.

zokier · on April 2, 2018

I created https://github.com/google/bloaty/issues/110 to track this. I'll try to find time to clean up the patch, but no promises. Lets continue the discussion in the GH issue.

mbrubeck · on April 2, 2018

And if you are willing to depend on Rust code, there's also a C++ symbol demangler written in Rust... :)

http://fitzgeraldnick.com/2017/02/22/cpp-demangle.html

evntdrvn · on April 2, 2018

That looks like a very cool tool, thanks for sharing! Bookmarked :)

the_mitsuhiko · on April 2, 2018

I assume it just ends up. It will likely ship a regex engine with unicode support. If it does I assume that’s a big part of it.

rahikainen · on April 2, 2018

[flagged]

rqs · on April 2, 2018

I think if somebody write enough Go, they will naturally realized that Go is not for everything (I mean it has limitation).

For example, the GC: I been struggling to reduce the memory consumption of my tiny TCP keep-alive server knowing that if I made it in C, it could only use few hundreds of KB tops. But in Go, it's 10MB just for holding Go Routines and replying ping.

So I reimplemented it with Rust, voilà! 500KB RES. Now I can put it on my router :D

ronsor · on April 2, 2018

To be fair, it said tiny binaries. I love go, but it does not make tiny binaries.

jug · on April 2, 2018

It does compared to all inclusive .NET Core! 30 MB unzipped Hello world for Linux or Windows targets! But boy, that Hello world has some huge shoes to grow in...

_o_ · on April 2, 2018

I know it is about rust but I just don't see the point. If you want for whatever reason small binaries, rust is just wrong, take c (or asm). The closest you go to processor instructions, the smaller the binary is and rust is compleatly wrong lang. for that.

Actually I think go has beaten rust at anything relevant... (I am c/c++ fan so I have no preference to either. Rust has become hacker news click bait.)

steveklabnik · on April 2, 2018

The Rust team doesn't share this opinion, FWIW. We care about binary size and being able to do things like this to make your code as small as possible.

vardump · on April 2, 2018

> ...take c (or asm). The closest you go to processor instructions...

Modern C isn't particularly close to processor instructions. 20-40 years old C is reasonably close to that benchmark -- or more like 20-40 years old C compilers.

littlestymaar · on April 2, 2018

Well, there are some people on the demo scene using Rust (and even winning[1]). I guess they disagree with you.

[1]: https://www.pouet.net/prod.php?which=68375

buster · on April 2, 2018

Can you explain why? How is C closer to processor instructions then Rust?

ShroudedNight · on April 2, 2018

My experience with using C was that, while it isn't necessarily closer to processor instructions, the instructions it does generate are far more predictable than the ones generated by Rust. And when it does surprise you, it's usually either a pleasant occurrance, or a compiler defect.

This is a combination of many things, the big ones being: 1) My temporal experience with C dwarfs my temporal experience with Rust by at least an order of magnitude 2) The Rust ABI is [still - last I checked] opaque. 3) Every operating system environment I have interacted with has been deeply C-oriented, so the platform utilities for exploring the C to ELF / PE transition are much more accessible.

I would love to be able to have the same familiarity, predictability, and indegenous feeling with Rust that I have with C, but each time I attempt to interact with the ecosystem, I come away feeling alienated and like I've just interacted with a 'the compiler knows best' cult.

buster · on April 3, 2018

- "the instructions it does generate are far more predictable than the ones generated by Rust."

I am wondering how that might work? Isn't this extremely depending on the compiler and flags? To be honest, i would assume that a modern compiler with many optimizations would generate a very different list of instructions then some simple compiler. C has a large list of available compilers..

- "I would love to be able to have the same familiarity, predictability, and indegenous feeling with Rust that I have with C,"

To be honest, all that seems to be mostly time and experience. Surely you won't get the same familiarity with Rust compared to C in a fraction of the time. It just sounds like you feel comfortable enough with C and see no reason to change, which is good for you in a way :)

ShroudedNight · on April 5, 2018

The best I can come up with is examples:

  enum State {
    UNINITIALIZED,
    STARTING,
    RUNNING,
    SLEEPING,
    STOPPING,
    TERMINATED
  };

  struct Thread {
    State state;
    int32_t threadID;
    MemoryMap * memoryMap;
    LockTable * lockTable;
  };

The memory layout of a Thread object is highly predictible: +0 state, +4 threadID, +8 memoryMap, +(12/16) lockTable

I can predict the symbol name for a function:

  Thread *
  createThread(void *stackArea)

I know that stackArea is going to be in RDI, and the resulting Thread* is going to be in EAX (or their well-specified equivalents on another platform)

  switch (thread->state) {
    case UNINITIALIZED:
    case STARTING:
      error("Not ready yet");
      break;
    case STARTED:
      sleep(thread);
      break;
    case SLEEPING:
      error("Already asleep");
      break;
    case STOPPING:
    case TERMINATED:
      error("Thread is terminating");
      break;
    default:
      critical_error("The world is on fire");
      break;
  }

Given a compiled set of instructions, I can link the compiled instructions corresponding to the switch statement (like a jump table) back to the C code.

  for(size_t i = 0; i < SOME_MAXIMUM; ++i) {
    unsigned char const working_value = input1[i];
    unsigned char const mask = ((unsigned char) -1) - (working_value >> 7);
    output[i] = input2[i] &= mask;
  }

I've probably screwed that up, and I don't claim that it's doing anything useful here, but assuming it was doing something useful, I would be able to identify the instruction associated with it, figure out if there was some deficiency present (like say, a conditional) and have a decent idea of how to fix it.

Rust, on the other hand, does a bunch of things that are (so far) opaque to me.

One example is passing a borrowed slice of something, and somehow, not just a pointer to the thing itself is getting passed, but also seemingly, potentially one or more boundary indices, as well as a lifetime (maybe?)

Another example is a match on the type of an object, clearly [on second thought, perhaps not] there's some run-time state being kept around along with the explicitly declared fields inside that object, what does that object look like in a memory dump?

As final example: generics, how and when do they get expanded into discrete functions? What do the symbols get named? How do I know if a generic expanion is likely going to overly prolific? Inlining and templates can do wonderous things for performance, but they can also drown it.

Anyway, hopefully that wasn't too overwhelming, or ranty. You asked nicely twice; I felt compelled to give you a real answer.

_o_ · on April 2, 2018

The only problem is that for some, rust is a silver bullet that is going to bridge lack of system dev. by learning another language. Sorry it does not work this way. C is just fine, we don't need rust, it can probably be implemented as another preprocessor layer of c. And quite frankly, I am failing to see, what benefits it provides if I throw c++ in equation.

ohazi · on April 2, 2018

Rust is a systems language for people who prefer strongly typed languages. It has some relatively unique features that make learning and using the language a little quirky, but for many who end up liking it, C and C++ often no longer seem "just fine." Rust ends up being a genuine improvement for a lot of people.

Also, unless you mean "write an essentially new language that happens to compile to C rather than machine code" -- you're delusional if you think you can recreate some of the nicer features of Rust as a new C preprocessor.

_o_ · on April 2, 2018

Just go and compile hello world with this: https://bellard.org/tcc/

And then with rust. Open your favorite decompiler and decompile both. C has again and again shown that less is more and it is going to stay that way.

(Btw, it is really sad that you get downvoted based on your opinion.)

Rust has become some PR stunt here, it is new hipster topic, but it has a long way to reach proficiency that developer is able to pull out from c. But knowlidge of programming is not enough, power of c is due to knowlidge that developer has and no tool is going to solve that (or he, he... garbage collector - pointing finger to java direction ;) )

jjnoakes · on April 2, 2018

> it is really sad that you get downvoted based on your opinion

That's not really what is happening. I'm guessing you are getting downvoted for a few specific things:

  * "If you want [...] small binaries, rust is just wrong"
  * "[Rust is not close to processor instructions]"
  * "I think go has beaten rust at anything relevant"
  * "Rust has become hacker news click bait"

The general tone of your comment is a bit inflammatory, you make some claims with no reasoning, and you aren't contributing much to the discussion (even if some of these are just "opinions", not all opinions are relevant or useful in the context of any given discussion).

I'm also a little surprised that you insult the folks who are reading this thread (folks probably interested in Rust, since it is a Rust thread) and then wonder why they aren't interested in seeing your comment.

_o_ · on April 2, 2018

My opinion is right to the spot. The amount of posting for rust is inapropriate to the amount of posts every other language has while it seriously doesn't offer something revolutionary to developers. All fans are having c++ to be able to do same things or more, AFTER they have figured out their limits. c/c++ problem is not type safety but rather that it allows people doing anything. But this is also a huge bonus.

(Btw, I don't care about downvoting, but it was somehow strange to see it)

jjnoakes · on April 2, 2018

You are probably also getting downvotes because, as this comment of yours demonstrates, you don't know enough about Rust to hold an opinion worth discussing.

chc · on April 2, 2018

Do you not see how incredibly Blubby the view you're espousing here sounds? You seem surprised by the downvotes, but what you're saying seems to boil down to "Blub is fine." (Not a Rust developer, BTW, just somebody who already knows C and thinks it sounds interesting.)

_o_ · on April 2, 2018

[flagged]

chc · on April 2, 2018

> But please, I am more then happy if you show me, I am wrong, what is so cool in Rust, that you can't pull with c++?

Anything that is possible in one Turing-complete language is possible in another Turing-complete language, so this question isn't framed in a way that's readily answerable.

A better question is what the language actively enables and what it prevents rather than just what you can and can't do in it. Rust makes guarantees that C++ doesn't. Unexpected nulls are possible by default in C++; they are impossible by default in Rust. Use-after-free bugs are possible by default in C++; they are impossible by default in Rust. Buffer overflows are possible by default in C++; they are impossible by default in Rust. Data races are possible by default in C++; they are impossible by default in Rust.

Is it possible to write C++ code that doesn't have these problems? Sure, lots of examples exist. But C++ leaves the door open for these things in a way that Rust doesn't. You have to actively avoid them.

> And if you can't give a reasonable answer, why inventing a new language? Rather write it as a new conding standard. Or create new stl with your new paradigm.

A coding standard is just a set of suggestions, and lacks the guarantees and sane defaults a language can offer. A new STL is a just a library. Basically, you're comparing compiler guarantees with human effort. Surely you can see why guarantees have value.

_o_ · on April 2, 2018

[flagged]

chc · on April 2, 2018

What prevents us from coding like this in C++? As far as I can tell, human fallibility. C++ has been around for decades and we still keep running into the same bugs over and over. That is my point — "What can I do in this language?" is the wrong question. Anything that is possible in C++ is also possible in C, or even in assembly. But you can see why you'd use C++ over those.

That's the Blub paradox — you can see how Blub's facilities are an improvement over less powerful languages even though they can write functionally equivalent programs, but when you look at more powerful languages, you find yourself thinking, "Why couldn't you just write that in Blub?"

(Also, I'm not sure what #define NULL CNull does, but I'm pretty sure it doesn't enable static checking of pointer nullability.)

DougBTX · on April 3, 2018

> (again downvoted by 4 :D :D What a coincidence :D :D guys, you are seriously having a problem :D :D :D)

Minus four is the minimum points, it doesn’t indicate exactly how many downvotes a comment has. My guess is to discourage people from attempting to get the “most negative” number of points for a post.

https://github.com/minimaxir/hacker-news-undocumented/blob/m...

sakuronto · on April 2, 2018

Ergonomic sum-types, ergonomic pattern matching, hygienic macros, the ability to track the lifetime of every object, generics that have nicer (can still get messy) error messages than templates, immutability-by-default, an insanely good package manager, builitin unit and integration testing, iterators that aren't terrible, easier cross-compilation, a smaller standard library surface so everyone on your team isn't using different corners of the language, and a wonderful zero-cost implementation of linear types are just what I can think of off the top of my head. Oh, and you have complete memory safety if you don't use unsafe, and much stronger guarantees of memory safety than C++ even if you do use unsafe. Adding all this to C++ is just adding noise to an immensely bloated system, and it wouldn't be half as good as a new language, for all the simplicity and compat-breaking reasons mentioned above.

Edit: And move and copy semantics that are unarguably better than those of C++.

kbenson · on April 2, 2018

> Naaah, I just don't tend to be politically correct

Which combined with your actual statements sounds to me like you're saying "I don't tend to care if what I'm saying strongly is wrong because I'm comfortable espousing my opinions as overly broad assertions of fact regardless of accuracy or counterarguments." That is, to put it bluntly, just a nice way of saying "I'm an asshole, and I'm okay with that." It's not a particularly useful way to interact with people if you expect to have a productive conversation. If you aren't hoping for a productive conversation, it makes a lot more sense though.

> As I said, the amount of Rust posts here is disaproppriate to the amount of languages that are available

Which implies all languages should get equal representation. Why would that be the case?

> Somehow it reminds me to the Cambridge Analytica case, a small amount of people (bots) manipulating the public opinion. :)

Or perhaps your opinion doesn't reflect the public norm much, or at least the HN norm, or even just the selection bias you see from the exclusion of people that immediately decided an article about Rust wasn't interesting to them, so they didn't visit. Good allusion to current scary topical event though.

> But please, I am more then happy if you show me, I am wrong, what is so cool in Rust, that you can't pull with c++?

Oh, now people are supposed to take on faith you have an open mind about this, and as long as you're shown something as you consider objectively "cool", you're happy to change your opinion? I'm not sure I'm convinced.

> And if you can't give a reasonable answer, why inventing a new language? Rather write it as a new conding standard. Or create new stl with your new paradigm. Or completely reinvent the types using a new library/headers. Why bother wasting time and reinventing what already works (profit? ;) ), and it works great (you can use it as an former VBA developer or as a guru, same language, different skill set, different features used).

All covered many times before on HN. Here's the simple answer: Every one of those changes means you have to learn a mostly new language while also being mostly incompatible with existing hardware, so why not just go farther and get some real guarantees out of the changes instead of just slightly better usability? Guess where that ends up...

kbenson · on April 2, 2018

> mostly incompatible with existing hardware

Err, I meant incompatible with existing libraries (at least to the degree your interop is the same as any other language that needs interop and supports similar types, like Rust). Not sure where I pulled hardware from...

pjmlp · on April 2, 2018

Lots of languages are as close as C to processor instructions, unless you are thinking of targeting PDP-11 systems.