Hacker News new | past | comments | ask | show | jobs | submit login
Why is a Rust executable large? (2016) (lifthrasiir.github.io)
195 points by new4thaccount on May 16, 2019 | hide | past | favorite | 128 comments



Wow, I wrote this 3 years ago and I feel nostalgic. Nowadays I consider this to be obsoleted, as the official Q&A has the exactly same question and many things (especially jemalloc) have been changed in the Rust world. I personally found the HN discussion at that time [1] was greatly fruitful.

[1] https://news.ycombinator.com/item?id=11823949


Unfortunately that link to the Rust FAQ is broken.


They removed the FAQ as it didn't fit they newer site.

You can find it at the old site version: https://prev.rust-lang.org/en-US/faq.html


I'm confused as to why they redesigned the website but removed prior functionality.

A few weeks ago people pointed out that the internationalization also suffered due to this. Edit: by suffered, I mean "was removed entirely". I just checked and couldn't find a way to change the locale, nor could I manually set it e.g. by appending /de-DE, etc.

If the new design requires a link to the old design to access removed functionality, is it really ready to launch?

---

edit; these were the languages offered: Deutsch, English, Español, Français, Bahasa Indonesia, Italiano, 日本語, 한국어, Polski, Português, Русский, Svenska, Tiếng việt, 简体中文

Now if I want to refer to someone about Rust in Korean I have to link to prev.rust-lang.org, which is weird.


It's the "redesign culture" in a nutshell. Change things (expending effort) for no discernible reason, break familiarity, arrive at subjectively worse appearance, and at objectively worse functionality. It's hair-pullingly frustrating.


It's a cargo cult. For any technology, there is a period of time when it improves more or less monotonically, and so most new things are cool. Some people over-extrapolate this and conclude that new things are always cool, and so is born the cult-of-the-latest-thing, which persists long after actual improvements stop.


Holy shit the new site is awful on desktop. The font size is huge on a desktop monitor so you have to scroll a lot to see anything. (hint: what do programmers usually access the web with).


The site were redesign using Tachyons CSS and it seems probably not finalise yet.

https://tachyons.io


That's a very dumb reason to ditch i18n. Waiting before it finalized wasn't an option?


As far as I can make out, the new website was considered a key part of the "Rust 2018" project, and so the maintainers chose to launch it before it was really finished (and without the planned period for user feedback), at the same time as the Rust 2018 release.

Evidently it wasn't quite such a key part of the project that they could consider delaying the new Rust edition until the website was ready.

In hindsight tying the website update to the new edition was quite possibly a mistake. The Rust core team has asked people to hold off on discussing the process or proposing major changes until they've published a retrospective on what happened (that was five months ago, and the retrospective is now "mostly finished").


Thank you for explaining it. I now finally understand why it was deployed with removed functionality.

That said, that is a completely idiotic reason to remove something like i18n. Unbelievable.

I wish I could say I don't understand how this got approved by multiple people, but the sad reality is that nobody makes internationalization a priority. It's completely normal to not have the other languages I use to not be supported, but I guess it stings to actively have it be removed.


Instead of yelling at people who weren't involved with the decisions and are only providing context, maybe go write a calm series of bug tickets explaining your concern, or better offer to and follow up on helping with i8n support?


So raising a point on a random internet post is now considered "yelling"? Interesting. Also, I called "that reason" dumb. Perhaps you're personally involved with the project, but please try to read things as they are instead of taking offense unnecessarily.

Counterargument: How about one doesn't remove stuff like i18n in updates? Or what about simply delaying the deployment until the i18n and other features are done? Or is removing support for other languages considered acceptable if the new, English only, website looks very pretty? [tone is of light sarcasm]

Counterargument #2: just because something is open source doesn't mean you can go "well, just open a ticket or do it yourself". No, how about people in general update things properly (or at the very least, don't remove translations)? I'm perfectly allowed to criticize removing important functionality like i18n. It was there, now it isn't, and I didn't make the choice to remove it. It's not my job to do other people's jobs for them just because I exercised my ability to criticize something. Otherwise I would be working on GUIs/UX for open source linux projects until I die. [tone is of exasperation in open source software development]


You might have misunderstood the sentence's tone.


Does the new design require a link to the old design? By your opinion sure, but they don't think that do they?

Which is to say, if you own a site and decide to cut a feature, isn't that okay? It may or may not be a feature users want.. but that's another story entirely, no?


The new site has a link to the old one at the bottom. My opinion isn't factoring into it.

However, losing i18n is pretty bad. These are the old languages that were offered:

> Deutsch, English, Español, Français, Bahasa Indonesia, Italiano, 日本語, 한국어, Polski, Português, Русский, Svenska, Tiếng việt, 简体中文

However, as someone who regularly studies and works with a non-English language I can tell you i18n is pretty darn important. More important than a very pretty site that people can't read.

Per my own opinion, a FAQ is also a great idea. Especially for a new language that's trying to gain more adoption. Entirely removing it instead of reworking it into the new design before deploying mystifies me.


Nobody was happy with losing i18n, and it's on its way back, there were just some delays.


That's not really the truth though, is it? If nobody was happy, then the new site wouldn't have been deployed. Or by "nobody was happy", did you mean "nobody really cared, because English is their primary language, thus they aren't affected at all"?

The new website is very pretty. I don't understand why the deployment could not have waited until the translations were completed. Even just translating 4-5 widely used languages would've sufficed, although still not ideal compared to the old website.

Even if people wanted to visit the old website, it's buried at the bottom of the new one, practically invisible.


> If nobody was happy, then the new site wouldn't have been deployed.

This is an incredibly simplistic view of how projects work. On any large project, there are a number of objectives, with different priorities. And there are a number of factors, some public, some private, as to why projects end up the way that they do.

I am the person who implemented the original i18n support. It took me a year of effort to get it shipped. I do care about this. That's not incompatible with what's occurred.

> "nobody really cared, because English is their primary language, thus they aren't affected at all"?

Even if English is a primary language, that doesn't mean we aren't affected. For example, not shipping it means that I have to be embarrassed and apologize when people on the internet point out this shortcoming. Not shipping it can limit growth, as you point out. There are tons of ways.

> I don't understand why the deployment could not have waited until the translations were completed.

The original way of doing i18n was completely untenable. Doing it a better way takes time and effort. That's before the translations are actually made. That work has been ongoing since December of last year. It's getting pretty close now, with a lot of movement recently.


> but few people who are actually willing to step up and do work. That's the limiting factor on getting stuff done.

That's why waiting is also an option. Feel free to excuse it however, but removing i18n here was a pretty big mistake on the rust team's part. For what? Rust 2018? I didn't contribute to the website, but I also didn't force anyone to change it either.

> There are a lot of people willing to get extremely mad about the website online

I'm not mad, I'm just disappointed. It's rare to see actual i18n for open source projects, and it's absurd to see it get removed arbitrarily when it does exist. I believe the Rust team is capable of delivering much better quality updates, hence my writing here. I'm also sick and tired of i18n being considered an afterthought by people who do have the resources to accomplish it to at least a basic degree.

Also, just because something is open source doesn't mean "well why don't you just go do it then" is a valid argument. It isn't.


> I'm not mad, I'm just disappointed.

Regardless of you, it's true in general. There has been a lot of heat, and it's damaged the ability to actually improve the site.

> Also, just because something is open source doesn't mean "well why don't you just go do it then" is a valid argument. It isn't.

It's not an argument. It's a description of reality.


Nobody liked the new site and there were a lot of complaints about the design. Why was it deployed before feature parity?


[flagged]


Would you please not argue in the flamewar style on HN? It's not what this site is for, and it damages the curiosity that it is for.

https://news.ycombinator.com/newsguidelines.html


> So aside from you, nobody on the team cared about i18n enough to actually get it implemented before deployment–it wasn't a priority. Got it.

It's not a matter of care; it's a matter of time and stuff to do. There are a lot of people willing to get extremely mad about the website online, but few people who are actually willing to step up and do work. That's the limiting factor on getting stuff done.


> That's a lie. If nobody was happy, then the new site wouldn't have been deployed.

You're losing your cool just because you have different ideas of the perfect trade-offs to make while having zero, and I mean zero, skin in the game.

It's not as impressive nor noble of a position as you seem to think it is.


Criticizing a software update is "losing my cool"? First I was "yelling" and now this...

My "different idea" is keeping the ability of non-English speakers to use the website and learn about Rust. I (and most people, I think) would not prefer a prettier site to one that has support for 10+ languages. Do you not agree?

My "position" is one of someone that heavily uses a language other than English. I very much enjoyed the old Rust site, because it meant I could actually show it to people around me, because their native language was supported. This isn't a tradeoff so much as a dealbreaker. What tradeoff was there for pushing the new website deployment too early?

I think it's perfectly reasonable to criticize an update that made the website prettier but removed i18n for no reason. Why? Was there a fiscal incentive to immediately replace the website before any translations could be done?

So instead of making personal attacks for no reason, how about we criticize bad, unnecessary software updates together?


> Criticizing a software update is "losing my cool"?

Making as assertion about someone telling a lie comes across as that, if you're given the benefit of a doubt and assumed to not be someone that calls people out as liars with little evidence unless you get carried away.

But perhaps calling people liars is a normal mode of communication for you, and you didn't lose your cool at al. That's definitely one way to interpret the comments so far, given how you equate "criticizing" to how you've presented your position so far.

I think you had good points initially. I also think that when presented with facts about this specific situation you've allowed you argument to devolve into a stubborn stand based on technicalities while belittling others.

> So instead of making personal attacks for no reason, how about we criticize bad, unnecessary software updates together?

Indeed.


I simply think we need to be honest about the language that we use. The truth is, instead of "nobody was happy", "nobody made i18n a priority" is far more accurate. Calling it a "lie" is perhaps too strong, but that's essentially what it is, intentional or not.

If nobody was happy about it, why was it deployed? That brings us to even more questions. Nothing presented thus far portrayed the decision to update in a better light, in fact it's only made my opinion of it worse.

I am used to websites having little to no i18n, but it's a complete mistake to actively remove existing i18n. For what? Rust 2018? A better looking landing page? Honestly now I'm just skeptical about the organizational structures that lead to this decision being approved.


> If nobody was happy about it, why was it deployed?

I think you've confused "nobody was happy" with "nobody thought it needed to be done". The world is rife with actions that nobody is happy with, but most or many agree is the best of the bad options available. Just because you think there doesn't exist or can't imagine a situation in which it was better for them to do what they did than the alternative doesn't mean that's the case. There's already been an admission that the situation was bad, the result is a mistake. Whether that mistake was at the point of making the choice on dropping internationalization or at a point, possibly many months earlier, which hemmed them in and didn't allow them a good choice beyond "don't update and deal with the major problems that causes or update and deal with the major problems lack of internationalization causes" is unknown, assuming one over the other and calling people liars based on your assumption isn't very responsible.

> Calling it a "lie" is perhaps too strong, but that's essentially what it is, intentional or not.

I don't think "perhaps" is even in question. You didn't have the information to make that assertion definitively, as there are plenty of very possible and likely scenarios where it's not a lie. Do do so against a person who does know the specifics as they were involved, and without first digging deeper as to those specifics, is something I view as irresponsible.


I don't know what wasn't fucked up by the redesign, so I don't have such confusion. I just assume it was done to make the site intentionally less useful for the glory of satan, of course, so it seems fitting.


Who cares about i18n? Both the language itself and all* libraries are in English anyway. You won't get far relying only on translated material.

(English is not my native language. I started out learning from Swedish resources, and generally consider that to have been a big mistake.)


The code itself and libraries being English is not a new concept.

Materials and knowledge about the language leads to more people potentially discovering it. I can far more easily persuade someone to use Rust, if I share the website with them in their native language.

Consider the reality that not every nation is advanced in English like Sweden or other European nations.

Anyway, the old site did have i18n, meaning that they did care about it. Not only did they care about i18n, they also had more niche languages, which is quite rare for an open source project. The alternative is basically saying "just learn English in order to read about Rust and why you may or may not want to use it". I find this unacceptable.

So it's very puzzling to me that the new website entirely ditched i18n–it's incongruous to what I perceive the quality of the rust team is capable of.


> Materials and knowledge about the language leads to more people potentially discovering it. I can far more easily persuade someone to use Rust, if I share the website with them in their native language.

That works for more 'end-user' facing projects, (GIMP, Mastodon...), but for programming languages not that much. As someone who speaks 3 languages, (including English), let me assure you that (most) non-English programming materials use a very tortured vocabulary in the target language, which actually makes it considerably harder to learn.

Also, prior to me learning English, I've always found it extremely disappointing when the main website of a project was in my language, but then you click on any important link, (like a guide), and it's English only. It was a bigger let down that if a site was English only from the get go and I knew every link it's going to be English. I actually learned English because of this.

As for general info about a new language etc. there are usually dedicated "IT/programmer" community portals with news and some basic tutorials for the new hotness in the target language, which is usually how non-English speakers learn about new tech, not really from the project website, at least in my experience.


I miss the old site :( It's so simple and beautiful..


Design choices shouldn't be a reason to not include the FAQ


Design had nothing to do with not including the FAQ.

The FAQ was often out of date, hard to maintain. It was super long. It's not clear how often it was actually used.


Was there a reason why the runnable code example (plus a link to more examples "Rust by Example") that was front & center of the old homepage was removed?

I found it difficult to find any code examples on the new site.


There's been a years-long discussion about how to get a good code sample, and we've never found anything satisfactory. Short examples don't show anything interesting, and interesting examples are too long. Nobody was happy with the old code sample, and nobody was happy with all of the suggested alternatives.


Every time a new language is posted on HN I spend ~15 seconds looking for code and, if I can not find it, I close the tab.

I can't imagine that isn't the case for many others.

I think Go nails this with the multiple code samples.

edit: In retrospect, I'm sure this is beating a dead horse and you're all well aware of this. I wonder, instead, what the approach is to fixing these problems when they reach a standstill? It sounds like no one could agree, so progress halted.


Everyone's experiences are different. Many languages don't even have a centralized web presence, and many don't have code snippets on the home page. Just as you'd absolutely expect it, I would equally never expect it.

You can see some code in less than fifteen seconds by clicking the big yellow "get started" button.


I think your parent commenter gave you a very valuable marketing feedback and you should take it seriously and not deflect it with the diplomatic yet hollow "everyone's experiences are different".

Did you agree with their feedback? Disagree? State it plainly, no one is going to sue you for it. :)

Many people who frequent HN are quite busy professionals. I too quickly close web pages that show zero code, or motivation for creating the language, or brief install instructions. To me it means that the creators can't present well and that leads me to assume that their programming language is bad at expressing intent as well (and thus verbose or confusing). Is this a wrong assumption? Very likely, but the first 10-30 seconds of percepting something new are not rational. That's a widely accepted psychological premise in marketing.

You might disagree and that's okay. Marketing however isn't at all about what you like but what your average site visitor likes.

So, finally, what was wrong with the code samples? They might have been too simple to be useful but they still did send the right signal to your busy programmer visitor -- IMO.


Feedback is valuable, and I did take it seriously. It's not new feedback.

> Marketing however isn't at all about what you like but what your average site visitor likes.

I agree 100%. But anecdotes, mine, yours or anyone else's, aren't data.


Ohh that’s where it is. That page with install instructions and examples is perfect.

I wasn’t looking to “Get Started” but why I might want to use it. Install instructions plus a basic example is all 90% of people want from a programming website.


You are coming from the perspective of somebody with deep knowledge of the language. For somebody that never saw it, a short code sample say a lot.

I know this because the first Rust code I've seen was that short sample. But then, I completely understand if you desire to focus the site on people with experience on Rust.


I may have experience, but it wasn't me that decided this. It was also from talking to a lot of new people, and their reactions to various code samples.

> I completely understand if you desire to focus the site on people with experience on Rust.

It's not, actually! It's the opposite. However, the audience is not just pure programmers either, it's also people like CTOs, etc.


golangs multiple options (via dropdown) always seemed the best to me https://golang.org/

Nim's line length one is okay https://nim-lang.org/

Although I'm sure you guys debated it to death already I still find it unusual to not be able to find a any example of actual within a few clicks. Committee-driven design is always difficult.


Go's page is for programmers; Rust's new page is for non-programmers. I think it's a mistake that you have to click "Get Started", then read through installation instructions, to finally see what Rust code looks like. It's all backwards. A short snippet (or two) should be front and center.

Crystal lang is another very good example: https://crystal-lang.org/.


Something similar to how Crystal approaches it could perhaps work, giving a whole bunch of different small examples (including compilation errors) with corresponding entrances to documentation.


Their new website is more marketing oriented. They tried to look for a "good" example and didn't find it. So basically they saw that the language looked ugly and didn't want to show it, which is in my opinion very dishonest. I remember voicing my dissatisfaction on this change. A snippet code should show what the language looks like, not mislead the reader.


Seems like there should be two sites; one for people they want to "market" to, and one for developers.


Nonetheless, it feels like there should still be an FAQ for actual FAQs. It's harder to maintain than other documentation?


We've empirically found it harder. It's completely disconnected from everything else, and covers a wide range of things, and so it's very easy to get lost in the shuffle.


So assign the job to people who can focus on just that; community volunteers can often do a better job than the core team for an FAQ, because they are more in touch with people new to the language are confused about than core developers with deep knowledge of everything. It will be dynamic and questions/answers will become obsolete, so that has to be handled.

(I was the G++ FAQ maintainer back in the 1990s; yes, I'm old).


> So assign the job to people who can focus on just that

This requires having a person who can. We never had one.

> community volunteers can often do a better job than the core team for an FAQ

Absolutely! The original version was written by the community. After such a heroic effort, they couldn't maintain it. I believe it was a lack of time, not desire, but regardless, that's just how it goes.


> It's not clear how often it was actually used.

"Frequently"


Ha!


It is if you are standardizing documentation. I don’t expect FAQs in the main website.

https://doc.rust-lang.org/1.0.0/complement-lang-faq.html


Given that a lot of these questions seem relevant for people considering to try rust for the first time, including some sort of link on the main website might be helpful.


A redirect would be helpful so the old links still work.


We generally tried to have redirects, but we are humans and therefore fallible. Please file bugs!


I would, but then the linked FAQ is outdated (in fact, the 1.20 version[0] indicates it moved to the website), and I don't think having a redirect that points to the prev website would work. According to [1], the presence of an FAQ on the website is an unresolved question.

[0]: https://doc.rust-lang.org/1.20.0/complement-lang-faq.html

[1]: https://github.com/rust-lang/www.rust-lang.org/issues/445


Ah, I thought you could redirect to prev. Sorry about that!


Heh. "Currently, Rust is still pre-1.0" doesn't seem to agree with the URL. ;-)


Just another demonstration of how easy it is to let this kind of thing get out of date.


Your nickname perfectly fits my opinion on the newer site, BTW.


That... Seems bad. Sorry to jab but is optics also why they decided on confusing non-standard async syntax?

I always thought that for open source software, the more terrible the website the better it was. A site too slick may ruin Rust's reputation


People that appreciate Rust's features can and do also appreciate a nice site - they're not mutually exclusive.

There is no such thing as "standard" async syntax unfortunately. The Rust language team is trying to find the optimal blend of ergonomics and consistency when there isn't a clear winner. It feels like pointless bike shedding to many, but a decision like how to handle strings has bifurcated the Python community for years.


This implies the old site wasn't nice. Which is not true. The old site isn't as pretty as the new one for sure, but it wasn't a bad site. It was straightforward.

Personally I don't like that the new website doesn't even display rust's syntax (what it looks like) on the homepage anymore.


> The old site isn't as pretty Wasn't it? It was clear, clean, had nice typography, didn't cause epileptic seizures when you scroll it. That's a way more than I can say about the new one.


Haha. I've been playing with Rust lately, which is what led me to your blog post. I also added a comment immediately after which linked to the up to date Rust FAQ, but I believe it is buried further down.

Even if it is dated, I found the gist of it very helpful, so thanks!


This is a good write up from the point of view of a C/C++ programmer, since it gives a fair breakdown of where that extra space is going and rightly points out that statically linked C/C++ executables are going to be large as well. It's also a fun tour of how to access the syscall interface directly from Rust and to perform optimizations that most C and C++ programmers wouldn't ever perform perhaps outside of the most space constrained embedded projects.

I take a minor issue with the handwaving away (not just in this article but with others as well) of glibc and the suggestion that it can just be replaced with musl or another libc. There's a reason that glibc doesn't officially support static linking, and that is NSS, and PAM.

If all you need is a static executable for your Docker container or whatever that reads user/group information and authenticates out of flat files in /etc, then go for it. But in the "enterprise" space things like LDAP, Active Directory, 2FA etc. are real, and if your application needs to support those, then you're going to need glibc and its dependency on dlopen() and friends.

And this goes for every language which has a dependency on your chosen libc as well (which let's face it, is a large majority at least in the Linux world), if you want to use NSS and PAM modules.


OK fist I'm not defending c or c++ but trying to fix incorrect information:

> ...rightly points out that statically linked C/C++ executables are going to be large as well.

This is false, and both how a library is structured how linkers work, if you statically link parts of musl expect a somewhat tiny size increase.

> There's a reason that glibc doesn't officially support static linking, and that is NSS, and PAM.

That and the code isn't structured for static linking as there are dependency chains pulling in many symbols.

> ...most C and C++ programmers wouldn't ever perform perhaps outside of the most space constrained embedded projects.

This is because syscalls are os specific and depending on the os unstable. Also Linux has an interesting way of encoding errno into the ret val (maybe others too?) not to mention vdso 'syscalls'.

> ...glibc and its dependency on dlopen() and friends.

If $code doesn't work on musl 99.9% of the time its due to $code, also musl has dlopen and a dyn linker otherwise alpine wouldn't work.


Thank you for the technical clarifications. In the boring, corporate, enterprise software bubble in which I have mostly worked for 25 years, musl/busybox/Alpine are barely a blip on the radar. I have not ever seen an Alpine Linux install in 25 years, unless it is the basis of various busybox based appliances and I've not noticed. Certainly never seen it used for running "mission critical" bloated Java enterprise apps etc.

So within my comment I thought it was implicitly clear I was referring to glibc based distributions, for example Redh^H^H IBM.

You CAN statically link binaries against glibc on these distributions, and in many cases it will work. However NSS and PAM will not work in my experience. Is it a generally a good idea to static link against glibc? No.

> If $code doesn't work on musl 99.9% of the time its due to $code, also musl has dlopen and a dyn linker otherwise alpine wouldn't work.

Is sssd officially supported on Alpine Linux? Does it just work out of the box? It seems to be in "testing" branch from what I can see with over 600 open bugs.

There is a reason that shareholders of large organizations want to pay a large, established Linux vendor for support, regardless of whether or not their engineers will ever use the support or not. They want to pay for stability/security updates. They are paying for a Linux distribution that 3rd party application providers have certified their application for.

They want LDAP/AD and other pluggable PAM modules to work out of the box without too much tinkering. Alpine Linux may fit that criteria for all I know. Doesn't matter. My employers wouldn't use it whether I wanted to or not.

Not saying that I like how things have turned out for Enterprise Linux necessarily, but it is what it is. Redhat and clones absolutely DOMINATE this particular space, at least in the USA, UK and Australia.


> They want LDAP/AD and other pluggable PAM modules to work out of the box without too much tinkering.

PAM works fine with musl even though the main implementation of PAM is horrible for security. Also PAM isn't tied to a libc so I don't understand why its mentioned, the glibc implementation of NSS tied to libc but there are other implementations of NSS.


> If all you need is a static executable for your Docker container or whatever that reads user/group information and authenticates out of flat files in /etc, then go for it.

If you're using a Docker image, then that's your statically-linked binary, for all intents and purposes. It won't change sizes much if you link libc statically or dynamically (esp. if you're running in an Alpine container with musl).


Substitute Docker for another container technology, or even "IOT device" in my comment if that helps. The point i'm making is not about the size of the binary, size or type of container that it runs in (if at all), but about the features it can/will support if you rip out glibc and replace it in a fit of rage with musl or similar which seems to be somewhat of a fashion at the moment.


> There's a reason that glibc doesn't officially support static linking, and that is NSS, and PAM.

Thanks for pointing this out; I'd hit this problem in the past and it's annoying to see people keep promoting static linking as a silver bullet when there is a genuine purpose to dynamically linking the name services.


For those writing embedded firmware, one of the subpoints of a talk I did recently (about a rust dev kit for the Nintendo 64) was that you want to avoid a standard model where in C and C++ you might have most of your code in a static library and build the executable as a second step. Instead you really want all of your code including your asm in rust source files so it's as painless as possible to generate an LTOed binary. I had binary sizes go from ~1MB to about ~70KB. Rust really depends on LTO if you care about binary size.


Was this talk recorded anywhere? it sounds really interesting.


No, unfortunately. I really just need to take the content and stick it in a blog, but life gets in the way...


So kick life's ass, tell it to back off, and blog that info somewhere!

It would be good info for the rest of us, and it would be great if you could chronicle that stuff.

(Don't tell Life I said to kick it's ass, please. It will crush me.)


Hi! I think I was at your talk. In Denver? I asked about emulation.


Hey! Yeah, that was me. Sorry for being so rambley; I hope that you still enjoyed it.

https://github.com/n64dev/cen64 is the emulator that I use


Both executables have grown on my machine since 2016. On Linux, no optimization or stripping

* C Hello World - 19K

* Rust Hello World - 1.6M (!)

The C gain is attributable to a change to ld: it puts read-only data and executable data in separate segments by default as a security hardening measure since ld 2.30. I don't see why this shouldn't produce three segments (R, RE, RW), but here I get four (R, RE, R, RW). (Anyone know why four?) If I pass -znoseparate-code to disable this the C goes back to the 8K shown in the article. (No effect on the Rust).

The Rust gain appears to be mostly in debug info. Stripped I get 187K, which is similar to 121K the article gives after stripping and removing jemalloc (which isn't in by default anymore).

Does anyone know why Rust has grown so much from 2016 to 2019?


Rust on Linux has a bug that it puts libstd's debug info into the executable, even in release mode.

You need to run `strip` on the executable :(


But even the stripped executable has grown by ~50%.


Did you compile with the --release flag? Without it will build a debug build and put all the debug info in there (which mainly due to println is a lot).


  $ cargo --version && cargo init hello && cd hello && cat ./src/main.rs && cargo build --release && du -h ./target/release/hello
  cargo 1.34.0 (6789d8a0a 2019-04-01)
       Created binary (application) package
  fn main() {
      println!("Hello, world!");
  }
     Compiling hello v0.1.0 (/tmp/foo/hello)
      Finished release [optimized] target(s) in 0.31s
  1,6M ./target/release/hello

Edit:

  $ strip target/release/hello && du -h ./target/release/hello
  192K ./target/release/hello


Note that, while it doesn't matter much in this case, with `du` you'll get ~±4Kb out of nowhere because it measures by fs blocks. You need `--apparent-size` for the file size―which doesn't exist in the BSD version (= MacOS version).


164K after fat lto, Os, panic=abort and stripping on nightly 2019-05-12


It doesn't make any difference for something as small as hello world (you can try it). A release build might not have debug info in the sense that -C debuginfo=0, but it still has plenty of stuff that can be stripped.

You also get the same size if you bypass the println machinery with io::stdout().write_all(b"Hello, world!").


My C++ 'hello world' program on Borland C++ Builder back on Windows 3.1 was about a megabyte, far larger than an equivalent program today. Executable size is a characteristic of the compiler, not the language. Rust is still a very young language and I'd imagine it's currently focusing on other things and will get round to optimizing output size.


Huh, 271K on MacOS using debug (268K using release)..

Edit: 1.34.2


The larger size on Linux is because the binary itself contains the debug symbols. On Mac, they're in a separate .dSYM file from what I understand. See https://github.com/rust-lang/rust/issues/46034.

According to that thread, the size jump of the unstripped binary is probably from libstd being compiled with -C debuginfo=1 now.


Ah thanks, that makes sense!


The real answer is that they're not. It's just that C has a head start of having a 10MB libc already installed on your system, and Rust doesn't.

If you compile C with libc statically linked in, it'll make executables as large as Rust's. If you compile Rust for dynamic linking, it'll give you hello world as small as you get from C.


"If you compile C with libc statically linked in, it'll make executables as large as Rust's."

    $ cat test.c && musl-gcc -static -O2 test.c && echo && du -h a.out 
    #include <stdio.h>

    int main(void) {
        printf("Hello World\n");
    }

    20K a.out


With an "old" gcc I use at work:

    echo $(gcc --version) && cat test.c && gcc -static -O2 test.c && echo && du -h a.out
    gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516 ...
    #include <stdio.h>

    int main(void) {
        printf("Hello World\n");
    }

    796K a.out


I stand corrected. Congrats to MUSL!


I would also imagine that the compiler is performing a couple of optimizations that would significantly reduce the resulting binary’s size: the call to printf would likely resolve to puts, which might be further optimized to a raw write call by a smart enough compiler.


And possible dead code elimination in the linker.


> If you compile C with libc statically linked in, it'll make executables as large as Rust's.

Not in my experience.


That might be the case with glibc but other libraries, compilers and OSes do not follow that. A minimal statically compiled C program in OpenWatcom under Windows is around 50KB and a statically compiled C program in MSVC is around 100KB (i do not count GCC/Clang because last time i checked those generate code that rely on msvcrt.dll which actually is a C library that comes with Windows).

Other languages do follow a similar pattern too. Free Pascal has its own runtime library which is linked statically to the executable, does not link against libc unless you want to use c library functions (or libraries that themselves may want to link against c), has a much richer library than libc and yet it can create executables around the same size as C (a simple hello world is a 40KB .exe file in FPC).

So, no, the reason for C having small executables is not that C has any head start (besides, with the exception of GCC and Clang, no other Windows C compiler in Windows relies on a C libraries that comes with Windows, although some - like MSVC and C++ Builder - can use it as a dynamic library which may or may not be installed system-wide).


Does Rust have the same level of support for dynamic linking as C? Genuine question, I don't know the answer. Good support for dynamic linking is crucial for apps that get shipped in Linux distributions.


Yes, in both directions (using a .so, and building a .so in Rust), but IIRC linking to a .so is a bit hairy -- requiring a build.rs (build script) that passes special options to Cargo. Generating a .so is just a different target type to Cargo (cdylib).


Theoretically you could link to a .so with just `#[link(name="soname", kind="dylib")]` in the code.

The hairy part is finding the library on the system, and that's a mess that Rust merely inherited. The build scripts are there to run pkg-config, search system lib dirs, and compile a fallback if necessary (next everyone suggests to just have Cargo automatically figure it out, until they see how deep the rabbit hole goes for snowflake libraries like openssl and llvm).


I guess the critical question is whether it's built up to the point that anyone wants to use it yet. It's one thing to have theoretical support for dynamic linking, but if in practice everyone just ships giant static binaries (which seems to be the emerging default for Go, node, etc), it doesn't help you much.


> I didn’t mention this because it doesn’t improve such a small program, but you can also try UPX and other executable compressors if you are working with a much larger application.

I don't recommend this as it will increase the memory used by your program which is usually more important than how much disk space it takes up.

OSes only page in the bits of the executable that is actually run, however when upx decompresses the binary it has to load it all into RAM. That RAM can be swapped out but for a normal executable backed by a file the kernel would just drop pages from the file so no write IO needed.

In my tests with rclone, using upx made the binary 31% of the size, but made it use 42% more RAM. YMMV!


While you are technically correct, nowadays the executable file size is minuscule compared to the available memory. Since upx uses the UCL algorithm which requires no additional memory besides the target buffer AFAIK, the memory increase can only be attributed to the original file size---I think rclone weighs about 20 MBs, so it is probably the case that rclone unusually consumes less memory! :p

The real problem with upx is the invocation latency, which is a big blocker when your program starts a lot (common for CLIs). Executable compression has been much more common in Windows because the download size matters there and programs tend to last much longer.


I've found that UPX triggers AntiVirus programs more often than uncompressed binaries on Windows.


I have routinely created firmware written in Rust which are just a handful of KB. The issues people are having are related to system-integration, such as static linkage, can be mitigated in cases anyone really cares about, and will disappear over time.


Also, the author mentions this is in the Rust FAQ (https://prev.rust-lang.org/en-US/faq.html) now under the following question:

Why do Rust programs have larger binary sizes than C programs?



One of the points has since been removed from reality: jemalloc was removed and the default allocator switched to system in 1.32 (January 2019)[0].

[0] https://github.com/rust-lang/rust/blob/master/RELEASES.md#co...


I find interesting though that the Rust code that generates a larger binary is about as complex and lengthy as the C code, while the Rust code which generates a smaller binary is clearly clunky and nobody would ever use it.

Of course it is a toy example and I imagine in most real-world examples the size of the binary is a moot point.


I believe that rust now defaults to the system's malloc implementation instead of bundling jemalloc, so that bit is no longer necessary:

    #![feature(alloc_system)]
    extern crate alloc_system;
As for the part that removes libstd it's how you'd generally do it for a bare metal program (i.e. bootloader/OS) since you can't really use it without an OS unless you manually reimplement all sorts of primitives (memory allocations being the most obvious one).

Now if you do use libstd it's true that it's significantly larger that libc, but it's also vastly more powerful. You don't have anything like String, Vec or iterators in libc for instance, just a bunch of relatively low-level functions and wrappers around syscalls. It's also a fixed cost, so obviously for a simple Hello World there's a massive overhead but for a more complex application it should be less noticeable.

I mean when you think about it even the C Hello World is ridiculously bloated. If you were to write the equivalent program in assembly without any dependency on the libc you could probably get a binary that would be significantly smaller (and most of its size would really be the ELF metadata).


Well, if you're going to go that route, you in fact can make an ELF exectuable that's smaller than the size of the ELF header (under Linux anyway)

https://www.muppetlabs.com/~breadbox/software/tiny/teensy.ht...


The smaller, similar to C code uses println, which is a macro, and expands out at compile time to be bigger than the clunky hand written code.


While comparing the sizes of "hello world" programs is cute, how do real world small utilities compare?


That's an interesting breakdown. The size of Rust binaries is probably the largest reason why Rust is not terribly compelling for me.

Seeing that it's possible to make that more reasonable is enlightening!


The author doesn't even understand what's wrong with large binaries and writes nonsence about Internet connection speed. Main problem is that statically built libraries are not shareable. For example, if you have two Electron applications, each with its own copy of Chrome, then every Chrome will take about 100 Mb only for code section (not counting data, stack which typically will be much larger) and together they take 200 Mb. But if those applications didn't ship their own copy of Chrome and used the same libraries then the code would be shared and took only 100 Mb of RAM. That is, 100 millions of bytes of RAM are saved.

The more RAM such applications take, the higher is probability that the system will get out of RAM and start swapping. And in this case the system will become so slow that it won't matter what kind of Internet connection you have.

The problem of non-shareable code is also actual for interpreted languages like Javascript, Python or Java. When running such applications, their code and byte-code generated from it often are not shareable. So if you run two copies of desktop JS application, that uses some JS library, you can get two copies of this library in memory. As I remember, Android uses some clever trick that allows sharing compiled Java code. Most other interpreted languages cannot do it.

Also, a fun fact: if you are using Gnome-shell on Wayland and the system starts swapping, computer can stop responding even to mouse movement, and you won't be able even to switch to other VT.


But part of the point of electron is a fixed base that you control. You only have to validate against one specific version of electron. Forcing people to use the same electron is directly against part of the value proposition to using it in the first place.

Like, I say this as someone who hates the electron-ification of the desktop.

Additionally there's some pretty good arguments when you're not in a dual interpreted/compiled world and can LTO fully like a pure Rust app is, that the benefits of having everything shared between processes are outweighed by stripping everything out that isn't necessary.


The REAL answer is still the same as in 2016: the language is not mature yet. It will be fixed, in time, after other things that matter more.

Binary size has less priority than important goals such as feature completeness, compiler speed, optimization performance, target platform coverage, performance tooling support, library coverage, and myriad others. Memory has become cheap enough that excess binary size is not typically the limiting factor preventing deployment, where that happens.

As the language matures, numerous impediments to industrial deployment will fall, one at a time. The language and its ecosystem are maturing with impressive, even stunning rapidity, but it will still be ten years before the language is an obviously safe choice for any random project -- if it gets there at all. Historically, odds are against that, but the only real risks for Rust are whether its adoption rate can be grown fast enough to retain relevance, and whether its development jumps some unforeseen shark not easily recovered from.


> it somehow aborted. Probably a libbacktrace issue, I don’t know, but that doesn’t harm much anyway.

Are you talking about the SIGILL? That’s probably a ud2 inside of panic, which presumably tells LLVM it will not return and causes this instruction to be placed there.


It always amazes me when new languages start out and their implementations include a huge pile of crap by default.

Virgil is different, it only includes the stuff which is reachable from main(). In fact the entire compiler is organized around only compiling reachable code into the binary.

Even including the entire runtime and garbage collector, about (8KiB), there isn't much smaller you can get:

HelloWorld.v3: def main(a: Array<string>) {

System.puts("Hello World!\n");

}

-rwxr-xr-x 1 titzer wheel 80 May 16 13:59 HelloWorld-jar

-rwxr-xr-x 1 titzer wheel 9088 May 16 13:59 HelloWorld-x86-darwin

-rwxr-xr-x 1 titzer wheel 5744 May 16 13:58 HelloWorld-x86-darwin-nogc

-rwxr-xr-x 1 titzer wheel 4132 May 16 13:58 HelloWorld-x86-darwin-nort

-rwxr-xr-x 1 titzer wheel 4692 May 16 13:59 HelloWorld-x86-linux-nogc

-rwxr-xr-x 1 titzer wheel 340 May 16 13:59 HelloWorld-x86-linux-nort

-rw-r--r-- 1 titzer wheel 6309 May 16 13:59 HelloWorld.jar

-rw-r--r-- 1 titzer wheel 63 May 16 13:57 HelloWorld.v3

-rw-r--r-- 1 titzer wheel 3486 May 16 14:04 HelloWorld.wasm

-rw-r--r-- 1 titzer wheel 256 May 16 14:46 HelloWorld-nogc.wasm

(the executable files are, well, the executables). The Linux executable without runtime or GC is literally 340 bytes, the wasm executable without GC is 256 bytes. The x86-darwin-nort binary is large because apparently Mach-O executables don't work right unless they are at least one 4KiB page in size.


I wanted to check that out, but it seems the homepage at http://compilers.cs.ucla.edu/virgil/ is broken, the webserver doesn't include the server side includes (e.g. linkbar.inc, footer.inc)



Why was this downvoted?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: