Hacker News new | past | comments | ask | show | jobs | submit login

A linker typically only includes the parts of the library it needs for each binary so some parts will definately have many copies of the same code when you statically link but it will not make complete copies.

But I wouldnt consider this bloat. To me it is just a better seperation of concerns. To me bloat would be to have a system that has to keep track of all library dependencies instead, both from a packaging perspective but also in runtime. I think it depends where you are coming from. To me static linking is just cleaner. I dont care much for the extra memory it might use.




Dynamic linking served us when OS upgrades came infrequently, user software was almost never upgraded short of mailing out new disks, and vendors had long lead times to incorporate security fixes.

In the days of fast networks, embedded OSs, emphemeral containers, and big hard drives, a portable static binary is way less complex and only somewhat less secure (unless you're regularly rebuilding your containers/execs in which case it's break even security wise or possibly more secure, simply because each exec may not include vulnerable code)


> In the days of fast networks, embedded OSs, emphemeral containers, and big hard drives, a portable static binary is way less complex and only somewhat less secure

If what you're trying to do is run a single program on a server somewhere, then yes absolutely a static binary is the way to go. There are lots of cases, especially end user desktops, where this doesn't really apply though.

In my opinion the debate over static vs dynamic linking is resolved by understanding that they are different tools for different jobs.


  understanding that they are different tools for different jobs
Right, but this goes against the dogma on both sides and the fact that much of Linux userspace is the wild west. Ideally, there should be a set of core system libraries (ex glibc, openssl, xlib, etc) that have extremely stable API/ABI somatics and are rarely updated.

Then one dynamically links the core libraries and statically links everything else. This solves the problem that a bug/exploit found in something like OpenSSL doesn't require the entire system to be recompiled and updated while allowing libraries that are not stable, used by few packages, etc, to be statically linked to their users. Then, when lib_coolnew_pos has a bug, it only requires rebuilding the two apps linked to it, and not necessarily even then if those applications don't expose the bug.


> Then one dynamically links the core libraries and statically links everything else.

Agreed, and that is already totally possible.

- If you split your project in libraries (there are reasons to do that), then by all means link them statically.

- If you depend on a third party library that is so unstable that nobody maintains a package for it, then the first question should be: do you really want to depend on it? If yes, you have to understand that you are now the maintainer of that library. Link it dynamically or statically, whichever you want, but you are responsible for its updates in any case.

The fashion that goes towards statically linking everything shows, to me, that people generally don't know how to handle dependencies. "It's simpler" to copy-paste the library code in your project, build it as part of it, and call that "statically linking". And then probably never update it, or try to update it and give up after 10min the first time the update fails ("well, the old version works for now, I don't have time for an update").

I am fine with people who know how to do both and choose to statically link. I don't like the arguments coming from those who statically link because they don't know better, but still try to justify themselves.


> Agreed, and that is already totally possible

How? Take for instance OpenSSL mentioned above. I have a software to distribute for multiple Debian versions, starting from Bullseye which uses OpenSSL 1.x and libicu67. Bookworm the more recent has icu72 and OpenSSL 3.x which are binary-incompatible. My requirement is that I do only one build, not one per distro as i do not have the manpower or CI availability for this. What's your recommendation?


> How?

Well you build OpenSSL as a static library, and you use that...

> Take for instance OpenSSL mentioned above.

However for something like OpenSSL on a distro like Debian, I really don't get why one would want it: it is most definitely distributed by Debian in the core repo. But yeah, I do link OpenSSL statically for Android and iOS (where anyway the system does not provide it). That's fairly straightforward, I just need to build OpenSSL myself.

> My requirement is that I do only one build

You want to make only one build that works with both OpenSSL 1 and OpenSSL 3? I am not sure I understand... the whole point of the major update is that they are not compatible. I think there is fundamentally no way (and that's by definition) to support two explicitly incompatible versions in the same build...


> Well you build OpenSSL as a static library, and you use that...

I mean yes that's what I do but see my comment, I was asking specifically about dynamic linking mentioned by the parent (OpenSSL is definitely a "core library")

> I think there is fundamentally no way (and that's by definition) to support two explicitly incompatible versions in the same build.

Yes, that's my point - in the end static linking is the only thing that will work reliably when you have to ship across an array of distros even for core libraries... The only exceptions in my mind is libgl & other drivers


I strongly believe that developers should not ship across an array of distros. First because you probably don't test on them all.

Really, that's the job of the distro/package maintainers. As a developer, you provide the sources of your project. If people want to use it on their respective distro, they write and maintain a package for it, or ask their distro maintainers to do it. That is the whole point of the distro!


Well, I completely disagree. I have a fair amount of users on a wide array of distro who are non-technical - just users, they wouldn't know how to compile something let alone write a distro package. They still deserve to be able to use the software they want without having to change OS.

> or ask their distro maintainers to do it.

This only works if you're using a rolling-release distro. You can't get new packages in the repos of Ubuntu 20.04, Suse Leap, Fedora 30 or Debian Bullseye.


Statically linking does not imply copying the code into the project


Of course not. My point was that people who say "static linking is better" because the only thing they know (which is copying the code into their project) results in something that looks like static linking are in the wrong.


> Right, but this goes against the dogma on both sides and the fact that much of Linux userspace is the wild west. Ideally, there should be a set of core system libraries (ex glibc, openssl, xlib, etc) that have extremely stable API/ABI somatics and are rarely updated.

This is largely true and how most proprietary software is deployed on Linux.

glibc is pretty good about backwards compatibility. It gets shit for not being forwards compatible (i.e. you can't take a binary linked against glibc 2.34 and run it on a glibc 2.17 system). It's not fully bug for bug compatible. Sometimes they'll patch it, sometimes not. On Windows a lot of applications still link and ship their own libc, for example.

xlib et al don't break in practice. Programs bring their own GUI framework linking them and it'll work. Some are adventurous and link against system gtk2 or gtk3. Even that generally works.

OpenSSL does have a few popular SONAMEs around but they have had particularly nastily broken APIs in the past. Many distros offer two or more versions of OpenSSL for this reason. However, most applications ship their own.

If you only need to talk to some servers, you can link against system libcurl though (ABI compatible for like twenty years). This would IMHO be much better than what most applications do today (shipping their own crypto + protocol stack which invariably ends up with holes). While Microsoft ships curl.exe nowadays, they don't include libcurl with their OS. Otherwise that would be pretty close to a universally compatible protocol client API and ABI and you really wouldn't have any good reason any more to patch the same tired X.509 and HTTP parser vulnerabilities in each and every app.


It applies very much to end user desktops as well, with snap, flatpak, etc. working towards it. Lots of software requires dependencies that aren't compatible with each other and result in absolute dependency hell or even a broken install when you dare to have more than one version of something. Because who would ever need that, right? Especially not in a dev desktop environment...

Windows is basically all self-contained executables and the few times it isn't it's a complete mess with installing VC++ redistributables or the correct Java runtime or whatever that clueless users inevitably mess up.

We have the disk space, we have the memory, we have the broadband to download it all. Even more so on desktop than on some cheap VPS.


> Windows is basically all self-contained executables

With the caveat that the "standard library" they depend on is multiple GBs and provides more features than entire Gnome.

Also MS always worked in some tech to avoid library duplication such as WinSxS or now MSIX has autodedupe even at the time of download.


> when you dare to have more than one version of something. Because who would ever need that, right?

If done properly, you can have multiple major versions of something and that's fine. If one app depends on libA.so.1.0.3, the other on libA.so.1.1.4, and they can't both live with 1.1.4, it means that `libA` did something wrong.

One pretty clear solution to me is that the dev of libA should learn good practice.


Yep, the dev(s) of libA should learn good practice. But they didn't and app1 and app2 still have the problem. Static linking solves it for them more reliably than trying to get the dev of libA to "git gud". Much of the desire to statically link binaries comes from this specific scenario playing out over and over and over.

Heck for a long time upgrading glibc by a minor version was almost guaranteed to break your app and that was often intentional.


> Yep, the dev(s) of libA should learn good practice. But they didn't and app1 and app2 still have the problem.

Sure :-). I just find it sad that app1 and app2 then use the bad libA. Of course that is more productive, but I believe this is exactly the kind of philosophy that makes the software industry produce worse software every year :(.


I used to think the same. But after nearly 30 years of doing this. I no longer think that people will meet the standard you propose. You can either work around it or you can abandon mainstream software entirely and make everything you use bespoke. There are basically no other choices.


Yeah I try really hard to not use "bad" dependencies. When I really can't, well... I can't.

But still I like to make it clear that the software industry goes in that direction because of quality issues, and not because the modern ways are superior (on the contrary, quite often) :-).


Wishing that all people will be smart and always do the correct thing is setting yourself up for madness. The dependency system needs to be robust enough to endure a considerable amount of dumbfuckery. Because there will be a lot of it.


Because I have to live with "malpractice" doesn't mean I should not say it is, IMHO.

I can accept that someone needs to make a hack, but I really want them to realize (and acknowledge) that it is a hack.


It should be noted though that flatpaks and related solutions are NOT equivalent to static linking. They do a lot more and serve a wildly different audience than something like Oasis. They are really much too extreme for non-GUI applications, and I would question the competence of anybody found running ordinary programs packaged in that manner.

I recognize that you probably weren't confused on this I'm just clarifying for others since the whole ecosystem can be a bit confusing.


Windows makes up the lion's share of desktop computing, and seems to be doing fine without actually sharing libraries. Lots of dynamic linking going on, but since about the XP days the entire Windows ecosystem has given up on different software linking the same library file, except for OS interfaces and C runtimes. Instead everyone just ships their own version of everything they use, and dynamic linking is mostly used to solve licencing, for developer convenience, or for plugin systems. The end result isn't that different from everything being statically linked


As far as I can see, it would be unwise to roll back 30 years of (Linux) systems building with dynamic linking in favor of static linking. It mostly works very well and does save some memory, disk, and has nice security properties. Both have significant pros and cons.

I've been thinking (not a Linux expert by any means) the ideal solution would be to have better dependency management: I think a solution could be if say binaries themselves carried dependency information. That way you get the benefits of dynamic and static linking by just distributing binaries with embedded library requirements. Also, I think there should be a change of culture in library development to clearly mark compatibility breaks (I think something like semantic versioning works like that?).

That way, your software could support any newer version up to a compatibility break -- which should be extremely rare. And if you must break compatibility there should be an effort to keep old versions available, secure and bug free (or at least the old versions should be flagged as insecure in some widely accessible database).

Moreover, executing old/historical software should become significantly easier if library information was kept in the executable itself (you'd just have to find the old libraries, which could be kept available in repositories).

I think something like that could finally enable portable Linux software? (Flatpak and AppImage notwithstanding)


Everything you describe already exists. Executables do list their dependencies, and we have well-defined conventions for indicating ABI breaks. It is entirely normal to have multiple major versions of a library installed for ABI compatibility reasons, and it is also entirely normal to expect that you can upgrade the dependencies out from under a binary as long as the library hasn't had an ABI break.

The bigger dependency management problem is that every distro has their own package manager and package repository and it's tough for one application developer to build and test every kind of package. But if they just ship a binary, then it's up to the poor user to figure out what packages to install. Often the library you need may not even be available on some distros or the version may be too old.


That's why distros ask you to provide just the sources and we'll do the packaging work for you. The upstream developers shouldn't need to provide packages for every distro. (Of course you can help us downstream packagers by not having insane build requirements, using semantic versioning, not breaking stuff randomly etc).


This is only realistic for established applications with large userbases. For new or very niche apps, distros are understandably not going to be very interested in doing this work. In that case the developer needs to find a way to distribute the app that they can reasonably maintain directly, and that's where containers or statically-linked binaries are really convenient.


I agree with everything you said up to this. We're talking about a software library, for which the user is a software developer. IMO a software developer should be able to package a library for their own distro (then they can share that package with their community and become this package's maintainer).

As the developer of an open source library, I don't think that you should distribute it for systems that you don't use; someone else who uses it should maintain the package. It doesn't have to be a "distro maintainer". Anyone can maintain a single package. I am not on a very mainstream distro, and I still haven't found a single package that I use and is not already maintained by someone in the community (though I wish I did, I would like to maintain a package). My point is that it really works well :-).

I disagree with the idea that we should build a lot of tooling to "lower the bar" such that devs who don't know how to handle a library don't have to learn how to do it. They should learn, it's their job.

For proprietary software, it's admittedly a bit harder (I guess? I don't have much experience there).


This isn't really true, Fedora, Debian and Arch have huge numbers of packages, many very niche. You might well need to make the distro aware that the new program exists, but there are established routes for doing that.


Arch particularly has the user repository where anyone can submit a package and vote on the ones they use most often to be adopted into the community repository, yes.

It’s a great way to start contributing to the distribution at large while scratching an itch and providing a service to individual projects.


This is not grounded in reality. Look at popcon or something like it. It is a nearly perfect "long tail" distribution. Most software is niche, and it's packaged anyway. It's helped by the fact that the vast majority of software follows a model where it is really easy to build. There are a lot more decisions to take with something like Chromium, which perhaps ironically is also the type of software which tends to package its own dependencies.


>Executables do list their dependencies

They list paths to libraries, but not the exact version that the executable depends on. It is a common occurrence for executables to load versions of libraries they were not designed to be used with.


If you're talking about ELF for desktop Linux, they for the most part don't contain file paths, and may specify the version but usually just have the major version (to allow for security updates). You can use ldd to read the list of deps and also do a dry run of fulfilling them from the search path, for example:

  $> ldd $(command -v ls)
    linux-vdso.so.1 (0x00007ffd5b3a0000)
    libcap.so.2 => /usr/lib/libcap.so.2 (0x00007f6bd398c000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007f6bd3780000)
    /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f6bd39e5000)


Libraries can cause bugs even if they have the same exact version as it may be compiled in a way that is not expected by the program. Ideally the list of libraries should be some form of a hash of the library to ensure it is loading exactly what it expects.


Yes, if someone actually did dependency management in Linux properly then I agree - dynamic linking would be fine. It works pretty well in Nixos as I understand it. But it’s called dependency hell for a reason. And the reason is almost no operating systems handle C dependencies well. There’s always weird, complex, distribution specific systems involving 18 different versions of every library. Do you want llvm18 or llvm18-dev or llvm-18-full-dev or something else entirely? Oh, you’re on gentoo? Better enable some USE flags. Redhat? It’s different again.

If Linux dependency management worked well, there would be no need or appetite for docker. But it works badly. So people just use docker and flatpak and whatnot instead, while my hard drive gently weeps. I don’t know about you, but I’m happy to declare bankruptcy on this project. I’d take a 2mb statically linked binary over a 300mb Linux docker image any day of the week.


> If Linux dependency management worked well, there would be no need or appetite for docker.

I kindly disagree here. Linux dependency management does work well. The problem is the bad libraries that don't do semver properly, and the users who still decide to use bad libraries.

If people stopped using libraries that break ABI compatibility, then the authors of those libraries would have to do it properly, and it would work. The reason it doesn't work is really just malpractice.


If Linux dependency management works well in theory but not in practice then it doesn't work. It works in nix because it can literally use multiple minor versions of a library when it needs to with no problem. Most distro's can't or won't do that.

You can call it malpractice but it's not going to stop so in practice you need a way to deal with it.


Well, by calling it "malpractice", I say that it works for "true professionals". Then we could say that "it doesn't work in practice if people who don't know what they are doing cannot use it", of course.

The question then is where we want to put the bar. I feel like it is too low, and most software is too bad. And I don't want to participate in making tooling that helps lowering the bar even more.


And by the way it does work really well for good software. Actually most Linux distros use a system package manager and have been doing it for decades.

So I think it would be more accurate to say that "it doesn't work for lower quality software". And I agree with that.


Semver only controls API compatibility, not ABI compatibility. You can make an ABI break in a Semver minor (or patch) version update. Semver is nice, but it's not enough for ensuring compatibility when dynamic linking.


SONAME is here for ABI compatibility, right?


This isn't really an "operating system" problem. Particularly in the open-source world, there are a number of fairly core libraries that refuse to provide any kind of API compatibility.

Then, when there are a couple dozen applications/etc that depend on that library, it's almost an impossible problem because each of those applications then needs to be updated in lockstep with the library version. There is nothing "clean" about how to handle this situation short of having loads of distro maintainers showing up in the upstream packages to fix them to support newer versions of the library. Of course, then all the distro's need to agree on what those versions are going to be...

Hence containers, which don't fix the problem at all. Instead they just move the responsibility away from the distro, which should never really have been packaging applications to begin with.


> away from the distro, which should never really have been packaging applications to begin with.

I disagree here: the whole point of a "software distribution" is to "distribute" software. And it does so by packaging it. There is a ton of benefit in having distro/package maintainers, and we tend to forget it.



I should have been more balanced or nuanced a bit: I also don't think static linking is to be forbidden or completely shunned. As Linus himself says, a combination of both may be ideal. For basic system libraries like GUI libraries the current approach works well. But you should be free to static link if you want, and if there are serious issues if you don't. Maybe dynamic linking should be focused on a smaller number of well curated libraries and the rest should be left to static. Library archeology seems like a potential serious problem years from now.

I still think better listing dependencies (perhaps with the option to pin an exact version?) would be helpful, as well as better usage of something like semver. Someone mentioned binaries include paths to dependencies, but as far as I know, there is no tool to automatically try to resolve those dependencies or standard interface, maybe some more tooling in this area would help.

Another nice point about how it current works is that I think it relieves work from programmers. The policy of "Don't worry about distribution (just tell us it exists)" from distros seems like one less headache for the creator (and you can provide static linked binaries too if you want).

As most things in life, the ideal is somewhere in the middle...


> Dynamic linking served us when OS upgrades came infrequently, user software was almost never upgraded

Even today, dynamic linking is not only a security feature but also serves convenience. A security fix in OpenSSL or libwebp can be applied to everything that uses them by just updating those libraries instead of having to rebuild userland, with Firefox, Emacs, and so on.


Then why does every steam game need to install a different version of visual c++ redistributable?


Because they are not packaged by the distros so they are not guaranteed to have the libraries present that they were linked against? I am just guessing, I haven’t used Steam.


Does this happen on Windows too? The reason it happens on Linux is because every game ran via Proton/WINE gets its own virtual C: drive.


Yeah I’d prefer we just use another gigabyte of storage than add so much complexity. Even with what is a modest SSD capacity today I have a hard time imagining how I’d fill my storage. I’m reminded of my old workstation from 8 years ago. It had a 500GB hard drive and a 32GB SSD for caching. I immediately reconfigured to just use the SSD for everything by default. It ended up being plenty.


Apple has been pushing dynamic libraries for a while, but now realized that they really like static linking better. The result is they found a way to convert dynamic libraries into static ones for release builds, while keeping them dynamic for debug builds: https://developer.apple.com/documentation/xcode/configuring-...


Very interesting, as of Xcode 15? I wonder if anyone has explored doing this on Linux, and hope this gets a little more attention.


Yes, announced last June, Xcode 15


I'm not versed in this, so apologies for the stupid question, but wouldn't statically linking be more secure, if anything? Or at least have potentially better security?

I always thought the better security practice is statically linked Go binary in a docker container for namespace isolation.


If there is a mechanism to monitor the dependency chain. Otherwise, you may be blissfully unaware that some vulnerability in libwhatever is in some binary you're using.

Golang tooling provides some reasonable mechanisms to keep dependencies up to date. Any given C program might or might not.


> If there is a mechanism to monitor the dependency chain.

So that would not be less secure, but it would also not make it more secure than dynamic linking with a good mechanism, right?


Personally, I think any inherent security advantage (assuming it has great dependency management) would be very small. This "Oasis" project doesn't seem to call it out at all, even though they are making a fair amount of effort to track dependencies per binary.

They cite the main benefits being this: "Compared to dynamic linking, this is a simpler mechanism which eliminates problems with upgrading libraries, and results in completely self-contained binaries that can easily be copied to other systems".

Even that "easily be copied to other systems" sort of cites one of the security downsides. Is the system you're copying it to going to make any effort to keep the transient statically linked stuff in it up to date?


>A linker typically only includes the parts of the library it needs for each binary so some parts will definately have many copies of the same code when you statically link but it will not make complete copies.

Just to add to what you said: in the old days the linker would include only the .o files in the .a library that were referenced. Really common libraries like libc should be made to have only a single function per .o for this reason.

But modern compilers have link time optimization, which changes everything. The compiler will automatically leave out any items not referenced without regard to .o file boundaries. But more importantly, it can perform more optimizations. Perhaps for a given program a libc function is always called with a constant for a certain argument. The compiler could use this fact to simplify the function.

I'm thinking that you might be giving up quite a lot of performance by using shared libraries, unless you are willing to run the compiler during actual loading.

Even without lto, you can have the same results in C++ by having your library in the form of a template- so the library is fully in the /usr/include header file, with nothing in /usr/lib.


> Just to add to what you said: in the old days the linker would include only the .o files in the .a library that were referenced.

It was not exactly like that. Yes, the .o file granularity was there but the unused code from that .o file would also get linked in.

The original UNIX linker had a very simple and unsophisticated design (compared to its contemporaries) and would not attempt to optimise the final product being linked. Consider a scenario where the binary being linked references A from an «abcde.o» file, and the «abcde.o» file has A, B, C, D and E defined in it, so the original «ld» would link the entire «abcde.o» into the final product. Advanced optimisations came along much later on.


> A linker typically only includes the parts of the library it needs for each binary […]

It is exactly the same with the dynamic linking due to the demand paging available in all modern UNIX systems: the dynamic library is not loaded into memory in its entirety, it is mapped into the process's virtual address space.

Initially, there is no code from the dynamic library loaded into memory until the process attempts to access the first instruction from the required code at which point a memory fault occurs, and the virtual memory management system loads the required page(s) into the process's memory. A dynamic library can be 10Gb in size and appear as a 10Gb in the process's memory map but only 1 page can be physically present in memory. Moreover, under the heavy memory pressure the kernel can invalidate the memory page(s) (using LRU or a more advanced memory page tracking technique) and the process (especially true for background or idlying processes) will reference zero pages with the code from the dynamic library.

Fundamentally, dynamic linking is the deferred static linking where the linking functions are delegated to the dynamic library loader. Dynamic libraries incur a [relatively] small overhead of slower (compared to statically linked binaries) process startup times due to the dynamic linker having to load the symbol table, the global offset table from the dynamic library and performing the symbol fixup according to the process's own virtual memory layout. It is a one-off step, though. For large, very large and frequently used dynamic libraries, caching can be employed to reduce such overhead.

Dynamic library mapping into the virtual address space != loading the dynamic library into memory, they are two disjoint things. It almost never happens when the entire dynamic library is loaded into memory as the 100% code coverage is exceedingly rare.


> It is a one-off step, though.

Yes, but often a one off step that sets all your calls to call through a pointer, so each call site in a dynamic executable is slower due to an extra indirection.

> For large, very large and frequently used dynamic libraries, caching can be employed to reduce such overhead.

The cache is not unlimited nor laid out obviously in userspace, and if you have a bunch of calls into a library that end up spread all over the mapped virtual memory space, sparse or not, you may evict cache lines more than you otherwise would if the functions were statically linked and sequential in memory.

> as the 100% code coverage is exceedingly rare.

So you suffer more page faults than you otherwise have to in order to load one function in a page and ignore the rest.


> Yes, but often a one off step that sets all your calls to call through a pointer, so each call site in a dynamic executable is slower due to an extra indirection.

That is true, however in tight loops or in hot code paths it is unwise to instigate a jump anyway (even into a subroutine in the close locality). If the overhead of invoking a function in the performance sensitive or critical code is considered high, the code has to be rewritten to do away with it, and it is called microoptimisation. This will also be true in the case of the static linking.

Dynamic libraries do not cater for microoptimisations (which are rare) anyway. They offer greater convenience with a trade-off over the maximum code peformance gains.

> The cache is not unlimited nor laid out obviously in userspace […]

I should have made myself clearer. I was referring to the pre-linked shared library cache, not the CPU cache. The pre-linked shared library cache reduces the process startup time and offer better user experience. The cache has nothing to do with performance.

> So you suffer more page faults than you otherwise have to in order to load one function in a page and ignore the rest.

I will experience significantly fewer page faults if my «strlen» code comes from a single address in a single memory page from 10k processes invoking it (the dynamic library case) as opposed to 10k copies of the same «strlen» sprawled across 10k distinct memory pages at 10k distinct memory addresses (the static linking case).


You should be keeping track of those library dependencies anyway if you want to know what you have to recompile when, say, zlib or openssl has a security problem.


Well, you have to do that anyways


Can’t file systems de dupe this now




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: