Hacker News new | past | comments | ask | show | jobs | submit login
Sta.li: Static Linux (sta.li)
150 points by Xyzodiac on Feb 19, 2014 | hide | past | favorite | 98 comments



Ah kids, they crack me up!

Lot of fun reading that, I looked around briefly but couldn't find my email archive from Sun, but I was in the kernel group when folks got the idea that "Gee if you shared the text segment of libraries, that would give you more memory for your buffer cache or user pages!" One of the guys in my group re-wrote the linker to scan through and realign text segments so that the maximum number of read-only pages (and thus shareable) could be allocated. And of course code that had static buffers and what not (go look at the 4.1BSD code, they were everywhere). It made my Sun 3/75 which had, wait for it 32 Megabytes of RAM (yes, less than the L2 cache in some machines these days) run quite a bit faster. Took a long time to get right too.

Shared libraries gave you three distinct advantages, one you cut down on the working set size, two you cut down on file load time, and three it became possible to "interpose" on the library and run two versions of the library at the same time for backwards compatibility.

Building a static system might be fun but for a 64 bit system, building one where libraries were fixed in the address space in flash or something might actually be even better.


I remember, but these days the parallel to working set size (depending on the app) is fitting things in cache, or at least cache-awareness -- which some people aim for, but is largely an ignored issue. Page thrashing versus cache thrashing.

An issue both back then and today is potentially that dynamic linkage often ends up using an indirection table, unlike static linkage with simple fixups, which then adds at least the cost of an indirection to each function call.

Sometimes that overhead is swamped by other factors, sometimes it's not, but that would be one reason why static linkage can be faster at times.

Dlopen() is certainly a lifesaver sometimes.


Plus you can inline stuff with static linking.


Whole program optimization is really the killer feature for static linking today. It's possible with dynamic linking, but a static compiler can really go to town shedding weight since lots of libraries have common code structures (see every sufficiently complex C library and its own implementations of various data structures).


that one paragraph... if i were you i'd be wishing i could edit that comment! you most certainly do not "cut down on file load time." the linking process is a complex task, especially if it's to be performed with any efficiency. i remember when starting a large dynamically linked executable was a glacial process on linux, far slower than the time necessary to read the entire executable from disk. no sensible system loads the entire executable when it's statically linked.

your claim of running two library versions at the same time is downright hilarious! this is far harder with dynamic linking, where you need to be sure of loading the right library on every launch of the affected programs than it is with static, where the libraries are just built in to the executables.


To be clear, loading data from disk takes milliseconds and re-computing addresses takes nanoseconds. Yes, in less than one tenth the time it takes to read the part of the libraries you have linked into your binary from disk, you can locate and fixup any link references.

This part I don't get "no sensible system loads the entire executable when it's statically linked." If you're a statically linked executable, by definition the entire file is headed into memory, if there was something in the library you didn't use it got edited out in the link step. Now you may mmap the file and fault it in as you go along, but you are going to have the whole thing read.


> To be clear, loading data from disk takes milliseconds and re-computing addresses takes nanoseconds.

I think the literature surrounding prelink pretty strongly contradicts this assertion. See:

http://people.redhat.com/jakub/prelink/prelink.pdf

http://lwn.net/Articles/341305/

http://lwn.net/Articles/341309/


> you most certainly do not "cut down on file load time."

Surely you must be joking eekee.

I know SSDs have made us all forget, but Disks (you know, those spinning piles of rust that most of us still have in computers to permanently store our bits on) are incredibly, painfully slow. Average times for any action on a disk are in milliseconds. That's millions of computational cycles.

I'm sure someone could invent a system where the dynamic linking process is slower than loading a static executable from a disk, but I've yet to find it. I'm also certain that our assumption that dynamic linking is always the way to go will be more and more challenged by the speed of SSDs, which are becoming much closer to RAM in speeds every day. But for today... no way.


>>the linking process is a complex task, especially if it's to be performed with any efficiency.

I'm no kernel hacker, but doesn't that make the GP argument better?

It would be almost like static linking?

If code is compiled against a shared lib which always will be at the same address in virtual memory, a linking setup could be cached. (And redone if there is a new version of the library, of course.)

(I realize that caching this symbol table won't be a totally trivial change.)


>> I'm no kernel hacker, but doesn't that make the GP argument better?

Nope. Dynamic linking is done each time the program is loaded - the kernel calls out to the dynamic linker to open shared libraries, resolve symbols, create jump tables & the like. Static linking is done once (at compile/link time). When you execute a statically compiled library, the kernel just loads the text, data & bss into memory and more or less starts executing main(). Much, much simpler, although you lose the ability to do things like ASLR.


BugBrother however correctly interpreted my suggestion, which is that given a large address space, one could define an address where shared libraries would always appear. Thus your linking information can be fixed in in the executable and the only dynamic part is a check to see if the library is loaded or not. Thus all the link speed benefits of static linking, the load speed benefits of dynamic linking.


My point was that with dynamic libraries at fixed memory addresses, dynamic linking information can be cached (as long as no binaries are updated). That would imply similar efficiency as for static libraries.

Sorry if I wasn't clear.


Isn't that what prelinking[1] does?

[1]: http://en.wikipedia.org/wiki/Prelink


Thanks. He, a bit after I stopped doing "real" work and went scripting. :-) :-(

Sometimes I regret that, but then I think of the repeated reading of the Effective C++ books. Not to mention the writing of C for a week which I could throw together in hours in Lisp or Perl.


furthermore the most dramatic issues arise with C++ vtables, although even that has been addressed with prelinking and a large address space helps a lot.


I think the real advantage to static linking is that it forces developers to fully acknowledge the resources they are using.

Programs using shared libraries allow a degree of avoiding blame. It's difficult to judge the real impact of things, it rests upon assumptions about how much the library is shared, while putting pressure on the library to be serve more masters and to become more generalized increasing overall size.

It's not a simple equation. It would be at least interesting to get data on all-static systems as well as compare static linked memory usage vs shared library on an individual basis.


You still have the same issues about blame with libraries, static linking them won't solve it.


Yeah, but the dev will know at compile time what those problems are.


Plan 9 is statically linked. Suckless has a lot of plan9 fans, which is probably where this project comes from.


In case it's not obvious: if you static link to a library, say libpng, and a vuln hits, every binary that linked to libpng potentially needs to be rebuilt and distributed.

If the OS has rigid dependency tracking (maybe source distros like Gentoo, or a cryptographically tracked binary distribution like freebsd-update), maybe you can live with that.

So there's some trade off of "dll hell" for binary hell, and perhaps some other security advantages to dynamic libs. IMHO shared libraries are pretty well understood now days and static linking should be avoided unless you have a very good reason.


That vuln problem works the other way round as well. When a new libpng vulnerability is introduced all executables using the shared library are affected, while static-lib users with an older version are fine.


But in general, all binaries in a distribution are compiled against the same version of a library, namely the one that is distributed with it. I don't see that changing in a distribution that was fully statically linked.

Even in the unlikely case where binaries are statically linked against different versions of a library. You'd still have to check against which version each binary is compiled.

Of course, you also gain in security, since all kind of library preloading attacks are not possible anymore.


Many applications started shipping with all neccessary libraries, that's like taking disadvantages of static and dynamic linking combined. At least they can adhere to GPL...


There are already ways to mitigate that. One example would be to use mandatory access control to 'host' the library in a separate process which is stripped off of all unnecessary rights/privileges.


From the FAQ:

"Also a security issue with dynamically linked libraries are executables with the suid flag. A user can easily run dynamic library code using LD_PRELOAD in conjunction with some trivial program like ping. Using a static executable with the suid flag eliminates this problem completely."

Have the authors actually tried this? Using LD_PRELOAD with suid programs won't work.


Right - at least with glibc, ld.so unsets most LD_* variables and more for both setuid and setgid programs. Grep for UNSECURE_ENVVARS in glibc source to get the whole list and see how it's used. I'd be very surprised if any other libc implementation didn't do the same.


I'm puzzled by the idea of a system being leaner/faster with n copies of a library in physical RAM rather than 1 copy mapped via VMM into whatever process wants it. IIRC this was the main point of shared libraries, not pluggability or changing code during runtime. Am I missing something?


Static linking doesn't necessarily link the entire library, unless the entire thing compiles to a single .o file. Linkers are smart enough to only link in the object files needed by the program. So assuming you are only linking in well-designed libraries, I guess it's possible that statically-linked software will be smaller since it will leave out the stuff you aren't using.


I recently played around with this, taking a rather small project (around 15,000 lines of code), putting it all into a single file and compiling. It did produce a smaller executable, but the real gain was in making every function static (since it's all in a single file). Doing that, a total of 41 functions were eliminated (either inlined or not used at all).

Was it worth the effort? Eh. But it was instructive and I'd like to attempt (when I get some time) to try a larger project.


Some popular software like SQLite combine all their sources into one big source file called Amalgamation and then compile that. Their benchmarks show modest but not negligible performance gain.

There's a lot of work going on in link time optimization at the moment, both in LLVM and GCC. It's not quite ready for prime time, it still takes more than a small change in your Makefile to deploy it (e.g. dealing with linkers etc).

With LLVM toolchain you can compile C code (or other high level code) into LLVM IR, link the IR files together and run that through the optimizer.

You will notice that modern optimizers will want to inline everything if possible and a lot of functions will be missing from the resulting binary. Boundaries of object files are perhaps the biggest obstacle in optimization today.


You essentially did Whole Program Optimisation by hand.


We did that (as a developer option) with KDE as well, since KDE 2 or thereabouts?

With the automake-based build system you'd pass "--enable-final" and the buildsystem would cat all the source files together and compile the whole damn thing at once (and really stress-test the kernel and gcc).

With KDE 4 I believe it is -DKDE4_ENABLE_FINAL=TRUE passed to cmake.

It was never quite 100%... sometimes you'd run into things like different source files in a modules declaring the same class name, insufficiently-namespaced header include guards, etc. But it was definitely interesting.


That approach is common practice when developing for game consoles.


> So assuming you are only linking in well-designed libraries

This is a very big assumption.

Dynamic linkers are also clever enough to only mmap the required parts of dynamic library.


It's been awhile since I've mucked about with these sorts of things, but I'm glad that my thought about that is confirmed: if static linking only brings in used functions, why doesn't dynamic loading do the same (it does, apparently)? Much of this railing against dynamic loading wasting resources seems like complaining about the wrong things, either bad dynamic linkers, or bad libraries, neither of which will be fixed by static linking.

Don't get me wrong, there are places I think that static linking is ideal. I wish more distributors of binary only software would statically link, or at least include standalone required dynamic libraries, rather than rely on system dynamic libraries.

I wish them luck in their experiment and hope they can improve static linking, but I suspect they will learn more about why dynamic loading "wastes" so many resources the more they come in contact with real world libraries.


AFAIK static linking doesn't bring in "used functions".

It brings in used libraries, all at once. E.g. if you used sincos() from math.a, and math.a contained 47 other math functions, then you'd get all 48 math functions in your static binary just from using sincos().

Someone correct me if I'm wrong but I believe it's only with good whole program optimization at link time that it's possible to truly prove that a function is unneeded and exclude it (and then re-link if needed to re-resolve symbols to their new address in virtual memory).


You are wrong... sort of. It brings in used objects. A library can be made up of many objects; unused ones will be discarded.


Ah, good point, thanks for the correction.


I don't know that well how Linux dynamic loader works, my comment was based on what is possible in other operating systems in general, and what has been done in operating system research.


the whole library gets mmapped, there is no point doing otherwise, the mapping itself is cheap.

You are probably talking about demand paging (which happens on statically linked binaries.


You probably talking about Linux, there are other types of dynamic loaders out there.


Ah, interesting, do you have some links?


Combined with modern compilers and link-time optimizations they can even leave out code when it's just one .o file also. Dead code elimination ends up great with that there.


On top of what others have said, it takes some time to dynamically load a library into an address space. There are tables that may need to be walked and updated with correct pointers. For large libraries, this can be quite measurable. A statically linked executable will be memory-mapped and then brought in lazily as the program runs. And then if executed again, everything is mapped and loaded, so there is zero delay. Compare to dynamic loading, which will require table updates again (especially with ASLR).

This is why I like to re-make my shell and associated tools static binaries — shell scripts run 10-20% faster. (lots of small programs running repeatedly)


> it takes some time to dynamically load a library into an address space. There are tables that may need to be walked and updated with correct pointers

True, but I'd expect that to be dwarfed by the I/O time required to load even a single 4k page from disk, vs. keeping one copy of a big dynamic library like glibc loaded for the whole system, with fixups done per-process.

Good points about ASLR and static-linking frequently-exec'd-and-exited processes like the shell; and certainly for embedded and HPC it makes sense. I guess the moral, as always, is to measure.


A lot of those pages will already be in the file cache (assuming his use case of small utilities running frequently). Anyway, it should be easy to test: since glibc will always be in memory, any differences in timing between the static and dynamic version should be those alleged loading costs.


How many minutes are saved by 10-20% faster? A few seconds don't matter for a human.


We have some scripts at work that take 10+ hours to execute. Granted most of that time is in large child processes.


They get a speedup from eliminating the indirection used for calls across dylibs. From the sound of it they're also eliminating position-independent code, which itself can be a reasonable speedup, especially on 32-bit x86.


Maybe if we didn't have gigabytes of RAM these days. I quite like the idea of static linking. It makes software packaging and distribution very very easy and it has some security benefits. Go's build system is a good example of this. This project seems to be dead.


I have mixed feelings about the security gains.

On one hand, you eliminate one attack vector since you take ldd out of the equation. On the other hand, you depend on packagers who distribute their programs to rebuild and relink them every time a security issue creeps up a library they link with. I'm not sure I like that, and I don't have the free time I had in high school when compiling everything by hand seemed really fucking cool.


It's the distribution and packaging parts that I love about it. Also might help with cross-platform distribution as well; While dynamic linking does have advantages, I get really frustrated when an old application won't run on a new kernel due to requiring old libraries that simply can't be installed. Static linking fixes that, and it's why anything I try and write is statically linked for the most part!


Ever looked at Nix and NixOS?


i don't know about leaner, but static linking is faster. symbol resolution is a heck of a task to be run on every program load, especially for larger programs. i remember before linux optimized its dynamic linking, starting a gnome or kde program was a glacial process.

as for static linking being leaner, i have been told an entire shared library needs to be loaded if so much as 1 program needs a part of it, but i doubt it. i don't see why shared libraries can't be demand loaded just like executables are. then again, demand loading a shared library would be a more complex task, and i have reservations about complexity just like the suckless community does.


In generic distributions it would cause bloat, but in a single purpose embedded device, I found it was a significant size reduction.


If each copy uses less then 1/n of the library and the rest can be stripped away by the compiler, you need less memory.


On binary sizes:

> Linking a stripped hello world program with glibc results in 600kb. Linking it with uclibc in about 7kb.

That's nice for uclibc, but we're typically linking dynamically. The comparison should be between dynamically and statically linked binaries. A stripped and dynamically linked hello world results in a 6kb program on my machine (glibc).

There's also a lot of handwaving on memory usage in the FAQ.


Yeah - so your stripped and dynamically linked executable is just about as large as a statically built one... and yours still has to link glibc.

I can build busybox (a multi-call all-in-one executable, use symlinks to refer to the binary with the name of a tool and it acts like that tool), with init and bourne shell and the minimal set of command-line tools (coreutils remakes and util-linux remakes) into a 600KiB executable statically linked with uclibc. Combined with a linux kernel, I can boot with it. Meanwhile, my glibc is 2MiB.

Modern computers are really amazing. And it's also amazing that the understandable trend of letting software get bigger and slower as long as it doesn't really cause problems on current hardware has resulted in such astounding (though mostly harmless) waste.

All that said, these stali project pages have existed for years, and there's nothing interesting to show for it. Not that many people really buy into this thing (including me).


Busybox is great and it's specifically built to be tiny. I don't really know what the stali guys are going for. Are they building something to compete with busybox or something more general? If it's the latter then a user will probably have a lot of programs that use way more of the standard library than "hello world" does.

As it stands, the FAQ entry is comparing apples and oranges. Comparing full-featured and dynamically linked programs to statically linked, but feature-limited ones is only interesting if you can get by with the feature-limited version. I suspect we're in agreement.


> A stripped and dynamically linked hello world results in a 6kb program on my machine (glibc).

Is that the proportional size (binary size + glibc size / number of things using glibc), or just the size of the binary?


It's the size of the binary. I don't understand what you mean by proportional size.

If we add more functionality to the hello world program, the dynamically linked version should increase in size slower than the statically linked.


The point of proportional size is to account for the size of glibc, which has to be on disk and in memory and so incurs a cost that isn't accounted for by just looking at the size of the binaries that link against it, by dividing glibc's size by the number of applications that use it.

When examining memory usage on Linux systems, one of the common measurement metrics is the Proportional Set Size (PSS), which essentially is what I was asking about (but for on-disk size, rather than just in-memory): http://lwn.net/Articles/230975/


> There's also a lot of handwaving on memory usage in the FAQ.

Totally. Reading the FAQ reminded me of https://xkcd.com/386/


"The reason why dynamic linking has been invented was not to decrease the general executable sizes or to save memory consumption, or to speed up the exec() -- but to allow changing code during runtime -- and that's the real purpose of dynamic linking, we shouldn't forget that."

Not sure about "changing code during runtime", but one of the great benefits of dynamic-link libraries are for writing plugins. And I don't think it would take much time for an app to look up its own plugin folder.


It's not meant to be just another Linux distribution. It will be whole new system with Linux kernel at the core. It will be more in line with BSDs. That's why static linking will make sense. Your updates are just rsync away.

As I see it suckless.org community prefers 'linking' in form of shell scripts or communication via pipes, ideally via system's VFS.

You will get lean and minimalistic base system. If something will not share same ideals it will not be in the base system (like glib[c]?, bash, Firefox). It is possible that dynamic linking will be allowed in /emul chroot as stated in 'Filesystem' page.

I see sta.li as rock solid minimalistic base system, which you can use on it's own (how I belive many suckless.org folks will use it) or as a base system, that you can build upon. Even more than that, you will be able to use sta.li components in generic distributions with ease, because of static linking.

Go-lang is in my opinion next step for Bell Labs folks in lean development for the masses. Don't take this too literally ;) Plan 9 [1] was first step, but people did not want whole new system. Inferno [2] was next one, but VM system was too much also. Go allows to use some of Plan 9 features in edible form for the masses. Without requirement to install specific system or VM to run your programs. That to some extent is what makes sta.li's ideals similar to go-lang's.

Development of sta.li is at slow pace, but many experiments are under development [3]. Probably some of them will be included in sta.li. Most probably sta.li will include X11, but we are seeing some developments with Wayland [4].

[1] - http://en.wikipedia.org/wiki/Plan_9_from_Bell_Labs [2] - http://en.wikipedia.org/wiki/Inferno_%28operating_system%29 [3] - https://github.com/henrysher/sinit - http://galos.no-ip.org/sdhcp - http://git.suckless.org/dmc/ [4] - https://github.com/michaelforney/swc


Building Gentoo with USE=static might be a good way to experiment without have to build an entire new distribution. Try one with the flag, one without, do the same operations in each and watch the memory usage and time to completion.


It's not that easy. Many libraries can't be statically linked, some programs will break if it will be statically linked. It is impossible to link statically with glibc. Even if it would be possible it's pointless. You will need another lean libc, like musl. Than go figure list of bugs found by musl: http://wiki.musl-libc.org/wiki/Bugs_found_by_musl


There is an experimental Musl gentoo stage I think now.


Took the words right out of my mouth. Though I think to do this well you'd need to go old school and do a fully bootstrapped (formerly "stage 1") install.


The developer, Anselm Garbe, gave a talk about Stali (among other things) at last year's Suckless conference...

http://www.youtube.com/watch?v=Zu9Qm9bNMUU

I'm not qualified to comment on Stali itself but, more generally, I can't recommend Suckless software highly enough.

Since I switched to Linux a few years back I've found myself using more and more of their programs--DWM, Dmenu, ST, Tabbed, Slock and Surf.

Before, I'd hop from one window manager, terminal or browser to another but, for me, Suckless programs just tend to stick because of the minimal philosophy.


It's an admirable initiative, but I'm pretty sure that at this point sta.li has been in the "design phase" for years. It's vaporware. I'd love to be proven wrong, though.


© 2006-2013 for something still not released certainly isn't promising. Glancing at the git logs (http://git.suckless.org/?s=idle) there is some ongoing work, so the project isn't totally dead at least.


Can you compile the world with static linking on Gentoo Linux or any of the BSDs? If so, that could serve as a substitute.


Can't help you with that, latest download is from 2009: http://dl.suckless.org/stali/


the maintainer was busy with work for quite a while, so it might pick up now he's not. i don't care too much myself though, linux has become little more than a web browser platform for me, and i'm sure sta.li would struggle to provide that.


One useful set of gcc flags to consider when building a statically-linkable library/executable and aiming to reduce executable size is -Os -ffunction-sections -fdata-sections -Wl,-gc-sections.

This causes gcc to put each function in a separate section in the resulting object file, and the -gc-sections option makes ld strip the sections that are not reachable by calls from main (basically a tree-shaker).


I enjoyed this from their description of their dwm window manager, which took me back to the general state of Linux circa 1999:

"Because dwm is customized through editing its source code, it’s pointless to make binary packages of it. This keeps its userbase small and elitist. No novices asking stupid questions. There are some distributions that provide binary packages though."


I don't miss the general state of linux circa 1999 one bit.


I, for one, miss spending 10 hours fully compiling GNOME 0.99.8 and Englightenment 0.15 from CVS every week on my Pentium 133 running Linux Mandrake.

Over the past 17 or so years I've been using linux full-time, I've experienced 3 or 4 of the "sweet spots", or times where using linux was superior to everything else on the market. GNOME 1.x with Enlightenment 0.15 (vs Win98) was the second one (the first, I'm told, was E DR0.13 with CmdrTaco's task managing app and Hand of God theme vs Win95). I believe we are currently at the end of another sweet spot with KDE 4.x being put out to pasture, as it completely destroys the UX of Windows 7/8 and Mountain Lion.

But don't knock the state of linux circa 1999. Sure, you had to ensure your sound cards had OSS drivers, and Winmodems sucked, but it was a superior experience to Win9x even back then.


I used to run dwm back when I had a machine with just 256MB RAM. The source code is very clean and if you just want to edit some keybindings or the colourscheme, etc. it's so well structured, it's pretty much like editing a config file. It also compiles in no time at all.


I wonder if this was how people felt about C in early days. ASM was the low level frailty, and C the oh-so-clean one-make-away world to create, extend, modify your system.


Lispmachines existed in the early days of C, so no probably not.


Weren't they pretty much a niche compared to unix systems ?


They were a niche at the time but remember, this was the early 70s, so was Unix.


They missed a chance to call a project 'Stalin'.



Perhaps because they had the opportunity to use the sta.li domain and/or avoid confusion with the Stalin Scheme compilers?


Stali currently does not exist, but there are others statically linked distros out there, ie. http://starchlinux.org/


I like what the guys are doing and would love to try it out, but frankly the page didn't change much for the last two years and all the discussion of strategy isn't worth much without a working distribution with which you can experiment.

Furthermore I suspect the scope of stali will be so narrow that I will never be able to run say a CL implementation on it. Pretty much the same as Plan9, I love the design but it's practically useless for me. :(


For this kind of thing Stal.in is a better domain name.


The group behind Sta.li also makes a great terminal emulator. It is notable for supporting font antialiasing without depending on the large GTK or QT libraries - it uses Xft directly. And it is much smaller than even xterm or rxvt.

http://st.suckless.org/


I was dreaming of it, and knew that no matter how much it was itching I could never scratch it. Thanks


Sounds good, what about Golang based tools? (considering the default static binary generation)


Ever since uriel died, the suckless project doesn't really appreciate Go anymore.


many people in uriel's cat-v community don't either. part of the problem is go's community and its web focus. the other part is cat-v is now focused on a plan 9 fork, and go solves problems which don't come up much on plan 9, or which are already largely solved.


Static linking also relative in a discussion about coupling and OS dependencies : http://unix.stackexchange.com/a/38914/17683


With the use of link time code generation too, this could be better than expected perhaps?





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: