As someone who has actually built binaries to run across multiple distros, this doesn't address even half the issues. Very few people are concerned about binaries that run on more than one architecture when distributing binaries that run on more than one distro is a harder problem.
This solution doesn't solve the hard problems, but solves an easy one in an uninspired fashion. Rejecting these patches wasn't a case of holding back innovation, but merely holding back solutions that more experienced developers felt were not appropriate for the platform.
There was a good reason for that and playing armchair critic after reading a sympathetic story from a developer who is understandably hurt after getting his feature rejected just doesn't help anyone.
The kernel is more well managed than most people give it credit for, and it's exactly because of this that hacky incremental solutions get rejected, even when they might legitimately produce a small benefit. Eventually the overall solution to the problem FatElf is trying to solve is going to have to be something different.
I feel sorry for the developer, no one likes getting a feature rejected. But it was the wrong solution and the technical merits must trump other concerns.
But it was the wrong solution and the technical merits must trump other concerns.
Your post is very long, but I'm curious to hear why universal binaries are the wrong solution?
On the Mac, it's the cornerstone of their transparent multi-architecture support.
Apple (and previously NeXT) inarguably demonstrated its value and ease-of-use over the past 20 years. Producing x86-32, x86-64, armv6, armv7, and PPC binaries all from the same source code -- and having them work the first time -- is incredibly simple on Mac OS X.
NeXTSTEP (the OS) ran on 68k, x86, PA-RISC, and Sparc processors, and it was equally easy to build software that ran on those machines.
I re-read your post and wanted to clarify what you meant by:
Apple (and previously NeXT) inarguably demonstrated its
value and ease-of-use over the past 20 years. Producing
x86-32, x86-64, armv6, armv7, and PPC binaries all from the
same source code -- and having them work the first time --
is incredibly simple on Mac OS X.
Presumably you aren't under the impression that FatELF has anything to do with making source code easily compile to more targets. FatELF merely allows you to combine the binaries that would result from taking a source code base and compiling a binary for one arch and then another and gluing those together into one file.
The wording on your post wasn't entirely clear, though I'm guessing we're on the same page.
OS X has a unified set of libraries to work with, Linux does not. This means the problem is much more complex. Creating a binary that simply runs on different distributions all on the same architecture is a fundamentally hard problem. Simply stacking together binaries for multiple architectures together doesn't result in a universal binary. At worst you have to stack together different binaries for different combinations of libraries and different architectures and if you assume even just the latest versions only of the top 5 distributions running on the top 5 supported architectures for Linux you may well end up with 25 different binaries that FatELF would glue together resulting in every program being a cool 2500% larger. This is unacceptable in terms of space and unacceptable in that it still isn't universal after all that as not all arches are supported and you only get the latest versions of the popular distros.
This is why I claim FatELF does not solve the hard problems with universal binaries. The only thing it might solve is someone could distribute a binary for a particular distro that could run on multiple architectures for that distro, but they can already do this with shell scripts and this is an uncommon case and most people would just rather download and store the one appropriate binary.
(This last is easy with a functional package management system. You should keep in mind that OS X does not have a functional complete package management scheme that allows users to discover, acquire and manage 3rd party packages. Not all Linux distros do either, but those that do are less in need of universal binaries. This is really a separate discussion though that gets to the heart of how software ought to be distributed. There are many blind spots in OS X's software strategy some of which Sparkle is making a half-baked (not sparkle's fault) attempt to fix.)[1]
At the end of the day you can't simply assume that the solution for OS X is the right one for Linux and you'd make a much bigger mistake assuming that the developers who rejected these patches weren't both entirely aware of OS X's excellent use of this technique and were doing this for anything but sound technical reasons. (I.E. It's not just NIH, they would happily adopt this if this was a reasonable solution to Linux's unique[2] challenges with binary compatibility across "Linux")
[1] There are many things wrong with this and most package management systems do not incorporate third party repos/packages as well as they should, which IMO is the larger problem here. Not that corporations should have to manage repos for every distro, but that becomes a separate mess.
[2] Perhaps read as special.
Aside #432: As you noticed my replies can often be anything but concise. I apologize if that makes things hard to follow.
He could write the patch, he could try it and persuade mainline kernel (and glibc and gcc, and...) developers to include it. If the feature is really that useful, he can even maintain it separately, or some Linux distribution can. All of which would be impossible with closed software. Innovation in the oss way, sadly, means, that there are lots of ideas, and most of them thrown out.
Hostile reaction from kernel developers? Well, maybe it's little sad, they could trash the patch more politely or something, but eventually they have to. Fatelf adds stuff to kernel, glibc and few other parts of system without really solving much. He could try something with binfmt_misc and completely userspace, not add junk into kernel and glibc itself. (With junk I mean something unnecessary that can be done in userspace. (or not at all))
On a German Linux news site (prolinux.de), a commercial Linux game developer -- Ankh2, IIRC -- once commented that they could not get their game running on about 2-3% of their customers' Linux computers, despite their best efforts.
I dunno whether FatElf would have solved problems like this in the long run, but there are problems. You may not be affected but this doesn't mean others aren't.
No, I don't think it will solve this kind of problem. It can do the only thing: pack two different binaries into one file, which is also binary, and the kernel has some extra logic to choose, so it is 'transparent'. While this makes some thing easier (if you really don't know whether you want to download that strange 'i386' file or even stranger 'x86_64' file, you download one larger 'fatelf' containing both), it doesn't solve the real problems. Distributions spent quite a lot of effort to make both 32 and 64-bit binaries run on a single system. This is just a hack, nothing more. And it can make the mess even worse. Do I have x86_64 binary in this? Is there a correct elf packed somewhere? How do I know.
Well, it is the cornerstone of Apple's highly successful and entirely transparent transition from PPC to x86, including drag-and-drop application installation/uninstallation. Couple it with Apple's compiler drivers and well-designed multi-architecture SDKs, and it becomes dead-simple to support multiple architectures and OS releases. Just build your binaries with -arch i386 -arch ppc -isysroot /Developer/SDKs/MacOSX10.4u.sdk
It also supports Apple's transparent selection of armv7/armv6 binaries on the iPhone 3GS.
I'd say it's pretty useful, and not really a hack at all.
The only arguments I've heard against it involve package managers and shell scripts, which demonstrates such a remarkable lack of understanding of ease of use and user behavior that I don't even know where to start, other than to say: This is why popular adoption of Linux on the desktop is not going to happen any time soon.
Apple, on the other hand, doesn't have anything even close to a package manager, for example. Also, "transition". There is no transition for Linux. Nobody is really interested in completely phasing out an old platform and port everything to a completely different new one, which is what Apple did. Apple's Mac OS has rather limited hardware support, in fact you shouldn't even try to run it on anything that doesn't have Apple sticker. Linux is expected to run everywhere on everything.
I can't see any possible difference for the fatelf solution and the package manager and scripts solution for an end user. It's not about the ease of use, because there is no difference.
Also, "transition". There is no transition for Linux.
ARM netbooks? That said, part of why there's no 'transition' is that providing easy-to-use (for consumers) third-party application binaries is this side of impossible given the lack of stable APIs across distributions and releases.
I can't see any possible difference for the fatelf solution and the package manager and scripts solution for an end user. It's not about the ease of use, because there is no difference.
I can drag-install any third-party application I want to download, and I can expect it to work on any machine. The author doesn't have to wait to get it packaged by a distribution, set up a package repository, etc. I don't have to wait 3-12 months for the latest version.
I can drag it over to my PPC iMac, and it'll work there, too.
ARM netbook with its limited SSD disk space is really a great usecase for something that makes binaries two (three, or even more) times larger, while I need only the binary for ARM. (Yeah, it's not that much for > 100G HDD, still, it can be quite a lot for 4 or 8G SSD. Especially when it is completely useless for the computer.)
It's not different, because fatelf doesn't help with this. You can still make some tiny wrapper, and have completely the same thing. Vendor still have to make a package fit for your system, for the set of libraries that you have, correct versions of them, and so on. Yes, Mac OS X system has two or three recent different versions, mostly mutually compatible. If you were trying to make a package for only one or two mostly compatible Linux distributions and one or two of their versions, then it would be also that easy. But it is not, and fatelf doesn't make any difference.
I'm not saying this is a hack, but just because it has proved a useful strategy in Apple's situation, doesn't mean that it is the right strategy for Linux.
Apple succeeded by going against the conventional wisdom; now, in many ways, Apple is the conventional wisdom. In the end, I think the most important thing is for people to do what feels right for them, not blindly copying what has worked for others.
Well, it would sure make running software on a mixed cluster a lot easier.
But in any case - the proliferation of package "managers" is a bigger problem. I really hope a few Linux distributions die so that we can at least have standardization in an evolutionary manner.
Exactly! And that sounds like it would have been the smartest move from the start. Something like this doesn't need to be in the kernel. It doesn't need to be in glibc, and doesn't need to be in ld. There are probably performance advantages that you miss out on by using binfmt_misc, but that's the perfect way to get your solution working without any external dependencies. You create two userspace tools: 1) an app that stitches together multiple binaries into a FatELF file, and 2) a launcher program that, when given the path to a FatELF file, picks the correct embedded binary and runs it. #2 might need some fancy tricks with the dynamic loader, but it's certainly doable.
Assuming people do like it and use it, only then do you try to get it integrated into the system proper. Why try to push something into three pieces of software very core to Linux and Linux development when there's no evidence that there will be any developer, distro, or user adoption?
I don't think this is specific to open source; anyone who's worked at a closed-source company can certainly tell the same story about stuff being quashed by egos higher-up. At least in the open source community, if you feel strongly enough, you can fork it and go off without the egos.
This is sort of an interesting counterpoint to the whole "Open source forsters innovation" meme open source types love to throw out.
I may be missing something, but it seems to be fostering innovation in this example as well. He has managed to create something that at least a few people consider innovative, thanks in part to the open licenses of the sources this work is built on, and he is free to distribute it however he likes.
Or are you arguing that it's not innovation unless some other group can be forced to take up work on your behalf, by merging and/or maintaining code they don't want or agree with?
I think it has more to do with one open source project in particular, the kernel, which has a lot of history. The one time I tried to get a patch in, it was very difficult and I ultimately let it be, despite people regularly saying they thought it was a sensible idea. Indeed, there was never any real technical opposition, just a sort of inertia to ignore stuff.
Imagine a binary-only distribution of some application that runs on all hardware platforms (because we get that a lot in OS X and we used to get it on NeXTSTEP as well).
Oh, so having future commercial packages that many people depend on, like Adobe Photoshop or 3D Studio Max ... packaged and distributed as a universal binary for every major distribution and every architecture ... is of no importance whatsoever?
This is a major roadblock for commercial third-parties that would like to distribute ports of their software for Linux.
You can do this without fatelf. Put everything you have into a single package and add "shell scripts and flakey logic to pick the right binary and libraries to load". (He fails to mention, that fatelf is nothing more than this, only hardwired into kernel.)
What you are demonstrating is the userland toolchain. What was suggested above is an improvement by not requiring meddling with the kernel. Nothing prevents a userland toolchain for Linux from supporting what you are describing above without kernel modifications.
What you are demonstrating is the userland toolchain.
No, what I'm demonstrating is an end-to-end architecture-transparent platform, which includes a complete userspace toolchain, a universal binary format, and necessary kernel and dynamic linker support.
What was suggested above is an improvement by not requiring meddling with the kernel
I'm not sure I understand how "meddling" in the kernel is a bad thing when it provides for user-transparent execution of multi-architecture binaries, including transparent emulation of binaries that lack support for the host architecture.
It's not as if it's complex or dangerous to parse the Mach-O or FatELF formats, and if you take a page from Mac OS X or qemu, you can even do transparent emulation in userspace.
Nothing prevents a userland toolchain for Linux from supporting what you are describing above without kernel modifications.
How is it useful to build an easy-to-use multi-architecture binary if the kernel can't actually execute it?
Why are you afraid of doing simple parsing[1] of the binary? The kernel already does this -- how do you think ELF loading and shell script execution works?
I can imagine this conversation 30 years ago[2] -- "Why should we add shell script shebang parsing/execution support to the kernel? Why not just glue together loader executables that load the shell script"
It shouldn't be hard to make an binfmt_misc handler that would rip the correct elf from the fatelf thing (or rather a simple tar) and feed it to kernel, so you can just type ./fatelf-something. You can also make it a shell script with the binary in it, that will choose the correct one and execute it. Both these solutions works without a single kernel line (really), and I can't see anything that fatelf has and these doesn't.
If a proprietary vendor is willing to do all the work to cross-compile, package, and test on a variety of architectures and distributions, I can't believe they would be deterred by the need to link a binary for each package. With FatELF they would have to do that anyway, and then glue all those binaries together before building all their packages.
Yes, and NeXT could have asked its ISVs to do the same thing back when they ran on 040, x86, SPARC, and PA-RISC.
The motivation, though, was not just to make things easier for the ISVs, but also to make things easy for the ISVs' customers. Simplifying the process of buying and installing software was the real thrust of the endeavor. [1] There is real economic value in making things easy for customers.
Why, for example, would web sites offer 1-Click ordering, when entering a credit card number is just a few more keystrokes? Because simplicity makes money flow, and is good for a market.
The Year of the Linux Desktop will never come until these issues become important to the community. Which is to say that it will never happen, because the community, as a whole, has no economic incentive to lower the bar for customers like this.
Ryan Gordan probably could have figured that out sooner, but I'm glad he was an optimist for a little while, at least.
[1] There was also a secondary benefit for system administrators. You could install a single copy of a 4-way-fat-binary on an NFS share and a workstation running NeXTstep of any architecture could launch it over the network.
Except, linux users don't use four platforms on the desktop. They use (nearly entirely) 1. Those use x86-64 are almost entirely clued in enough to grab the correct architecture. And even if they weren't, you could get around this simply by making your installer a shell script, that choose the right dpkg based on the output of uname -m. Hardly overly complex.
I guess that's fine, as long as nobody has ambitions for Linux adoption to grow beyond the relatively small already-clued-in demographic. There's nothing wrong with wanting to stay small.
FWIW, I used Debian/PPC at home for years. But, then, I was also among the small set of people who actually ran NeXTSTEP on a "gecko" PA-RISC machine and a SPARC laptop made by Tadpole. That's the story behind my perspective, anyway.
This may be the achilles heel of the freenix world: since almost anything can be worked-around with some scripting, people actively argue against putting facilities like this at the right place in the system's architecture. Add 1 to the Cathedral's score.
... Except why exactly is kernel and libc the right place to put this in?
Remember that stuff in kernel works outside memory protection. Moving complex stuff away from the kernel is a win pretty much always when there are no pressing performance reasons to do otherwise. This code will only run once every time a program is started, so performance is certainly not a good reason to put it in kernel.
What I really want to reiterate here is that using fat_elf:s would in absolutely no way make shipping stuff to multiple different linux platforms easier. The reason it is hard is because when someone says linux, pretty much the only guarantee you have about the system is that it runs a linux kernel, and even that can potentially be so old or be so strange that you can't trust anything about it. All fat_elf gives us is stuffing multiple binaries in a single file, and we can do this already without much fuss. It does not in any way give us true "Universal Binaries", because to do so, we would have to either agree on a common subset of libraries a linux system should always ship (and agree on indefinite binary compability for those libraries), or ship a meaningful portion of the entire platform in every binary.
If you want what fat_elf gives, you can get it with a 20-line shell script you concatenate into a bunch of binaries, and it could be argued that it is the cleaner and better place to put it in, considering it's only an ugly hack anyway. It's just that people look at fat_elf and see something that isn't there.
The architectural location that I was alluding to is some fat binary format itself, because this has the greatest impact on customer and user experience. How you parcel out the supporting code between kernel and userland would flow from that invariant...if I were Linus for a day.
Or, perhaps even better, adopt a small, architecture-neutral IR like LLVM-BC and a workstation of any architecture could choose to compile it just-in-time at launch, or at install time.
Either way, the greatest tragedy here is small thinking. I would love to see Linux rise to the level where users don't need to know what an instruction-set architecture is.
It really is just a bad idea. Or at least one that is working against ideas central to the way Linux is currently used.
Universal/fat binaries made sense on the Macintosh because there is no concept of program installation on these systems. While I think that eschewing installation is generally a better design, one drawback is that if you want to be able to support multiple architectures in one application you have to do the architecture check when the program is loaded.
Central to Linux and Windows is the idea of program installation, either through packages or installer programs. No one is interested in making it so that you can drag and drop items in Program Files or /usr/bin between systems and expect them to run, which is the only thing that using fat binaries really gets you over other solutions.
Nearly all of the commercial binary-only software I have seen in Linux (and other Unixes) uses an installer program, just like windows. There is no technical reason why such an installer couldn’t determine the architecture to install.
Not quite. Current Linux packaging formats encourage the developer to build one package per architecture. This means presenting several download choices for the user which can be confusing. The user doesn't always know what his architecture is.
The problem can be solved in two ways:
1) Distributing through a repository and have the package manager auto-select the architecture. However this is highly distribution-specific. If you want to build a single tar.gz that works on the majority of the Linux distros then you're out of luck.
2) Compile binaries for multiple architectures and bundle everything into the same package, and have a shell script or whatever select the correct binary.
While (2) is entirely doable and does not confuse the end user, it does make the developer's job harder. He has to compile binaries multiple times and spend a lot of effort on packaging. Having support for universal binaries throughout the entire system, including the developer toolchain, solves not only confusion for the end user but also hassle for the developer. On OS X I can type "gcc -arch i386 -arch ppc" and it'll generate a universal binary with two architectures. I don't have to spend time setting up a PPC virtual machine with a development environment in it, or to setup a cross compiler, just to compile binaries for PPC; everything is right there.
I think the ultimate point is not to make impossible things possible, but to make already possible things easier, for both end users and app developers.
Somebody has to test and debug your app on actual PPC hardware.
Our Xcode-supported unit tests transparently run three times -- once for x86_32, once for x86_64, and once for PPC. The PPC run occurs within Rosetta (ie, emulated).
If the tests pass, we can be reasonably sure everything is A-OK. In addition, we can do local regression testing under Rosetta (but it's rarely necessary -- usually everything just works).
The only native PPC testing we do is integration testing once we reach the end of the development cycle.
I doubt anyone would have a problem with toolchain support being added. But you don't need a kernel patch to fix the user end of the equation. It doesn't add anything that can't be provided just as conveniently (or more so - the shell script approach doesn't require any changes on the users side) without it.
To your point one, I agree. I also fail to see how universal binaries help. The problem isn't with supporting multiple architectures, it's with supporting multiple distributions.
Regarding point two, the developer is stuck with the packaging hassle regardless. The binary goes one place, config files and man pages in others, maybe you want a launcher in the gnome and kde menus... You are stuck with writing an install script anyway.
Yes, universal binaries do not help when it comes to supporting multiple distributions. However I have a problem with the fact that Linux people downright reject the entire idea as being "useless". This same attitude is the reason why inter-distro binary compatibility issues still aren't solved. Whenever someone comes up with a solution for making inter-distro compatible packages or binaries, the same knee-jerk reaction happens.
And yes, the developer must take care of packaging anyway. But that doesn't mean packaging can't be made easier. If I can skip the step "setup a cross compiler for x86_64" then all the better.
I think this guy is trying to solve a problem that doesn't exist in Linux. In Linux, every piece of software is compile for the architecture you want, so there is no need for a fat binary. Mac OS X needed this because they Mac users mostly run commercial software, so it is better to have a single binary for all platforms.
Enthusiasm is not a substitute for pragmatics; all the other contributors are just as enthusiastic, but they know their patches are not entitled to merger with the main tree willy nilly, without there being a very good reason.
I don't see what's stopping him from forking Linux and glibc. If this is really better, everyone will just use his fork. (It worked for my fork of cperl-mode, anyway...)
The point is to get patches accepted without forking. Linux and glibc aren't exactly small. Do you honestly see a single man maintaining millions of lines of forked code as well as handling packaging and stuff?
It's not as bad as it sounds, basically great many of the http://git.kernel.org/ repos are actual forks of the kernel, regularly pulling changes from vanilla, resolving the occasional conflicts and maintaining/developing their own patches.
Actually it is as bad as it sounds. Although git makes it easier to merge with mainline, in the end it still requires an active maintainer who manually merges and tests stuff once in a while. Maintaining a fork requires one's constant attention and diverts one from doing other - maybe more useful - things.
This solution doesn't solve the hard problems, but solves an easy one in an uninspired fashion. Rejecting these patches wasn't a case of holding back innovation, but merely holding back solutions that more experienced developers felt were not appropriate for the platform.
There was a good reason for that and playing armchair critic after reading a sympathetic story from a developer who is understandably hurt after getting his feature rejected just doesn't help anyone.
The kernel is more well managed than most people give it credit for, and it's exactly because of this that hacky incremental solutions get rejected, even when they might legitimately produce a small benefit. Eventually the overall solution to the problem FatElf is trying to solve is going to have to be something different.
I feel sorry for the developer, no one likes getting a feature rejected. But it was the wrong solution and the technical merits must trump other concerns.