Something that has been bugging me for a while: Why is cross compiling so hard? After all, compilers are just C programs (or C++/Go/Rust/… programs). It's not like a target implementation is using some magic instructions from the host to generate the assembly, or that it is harder to calculate some 32bit offsets for the compiled program, when the compiler is running on a 64bit host environment.
Or is it about the helper programs / scripts around the core compilation process that are too often hard coded to read out the configuration from the host system? So basically there is no technical hurdle, just the social norm that target arch == host arch, and thus make procedures aren't sufficiently tested for cross compilation from the get go?
Different toolchains support it differently. So for example, if you want to cross-compile with gcc and ld, you need to first compile gcc and ld for the target architecture. This is because you pick the host and target at compile time. So you get aarch64-elf-gcc instead of just plain gcc. This means that you either hope that someone else has already done this for you, or you have to build it yourself. It's not impossible but it is a pain. Contrast this with llvm; clang and lldb support multiple hosts and targets in one binary. You pass --target (or whatever) and you're good to go. This removes that setup step.
The outside world. If you want to target an OS that's not your OS... you need the API for that OS. On Linuxes, that's the actual syscall interface, but on many other OSes, for example, Windows, the syscall interface is not stable. You must use the library the OS provides. This means that, if you're on Linux and you want to cross-compile to Windows, you need to get a copy of all the stuff that's in C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.10.24728\lib or whatever, because your program is going to need it.
Is your project all in one language? If not, you'll need to do the above in multiple ways, maybe. For example, a pure Ruby program is easy to "cross compile", you just cross-compile the interpreter. (I don't know how hard that actually is these days, but it's an analogy, work with me here.) But if you use C extensions, you also need to have them cross-compiled. Use both Rust and C? You'll need the toolchains for both, and to coordinate them. (We've tried to generally make this very easy for Rust but there's still details where it's not.)
Then, you need to make sure that the software understands that host != target. So for example, in Rust, we now have compile time function evaluation. The way that this works is, we include a full interpreter into the compiler, and it runs CTFE stuff with the interpreter set to the properties of the target, not the host. Not all languages do this kind of thing. There's tons of other ways this can possibly go wrong too, as it's not the case that many people expect. For example, maybe the software uses linux-specific stuff. You can't just automatically cross-compile it then. But it's not like anything tells you you're using os-specific stuff, and so if you never attempt to cross-compile, then you may not even realize.
> Contrast this with llvm; clang and lldb support multiple hosts and targets in one binary.
I wonder, could the GNU toolchain devs be convinced to add support for multiple hosts/targets in one binary? Do they have a philosophical objection to it? Or is it simply that it is a lot of work and nobody has done it? It would make cross-compiling with the GNU toolchain a lot easier.
Simply compiling with gcc on a single architecture (no cross compile) requires more than one binary. In my experience (20 years of embedded work), cross compiling is hard not because of the compiling, but because of the linking. The linker has to know where all your target architecture libraries are so it can link your executable.
That said, this is getting easier and easier to do. I think this is because debian packages have to be architecture specific just for Intel architectures now, to handle both x86 (32 bit) and x86_64 ISAs. Once you have the environment, creating and using a different compiler executable is the easy part.
It's one reason to favor Go over Rust for a particular project, It's much easier to set GOOS and GOARCH and run go than to download and install a complete target with rustup assuming the target is actually available to install.
Could cranelift or similar allow rust to support multiple targets easier than llvm?
Cross compiling isn't hard, not really. But software building is hard. There are a lot of libraries and linkage magic needed to turn source files into something executable. And for a host system, all that work has been done for you.
So if you want to have a cross toolchain for, say, RISC-V on x86, it's not enough to have the compiler and assembler, you need a RISC-V linker script (for your particular RISC-V target), a RISC-V build of the C library (again target-specific), the libgcc helper library, headers for all the relevant platform code (they seem simple, but in my experience headers are routinely the biggest mess).
And unlike the host toolchain where there is an unambiguous "correct" answer for all that stuff, a cross environment might be asked to support all kinds of crazy alternatives. Your ARM gcc might be used to produce a binary to run on Fedora, or for a busybox/uclibc build on an ancient kernel, or to target Zephyr (my world)...
It gets messy very fast. But it's not "hard". Putting it together for one system isn't any harder than it is for any other.
I cross-compile an entire Linux distribution (even if you're compiling on Linux/x86_64 FOR Linux/x86_64, it's still going to get cross-compiled since it's a different platform). This is around 300 external/upstream packages and 100 local packages. The vast majority of software that uses GNU autoconf has little to no problems. Some software that uses CMake has some small problems.
Building the cross-compiling toolchain is relatively easy, compared to dealing with the problems in software
I ran into a bug in bash [0], for example, that you would really only hit if you were cross-compiling it (there are other ways, because it wasn't related to being cross-compiled just features that were disabled automatically when cross-compiling). It took a while to convince people that this bug existed because it wasn't obvious that it was caused by being cross-compiled.
The real problems come from "modern" package managers as well as scripting languages in general.
Scripting languages and their extension mechanism are, in general, absolutely ignorant of cross-compiling and the idea that you might be building an extension to the language for a different platform than the running interpreter is completely foreign to most. Ruby fails especially hard here -- it looks ONLY at the running interpreter to determine how to build the extension. Python is a bit better because, but still rigid in versions. Perl's cross-compilation story requires having access to a system to SSH into (though there are alternative autoconf-based build systems that make this sane -- I don't know if they work for extensions, I just abandoned the idea of including Perl based on how poor their build system was). Tcl was the best, since it's extension system (TEA) is autoconf-based.
LuaJIT can't be cross-compiled from a 32-bit system for a 64-bit system [1] since it, at build-time, tries to do some stuff and fails at it.
Erm, GCC is a huge problem for cross-compiling, because it's compilation target is determined at compile time. I mean at GCC compile time. You have to recompile GCC to target a different machine.
Clang isn't broken like that, and for clang you are right that the issue is normally dependencies. Again GNU's libc is kind of broken.
Cross compiling with Clang and Musl is not too bad.
This IMHO is a huge problem for embedded development. I want to be able to just select what target I want and recompile a project. Even in small embedded systems you still need target specific headers and .bss code, but that's more of an IDE problem on top of the compiler not supporting all architectures out of the box. A C library that uses no hardware specific features should be able to recompile trivially for any supported target for example.
If my chip vendor decides to replace the ARM core on a SoC with a RISC-V core but leave all peripherals and memory mappings the same, I want to simply change the target and rebuild.
> If my chip vendor decides to replace the ARM core on a SoC with a RISC-V core but leave all peripherals and memory mappings the same, I want to simply change the target and rebuild.
I mean, I don't know what distro you're working on (and whether they package these toolchains conveniently), but on Arch you just change your configured target in your conf/build system, and make sure you have the other one installed. If you're using autotools or cmake, you may not even need to change the configure scripts.
If you're doing embedded with memory mapped peripherals, surely you're building all your deps statically anyway.
If you are memory mapping stuff like that you most likely already have your own linker script or something from the vendor of arch1 and you can, with not much pain, adapt the linker script for arch2 to use the same base addresses and magic numbers. I mean I haven't personally done it but worked on projects were we were doing that and the headers for both archs were almost the same.
If you don't have hardcoded addresses and are fully abstracted by a modern HAL and VM like on Linux then it's mostly a problem of the Linux distro or the OS vendor.
Does such hardware exist?
Usually peripherals get an update whenever major changes are done anyways, cause that's where the value add is for most SoCs.
With all due respect: clang hasn't been asked to cover even a tiny fraction of the breadth you get with gcc. Cross compiling with clang generally just means building ARM Linux binaries on x86 Linux hosts. Where's the clang/llvm equivalent to buildroot, for example?
The non-native languages like to keep their build systems in a form of a big mess. Python is probably still not cross-compilable, npm that depends on an old ssl, things like that.
As opposed to C/C++ compilers being built in C/C++?
EDIT: Try to build NetBSD for some esoteric platform from the comfort of your desktop machine. Remember to pick up your jaw from the floow when you're done. ;-) My point is, it does not have to be hard.
In my understanding, Go re-implements the system's interface, rather than using libc (or the equivalent), even on platforms where the syscall interface isn't considered stable. This means cross-compiling is really easy, but it can also mean breakage when the unstable interface changes. A tradeoff like anything else.
This is true for Rust, however, given that LLVM is the project doing the codegen, it's a little bit special; cross-compiling it to the platforms it supports is a bit easier than a random C++ library, since it's the one with the support in the first place.
Rust projects make much more use of C libraries than Go projects do, though, so for things other than the Rust compiler, this is still a good point.
Go uses a common assembly with ways to covert into the specific architecture which makes porting the toolchain to new targets somewhat trivial. (https://www.youtube.com/watch?v=KINIAgRpkDA)
I wrote a short paper for HPEC that included some power and performance benchmarks and analysis on the HiFive Unleashed U540 SoC [0]. The SoC isn't open source as some suggest although I believe the core is based on the open source Rocket Chip Generator [1]. It seems the greatest weakness was in the slow memory interface. The details of memory controller and configuration were proprietary when I tried to find out why it wasn't performing well.
The STREAM benchmark [6] was also compiled and executed, confirming that DRAM performance is limited to less than 1.6 GB/s on this platform. It’s unclear if this is a problem with the cache hierarchy, memory controller, or the configuration
of the DDR Controller Control Registers
Wow, that's unspeakably terrible. That's about 10-20% of the b/w one should get from 2400 mem (depending on how many channels the mem controller uses).
I hope this is some simple misconfiguration that can be fixed in firmware or in the kernel.
The Freedom platform is open and lots of the tilelink interconnect. The xore is based on rocket but they have some internal changes that is not open source jet.
The generated RTL is open but that is of course limited.
It's an in-order 64 bit 4 core processor (actually 5 cores, because there's a small 32 bit core which does nothing normally). Think Cortex A53. It's very usable for development (I have two of them), and we even have people using them for desktop running GNOME. But it's not a Xeon. The main issue is the cost and lack of SATA on the development board (there's a daughter board providing that, at extra cost). Everything is open source.
This chip is free of speculation bugs because it doesn't do speculation. However the architecture itself is not in any way better or worse than others, and a RISC-V core with a different microarchitecture might well be vulnerable, although since these attacks are now well-known steps should be taken by designers to avoid them.
Memory architecture is very simple, it has small L1 and L2 caches and main memory. There's also a chiplink connection which takes the memory bus off-chip. You can read more here: https://www.sifive.com/chip-designer#fu540
It takes the Chiplink connection and routes it to a PCIe bridge. There's also a large FPGA on there, but I'm not quite clear how it's all connected up.
I don't know much about what people are using this for. Also the whole setup costs like $4000 so I guess it's not quite ready for casual home users just yet. If you want to go the FPGA route, then it might be better looking at a Virtex-7 and putting the RISC-V rocketchip core on there, along with whatever local customizations you want to try out.
Do you know if it is possible to actually purchase the daughter board? As far as I can tell they only made a limited supply for the crowdfunding program.
Thanks for the post and nice work, Drew! I look forward to Linux/BSD-capable non-x86 boards becoming easier to find and cheaper to buy over time. This one is not the cheapest board but since I'm indefinitely in line for a Pine64 (and can't find much else) I may take a look at this.
>"I’m working on making this hardware available to builds.sr.ht users in the next few months, where I intend to use it to automate the remainder of the Alpine Linux port and make it available to any other operating systems (including non-Linux) and userspace software which are interested in working on a RISC-V port."
Is this author also the designer of the board? Or do they made they're working on making the software available?
This looks really exciting, looking forward to following the progress.
The author means making compute time on the hardware available to run CI builds on their service. If you care about porting your software to RISC-V enough to want to test that software there, hardware is probably a more accurate harness than qemu.
I'm not sure what this means, but we have 3 of these SiFive boards doing builds[1], as well as dozens of qemu instances running on Intel hardware supplied by Facebook. We use the SiFive boards for the largest builds however, things like GCC and the kernel, because they're still much better than qemu.
[1] Or we were up to yesterday when we had a hardware failure (on an Intel machine) so our build system is currently down. Four days before Christmas, so the worst possible time ...
It's also surprisingly nice. 8G of DDR4 EEC RAM, 4 surprisingly fast CPU cores, and a gigabit ethernet port - it all adds up to a really pleasant board to work with.
That's because it's brand new tech. I assume, that the Chinese, who came up with Arduino boards, that cost USD1.99 will also create knock-offs of these and prices will fall extremley.
Well, it's the also first RISC-V CPU available to the general public. I expect prices to come down in the future, but I was definitely surprised with the quality of the "first" CPU.
That CPU has one of the worst instruction sets ever: the mnemonics are even worse than intel, which was hereto considered the stupidest up until this point. That's an achievement. They will have to be hidden by compilers because they're otherwise complicated and a pain in the ass to program.
As far as I can tell, this hardware doesn't bring anything revolutionary in terms of capability. The hardware should stand on its own merit, like for example the Commodore Amiga did, or the SGI hardware did, not on the merit of some arbitrary ideology.
If you use Alpine releases, you'll get older packages than if you use edge. In general edge packages are pretty up-to-date.
Though, they do have issues with alpine-aports patches (alpine-aports is their package repository scripts). They don't have a large enough team so a patch generally needs a few months to get in.
I think it's kind of misleading since it's only done for compatibility of the produced binaries. CentOS 5, until recently, was the actively maintained distro with the oldest kernel, libc etc. versions. So if it runs on CentOS 5, it is guaranteed to run on anything. This is different from companies still using CentOS 5 and are running active services on it because they simply refused to upgrade. Python or rather PyPi does this too. They aren't using CentOS 5 Despite of it's age, but because of it.
Is this to produce binaries on the oldest possible distro so they'll run anywhere (symbol versioning etc.)?
That being said, CentOS 5 hasn't received security updates for year or so (or 2?), so maybe there's a security risk in continuing to rely on it? I guess CentOS 6 would be the oldest still supported distro.
alpine 3.8 (and 3.7) branch have gcc-6.4 which was released in July 2017. For a C compiler, that is absolutely sufficient. You'll find many other packages in the 3.8 release branch are more up-to-date, this is a bit of an exception.
This isn't Rust. GCC-6.4 will compile literally any C or C++ project you find (that works with any recent version of GCC), including stuff requiring C++-14 and _most_ of C++-17. Hell, GCC-4.9 (from 2014) is sufficient to compile the latest linux kernel, the latest sqlite3, the latest nginx, the latest of any real significant package.
That was actually a special case if I understood correctly. gcc 6 dropped support for something needed by something else which prevented it to be upgraded
RHEL version numbers are not telling you much. Nobody would be surprised to find some Linux kernel features announced for 4.20 in RedHat kernel 2.6.32-something.
By updating the packages you need that are outdated and sending patches upstream. The package manifests are pretty straightforward and easy to work with.
Or is it about the helper programs / scripts around the core compilation process that are too often hard coded to read out the configuration from the host system? So basically there is no technical hurdle, just the social norm that target arch == host arch, and thus make procedures aren't sufficiently tested for cross compilation from the get go?