Hacker News new | past | comments | ask | show | jobs | submit login
Building a 64-bit aarch64 kernel and userspace for the Raspberry Pi 4 (esotericnonsense.com)
116 points by esotericn on Nov 16, 2019 | hide | past | favorite | 41 comments



I understand the logic behind the Raspberry Pi Foundation’s decision to continue releasing a single Raspberrian version in 32-bit mode for compatibility reasons, but... on the other hand, it does severely handicap their newer hardware.


I think a big part of the reason is also the fact that several parts of the userland don’t compile for aarch64.

See e.g. https://github.com/raspberrypi/userland/issues/314

One of the first comments there also say about the RPi 3:

> The GPU is a 32 bit processor. I haven't checked, but I'm expecting that there's a heck of a lot more work to do to get Khronos or other multimedia extension stuff up and running against a 64bit kernel than just getting userland to build.

Don’t know if that’s the case for the RPi 4 as well.

And another comment too:

> I'm sure there are certain applications where having a 64-bit kernel (let alone userland) may be beneficial, but I suspect the hoped-for performance improvements didn't materialise, otherwise people would be waving benchmark results at us demanding an RPi-supported aarch64 kernel.

I’m missing this too, and am planning on eventually doing some benchmarking of my own to see if there is any advantage of running the aarch64 kernel for my applications.


The GPU has nothing to do with what the OS on the CPU is doing - it's an entirely separate architecture anyway (VideoCore 6 on RPi4).

The reason their /opt/vc stuff doesn't support 64bit is because there are proprietary binaries directly from Broadcom. Also, they use interfaces provided only by the downstream RPi kernel.

There is a blog post from the RPi foundation about support for VC6 in Mesa.


What do you exactly mean by "/opt/vc" stuff? OpenGL?

Because I am decoding, encoding and rendering 1080p H264 with the VC on a 64-bit kernel.


Things like vcgencmd that allow you to read and write values to the GPU OS that actually controls the physical hardware.


For what it's worth, Fedora support arm7/aarch64 images for the rpi3: https://fedoraproject.org/wiki/Architectures/ARM/Raspberry_P...

The raspberry pi 4 apparently isn't supported. I'm not exactly clear on why, and how this links to the above issue, and if any of the things they suggest are problems in their bug tracker are also not solved in Fedora, or are simply raspberry pi people being unwilling to fix in their setup.


The fedora/etc problem has to do with the lack of uboot/edk2 and upstream kernel support. Once support lands in the appropriate repos you can expect fedora will enable it.


There's a full 64-bit Gentoo image available. It seems to work fine for all my use-cases. I'm not sure which parts of your comments really apply at this point in time.


Hardware accelerated OpenGL. As far as I have been able to tell, that will still be missing if you run an aarch64 kernel.

And I think using the GPU to get hardware accelerated video decoding and encoding will not be available either.

Edit: But if I understand https://github.com/popcornmix/omxplayer/issues/714 correctly then you could do hardware accelerated decoding of HEVC on the CPU. I don’t know how the performance of that compares to the kind of video decoding that the GPU can do. That’s one of the things I’d like to see someone benchmark, or benchmark myself.


That was the case but the VC4 instruction set is open/documented and there is a Mesa driver worked on to REPLACE the closed binary driver which is only available for 32-bit. When you use the Mesa VC4 driver you can also compile the whole user space for 64 bit including OpenGL ES support in a even newer version then the closed source one. (Supporting user kernels etc.)

You can find some info about that here: https://wiki.gentoo.org/wiki/Raspberry_Pi_VC4

Some side notes: Mesa is quite hungry in performance and memory to compile shaders (to VC4 instructions), way more then the closed driver requires, thats why older versions of the Pi with less powerfull ARM cores couldn't really use this approach.

Source: I used to write custom user shaders in VC4 assembly to run on all Pis because the closed binary didn't offer OpenCL.


The 64bit kernel is able to run a 32bit userland, so it should be possible with the correct libraries etc to run vcgencmd and friends.

I've personally had no issues - the system is overall faster than on armv7h - ioquake3 runs full speed, I can watch 1080p videos in mpv, etc.

I can't seem to get the 'kitty' terminal to work which does require the use of OpenGL, but that doesn't work on a 32bit kernel for me either.


Raspbian contains a 64-bit kernel nowadays, add

arm_64bit=1

to your /boot/config.txt

You now have a 64-bit kernel with 32-bit userland.

It is unclear to me what kind of code gcc now generates with -march=native. If somebody could clear that up, would be very much appreciated ( ie, does it use 31 GPRs ).


If it's a 32bit userland, then you don't have access to the other GPRs. The GPR field in AArch32 instructions is a max of 4 bits, so you can only encode r0-r15.


Thank you. Have been trying to get a 64-bit gcc running, but haven't succeeded yet.


Does it? What does the 64-bit mode provide that the 32-bit one doesn't? On x86 things were different because the 64-bit ABI also provided access to more registers and other benefits, but as far as I know, the 32-bit mode has the same features as 64-bit on ARM.

This reminds me of when the UltraSPARC CPU's came out. I was working at Sun at the time, and I remember asking why Solaris 2.6 was released without 64-bit support. The reason was the same, there was really no benefit to it (except for support for more than 4 GB of RAM in a single process, which wasn't really needed back then).

Even as later versions of Solaris came out with native 64-bit support, the entire userspace was still 32-bit because it worked on both architectures, and the binaries were both smaller and also faster.


> On x86 things were different because the 64-bit ABI also provided access to more registers and other benefits, but as far as I know, the 32-bit mode has the same features as 64-bit on ARM.

It doesn't. 64-bit arm has 31 GPRs. It also has much cleaner decode and some low-end cpus that can do it and 32-bit execute it faster than they do 32-bit code.


Thank you for the clarification. I didn't know that.


But you worked at Sun?


I did. But Sun never used ARM. They used SPARC, and later x86.


>it does severely handicap their newer hardware.

Does it? I thought part of the lag is due to there being no obvious benefit on a 1-4gb device


For openSUSE there's an aarch64 image as well: https://en.opensuse.org/HCL:Raspberry_Pi4

It uses mainline Linux, so no USB 3 (only USB 2 OTG/Gadget/Host on USB-C), but Ethernet is supposed to be working meanwhile.


What's the performance difference between 32 and 64 bit?



Btw, for those who are using RPi3 and want to try aarch64 kernel -archlinux|ARM has one and I've used it in my experiments to getting smoother desktop experience on Raspberry Pi[1].

[1]:https://abishekmuthian.com/getting-smoother-desktop-experien...


Slightly off-topic: is there anything that is as or more powerful than RPi4 but has more/expandable memory (say 16G or more) and still works with mainline kernel/distributions? In other words, something that I could setup as a build machine for CI?


Realistically you probably want to just use QEMU for builds until you find yourself needing to scale to something like https://www.packet.com/cloud/servers/c1-large-arm/

If you're doing it just for the novelty or really want to do it on site natively at a reasonable cost check out the Jetson TX2 dev kit. The TX2 has been mainlined for a while now and this kit offers 8 GB RAM with a fast CPU and the ability to connect fast SATA/NVMe storage. While most distros would run on it you would probably need to make your own install package (or just run container images for builds if you don't NEED to have an exact kernel)


Well, any PC? What are your conditions? Price? Size?



This looks promising (even at $750 for the motherboard and CPU) but it seems to be a work in progress, specifically at the bottom of the page it says:

    Mainline Kernel – work in progress
    UBoot – work in progress
    UEFI – work in progress
But if anyone has any experience using this board, please share.


Why not just build a CI machine? At a $750 price point, you could very easily build a fantastic one with half an hour's worth of work.


Ubuntu also has a nice aarch64 image for raspberry pi 4.


The Ubuntu image currently doesn't support USB on the 4GB model unless you configure the kernel to only use 3 of the 4GB.


Thanks for mentioning that! You probably just saved me a few wasted hours.



They're using 5.3.0 which isn't supported by the RPi Foundation yet. The 4.19 line had an issue with RAM too but it's been fixed now so if you build from git it's fine.

I'm running on 4gb.


Hope they fix that soon!


Is there something like x32 for ARM?


Yes, aarch64-ilp32.

It's used in production for Apple Watch Series 4 onwards among other things.


Is the RPi4 mainlined in the kernel, or does the board require porting new kernel versions?


It doesn't seem it has dts file in 5.4 at all.


I've found buildroot much easier to use.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: