Great article -- just the kind of meaty technical content I love seeing here. Also good to know Rust is living up to its goal of being a fully-fledged systems programming language.
I'd go crazy for something along the lines of xv6 [1]. xv6 is a modified Unix V6 that is used to teach the principles of operating systems. Its source code is small and well written. A rust analogue would be awesome
In highschool, I wrote a tiny toy kernel, running solely in 16bit real mode. Quite fun, but I wanted to go further and my knowledge was lacking, although the project did help me learn a lot of ASM.
The OP has shown that I could do all sorts of cool stuff going straight into x64 without all the incantations! That's actually super exciting, and makes me hate EFI a little less ;)
I'm going to go play, I think! I wonder what Lisp would be the best candidate here...
I don't see much if any of Oberon's spirit in any those languages. Maybe Go, though I don't like it much. Part of the point of Oberon was the minimalism - each of Wirth's language got simpler and smaller, and the system aimed to remain tiny too.
In terms of the language, the Oberon 2 language report is 17 pages of which 1.5 pages is the full EBNF for the grammar.
Then again, looking at the new stuff from the people keeping Oberon alive, it seems much less interesting to me as well.
The spirit of using strong typed safe languages for system level programming instead of unsafe languages like C and C++.
I did like Oberon a lot, as a GC enabled safe systems programming language with module support, and the ideas of Native Oberon exploring Smalltalk dynamism in a static compiled language.
As for Wirth's minimalism, specially the latest Oberon-07, I don't like it that much.
The main problem is lack of support, it is a very big effort to create an OS and it is very easy to turn attention to other stuff.
Linux and BSD distributions wouldn't have reached the state they enjoy nowadays, without the help of the existing UNIX toolset and investment of companies interested in seeing the projects succeed.
Responding to pjmlp, a cool project I saw recently that's related to the ones you mentioned is Unikernels [1]. They noticed that a challenge faced by a lot of those projects is hardware support--each OS has to ship with a huge set of drivers to be widely usable. They sidestep this by targeting a VM, which exposes a single hardware interface no matter what it's running on.
EFI even talks FAT and there exists licensing such that you can use FAT for EFI reasons without any patent worries. I'm not saying that's a bad decision, just pointing out it is another thing that happened.
I heard that fact in a linux.conf.au talk by the guy what did UEFI work for linux [0]. If you're sad to hear about the Microsoft influence, dude isn't too hot on the implementation of this new standard in general. Interesting talk though, digging up the link to post it here means I'm going to give it another watch.
I fail to see the major issue here: The standard needed to specify those points, otherwise one EFI implementation would expect UTF8, another ASCII and yet another UTF-16. The calling conventions, the binary format and UTF-16 are widely supported across platforms and known. Windows calling conventions are probably more widely implemented than any other (since the popular linux compilers can cross compile to win and not the other way round). Just because Rust had no support for the windows calling conventions that doesn't imply that C for example has the same problem. UTF-16 is used in different places as well, notably in Java. No problem here as well. It also has the major advantage of being a fixed width encoding over UTF8. That might be fairly useful when your code doesn't have full OS and library support when running.
All in all: It's probable that Microsoft influenced the standard here, but there's also sufficient technical reasons for those choices. There's other places where Microsofts influence on the standard is a much bigger problem.
> The calling conventions, the binary format and UTF-16 are widely supported across platforms and known.
The calling conventions are W64's, which is by far the least common across platform (especially since the AMD64 ABI defines calling conventions)
> UTF-16
Is a shit encoding and there is no justification for such a technical low-level API to use it[0].
> UTF-16 is used in different places as well, notably in Java.
Not a justification (unless you also assert that most UEFI applications will be written in Java), explained by Java (and Windows NT) having standardized their Unicode implementation between Unicode 1.0 and Unicode 2.0, 20 years ago. UEFI was not created 20 years ago, that it bears the trace of that event is a disgrace.
Plus technically Java does not use UTF-16, it's UCS2. Support for astral plane codepoints and APIs taking surrogate pairs in account were only introduced in JDK 1.5, separately from the old UCS2 APIs (including char, which remains a code unit)
> It also has the major advantage of being a fixed width encoding over UTF8.
You're high as a kite[2], UTF-16 is no more fixed-width than UTF-8, and UTF-16 implementations are far more commonly broken than UTF-8 ones as 1. astral planes support is rarely exercised and 2. they have to contend with the UCS2/UTF16 duality.
> All in all: It's probable that Microsoft influenced the standard here, but there's also sufficient technical reasons for those choices.
I assert that there is no technical justification whatsoever for the choices of calling conventions and encoding beyond "it's how Windows works and we couldn't be arsed to do things correctly".
[0] the only potentially acceptable justification for UTF-16 is low-markup asian script[1] with a vast majority of codepoints living in the upper segments of the BMP (U+0800 to U+FFFF) and encoded as a single UTF-16 code unit but as 3 UTF-8 code units. This is pretty much irrelevant to UEFI.
[1] mostly, some african and american scripts also live there
[2] no seriously, personal attacks may be bad but when you assert UTF-16 is a fixed-width encoding — even ignoring that Unicode makes the concept laughably useless — you're in a separate plane of reality than just about all of humanity.
... I've come to assume that anybody who says they're doing UTF-16 actually means UCS2 with the wrong endianness and just wanted the easiest way to support Chinese characters.
Other commenters beat me to the first two examples, though there's more.
GoFY (which hilariously I'm a contributor on) was probably the closest analogy; it booted on x86. Russ Cox's did as well, but it was removed from the go tree years ago when it bitrotted.
Really it boils down to 1) custom linker scripts/targets and 2) a custom "OS" target that emulates syscalls/etc
I've done it a couple times (on ARM-based embedded devices), I end up throwing it out every time. Should get a proper writeup I suppose.
So I'm a bit confused, the article says the code is ran in 64bit mode, but before virtual memory is turned on. How is that possible?
I remember from the OSDev wiki that the only way to get a processor in 64 bit mode was by first turning on virtual memory. I remember because I was disappointed by that, it seems unnecessary.
Doesn't UEFI play by these rules, or am I misunderstanding?
I think "turning on the MMU" is something a bit different than "turning on virtual memory." The former just means that all addresses are translated from virtual->physical, while the latter implies more abstraction: virtual memory can be overcommitted, pages might not be resident.
This is just a guess, but I'd wager that UEFI has the MMU enabled, but with an identity mapping that maps all virtual addresses to the same physical address.
"64-bit UEFI understands long mode, which allows applications in the pre-boot execution environment to have direct access to all of the memory using 64-bit addressing."
I believe that "long mode" is just the name for the native 64-bit mode of these processors; ie. what you'd normally run 64-bit Linux under, with the MMU enabled.