Booting to Rust

haberman · on Nov 19, 2013

Great article -- just the kind of meaty technical content I love seeing here. Also good to know Rust is living up to its goal of being a fully-fledged systems programming language.

freehunter · on Nov 19, 2013

Agreed. I'd love to see a "build an OS with Rust" tutorial.

cmrx64 · on Nov 19, 2013

What would you want to see in such a tutorial? A lot goes into even the most barebones functional operating systems.

daniel-levin · on Nov 19, 2013

I'd go crazy for something along the lines of xv6 [1]. xv6 is a modified Unix V6 that is used to teach the principles of operating systems. Its source code is small and well written. A rust analogue would be awesome

[1] http://pdos.csail.mit.edu/6.828/2012/xv6.html

cmrx64 · on Nov 19, 2013

I'll see what I can do

girvo · on Nov 19, 2013

I'm more super interesting in diving into EFI!

In highschool, I wrote a tiny toy kernel, running solely in 16bit real mode. Quite fun, but I wanted to go further and my knowledge was lacking, although the project did help me learn a lot of ASM.

The OP has shown that I could do all sorts of cool stuff going straight into x64 without all the incantations! That's actually super exciting, and makes me hate EFI a little less ;)

I'm going to go play, I think! I wonder what Lisp would be the best candidate here...

xradionut · on Nov 19, 2013

I'd rather see Oberon ported to a modern machine. :)

pjmlp · on Nov 19, 2013

Join efforts with the Active Oberon team at ETHZ.

http://www.ocp.inf.ethz.ch/wiki/

There are still some students and external guys keeping the ports going.

http://www.ocp.inf.ethz.ch/forum/index.php/board,4.0.html

Nowadays, I think Oberon spirit will live in Rust, D, Go, C#. :)

vidarh · on Nov 19, 2013

I don't see much if any of Oberon's spirit in any those languages. Maybe Go, though I don't like it much. Part of the point of Oberon was the minimalism - each of Wirth's language got simpler and smaller, and the system aimed to remain tiny too.

In terms of the language, the Oberon 2 language report is 17 pages of which 1.5 pages is the full EBNF for the grammar.

Then again, looking at the new stuff from the people keeping Oberon alive, it seems much less interesting to me as well.

pjmlp · on Nov 19, 2013

The spirit of using strong typed safe languages for system level programming instead of unsafe languages like C and C++.

I did like Oberon a lot, as a GC enabled safe systems programming language with module support, and the ideas of Native Oberon exploring Smalltalk dynamism in a static compiled language.

As for Wirth's minimalism, specially the latest Oberon-07, I don't like it that much.

xradionut · on Nov 19, 2013

Has any of Oberon's "spirit children" been used for a full fledged OS yet?

pjmlp · on Nov 19, 2013

Just kernel level, or partial state projects that were abandoned.

D - Xomb - http://wiki.xomb.org/index.php?title=Main_Page

Go - TinyGo - http://code.google.com/p/tinygo/

C# - Singularity - http://singularity.codeplex.com/, Cosmos - http://cosmos.codeplex.com/

The main problem is lack of support, it is a very big effort to create an OS and it is very easy to turn attention to other stuff.

Linux and BSD distributions wouldn't have reached the state they enjoy nowadays, without the help of the existing UNIX toolset and investment of companies interested in seeing the projects succeed.

eholk · on Nov 19, 2013

Responding to pjmlp, a cool project I saw recently that's related to the ones you mentioned is Unikernels [1]. They noticed that a challenge faced by a lot of those projects is hardware support--each OS has to ship with a huge set of drivers to be widely usable. They sidestep this by targeting a VM, which exposes a single hardware interface no matter what it's running on.

[1]: http://anil.recoil.org/papers/2013-asplos-mirage.pdf

pjmlp · on Nov 19, 2013

Some researches from the Singularity team moved on for the same type of work for Windows.

http://research.microsoft.com/en-us/projects/drawbridge/

daurnimator · on Nov 19, 2013

Makes me sad to realise just how much of EFI, the new universal standard, was dictated by Microsoft norms (UTF16, binary format, etc)

dbaupp · on Nov 19, 2013

Yeah, the author even had to add support for the appropriate calling convention (win64) to the compiler[1] to get it to work.

[1]: https://github.com/mozilla/rust/pull/10527 not yet landed.

eholk · on Nov 19, 2013

To be fair, Rust probably would have needed this for some other reason in the future. I see this more as making Rust a more complete language.

orf · on Nov 19, 2013

That was hardly challenging, look at the number of changes.

dbaupp · on Nov 19, 2013

I didn't say it was challenging. (And yes, most of the changes are actually fixing a different bug, not just adding support for the new ABI.)

forgottenpass · on Nov 19, 2013

EFI even talks FAT and there exists licensing such that you can use FAT for EFI reasons without any patent worries. I'm not saying that's a bad decision, just pointing out it is another thing that happened.

I heard that fact in a linux.conf.au talk by the guy what did UEFI work for linux [0]. If you're sad to hear about the Microsoft influence, dude isn't too hot on the implementation of this new standard in general. Interesting talk though, digging up the link to post it here means I'm going to give it another watch.

[0] EFI and Linux: the future is here, and it's awful - Matthew Garrett http://www.youtube.com/watch?v=V2aq5M3Q76U

daurnimator · on Nov 20, 2013

Thanks for the reminder about that talk; I was actually in the audience for it :)

Aloisius · on Nov 19, 2013

EFI even talks FAT and there exists licensing such that you can use FAT for EFI reasons without any patent worries.

Surely all the patents on FAT have expired by now regardless. Even FAT32 was introduced 17 years ago.

Xylakant · on Nov 19, 2013

I fail to see the major issue here: The standard needed to specify those points, otherwise one EFI implementation would expect UTF8, another ASCII and yet another UTF-16. The calling conventions, the binary format and UTF-16 are widely supported across platforms and known. Windows calling conventions are probably more widely implemented than any other (since the popular linux compilers can cross compile to win and not the other way round). Just because Rust had no support for the windows calling conventions that doesn't imply that C for example has the same problem. UTF-16 is used in different places as well, notably in Java. No problem here as well. It also has the major advantage of being a fixed width encoding over UTF8. That might be fairly useful when your code doesn't have full OS and library support when running.

All in all: It's probable that Microsoft influenced the standard here, but there's also sufficient technical reasons for those choices. There's other places where Microsofts influence on the standard is a much bigger problem.

masklinn · on Nov 19, 2013

> The standard needed to specify those points

Yes, the problem is those which were specified.

> The calling conventions, the binary format and UTF-16 are widely supported across platforms and known.

The calling conventions are W64's, which is by far the least common across platform (especially since the AMD64 ABI defines calling conventions)

> UTF-16

Is a shit encoding and there is no justification for such a technical low-level API to use it[0].

> UTF-16 is used in different places as well, notably in Java.

Not a justification (unless you also assert that most UEFI applications will be written in Java), explained by Java (and Windows NT) having standardized their Unicode implementation between Unicode 1.0 and Unicode 2.0, 20 years ago. UEFI was not created 20 years ago, that it bears the trace of that event is a disgrace.

Plus technically Java does not use UTF-16, it's UCS2. Support for astral plane codepoints and APIs taking surrogate pairs in account were only introduced in JDK 1.5, separately from the old UCS2 APIs (including char, which remains a code unit)

> It also has the major advantage of being a fixed width encoding over UTF8.

You're high as a kite[2], UTF-16 is no more fixed-width than UTF-8, and UTF-16 implementations are far more commonly broken than UTF-8 ones as 1. astral planes support is rarely exercised and 2. they have to contend with the UCS2/UTF16 duality.

> All in all: It's probable that Microsoft influenced the standard here, but there's also sufficient technical reasons for those choices.

I assert that there is no technical justification whatsoever for the choices of calling conventions and encoding beyond "it's how Windows works and we couldn't be arsed to do things correctly".

[0] the only potentially acceptable justification for UTF-16 is low-markup asian script[1] with a vast majority of codepoints living in the upper segments of the BMP (U+0800 to U+FFFF) and encoded as a single UTF-16 code unit but as 3 UTF-8 code units. This is pretty much irrelevant to UEFI.

[1] mostly, some african and american scripts also live there

[2] no seriously, personal attacks may be bad but when you assert UTF-16 is a fixed-width encoding — even ignoring that Unicode makes the concept laughably useless — you're in a separate plane of reality than just about all of humanity.

forgottenpass · on Nov 19, 2013

Plus technically Java does not use UTF-16, it's UCS2.

Well then I've got good news! EFI, too, also means UCS2 when it says UTF-16 in the spec.

Pxtl · on Nov 19, 2013

... I've come to assume that anybody who says they're doing UTF-16 actually means UCS2 with the wrong endianness and just wanted the easiest way to support Chinese characters.

masklinn · on Nov 19, 2013

Well I guess if it's going to be garbage anyway you might as well go whole-hog and ensure nothing is salvageable.

chalst · on Nov 19, 2013

The ease and flexibility of this kind of thing is reason to expect UEFI viruses will be more common than viruses targetting earlier firmware.

andrewaylett · on Nov 19, 2013

Leading to Secure Boot, pretty much.

aray · on Nov 19, 2013

Nice writeup! It's interesting to see the similarities between Booting to Rust and booting Go on baremetal.

dbaupp · on Nov 19, 2013

You can't say something interesting like that without a link to a Go example for the rest of us to see the similarities too. :)

_mhr_ · on Nov 19, 2013

Well, there's this: http://code.google.com/p/gofy/

adsche · on Nov 19, 2013

I was interested in this as well and found this commit[1] by Russ Cox. README has details on running it inside bochs.

[1] http://code.google.com/p/go/source/detail?r=ffbf0900a2

aray · on Nov 19, 2013

Other commenters beat me to the first two examples, though there's more.

GoFY (which hilariously I'm a contributor on) was probably the closest analogy; it booted on x86. Russ Cox's did as well, but it was removed from the go tree years ago when it bitrotted.

Really it boils down to 1) custom linker scripts/targets and 2) a custom "OS" target that emulates syscalls/etc

I've done it a couple times (on ARM-based embedded devices), I end up throwing it out every time. Should get a proper writeup I suppose.

dbaupp · on Nov 19, 2013

Hm, those two seem vastly more complicated, with extra C and ASM rather than just plain Go.

tinco · on Nov 19, 2013

So I'm a bit confused, the article says the code is ran in 64bit mode, but before virtual memory is turned on. How is that possible?

I remember from the OSDev wiki that the only way to get a processor in 64 bit mode was by first turning on virtual memory. I remember because I was disappointed by that, it seems unnecessary.

Doesn't UEFI play by these rules, or am I misunderstanding?

haberman · on Nov 19, 2013

I think "turning on the MMU" is something a bit different than "turning on virtual memory." The former just means that all addresses are translated from virtual->physical, while the latter implies more abstraction: virtual memory can be overcommitted, pages might not be resident.

This is just a guess, but I'd wager that UEFI has the MMU enabled, but with an identity mapping that maps all virtual addresses to the same physical address.

ivanca · on Nov 19, 2013

"64-bit UEFI understands long mode, which allows applications in the pre-boot execution environment to have direct access to all of the memory using 64-bit addressing."

Source: http://en.wikipedia.org/wiki/Unified_Extensible_Firmware_Int...

haberman · on Nov 19, 2013

I believe that "long mode" is just the name for the native 64-bit mode of these processors; ie. what you'd normally run 64-bit Linux under, with the MMU enabled.

tmzt · on Nov 19, 2013

It would be interesting to see Rust bindings for coreboot's libfirmware.