Hacker News new | past | comments | ask | show | jobs | submit login
MINIX 3: a Modular, Self-Healing POSIX-compatible Operating System (youtube.com)
51 points by axman6 on July 7, 2010 | hide | past | favorite | 28 comments



This 6k vs 6M LoC comparison is pretty dubious. If your disk driver has a bug and overwrites some data you're just as screwed up, be it running in userspace or kernel space. The argument these bugs are less powerful because they are now in userspace "can't do very much" is at best limited. Drivers can screw up hardware just as easily, filesystems can screw up your data just as easily, etc. The memory grant stuff seems interesting though and I see how that can protect you from some forms of memory corruption.

Running all these processes in userspace seems to gain you some capabilities to more easily respawn/reset drivers. Live upgrade seems exciting. I wonder if you are paying for that in added complexity that is hard to debug. The tests they did are simulating basically hardware errors, and it does seems to be quite resilient to those. In practice I fear the software bugs more. The bugs I actually see in Linux tend to be oopses generated by bad code that simply disable the driver. The bad ones screw up the hardware they are controlling and require a reboot. It is unclear in these situations that the driver can be correctly reset without resetting the hardware too.

I wonder how much of this respawn/update functionality you can apply easily to a Linux kernel right now. Most drivers and filesystems can already be built as modules and loaded/unloaded in a live kernel. I wonder how much that can be extended.


This 6k vs 6M LoC comparison is pretty dubious. If your disk driver has a bug and overwrites some data you're just as screwed up, be it running in userspace or kernel space. The argument these bugs are less powerful because they are now in userspace "can't do very much" is at best limited

They are less powerful in user-space he states this about 25 minutes into the talk "moving bugs to user-space will do less damage"...roughly.This is true instead of getting full ring0 access to anything I can only do what the driver is allowed to do if I exploit a bug in the driver.

Running all these processes in userspace seems to gain you some capabilities to more easily respawn/reset drivers. Live upgrade seems exciting. I wonder if you are paying for that in added complexity that is hard to debug

Why would there be added complexity? Linux has an api just as well but less defined than simple ipc - even more complex. Hard to debug? you do understand that having parts of the kernel in userspace makes it easier to debug

The bugs I actually see in Linux tend to be oopses generated by bad code that simply disable the driver.

Read LWN theres roughly a root exploit every 2 weeks

An ideal operating system would be a sort of exokernel but proven and a 'relaxed' api that would allow distributed computing. Unverified applications would run under a vm with proven parts being compiled as well as compiling heuristically verified parts and jitting the other needed parts.


>They are less powerful in user-space he states this about 25 minutes into the talk "moving bugs to user-space will do less damage"...roughly.This is true instead of getting full ring0 access to anything I can only do what the driver is allowed to do if I exploit a bug in the driver.

I did listen to the talk and that justification. That's why I said it was pretty limited. That may be true for security bugs in drivers. For a network driver that may even be very important. In practice what I see is that the actual bugs I care about in Linux drivers are code bugs that disable the device, or in a filesystem cause disk corruption. None of those are solved by a microkernel. Microkernels give you a bunch of provable advantages in areas that monolithic kernels don't seem to do too badly at.

>Why would there be added complexity? Linux has an api just as well but less defined than simple ipc - even more complex.

This is anything but simple IPC. You're sending async messages around and wanting to handle restart of whole pieces and reissuing of commands. It is much more complex and with many more edge cases that the equivalent Linux call stack.

>Hard to debug? you do understand that having parts of the kernel in userspace makes it easier to debug

Because now you're trying to restart a driver for a device that is in an unknown state and then restarting the operation of the filesystem accessing the driver that now has to make sure its operations are idempotent otherwise it will screw up. The number of new edge cases is immense. It could get hairy really fast. That is even touched upon in the presentation with the async messaging and deadlock avoidance. That's why it's harder to debug. Because you're adding a bunch of complex code in error handling paths that get executed once in a blue moon.

>Read LWN theres roughly a root exploit every 2 weeks

I read LWN every week, there are some local root exploits once in a while. The memory protection stuff could be good for that and you could implement it in Linux if you wanted. I specifically stated that part is interesting for this. My point is that the non-security bugs I care about wouldn't be prevented by this technique.


Exactly. Linux is also useful enough to run the laptop Tannenbaum's presentation is running on while no version of Minix is. (He's running windows, but it could have been Linux)

Minix drivers would still have to exist, and to be usable, they still need access to hardware. There's no presented reason why Minix drivers should have fewer bugs / LoC than linux, except that individual drivers can't take down other "kernel" processes.

That might get you minor improvements in reliability, but it won't fix this issue of bugs existing in hardware drivers in the first place.

IMO, the way forward in reliability is a kernel written in a more correct, more expressive (haskell-ish) Programming Language.


> There's no presented reason why Minix drivers should have fewer bugs / LoC than linux

On a monolithic kernel, each and every driver is aware (or at least can be) of each and every other driver linked into the kernel. It may create added complexities that should be absent and abstracted in microkernel-based OSs.

Microkernels are not a silver bullet and buggy drivers will be able to put the system - buses, coprocessors, whatever - in weird states that can hang or crash the machine.

I like the idea of parameter validation on every function. That and extensive unit tests built into the kernel. Tests could be invoked on boot with a specific switch and the kernel would perform a self-test (with crash detection) before bringing up your system. This could help find hardware incompatibilities and other weird defects that could affect reliability of the whole box.

I don't think we would need a Haskell-like language. You can do very Haskell-ish things in C. All it takes is lots of discipline (just as much as writing Haskell)

Maybe forcing kernel developers to learn Haskell would help ;-)


I've never found that writing Haskell demanded much discipline at all, just follow a few simple rules (like making total functions etc.), and your code is well on the way to doing exactly what you think it should.


What I'm hearing is "It helps make things more stable, yes, but doesn't solve everything, so why bother?". The advantages Minix has to offer doesn't solve every problem, least of all the wetware problem that there will always be bugs. But if you're after a high reliability system, say a plane, you don't want a bug in the audio driver making the navigation systems crash. Who cares if the warning signals sound a little weird for a second, if the system is still running correctly, apart from the audio?

The less that is tied into the kernel, the less can crash the system easily. The disk driver could certainly overwrite the kernel on disk, and that could be... pretty bad, but no system that doesn't in some way verify that the code is correct can ever protect against this sort of programmer caused bug.

It should also be noted that Minix isn't exactly aimed as a replacement for linux, and I'm not one hundred percent sure that making such a comparison is that useful.


I find it fascinating that the actor model seems to be the model for creating reliable software. You can see it in Scala or Erlang at the user level, get it in MINIX3 at the kernel level. I think if you combined the two you could potentially create systems which stay up for years.


I have met a couple of QNX-based, also microkernel-based, systems that had uptime measured in years.

Lovely OS. I learned C on it.

MINIX reminds me of GNU/HURD. Since they are all Unix-ish from the application's point-of-view, I have hope they someday can more or less compete on equal footing with more traditional monolithic (it would be fair to call OSX "duolithic") OSs like Linux, BSD, Solaris.

Very usable desktops like the Gnome I am using to write this don't care much about what is under the libraries they link against. They would be fine on top of any OS as long as libraries provided the environment they expect. Oddly enough, the first time I ran Gnome was on top of IRIX.


The self-healing thing in Minix 3 really works.

Some years ago I installed Minix 3 on a very old Thinkpad that had an esoteric video hardware quirk that would crash most OS's. It was "designed for Windows 95" and by God they meant it because you couldn't run anything else on it. Neither Linux nor Windows 98 would run for very long before the hardware would cause a driver crash and a kernel panic/bluescreen (even in text mode). But with Minix 3 the video driver would just transparently restart after each crash, so smoothly that I wouldn't notice it unless I was watching the logs.


That's awesome, and I'm sure its the sort of story the developers would love to hear about.


MINIX is a fantastic kernel to practice kernel hacking on and see how operating systems function without the real world cruft of performance and portability hacks.


Personally, I found the code relatively painful to follow through. I'm not even talking about how the parts communicated - the code was difficult for me to trace through in the small. Even when compared to Linux or the BSDs, let alone Plan 9.

K&R prototypes, overly decorative comments that just echo the name of the function, defining macros like 'PUBLIC' and FUNCTION_PROTOTYPE(name, args) which are effectively no-ops, and a host of other little annoyances that make the code - at least in my opinion - rather unidiomatic and painful to read C.


Not knowing much about OS design, is there any inherent reason why operating systems built on microkernels seem to be so unsuccessful? They always seem really elegant, but are never widely adopted (with the possible exception of Mach -> OS X).


I tend to think that Linus was right and microkernels are in reality a bad design.

Case in point: Minix 1.5 (which I hacked on before Linux came along).

Minix 1.5 has two "daemons" called mm and fs which run the memory management and filesystem respectively. Now consider process creation and loading (fork and exec). Creating and loading a process intimately involves both mm and fs, so in Minix 1.5 the program sends a message to mm [IIRC] which sends a message to fs and both daemons have to coordinate with each other. This makes it a lot more complex than if there was just one daemon (ie. a monolithic kernel).

Another example is that if mm or fs die, your OS dies. You can't restart one or the other because there's so much process state spread across the two daemons. So the claim that microkernels are more resilient because you can restart daemons seems to be nonsense (but I should say that QNX can apparently restart some(?) components transparently).

Nevertheless it's not all roses for monolithic kernels either. There's no process protection and they're usually written in deeply unsafe languages like C. Exokernels might be the answer to this because they have monolithic qualities (fast calls and shared state) but keep virtually everything running in userspace so you can use sane, safe programming techniques.


This shows nothing more than a badly implemented api. POSIX is a bad api for anything modern (distributed).


Since 80's, operating systems are a commodity and in the commodities market any elegant or premium product will end in the marginal.

People want their operating systems to manage disks, processes, cpus, network, and peripheral hardware: unless the operating system totally fails at these basic tasks, nobody will pay any attention to how the kernel was implemented. There are some folks who are interested in performance, and there are some folks who are interested in stability, but there are virtuall nobody who is, as a user, interested in elegance.

If you just buy a car to get from point A to point B, do you care if the engine has a carburetor and a purely mechanical fuel ignition system or if it comes with an engine control unit computer that electronically controls that its high-pressure fuel injection system and its computer-controlled ignition system are in sync, keeping the engine at its optimal parameters at all time and avoiding ignition knocks?

Given the nature of operating systems as for the markets in general, it's really great that people like Linus Torvalds and his fellow gurus keep making their kernel better and better. For most people, it sounds like really gritty, mundane work to do.


It's because Richard Gabriel was right.

Monolithic kernels are good enough, perform well enough and are available right now with a large enough body of software.

Being Unix-ish is a great thing as it allows you to be creative with the implementation while presenting a familiar API to the applications. Netscape ran on about 30 different platforms (27 of them more or less identical under the hood because they were Unix ports).

I see a bright future for microkernel and other architectures that can provide a unix-like appearence to programs, but only after we get rid of Windows.


the OSX version of Mach has subsumed so much back into the kernel that it can barely be considered a microkernel any more.

The thing is that for most systems, microkernels are a performance liability, (it's possible to make them perform well, but it's very hard to do while keeping memory protection) and with lots of hardware they tend not to be a big safety win - if you can wedge the hardware through bad commands, it doesn't matter if the commands originated in userspace or kernelspace.


This happened to the Windows NT microkernel too - they just kept shoving more and more stuff back into kernelspace for performance reasons, starting with Graphics drivers in NT4, and going from there. I think they may have yo-yo'd on Graphics Drivers later for stability reasons, but I've taken my eye off Windows since I stopped using it.


Unsuccessful on the desktop, qnx for instance is a microkernel and is quite successful. On the desktop there's a lot of inertia, you need apps (which is a chicken and egg thing). As wmf said, you see a lot of poor unix on top of the kernel so there isn't even a compelling reason to use it.


I think the main reason is poor marketing. Teams built microkernels, but of course a microkernel is not useful on its own so they had to build a complete OS. But being already tired (or near graduation) from building the kernel itself, they simply ported Unix to run on top of their microkernels, which led to a slower OS with no new features. There are benefits to microkernels, but all these benefits were hidden or wasted by Unix. By this time, patience (and thus funding) for microkernels had run out and researchers had to find something else to work on.


In my undergraduate operating systems class we hacked away at an early version of Minix. The level of detailed code documentation always just blew me away.


Absolutely fascinating lecture. Very much worth watching.


His book, "Modern Operating Systems", is a fascinating read for any geek. He discusses MINIX a fair bit (unsurprisingly!)


Or, just get his "Operating Systems: Design and Implementation", which is a great read, and contains the full MINIX3 source code (printed and on CD).


Wow... I remember having read that in 88 or 89... It's a bit sad MINIX 1 would still be considered pretty modern.


After watching the lecture, just download Minix, setup it up in VMWare/Virtualbox/Bochs and hack something in it. Really, do it. It's amazingly cool OS and fun to play with.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: