First we run everything as processes on top of an OS kernel. On UNIX, we have all of these high-level concepts of letting processes interact with each other. Files, pipes, streaming/datagram UNIX sockets, etc.
Later on certain people started to see that these high-level concepts are bad. Operating systems are insecure. Context switching has overhead. Page table handling is expensive. So unikernels are invented.
People later discover that unikernels are somewhat hard to work with, as the only way they can talk to the outside world is through virtio-like devices:
- Unikernels aren't capable of just writing something to a simple file. No, they must write to a raw block device. Hey, that's annoying. Let's bring back support for native file access using virtio-fs! https://www.qemu.org/2020/11/04/osv-virtio-fs/
- Unikernels are also not capable of simply streaming data to another application. No, they must include their own Ethernet/IP/TCP stack, and on the host system you must set up a network bridge (hopefully with a firewall) just to let a couple of unikernels talk to each other. So let's solve that using AF_VSOCK!
At what point do unikernels become indistinguishable from ordinary processes running on top of an OS kernel in terms of features/behaviour, but reinvented poorly? I have the feeling that we're coming full circle at this point.
The general thrust of your argument is correct, but you missed a key part of the evolution.
People see that the operating system is insecure, so they run the operating system as a virtual machine on top of a hypervisor. The hypervisor operates the system as a bundle of virtual machines and provides shared services to them. People realize the hypervisor is just a operating system, so they target the hypervisor API instead of the kernel API and thus unikernels are invented.
The question then becomes: Why unikernel on a hypervisor instead of a process on a operating system?
The standard answer is that hypervisors are more "secure". However, this is total nonsense. Any technique that made a hypervisor "secure" could be directly applied back to a normal operating system as they are solving the same fundamental problem of multiplexing a single machine to run multiple programs.
To quote Theo de Raadt: "You are absolutely deluded, if not stupid, if you think that a worldwide collection of software engineers who can't write operating systems or applications without security holes, can then turn around and suddenly write virtualization layers without security holes."
The actual answer is that everybody sucks at multiplexing hardware, so everybody knows the old stuff does not work. However, they do not know the new stuff does not work yet, so they bet it all on the new stuff. Unikernels on a hypervisor are a strictly inferior solution to a process on a operating system, but people think hypervisors are better than operating systems, so they develop unikernels. The truth of the matter is that you just need a better operating system so you can develop processes which is a strictly better model.
I'm not necessarily saying virtual machines are more secure, but, here's where I think the difference might be: the api.
Linux has so many system calls, and with each new one you get a whole new attack surface (e.g. most recently with io_uring).
But the attack surface with hypervisors is pretty static, and perhaps can be better reasoned about.
Again, not strongly arguing which is safer, but just pointing out that there is a pretty fundamental difference between the two that could make for a difference in security.
Yes, but the API is not inherent to the hypervisor/kernel split. You can have a operating system that exposes a smaller API similar to a hypervisor and voila, you have a better model. The only reason to prefer a hypervisor over a similar operating system is if you need to run a binary targeting a different operating system. At that point you actually do need a virtual machine and you actually do need the operating system in it. However, this is contrary to the point of a unikernel which is to get rid of the operating system within the virtual machine.
Processes and OS services are a strictly better model than virtual machines and virtual devices when developing a program. They fulfill the same purposes, but the operating system model is less clunky as you are "virtualizing" a ideal machine that requires no setup and ideal devices that requires no setup. In contrast, virtual machines and virtual devices expose a API that includes the nitty-gritty details of actual hardware which is just a distraction when developing a regular old application.
You actually see a funny convergence where optimized hypervisors implement specialized virtual devices that do not exist in reality that offer streamlined interfaces. These are basically the reinvention of operating system interfaces as the virtual machine guests (processes) consume a abstract interface that the hypervisor (operating system) maps onto the actual hardware. They also support specialized virtual devices for communicating between guests (IPC calls). Reinvention for everyone, but worse, yay.
The model was originally: OS + processes. Due to security reasons (which are wrong) it changes to: hypervisor + OSs + processes. However, the hypervisor and the OS really do the same thing; they multiplex multiple programs onto a single piece of hardware. The unikernel people recognized that and decided to collapse the OS + processes so you now just have: hypervisor + (hypercall library/OS stub + process)s. The better model is just making a better OS so you can drop the hypervisor layer and return to OS + processes. This model has better performance, easier development, more observability, and is basically just better in every way.
Unfortunately, nobody wants to do this because everybody wrongly believes that hypervisors are magically more secure than operating systems for the reasons I stated in my previous comment. This is really the only thing propping up the hypervisor and, by extension, the unikernel concept. As soon as you no longer believe they are magically more secure there are hardly any reasons to prefer a hypervisor over a properly designed operating system beyond the one use case I mentioned in a sibling comment which is running a binary-only program targeting a different operating system.
The correct place to solve these problems is at the operating system level, but that's a difficult space to break into, and so people are trying to solve adjacent "containment" problems with different mechanisms like virtualization and emulation.
The most important thing he says in the whole talk is that "Helios can run another instance of itself within itself, without virtualization".
We don't need any of this extra complexity, emulating entire machines, to be secure. We just need to build applications on top of APIs which allow for security. An application that needs a few seL4 endpoints to do its job is much easier to properly integrate into a secure system, than an application which expects the whole POSIX circus.
That's why I like the modular approach by Unikraft where the value is you can select which high level abstractions and libraries you want baked in the OS (including your application) [1].
Compared to other unikernels designs where the OS layer is minimal but mostly fixed.
There is a basic assumption that these applications are being deployed to the cloud such as AWS, GCP, Azure, etc. so your app is already using all of this regardless - you just don't manage it - the cloud vendors do.
Unikernels ask the very pointed question - if we're already being deployed as virtual machines and using these abstractions why not get rid of everything else and pick up the security, performance, etc.
It's questionable why any web service needs to write to a file anyway. Writing data to a filesystem on each web server node seems like a good way to lose data, Also, one of my pet peeves with language runtimes is you can't easily embed them because they are full of hardcoded filesystem calls; ruby (not mruby) checks all over the Unix filesystem looking for config files and gems and stuff... ugh
Your idea of the motivation behind unikernels is wildly different from the one stated in the original paper [1] that introduced the concept. The original motivation wasn't a Unix bad, unikernels good thing. That's a different topic. Rather, it was a reaction to cloud computing.
The authors behind the paper observed that people were deploying single purpose VMs on top of hypervisors. Those VMs included full-blown general purpose OSes. Many of the features provided by those OSes were unnecessary for this use case. That bloat not only lead to large image sizes, but also worse security caused by increased attack surface.
As a solution, the authors proposed a new class of OS which they named unikernels. Simply put, they're library OSes targeting hypervisors. By using a library OS, the compiler would ensure that only features that are necessary would be included in the application image. And by targeting the hypervisor, the OS would be able to rely on high level primitives provided by hypervisors such as multitasking, isolation, and device handling.
So taking this into account, let's go through your claims:
> Later on certain people started to see that these high-level concepts are bad. ... So unikernels are invented.
This isn't what unikernels are about. Nothing about the concept precludes implementing high level primitives in the OS. They just drop features that aren't required for the application on an individual basis. It's not about throwing every Unix features out of the window.
> Unikernels aren't capable of just writing something to a simple file. No, they must write to a raw block device.
That's not a description of unikernels. That's a description of the hypervisor's interface with guest OSes. Traditional OSes would have to deal with block devices too. Nothing about unikernels prevents them from implementing file system drivers like a traditional Unix system would.
> Unikernels are also not capable of simply streaming data to another application. No, they must include their own Ethernet/IP/TCP stack, and on the host system you must set up a network bridge (hopefully with a firewall) just to let a couple of unikernels talk to each other. So let's solve that using AF_VSOCK!
Again, you're describing hypervisors. Unikernels can implement communication mechanisms as they see fit. Unix OSes very much need to implement their own IPC and networking stack too, if they want their applications to be able to communicate. The difference is that unikernels only have to deal with virtual devices, a much narrower target than general purpose OSes. That can lead to efficiency and security benefits.
> I have the feeling that we're coming full circle at this point.
I'm grown increasingly dubious of unikernels; cutting out a layer of indirection by running the target process in kernel space seemed like a good idea -- as long as you didn't need, you know, multiple processes -- but it looks like the performance gains just aren't there.
Unikernels can boot incredibly fast. Compared to a normal virtual machine that can take seconds to boot unikernels can boot in 50ms (quote from a white paper on the nanovms website)
Considering that unikernels, by definition, don't have any userland, you really need to compare the unikernel boot time to the "time to reach init" boot time on a traditional OS -- which is under 100 ms with a Linux kernel, and down to 20 ms with FreeBSD now. A unikernel boot time of 50 ms really isn't anything special.
There are definitely other projects that are <10ms but I'm not convinced that a fast boot time is something worth spending a ton of time on. Many languages and frameworks take seconds to boot regardless (jvm, python ml frameworks, rails, etc.)
I'll also say that if you are setting up your own bridge and using something like dhcp nanos actually boots faster than the response can come in sometimes so there' s a flag you have to set. I think the 'fast boot time' usecases are a relatively small set.
Having said that there are many many other performance considerations besides fast boot time such as throughput, or request/second or random io read/writes/etc.
Many web languages are single-process/single thread (eg: all interpreted ones) and since that is the vast majority of our users it isn't a problem. Threads are great if you have access to them and if not you scale horizontally (just like the interpreters do today). Multiple processes as an architectural design outside of scaling (and numerous issues when it actually does come to scaling) is something the industry really needs to reckon with.
Unikernels are the emperor's new clothes: they lack the stack of services and capabilities delivered to normal system images necessary for real production use: firewalls, security, monitoring, performance measurement, backup, auditing, and fs ACLs. To skip them is to put blinders on and walk around without clothes on.
* firewalls - nanos.org is a go unikernel running on GCP and we punch holes for it to talk on https
* security - nanos supports much of the same security you'll see on linux but provides more: aslr, no stack/heap exec, rodata no exec, text no write, no null mapping; virtio-rng (for some clouds where it is supported - not everywhere it is), pledge, unveil, honestly - there's a lot here
* monitoring - plenty of apm vendors work out of the box but also things like cloudwatch, and our own custom service as well
* performance measurement - we have things like ftrace and many other tools
* backup - pretty easy to clone vms
* auditing - glad you pointed this out as this becomes much much easier; we actually analyzed a bunch of STIGs and measured the reduction for each STIG as compared to nanos - scroll down to the page here: https://nanovms.com/security - essentially if you are in a regulated industry like finance, health or defense this is a major benefit
* fs ACLs - we have unveil support and many other nanos specific things
They are great for scale to zero use cases (cloud hosting serverless firecrackers). With KVM rapid booting you want to remove as much overhead as possible, it keeps things cheap. The ultimate endgame would be to have a cold-start time almost equivalent to a prewarmed in-memory image using current technology. Of course, this is just one contender among many similar tech. If WebAssembly truly takes off, then unikernels would become irrelevant.
Wasm and unikernels are more complimentary than competitive. You could actually run wasm in something like wasmer or wasmtime wrapped inside Nanos wrapped inside Firecracker if you wanted to.
I think as you point out where wasm can shine are the function-as-a-service platforms, although the memory safety issues would probably need to be addressed at some point (writing to 0x0, no ro mem, etc.).
The thing is Wasm already provides sandboxing, so you don't really need any additional KVM sandboxing, unless your hostile code threat model goes beyond even what typical cloud providers face. The Linux host for most wasm environments is only a single instance, often already running on bare metal, so there is no significant benefit to stripping it down to a unikernel.
"Not all Unikernels". UKL is Linux and has a userland if you want to use it. https://github.com/unikernelLinux/ukl And it supports vsock already. I don't need to check because it's just Linux so it supports everything Linux supports.
Later on certain people started to see that these high-level concepts are bad. Operating systems are insecure. Context switching has overhead. Page table handling is expensive. So unikernels are invented.
People later discover that unikernels are somewhat hard to work with, as the only way they can talk to the outside world is through virtio-like devices:
- Unikernels aren't capable of just writing something to a simple file. No, they must write to a raw block device. Hey, that's annoying. Let's bring back support for native file access using virtio-fs! https://www.qemu.org/2020/11/04/osv-virtio-fs/
- Unikernels are also not capable of simply streaming data to another application. No, they must include their own Ethernet/IP/TCP stack, and on the host system you must set up a network bridge (hopefully with a firewall) just to let a couple of unikernels talk to each other. So let's solve that using AF_VSOCK!
At what point do unikernels become indistinguishable from ordinary processes running on top of an OS kernel in terms of features/behaviour, but reinvented poorly? I have the feeling that we're coming full circle at this point.