Yeah - if you run things in the kernel you don't need to do TLB flushes or context switches - which can be very expensive. But thats all offset by the fact that wasm code runs slower inherently, because it needs to bounds checks everywhere.
I suspect that programs which make a lot of kernel calls would run faster in the kernel in wasm, and programs that are more compute / memory bound will run faster as native processes. It'd be fun to play around and find where the tradeoffs lie for different applications.
Sorry for being daft, but doesn't this imply that TLB is now shared between all processes? Which means that the cost of flushing the TLB is amortized across context switches? Processes are gonna evict TLB entries from other processes at unpredictable intervals.
If all programs run in ring 0, you can just run them all in the same virtual memory space. Then you don’t need TLB flushes at all. Of course, one bad program could bring the whole system down but that’s the price you pay.
But if programs were written in wasm, that problem goes away. Wasm prevents bad memory accesses by doing memory protection in software. (Every array or pointer access is bounds checked).
I suspect that programs which make a lot of kernel calls would run faster in the kernel in wasm, and programs that are more compute / memory bound will run faster as native processes. It'd be fun to play around and find where the tradeoffs lie for different applications.