Hacker News new | past | comments | ask | show | jobs | submit login

No. The providers that did so soundly used virtualization to accomplish this, and a big part of the appeal of K8s is having a much lightweight unit of scheduling than full virtualization. gVisor is a middle ground between full virtualization and shared-kernel multitenant (which has an abysmal security track record).



Virtualization, lxc, containers (and K8s), etc were solutions to "secure shared environments". And they have an order of magnitude lower performance hit than gvisor does (Google 'cloudrun python startup times' if you're curious on the real impact of this stuff).

Have we proven they're not secure and safe? Have we broken out of containers yet? Heroku was running LXC for years before docker, did they run into major security woes (actual curious)?

If "secured shared environments" is a more specific term meaning "multi user unix environment", I didn't intend to say that.

Though you already mentioned my whole thread is a bit off topic to this post (and I sorta agree) but then baited me with this comment after. I'm happy to drop it and wait for a Gvisor container runtime thread.


Containers are not compute environments, their runtimes are, and gvisor (runsc) is one implementation of that. Docker engine (~runc) is another. It has similar performance characteristics to gvisor afaict looking online (the minimum cold start times I'm seeing are 500ms which I've beat in gvisor), yet implements less security features.

If by virtualization you mean VMs, gvisor can be more performant than those based on my experience. For example, AWS claims a p0 coldstart time of ~500ms using Firecracker but I know firsthand that applications sandboxed by gvisor can be made to cold start in significantly less time (like less than half): https://catalog.workshops.aws/java-on-aws-lambda/en-US/03-sn..., and you should be able to confirm this yourself by using products that leverage Gvisor under the hood or with your own testing. I actually worked on this (using gvisor, but working on adjacent tech) for years...

> Have we broken out of containers yet?

Sure, how about https://scout.docker.com/vulnerabilities/id/CVE-2024-21626 where runc (Docker) exposed the host filesystem to containerized applications? Precisely the kind of exploit gvisor is designed to prevent.

I'll note that a lot of people are thinking about how to reduce sandbox overhead in multitenant PaaS and it's one of the things I want to eventually address in my own startup. But I think blindly hating on gvisor because of a nebulous dislike of overhead really is misplaced without considering its alternatives.


The charts you linked in the performance guide show a 30x syscall overhead in runsc vs runc (careful quit a few of the charts are logrithmic). That's insane! They also go on claim a 20% tensorflow workload difference.

> I want to eventually address in my own startup.

You worked on CloudRun and their performance is dogshit. Seriously google it theres like 100 stack overflow questions on the subject. It's common enough a query Google even suggests follow up questions like: "Why is cloud run so slow?".

Now your answer might be "avoid syscalls", "don't do anything on the file system (oh by the way your file system is memory mapped hehe)", "interpreters can be slow to load their code, sorry", "look at these charts its not as bad as you say", "tcp overhead is only 30%", etc but your next set of customers wont have the same vendor lock in you enjoyed at Google.

Then do the same query for "Digital Ocean Apps slow", also gvisor. And bam you'll have a long list of customers ready to use your better version! Perhaps Google and Digital Ocean will enlist your expertise (again).


Yes, we have proven that shared-kernel multitenant is unsafe. The best example (though there are many) is the `waitid` LPE; nobody's container lockdown configuration was blocking `waitid`, which is what you'd have had to do to prevent container code from compromising the kernel. The list of Linux LPEs is long, and syzkaller crashes longer stil.


https://news.ycombinator.com/item?id=40591147

So the PaaS providers mentioned in that comment should be assumed to be compromised?


If they are using multitenant Docker / containerd containers with no additional sandboxing, then yes, then it's only a matter of time and attacker interest before a cross-tenant compromise occurs.


There isn't realistic sandboxing you can do with shared-kernel multitenant general-workload runtimes. You can do shared-kernel with a language runtime, like V8 isolates. You can do it with WASM. But you can't do native binary Unix execution and count on sandboxing to fix the security issues, because there's a track record of local LPEs in benign system calls.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: