It's not something I've practiced much myself, but writing a test case to reproduce the bug and then fix the bug seems like a more reasonable form of TDD in my head.
I've seen it once when reporting a bug to another team I worked alongside, where I was stress testing a new feature and found said bug. Instead of having to run the stress test, they wrote a unit test to reproduce it at a much smaller scale and then had a much smaller feedback loop to ensure it worked after they fixed it.
Further grist for the mill about the effectiveness of seccomp-style filtering for multitenant Docker, since it's unlikely anyone was filtering out `io_uring_setup`.
People can do whatever they want with seccomp-bpf obviously, but is it really that uncommon to use it for whitelisting? As for kernel vulnerabilities being a weakness of sandboxing in general, if anyone still doesn’t understand that by now it must be willful and I don’t know if they can be helped.
No matter how you mask off attack surface for the kernel, you're not super likely to want to disable io_uring, is the point I'm making. It's easy to find recent threads here with people sticking up for shared-kernel multitenant isolation.
(Be forewarned that I'm talking my book a bit here, since we have a commercial thingy built on multitenant VMM isolation).
BTW while on the topic, what do you think about having a heavy host kernel with a guest vmm attached to the network with a hardened firecracker and a dedicated network interface. Would you feel it's 'better' than shared kernel/os + namespaces? Or is it 'smallest hardened root hypervisor or no go'. Not sure I'm making sense...
The heavyweight host (which is the normal state of affairs) is problematic attack surface; moving the workload into a hardened VMM on that improves security regardless.
For a moment, I thought 'escalati' (in the title of the submission) was some kind of professional term that had so far evaded me. It sounds pretty elegant. But of course, the title was just cut off. Almost disappointing.
Escalati: the secretive guild of hereditary escalator engineers who maintain the escalators in the Illuminati's secret volcano lair (escalator reliability engineering is a major concern when world leaders are frequently escalating over giant cauldrons of molten lava)
what a puzzling fragment of american culture did you just unearth for us! it says s03e06 so it survived a surprising while. was this popular amongst HN crowd? was it all so absurd?
This is really a minor part of the show "community". Whilst community definitely deserves a watch, it is a weird comedy about community college. It is not focused on technology in any way.
Give the show a try, it is good! but don't expect tech-focus.
As I read it: it's a kernel UAF; memory corruption, in the context of the kernel. There's a secondary attack vector related to the refcount mishandling, where you can obtain control of file table entries after an `execve`, even if you exec a SUID, which is also bad.
> The affected code was not introduced into any kernel versions shipped with Red Hat Enterprise Linux making this vulnerable not applicable to these platforms.
It would be nice if the title mentioned what was affected, perhaps something like "CVE-2021-20226: io_uring privilege escalation via reference counting bug".
I think these articles are more aimed at the postmortem aspect; reading about a bug that happened so you can try to avoid it when you're designing a similar system. So it doesn't really matter that it affects Linux, or what io_uring is, etc. The lesson is relevant even if you use Foobian GNU/OpenBSD emulated under Windows 11 on an M2 Mac.
If you are just looking for notifications that you should patch your system, you probably want a method other than HN for that -- you will miss a lot of critical patches.
Man. I've been on the io_uring train since basically the beginning.
Since following, I've seen reason after reason not to ever use it. Between the skewed performance tests, the dubious funding (coming from Facebook), and the several security risks (including this one), I just don't see it taking off.
Skewed performance tests? Dubious funding? I'm not sure what you mean. Can't Facebook fund any technical work?
I see some reasons not to use it (like it's a vulkan-like low-level-only API, or portability issues, or some missing APIs in my case-s) but 'it was funded by facebook' and 'it has security risks'? I mean I'm not sure there's an area in the kernel without corporate funding/support, or without past security bugs. We keep finding vulnerabilities in ipv6, sctp... Yes the Linux kernel dev process is lacking but likely for different symptoms & causes?
The project has a long history of skewing the performance benchmarks and making wild claims like "60% faster than epoll" and the creator gets very defensive when anyone questions it.
There are claims about SQPOLL that simply cannot be reproduced. Several ongoing threads about it on the liburing tracker.
There have been privilege escalation problems from the start that seem like they aren't being addressed.
It feels as though Facebook wants to have something revolutionary at the cost of quality and they're using the Linux kernel to do so, though I realize that's my own hot take.
Again, I've been following this (and writing code for it) since I could get my hands on the dev branches. It's only really good for filesystem I/O in its current form as that's the biggest focus they have for it. They (Facebook) care less about other resource types (e.g. sockets).
Thanks for taking the time to expand a bit more, on your experience with the code and the community around it. I'll have a look.
Right now I'm more interested in the chaining aspect than raw performance but I know some of my high-throughout network workloads behave far better in latency with the complete syscall removal on recv and send though keeping up with the completion queue is hard-ish. I still prefer dpdk right now for this kind of network stuff but just because my use case is perfectly adapted for it (no fragmentation, no complex protocol, constant data stream...) and dpdk ain't no party either.
Because there isn’t any requirement for testing, it allows these functions to become super complex and harder to see where errors could occur.