Where does your hypothesis come from? Any architecture chart will point out that XDP will be processed before all other network modules. If XDP is not offloaded, the driver will always process the XDP hooks before the rest of the network stack is called.
In your assumed concept, how else would offloading to a NIC that does not run a kernel work?
XDP receives packets before the network stack does, but not before the kernel; in almost all cases, it's just a hook to process packets off the DMA buffer. None of this matters; the kernel controls XDP; not only that, but there's nothing an XDP program can do without rendezvousing back through the kernel.
> XDP receives packets before the network stack does, but not before the kernel; in almost all cases, it's just a hook to process packets off the DMA buffer.
If the kernel really processes and parses the data packet _before_ eBPF and XDP can then you could exploit the kernel via single data packets. That's the context of the discussion, still. Meaning that in the hypothetical scenario that you found a programming error in the kernel code regarding the parsing of network packets.
Note: Parsing is not the same as copying, and I used the word parsing specifically on purpose here.
If the kernel does not process or parse the network packet other than sending the pointer to the previously copied buffer to an eBPF program - then that means a malicious packet can be blocked before anything else in the network stack is affected, right?
So, what do you think happens when I decide to write an eBPF/XDP program that blocks e.g. all TCP packets?
A) The network stack receives the packet
B) The network stack does not receive the packet
If your answer is A, we have a different definition of what you describe as the term "network stack".
To me, the network stack is everything that comes _after_ XDP passthrough. And that's outside the influence of my userspace/kernelspace program that tries to protect the system.
Also XDP is the earliest position in the kernel architecture to detect/validate/block malicious network packets. Because let's be real: I am never gonna be able to get anything merged in the kernel driver code of my network cards.
I feel pretty comfortable with how XDP works, since I built a CDN forwarding path on it, on multiple different drivers, which was its own special fun. No, I don't think you're right that XDP gives you a fighting chance against CPL0 implants.
Have you written any eBPF code? People who haven't before might tend to think it's possible to do things in solely in eBPF that are not in reality possible. Have you written an XDP program before? XDP is even more limited than eBPF generally (it has almost no helpers exposed). No, you're not going to use XDP to detect or defend against a CPL0 exploit.
That's before we get to the more fundamental issue with the strategy, which is "what network packets would you even be looking for". The ones that say "CPL0 exploit"?
(Fun fact: literally looking for a packet that says "CPL0 exploit"? Super annoying to do in eBPF. No loops!)
Yes. [1] I also understand its limitations, e.g. not being able to do DNS compression due to its linearity and the bpf verifier only allowing statically inlined helper functions etc.
I think in general there is a misconception about what I was talking about. Maybe I was too unclear, dunno. I am aware that kernel self-checks cannot be implemented in the kernel itself. That is what I wanted to point out in my previous comment.
I was always talking about whether or not it's possible to protect the kernel from receiving known malicious network packets that could cause an RCE. And I think it is possible.
It's not just "you can't do DNS compression", or "you probably can't do general-case string comparisons". It's much more fundamentally that anything you "detect" in eBPF code, even in the extremely rare cases where it's offloaded into NIC chipset, has to get plugged right back into the kernel to do anything with that data. You can't write a general-purpose eBPF program; eBPF is just an telemetry and packet processing offload.
That eBPF firewall is a perfect example of what I'm talking about. It relies not just on the kernel but on a cooperating userland process to do all the "interesting" bits.
In your assumed concept, how else would offloading to a NIC that does not run a kernel work?