GCC eBPF for Linux port has landed

haberman · on Sept 9, 2019

I'd love to see an eBPF vs. WASM compare/contrast. These seem to be two emerging technologies for sandboxing code in non-GC-oriented runtimes. Both have compiler backends now. It would be very informative to see how their choices are similar and different. I'm also curious if there are differences in the requirements.

kbumsik · on Sept 9, 2019

They don't have any similarities at all, they have different features and different scopes. For example, you cannot even for-loop in eBPF.

monocasa · on Sept 9, 2019

You can, it just won't be validated as the CFG read by the kernel has to be a DAG. But the compiler has no problems emitting code with arbitrary loops. cBPF is different; branch offsets are unsigned there.

I'm working on an exokernel built around a eBPF VM that'll let you use loops in certain circumstances (ie. more or less regular threads that happen to be in kernel space and are preemptible).

star-trek-fleet · on Sept 9, 2019

> You can, it just won't be validated as the CFG read by the kernel has to be a DAG.

Are you saying the kernel bpf verifier can be disabled when loading a probe? AFAIK, there is no such option.

TBH, I cannot understand this statement. I've written a few thousands of lines of BCC C code as eBPF. Never encounter any reference to actually have loops in the code.

comex · on Sept 9, 2019

No, but you can write your own eBPF VM that doesn’t have the same restrictions.

star-trek-fleet · on Sept 9, 2019

Do you have a link to the document?

I never heard of that one can write a new eBPF VM.

monocasa · on Sept 9, 2019

What do you mean? The source is all open, and the bytecode format is well documented. There's nothing stopping anyone from writing their own with different semantics.

There's no doc on my work as it's private jerk offy personal project that I work on when I get tired of jira tasks and process.

Edit: here's an example of one someone did in rust https://github.com/qmonnet/rbpf

star-trek-fleet · on Sept 9, 2019

I was saying a eBPF vm running inside the kernel.

A userspace VM that do not have the same capability as the kernel one is not useful when tracing kernel internals.

monocasa · on Sept 9, 2019

You can patch your kernel, or write your own vm. BPF is ultimately just a kernel module under Linux. My vm runs in kernel space just fine as well (but is built on something that looks more like sel4 rather than Linux on the inside).

If you provided enough of std, or ported that VM to no_std that rust vm would work just fine in the kernel too.

star-trek-fleet · on Sept 9, 2019

This is an interesting option.

Although in our case, our eBPF runs in external customer's environment, and we cannot ask them to patch their kernels with our code.

monocasa · on Sept 10, 2019

I feel like we're talking on two different levels here.

It's sort of like how Oak was this neat virtual machine for running on a early 90s PDA prototype. Then the writers of that VM realized that they had written a really general purpose VM, cleaned it up and released the first Java.

This general of a VM (talking about eBPF now) hasn't been a first class citizen of a mainstream kernel before. The devs are taking a very cautious approach (as they should), but ultimately eBPF is way bigger than a tracing tool. I wouldn't be surprised to see nearly everything you currently do with a kernel module ultimately being allowed by eBPF too. Maybe more like emulating other OS's kernels as easily as you'd start another container.

star-trek-fleet · on Sept 10, 2019

Also different time scale.

I am on existing stuff can be readily used. You are probably estimating the future, I guess?

haberman · on Sept 9, 2019

One similarity seems to be the way they both restrict user access to the stack to prevent stack corruption or overflow. The article talks about how eBPF doesn't have an explicit stack pointer and doesn't give a function access to caller or callee stack frames. I have read similar things about how WASM protects its stack, though I don't know much of the details.

kbumsik · on Sept 9, 2019

I mean, their runtime models might share similarities but eBPF is not meant to be a general purpose machine, it is kinda a kernel add-on for programmable tracing.

cthalupa · on Sept 9, 2019

>it is kinda a kernel add-on for programmable tracing.

Well, that's the thing I personally mostly use it for via bpftrace and bcc, but that's not the only thing. It's being used for a lot of networking related things too. XDP, CloudFlare uses it for a lot of their DDoS mitigation, etc.

strmpnk · on Sept 9, 2019

There is ongoing to work to allow support for counted looping (including a working patchset that is under review) but in general it's true that an eBPF program does not allow a given segment of code to have multiple entry points (a generalization of the prior rule that jumps may only jump forward in the instruction stream).

aisio · on Sept 9, 2019

For loops are now allowed with eBPF in the latest kernels

saagarjha · on Sept 10, 2019

Provably bounded loops, I’d assume?

dsww2 · on Sept 10, 2019

Loops are supported in newer kernels as long as they have an upper bound which can be constant, or unknown but bounded value.

devwastaken · on Sept 9, 2019

Wasm is not sandboxed. Wasm is simply another standard way of writing instructions that can be platform independent. Wasm has no standard library software for sandboxing. Sandboxing is entirely dependent upon software implimenting execution of wasm instructions. Very few do this, there are fewer from reputable sources that do it without a JavaScript engine, and none that put sandboxing first.

nynx · on Sept 9, 2019

Wasm is sandboxed. It's designed to be entirely safe to run.

mhh__ · on Sept 10, 2019

WASM implementations are sandboxed - which can be defeated in theory as, given that there were POC spectre attacks on Javascript VMs it must be possible to do the same on what would be a Webassembly frontend to the same backend (practically) - but eBPF (or the validator to be more specific) is designed to conservatively only accept programs that it can guarantee have certain semantics.

devwastaken · on Sept 9, 2019

That's entirely dependent upon the runtime. Wasm is advertised as many things it's not. The only thing it is, is a set of instructions. Here, go through the list. https://github.com/appcypher/awesome-wasm-runtimes

the_duke · on Sept 9, 2019

And all of those basic instructions only operate in a protected memory space. There is no stack to manipulate,no way to do syscalls or interact with the host environment in any way that is not mediated by the runtime.

Wasm instructions are not native instructions.

The spec [1] clearly states that Webassembly is sandboxed.

A non-sandboxing implementation would be either non-compliant or have bugs (which admittedly they most likely do at this point, considering they are still new).

[1] https://webassembly.github.io/spec/core/intro/introduction.h...

devwastaken · on Sept 9, 2019

Which is all completely dependent upon the implimentation of the executing software.

Lets take it from the beginning, 1. wasm is a set of instructions.

2. those instructions have to be turned into instructions the hardware understands to execute them.

No where in here is there a requirement of sandboxing. With a sandbox, a 'protected memory space' is dependent upon the implimentor. There's no such thing as a magic software sandbox that you just drop into software and congrats, secure. You impliment it.

the_duke · on Sept 9, 2019

> No where in here is there a requirement of sandboxing

Let me quote the Webassembly spec [1]:

> WebAssembly provides no ambient access to the computing environment in which code is executed. Any interaction with the environment, such as I/O, access to resources, or operating system calls, can only be performed by invoking functions provided by the embedder and imported into a WebAssembly module.

[1] https://webassembly.github.io/spec/core/intro/introduction.h...

wahern · on Sept 10, 2019

Those limitations are true of most virtual machines. Lua bytecode has no opcodes for doing I/O, invoking syscalls, or anything else that interacts outside the VM environment. AFAICT, Python is similar with the exception of some opcodes for printing to stdout. To interact with the outside environment you must load and invoke code from modules, so access is intrinsically limited by whatever modules are permitted to be loaded. In Lua a new VM state context has no modules loaded at all--not even for the string module; nor any ability to load modules--the C application needs to explicitly load the package module to register "require" in the environment.

devwastaken · on Sept 9, 2019

There is a seperation between spec and implimentation. That quote doesn't at all disagree with me. That's how wasm is meant to be, but doesn't give any detail as to how that's achieved programatically. You can impliment all of that and still have a vulnerable runtime because of how you implimented it. You can also just put in system access as functions, as some runtimes do because they don't care about sandboxing.

Dylan16807 · on Sept 9, 2019

> You can impliment all of that and still have a vulnerable runtime because of how you implimented it.

This applies to every single sandboxed language in the world.

The instructions are designed so that sandboxing the actual core is trivial, and the spec says that you have to sandbox outside of deliberate pass-throughs. That's about the best you can possibly do.

If languages can qualify as sandboxed, it sounds like WASM qualifies. (And if they can't, then we're using a broken definition of "sandboxed".)

devwastaken · on Sept 9, 2019

Languages cannot qualify as sandboxed unless that language comes with a runtime that has a sandbox. Wasm does not. There is no such thing as 'trivial to sandbox'. Wasm doesn't introduce anything new in terms of it's instructions, it's bound by all the same mistakes and errors developers will make in sandboxing as there has been in the past.

Dylan16807 · on Sept 10, 2019

So even if all possible runtimes have to be sandboxed or they're not actually implementing the language, it's not possible for a language to qualify as "sandboxed"?

Then I stand by what I said before. Your definition is broken, and you're making a semantic argument rather than actually discussing eBPF and WASM.

When you see someone say "sandboxed language" read it as "language where conforming implementations are by definition sandboxed". WASM meets that definition, as far as I can tell.

When you see someone say "WASM is sandboxed" read it as "any runtime that implements the WASM spec is sandboxed".

devwastaken · on Sept 10, 2019

If wasm had a standard runtime everyone used, sure. It doesn't, runtimes are significantly fragmented. Therefore 'wasm is sandboxed' is not true, and in many cases those sandboxes are not at all being audited. It is very dangerous to make broad and demonstrably untrue statements about software security. Wasm is not a sandbox, wasm is a set of instructions. Your quality of sandbox, if at all, is up to what runtime you use. No amount of word play will change that.

Dylan16807 · on Sept 10, 2019

The defined semantics for those instructions include sandboxing. If there is no sandbox, it's not WASM. It wouldn't be implementing the instructions as described in the spec.

You can argue that a sandbox might be low quality. That's fine. But it doesn't make it non-sandboxed.

devwastaken · on Sept 10, 2019

"if there is no sandbox then it's not wasm".

You don't get to decide how an instruction set is used, and a 'low quality sandbox' is not a sandbox.

You've denied facts and continue to make both inexperienced and naieve claims that are dangerous. Not entertaining it further.

Dylan16807 · on Sept 10, 2019

If someone guesses what all the instructions are supposed to do and implements the wrong semantics, they didn't actually implement the same instruction set!

When all your security issues are violations of the spec, then it is not the language in the spec that is insecure.

thayne · on Sept 9, 2019

> That's how wasm is meant to be, but doesn't give any detail as to how that's achieved programatically

True, but a conforming implementation has at least some sandboxing, since it prevents arbitrary memory access. But the degree to which it is sandboxed depends on the functions that are exposed by the runtime.

simias · on Sept 9, 2019

I don't see where you're going with that. Doesn't that definition apply to eBPF as well? I feel like you're trying to win a purely semantic argument which doesn't really advance the discussion.

Spivak · on Sept 9, 2019

Wasm is no more or less sandboxed than any other VM.

It’s entirely up to the implementation how much access it gets to the outside world.

saagarjha · on Sept 9, 2019

Apart from the fact that it can potentially run forever, which can be unacceptable in certain contexts.

AgentME · on Sept 9, 2019

That can be addressed by pre-processing WASM like in https://github.com/ewasm/wasm-metering so it aborts if it runs for too many steps.

devwastaken · on Sept 9, 2019

Instruction modification is a hack. The proper way is to control as a VM.

the_duke · on Sept 9, 2019

I'm curios where you got this information from, but you are quite misinformed.

Webassembly is designed to be sandboxed and safe to run. Interaction with the host environment is only possible via the runtime.

Mic92 · on Sept 9, 2019

eBPF programs are supposed to finish in finite time, which is verified at load time. The kernel also provides some data structures like maps/arrays which are of constant size as I can recall. Hence it is not suitable for general purpose programs unlike WASM.

throwaway2048 · on Sept 9, 2019

eBPF is not for general purpose computation, it mostly functions as a sophisticated query language for extracting data, it doesn't even have loops. it is not comparable to WASM.

Spivak · on Sept 9, 2019

eBPF does have loops and it is absolutely for general purpose computation.

The verifier that exists in the Linux kernel works very hard to make sure that you are not allowed to load programs with unbounded loops but you most certainly can when those restrictions are lifted.

throwaway2048 · on Sept 9, 2019

This is trivial, when people talk about eBPF they are talking about the implementation, its like saying you could put lisp code in C, if only you modified the C compiler.

Spivak · on Sept 11, 2019

The implementation that exists in Linux is perfectly capable of running unbounded loops. But it runs your program through a function that rejects your program if it can't prove that it terminates.

The point being that you could rip the eBPF implementation out of the kernel, remove the verify check and have a very usable VM.

Here's a implementation that exists because of the GPL: https://github.com/iovisor/ubpf.

loeg · on Sept 10, 2019

Why do you think the Linux kernel is the only eBPF implementation?

akhilcacharya · on Sept 9, 2019

eBPF is by nature not Turing complete (user kernel code has to be guaranteed to halt) but WASM doesn't need those restrictions.

dsww2 · on Sept 11, 2019

Not true, the instruction set of eBPF itself /is/ Turing complete. Linux is one implementation of it and the verifier currently imposes restrictions in order to not destabilize the kernel (e.g. infinite loops). But that doesn't mean it's not not Turing complete. E.g. in future there could potentially also be a mode that allows to run unverified eBPF programs. Think of it like kernel modules which are also not verified for safety, but can be loaded with the right permissions like CAP_SYS_MODULE. In future I can image a similar mode/option for eBPF as well as long as the user has the right permissions to do so.

jblwps · on Sept 9, 2019

Not-a-kernel-dev here. Is my understanding correct, that this is a GCC port that can target eBPF? If so, is there any particular purpose in that, beyond making eBPF an easier place for people to write userspace software that would otherwise be kernel modules?

Pretty wild and cool stuff, seems like.

jabl · on Sept 9, 2019

> Is my understanding correct, that this is a GCC port that can target eBPF?

Yes. Though before a LLVM based eBPF target was available, so this adds the option to use GCC instead. So in principle you could use GFortran to write kernel code; ... profit!

> If so, is there any particular purpose in that, beyond making eBPF an easier place for people to write userspace software that would otherwise be kernel modules?

eBPF is an in-kernel virtual machine, with JIT for popular architectures like x86-64 (maybe arm64 and ppc64le too, not sure?). So you use GCC (or LLVM) to compile code into an eBPF compatible object format, load it into the kernel (with a special syscall IIRC, or maybe it was something netlink-based?), then an in-kernel verifier checks that it doesn't do anything that isn't allowed before it's enabled.

So what can you do with it. Quite a lot, it seems (disclaimer I haven't used it personally). The big use cases at the moment seem to be

- network filtering

- seccomp filtering (that is, check syscall arguments)

- tracing (see bcc/bpftrace) for performance analysis

simcop2387 · on Sept 9, 2019

> - seccomp filtering (that is, check syscall arguments)

Small correction there, seccomp still uses BPF not eBPF. That leaves a lot of restrictions on what it can do and I believe the bytecode is incompatible too.

anaphor · on Sept 9, 2019

Presumably it means you can write your eBPF source code and have it get compiled through GCC (meaning you get all of its optimizations, architecture targets, etc).

sigjuice · on Sept 9, 2019

Isn’t eBPF the architecture target in this case, i.e. gcc will translate C (and/or others?) to eBPF?

anaphor · on Sept 10, 2019

eBPF is its own language as well that is (more or less) a subset of C, is my understanding, so presumably they can apply some of the same optimization / analysis to it as they would C code, right?

nrclark · on Sept 9, 2019

This is very exciting! Nice work to the team that's doing this.

I've been waiting to dive into eBPF until the tools mature a bit, so it's great to see eBPF support landing in GCC.

sevagh · on Sept 9, 2019

Good timing on having recently began working with BPF + XDP. Here's my project: https://github.com/sevagh/ape, for those curious about what itches XDP can scratch.

joelthelion · on Sept 9, 2019

What are the applications, besides network filtering?

nrclark · on Sept 9, 2019

If your kernel is configured for it, you can use BPF to write pretty advanced seccomp filters for process sandboxing. One example could be something like: "restrict open() to only work on files in a given directory, for this process and all child processes", "only allow two write() calls ever", "disable all filesystem access after the first network packet gets received", or basically any arbitrary thing you want to do for syscall control.

cthalupa · on Sept 9, 2019

XDP and the performance tracing use cases are some of the big ones.

https://www.iovisor.org/technology/xdp

https://github.com/iovisor/bcc

https://github.com/iovisor/bpftrace