It seems worrying to me to put something so complicated in the kernel. Perhaps, ...

lgierth · on Feb 5, 2017

I'm less worried about that because it's so small. WireGuard is ~4k LOC -- thus measurably less complicated than OpenVPN (~100k LOC + OpenSSL), StrongSwan (~410k), or SoftEther (~330k).

Numbers taken from slide 5 of https://www.wireguard.io/talks/codeblue2016-slides-en.pdf

Moving to kernelspace brings it up to par with plain IP networking, minus a bit of overhead for the cryptographic operations. Userspace networking has overhead which is hard to overcome: context switching and CPU cache invalidation, copying packets between kernelspace and userspace, etc.

sargun · on Feb 5, 2017

Yeah, but BPF can operate in kernel space directly on SKBs. If you look at the XDP work, there's a lot of promise. In fact I've implemented ECC in BPF -- other than the state, and negotiation components, I don't see why this can't adapt BPF.

tptacek · on Feb 5, 2017

You implemented an elliptic curve scalar multiplication in BPF bytecode? Why? Which curve?

sargun · on Feb 7, 2017

So, for one of the implementations, yes, we implemented Curve25519 inside of the Kernel. Basically, every node was pushed into a map with the association of IP/Port to key data.

At the reception of a packet, if we didn't have a key, we'd punt the packet to userspace, asking it to do an exchange.

For some reason -- That I forget, we ended up putting the entire key into each packet. I think that when the "exchange" happened in userspace, we didn't actually store the entire box and key, instead we just generated an association specific key, and we wanted to be able to rotate this quickly. I think we just came up with the fact that doing the derivation was cheap enough that it'd be better to do it on every message.

What would then happen is that if we didn't have a box cached, we would calculate it, and this code was implemented as a helper function outside of the BPF itself. Then we did Poly1305 + XSalsa20 with a random nonce.

Why? Because IPSec is fucking complicated, and because I can?

wolf550e · on Feb 7, 2017

Has a cryptographer ever looked at the protocol you invented? Maybe doing the asymmetric crypto on every packet is just a harmless performance problem that, but maybe you invented something insecure. Or is the protocol strictly from the spec (NaCl?) and just the implementation strange?

sargun · on Feb 7, 2017

No, the project never got that far. I did discuss with a security researcher though, and it was secure, but it lacked features like PFS, and the handshake protocol could leak information.

In addition, using "pure" random nonces has all sorts of interesting problems.

zx2c4 · on Feb 5, 2017

Sidechannel free? High performance? Multi core? I'd love to see this. Sounds really interesting. What curve? Got source?

sargun · on Feb 7, 2017

Sidechannel free - Curve25519 is meant to be resistant to side channel attacks. We didn't spend much time seeing if the actual cryptographic code was secure.

Multicore - Yeah, we basically got that for free with BPF/XDP. It's kinda neat. You just load programs, and arrays get processes in parallel. We didn't have much state because we used random nonces, so there was no need for synchronization.

Unfortunately, the code will not be public because the project was closed source, it never shipped, and I'm no longer with the organization that paid for the work.

cakeface · on Feb 6, 2017

This comment seems interesting! Unfortunately I have no idea what you are saying. Can you unpack the acronyms for me?

sargun · on Feb 7, 2017

Sorry! Once you get deep enough in this tech, you seem to have more in common with an alien language than actual common English.

* BPF - Berkeley Packet Filter - BPF is a safe, in-kernel virtual machine that's meant to be able to introduce next generation functionality into the Kernel: https://lwn.net/Articles/603983/

* ECC - Elliptical Curve Encryption - Enough said

* XDP - eXpress Data Plane - This is a project atop BPF that's being merged into the Kernel to implement programmability features in the kernel, without requiring that people write crazypants code or unsafe code. It's backed, and used by Facebook. If you're familiar with Intel's DPDK, it's similar.

* SKB - Socket Buffers - These are the components that make up the data you send over the wire in the Kernel. When they come out of the kernel, they have to be flattened, and copied which is expensive. This is one of the biggest problems with usermode networking.

xorcist · on Feb 5, 2017

That sounds crazy! Was it just for fun or is there a use case? What's performance like?

sargun · on Feb 7, 2017

This was actually around the time that WireGuard initially announced. We had a product that we ship on-prem that needs end-to-end security guarantees. Unfortunately, IPSec has a bunch of issues around IKE and maintaining sanity at a medium node count.

Our implementation was not..performant, hence why we abandoned it. Our thought was that we could make it work, but instead we ended up building our own key negotiation / management code atop some other stuff we had in order to setup the IPSec SAs.

foobiekr · on Feb 5, 2017

Trireme?

sargun · on Feb 7, 2017

What?