I added some example code to the post, because, again, I kind of can't get over how easy this turns out to be. And if you follow the link into Jason's `wireguard-go` code, until you hit gVisor itself, it's not much more complicated under the hood.
Having complete control of TCP/IP in userland like this, with so little code, is so valuable I feel like there needs to be some special name for the technique.
The whole thing is kind of a vindication for Go's standard library network interface, which I have always hated.
> Having complete control of TCP/IP in userland like this, with so little code, is so valuable I feel like there needs to be some special name for the technique.
Yes! Userspace TCP/IP is how we implement firewall for Androids (which don't expose iptables on non-root devices but let you setup TUN interfaces via VPN APIs). Right now, we rely on LwIP (wrapped in golang) and it has worked wonderfully well; especially since it is light-weight without any locking-overheads (single-threaded) and that bodes well for battery-powered devices.
> The whole thing is kind of a vindication for Go's standard library network interface, which I have always hated.
The Fuchsia team at Google is re-implementing netstack3 in Rust (and hence you're probably right to call it "gVisor netstack") due to what I presume are performance and efficiency reasons (which is of interest to us because we develop for smartphones). Of course, flyctl doesn't need that, but since you wrote about pulling in heavy dependencies, I am interested in your take on it.
Don't want to go OT but I'm super curious what your experience developing a network application for non-root Android devices has been?
As a non-Android developer, I've been working on a project the last few months that involves running an HTTP server on the device and tunneling out so it can receive requests from the outside world, and the platform feels nerfed at every level from filesystem access to keeping your server from being battery-killed.
Android development is a bit tedious relatively compared to iOS due to having to support multiple API levels and having to account for subtleties across OEM implementations, but things have drastically improved in the last few years, especially after Oreo (Android 8).
Process reaping is also, I believe, a problem on iOS? One way to keep a process out of OutOfMemory/LowMemoryKiller's reach is to make it a foreground service (what stuff like Music Players do) and generally be very stringent with resource use. It is easy to profile for resource usage thanks to Android Studio's built-in profiler and tools like https://perfetto.dev/
Oh iOS is way worse from what I can tell. I don't consider it a viable computing platform so haven't bothered trying to make my software run there.
But Android seems to be working hard to "catch up" to iOS.
I'm mostly comparing to native Linux development. Obviously you may need to make some changes for security, but I feel like they've gone way overboard with things like forcing the storage access framework/media storage APIs, killing even foreground services (doze mode etc), and so on.
At the end of the day, if you're using software to purposefully limit what hardware is capable of, I think that's wrong. Even if you're worried about security, add a simple escape hatch for power users.
This is awesome! In the post you mention "For a couple hundred lines of code (not counting the entire user-mode Linux you’ll be pulling in from gVisor, HEY! Dependencies! What are you gonna do!) ..."
I'll note that while all of gVisor's user-mode Linux is in the same Go module, we've actually gone to decent lengths to keep the network stack logically separate from the rest of the user-mode Linux code.
To my amusement, my screen reader pronounced the project name as "deep aware", which I thought was appropriate, as in, it makes you deeply aware of your real dependencies.
I'm just a 1990s BSD sockets, write-my-own-select-loop kind of programmer; the idea of an abstract `Dial` interface always seemed like just a performative Plan-9-ism (I assume?).
> Having complete control of TCP/IP in userland like this, with so little code, is so valuable I feel like there needs to be some special name for the technique.
Many years ago, when we could take always-on desktop PCs more or less for granted, I developed a product that let the user connect back to their home PC from another PC, to stream music from home or grab a file (this was also pre-Dropbox). NAT was already ubiquitous by this point, and Windows XP SP2 (first version with Windows Firewall) came out that year, so I knew it couldn't just make a direct TCP connection to the user's home PC. So I did a stupid relay implementation, where both the client and the home server (that's what we actually called the tray applet on the home PC) made outgoing TCP connections to our central server, which would relay packets back and forth. If I'd had access to a TCP-in-userspace thing like the gvisor network stack, I could have run TCP end-to-end, the way it's meant to be used. It almost makes me want to reimplement that old system using Go and WireGuard, even though the functionality is basically irrelevant in today's world.
Because then there would be some service exposed to the Internet (not over WireGuard; if you have WireGuard, you don't need a jump box) whose job it would be to hop 6PN networks. The only thing we have in our infra now that controls access to 6PN is eBPF code; we keep the system simple so we can reason about it.
We pipe logs from our instances to users (all logs, including your app's); you can see them in `flyctl`. (Certificate issuance is also logged in our API, and these certs are very short-lived).
Having complete control of TCP/IP in userland like this, with so little code, is so valuable I feel like there needs to be some special name for the technique.
The whole thing is kind of a vindication for Go's standard library network interface, which I have always hated.