Our User-Mode WireGuard Year

anonymousisme · on Feb 9, 2022

Back in the day (nearly 30 years ago) people would run a user-mode stack to obtain Internet connectivity via a (dial-up) Unix shell account. The program was "slirp" which was named after SLIP/CSLIP, but then upgraded to support PPP once that became a thing.

https://en.wikipedia.org/wiki/Slirp

jacob019 · on Feb 9, 2022

I was using SLIRP a few years ago to tunnel traffic through a cheap OpenVZ VPS, (for downloading Linux ISOs). I had to manually patch the TCP window size and recompile the binaries to get decent speed as the original code doesn't support TCP Window Size Scaling. I tried to upstream the patch to Debian but the maintainer wouldn't reply. It worked well enough but now I use wireguard.

Modified SLIRP code is also found in VirtualBox, Qemu, UML and other virtualization software, for sharing the host connection in NAT mode.

chrissnell · on Feb 10, 2022

SLIP was a problem for us when I worked at an ISP in 1995. Cheapsakes would buy our barebones "shell access only" package and then run SLIP on our shell server. We eventually wrote a cron job to kill their processes when we found them. Sneaky ones just renamed the binary.

anonymousisme · on Feb 10, 2022

Heh. I used Netcom and they did not outright ban slirp, but they did have a script running that would renice (lower the priority) of user processes once they reached a certain total cpu time (two minutes). I discovered this feature shortly after they implemented it because my slirp network performance went down. Note that I am referring to cpu time and not run time. Most processes spend most of their time waiting for something, so accumulating two minutes of cpu time under slirp usually took several hours.

My solution was to write a program (I called checkcpu) that would spawn a process (slirp) and periodically check its total cpu usage. When it hit the threshold (110 seconds), it would spawn a child and suspend the parent (seamlessly passing the current run state to the child). It worked great and they either never noticed what I was doing, or they did not care. Over time, the number of suspended parent processes would rise, but it never became a problem.

eigenrick · on Feb 10, 2022

I was that guy. I didn't even know ISPs didn't like it.

I thought that's just how you got into the interwebs without running that obnoxious, buggy faux winsock client.

croon · on Feb 10, 2022

Man I miss the old internet when things were more playful and that was the extent of harm being done. Not that I condone cheapskates stealing your bandwidth.

Maybe it's just rosy nostalgia.

chrissnell · on Feb 11, 2022

Oh, believe me, that wasn't the worst. The worst was an undergrad in a university computer lab who harnessed many dozens of workstations to DDoS us off the net when he got into an IRC argument with someone.

croon · on Feb 11, 2022

Haha, right, my nostalgia blocked off that part of it. That was definitely a thing. IRC BNC:s became mandatory to avoid the risk.

efitz · on Feb 10, 2022

Remember when security was pretty much just scanning floppies for viruses before you ran the program?

rhn_mk1 · on Feb 9, 2022

History likes to repeat itself:

https://github.com/rootless-containers/slirp4netns

tonyarkles · on Feb 10, 2022

Heh, as a middle time between the dial-up days and the WireGuard days, I used to use PPP as a poor-man's VPN. SSH to a host inside the work network and run pppd on both sides. Tada, suddenly my home computer is on the work network, NATted from the jump box.

xanaxagoras · on Feb 10, 2022

What a trip back in time. I used this and TIA [1] as a youth.

[1] https://en.wikipedia.org/wiki/The_Internet_Adapter

apitman · on Feb 9, 2022

Usermode WireGuard would be a big deal. I maintain a list[0] of tunneling solutions, and one of the only limitations of systems built on WireGuard is the requirement for admin privileges. Even with the performance hit from running outside the kernel, UDP-based tunnels have a lot of advantages for multiplexing channels. Pretty much your only mainstream options today are QUIC and WireGuard, and only QUIC is intended to run in userspace.

I'd have to dig into the details more, but something like this might allow you to implement a simple tunneling system based on WireGuard that runs in the kernel if you have the privileges, otherwise falls back to usermode and is no worse than QUIC in terms of performance. That would be awesome.

[0]: https://github.com/anderspitman/awesome-tunneling

ffk · on Feb 9, 2022

Very cool!

Found a gap, Linux Foundation's FD.io's VPP (a high performance network virtual switch) has native wireguard support as well, all in userspace. Support here means you can do full kernel bypass from the app all the way down to the NIC card (e.g. via DPDK).

https://docs.fd.io/vpp/20.09/d5/d54/wireguard_plugin_doc.htm...

I'll open a PR on this later.

wahern · on Feb 9, 2022

> Pretty much your only mainstream options today are QUIC and WireGuard, and only QUIC is intended to run in userspace.

Not sure it fits your "mainstream" qualification, but many projects ago I used Airhook to help create a userspace, application-layer('ish), multi-channel virtual network: http://airhook.ofb.net/ (https://github.com/egnor/airhook)

Airhook is a relatively low-level library that handles framing and flow control; it's not a functional solution on its own. But that seems to be what you're getting at--something you can deeply integrate into your application, not a separate service. Though, I guess containers have sort of muddied that distinction.

remram · on Feb 9, 2022

It sounds like we could have a generic userspace tool that proxies any connection to a WireGuard server. Similar to ssh -L, it would listen on a TCP/UDP port locally (or talk the SOCKS protocol) and convert that to IP packets over the WireGuard connection (using a userspace TCP or UDP implementation for that side).

It looks like Fly.io has all the bits, they just need to be packaged as a stand-alone tool rather than built into flyctl and only talk SSH.

mrkurt · on Feb 9, 2022

Tailscale will do this!

    tailscaled --tun=userspace-networking --socks5-server=localhost:1081

sa1 · on Feb 9, 2022

Sadly, on the only machine that I would have wanted this on, where I didn't have root access, this has never worked for me.

I should try to recreate the logs and issue for the tailscale folks.

anurag · on Feb 10, 2022

Our customers at Render run Tailscale in user mode every day. Here's the repo they use: https://github.com/render-examples/tailscale.

An example of using Tailscale to access VSCode in the cloud: https://render.com/blog/host-a-dev-environment-on-render-wit...

godtoldmetodoit · on Feb 9, 2022

I had issues with it as well, was following the Tailscale guide for getting userspace running in Azure App Service and I could not get it to work.

remram · on Feb 10, 2022

With any WireGuard server or a custom tailscale server/service?

rkeene2 · on Feb 10, 2022

SSH can do this without any WireGuard:

https://rkeene.org/viewer/tmp/ssh-ip-tunnel.txt.htm

remram · on Feb 10, 2022

There's nothing userspace about this, you are using a tun device with the actual kernel IP stack.

apitman · on Feb 11, 2022

One thing to watch out for with that setup is using SSH for TUN devices can suffer from TCP-over-TCP performance issues, aka "TCP meltdown", when there's packet loss. You can avoid this by using normal SSH tunnels, a la ssh -L or -R, which unpack the individual TCP streams and multiplex them over a single connection. Or if you need a more traditional VPN setup use WireGuard.

pmarreck · on Feb 9, 2022

Is https://tailscale.com/ not "usermode WireGuard"? I've been playing with it for a while now (it has a fairly generous free tier) and am quite impressed. I can access any of my LAN machines (my servers, my NAS, etc.) from anywhere that is also connected to the same network, and the names work for DNS as well.

tptacek · on Feb 9, 2022

1. Tailscale is amazing. I hate them so much. (We use Tailscale and are very happy with it.)

2. Tailscale is user-mode WireGuard.

3. "User-mode WireGuard" in the sense this post uses the term is a misnomer and refers to the fact that we run TCP/IP itself in userland (Tailscale normally runs through a tunnel device and uses your native TCP/IP stack).

4. But Tailscale also has code to do user-mode TCP/IP (they've got it running in a browser with wasm).

apitman · on Feb 9, 2022

Last I heard[0] they were experimenting but hadn't shipped it. AFAIK their client still requires root, no?

Running on wasm sounds awesome. This[1] looks like it. Do you know how they're doing the actual networking? WebRTC tunnel?

[0]: https://news.ycombinator.com/item?id=24483173

[1]: https://twitter.com/bradfitz/status/1451423386777751561?lang...

bradfitz · on Feb 9, 2022

> Last I heard[0] they were experimenting but hadn't shipped it. AFAIK their client still requires root, no?

Tailscale's gvisor/netstack-based userspace networking mode has been supported and in wide use for quite some time. It's the default on Synology DSM7, for instance.

You don't need root when you run tailscaled with `--tun=userspace-networking`.

Peers can still connect inbound to the non-root tailscaled, but to connect _out_ to other peers, you need to use tailscaled's HTTP or SOCKS5 proxy, which are also flags to tailscaled, to specify what port they listen on.

apitman · on Feb 9, 2022

Thanks for the update!

Do you have any links that talk more about how the wasm stuff works? I'd love to read more about that.

tptacek · on Feb 9, 2022

Yeah, their client is always going to require privileges, because it needs to enable every other program on the system to interact directly with remote hosts transparently. User-mode TCP/IP works for us because we own the client-side program that our users run to talk to stuff on Fly.io.

amscanne · on Feb 9, 2022

I think Tailscale uses user-mode TCP/IP (also gVisor netstack) for some client devices, like iOS? But could be wrong here.

bradfitz · on Feb 9, 2022

We use it on all platforms _except_ iOS, for binary size/memory reasons.

(iOS 15 bumped the Network Extension memory limit to 50 MB, but we still need to be super trim for iOS 14's 15 MB limit)

amscanne · on Feb 10, 2022

LOL, I was precisely wrong.

Is there actually a preference for user-mode networking? I assume that’s primarily about control and flexibility?

Either way, I hope that the PacketBuffer changes can help reduce footprint after issues are shaken out.

pmarreck · on Feb 9, 2022

Fascinating!

apitman · on Feb 9, 2022

It depends on what sort of tunneling you're doing. If you just want a general-purpose private VPN, Tailscale is amazing. That list is more focused on the use case where you want to host a public server on a machine that isn't accessible to the internet (NAT, corporate firewall, etc). Think a shared Jellyfin server for your friends and family.

You can use Tailscale here but you'll need to separately run a reverse-proxy on a public machine. There are more moving pieces but if you're using Tailscale already then it's a good option.

pmarreck · on Feb 9, 2022

I wish I could run two separate Tailscale networks on a single device, one for business and one for personal (for example). Would make it tremendously more useful.

LilBytes · on Feb 9, 2022

There's an existing GitHub ask for this to be implemented. It's not terrible jumping between work and personal and work environments but it would be nice if I didn't have to.

vineyardmike · on Feb 11, 2022

Is this different from sharing one device with other accounts?

chaxor · on Feb 10, 2022

Headscale seems even better! They've taken what tailscale has done and improved it even more by allowing it to be a completely self hosted and private solution.

mobilio · on Feb 10, 2022

TunSafe also runs on userspace: https://github.com/TunSafe/TunSafe

tptacek · on Feb 10, 2022

It looks like from the source code that TunSafe opens up a tunnel device, in which case it's doing TCP/IP in the kernel, not in userland.

mobilio · on Feb 10, 2022

It opens tun/tap device just to get/push packets from/to virtual interface. Using tuntap is very common for all VPNs like OpenVpn2.

There isn't kernel driver like in real WireGuard.

aszs · on Feb 10, 2022

I feel like the lede is buried here... If I'm reading correctly I can use flyctl to set up an SSH encrypted tunnel over user-space TCP/IP over wireguard encrypted IP packets over UDP over WebSockets over HTTP over TLS encrypted TCP/IP? And not even just for fun, but to solve real problems in production. I can see why you might have mixed feelings about that!

withinboredom · on Feb 10, 2022

What’s even funnier is that if I read it right, it’s ipv6 only, and WSL in Windows doesn’t support IPV6 routing over WireGuard. The kernel is missing some important CONFIG flags when it was compiled to route the packets properly.

mst · on Feb 10, 2022

It's running its own TCP stack in userspace though so the hosting kernel's routing abilities aren't an issue.

I am both impressed and horrified, and I mean that in the best possible way.

dolmen · on Feb 10, 2022

> I am both impressed and horrified, and I mean that in the best possible way.

This sentence style sounded familiar in my head. Looking at the username. Oh, of course, that's him!

mst · on Feb 10, 2022

mstcat approves of this :D

(for those who don't know, http://trout.me.uk/mstcat.jpg )

thundergolfer · on Feb 9, 2022

This post pairs nicely with Julia Evans’ post on why most people use the Linux kernel’s TCP/IP stack and why a few others would bother with a userland stack.

In her post she doesn’t mention fly.io’s motivation for doing userland TCP-IP: a nicer end user client experience.

If I read this correctly then fly.io did all this work to make their CLI user experience markedly better. That’s pretty cool. Twist yourself in interesting knots to make your user’s lives better, maybe even in ways they won’t notice!

ulzeraj · on Feb 9, 2022

I was using wireguard-go on FreeBSD jail running on top of an APU2C2 board. Torrenting from my laptop caused wireguard-go cpu usage to spike to high loads and 30-50% CPU usage. Loading wireguard-kmod on the host machine plus some devfs rules dropped the CPU load to 0s.

Not sure what happened there. The processor seems to score less than an RPi4 on Geekbench.

bityard · on Feb 9, 2022

I use one of these as a firewall (running OPNSense) and they're very nice but the CPU is indeed _slow_. It's plenty good enough for everything the firewall does but booting it up takes minutes and that's saying something for FreeBSD.

hedora · on Feb 10, 2022

Odd. I run openbsd on a similar one, and booting is reasonably fast. I even have a linux vm running on it in vmd, and haven’t noticed performance issues with that either.

ulzeraj · on Feb 10, 2022

Pure FreeBSD boots in less than a minute from the SDCard. Have you tried plugging in a serial cable to check what service is slowing down the boot?

MisterTea · on Feb 10, 2022

> The processor seems to score less than an RPi4 on Geekbench.

The apu2 is an embedded AMD quad core 1GHz SoC consuming 5W. It is not a powerful system by any means and not surprised rivaled by a >1GHz quad core Arm.

mastax · on Feb 10, 2022

The AMD Jaguar cores came out in 2013. On the other hand, the Cortex A53 came out in 2012 so it's still a bit embarrassing for AMD. That was before the AMD renaissance, though.

coder543 · on Feb 10, 2022

The Raspberry Pi 4 uses a quad core cluster of Cortex A72 cores, not A53. A72 was released in 2016, but the Pi 4 also has them clocked to run at 1.5GHz. Either way, I don't think it's embarrassing for AMD's 2013 Jaguar to be beaten by cores that are 3 years newer and running at a 50% higher clock speed. I thought Jaguar was pretty cool at the time that it came out, but technology has continued to move swiftly since then.

marcus_cemes · on Feb 9, 2022

Fly.io's blog posts are incredible, they really seem to really enjoy what they do and want to share what they've made with everyone else. I love them for that.

I wish that more companies could be like this and skip the corporate BS, it shows that they really have something outstanding to offer.

mbesto · on Feb 9, 2022

> I wish that more companies could be like this and skip the corporate BS, it shows that they really have something outstanding to offer.

The nature of the blog typically cater towards the intended audience.

The CIO of Disney doesn't give a sh*t if the protocol is called WireGuard or OpenVPN or that if it uses AES-256 encryption - he/she wants someone to tell them that their developers are securely accessing their infrastructure. Full stop. If/when Fly gets to that level (let's say $500M in revenue) their blog tone will likely change - their audience is almost primarily developers and startup CTOs...for now.

mrkurt · on Feb 9, 2022

For better or worse, I can guarantee you that we won't ever write articles for the Disney CIO. Unless I get fired.

Whitepapers. They want whitepapers and magic quadrants.

xarope · on Feb 10, 2022

White papers are to CxOs what TED talks are for the uninformed, an easy way to get someone up to speed on a highly complicated subject, and more fool them (meaning the uninformed) if they think it means they are now an expert, which sadly a lot do (think they are now an expert).

I kind of miss the IBM ITSO Redbooks. No idea if they still maintain the same quality today, but in the 90s and pre-internet/google/wiki etc, they were fonts of deep knowledge.

mbesto · on Feb 9, 2022

> we won't ever write articles

> Whitepapers.

I probably should have said "content" instead of blog because I agree 100% with this. Point still stands.

What the OP was referring to will likely become part of an engineering blog. a la : https://codeascraft.com/

dangerboysteve · on Feb 10, 2022

Correction, magic kingdom quadrants.

tptacek · on Feb 9, 2022

So Eden sank to grief, so dawn goes down to day.

samwillis · on Feb 9, 2022

I think their super power here is employing a renowned security expert who is an incredibly good communicator!

(And happens to be a top HN contributor)

philosopher1234 · on Feb 9, 2022

Nit: THE top HN contributor

purplerabbit · on Feb 9, 2022

my goodness, you're right... https://news.ycombinator.com/leaders (edit: by a factor of ~2, no less!)

rcoveson · on Feb 10, 2022

I like how he decided he should only copyright his comments from years 2010 + Fn.

myth_drannon · on Feb 9, 2022

And they also employ Phoenix framework creator.

Karrot_Kream · on Feb 9, 2022

I'm pretty sure their experience running an ISP helps too heh.

tptacek · on Feb 9, 2022

It's definitely all my talent that keeps this place running. I'm definitely not just a noisy message board guy who got hired after most of this infrastructure was built and deployed and then just proceeded to make a bunch of message board noise about it.

swyx · on Feb 9, 2022

see, this is why the people love reading what you write. keep giving credit but also having fun!

dan-robertson · on Feb 9, 2022

A lot of (small–medium sized, tech) companies just don’t have a process to get things out on their blog like this. It might be that only a few senior people have the ability to write posts and they are not interested or busy with other things, or it might be that there is a slow review process for posts that makes writing them unpleasant, or it might be that they don’t want to reveal IP or have an opinion and so have little to talk about. Another company that does a good job of writing blog posts, often timely posts about current (and relevant) events, is cloudflare though their posts have a quite different energy to Fly.io’s.

ignoramous · on Feb 10, 2022

Yep. Read also: https://danluu.com/corp-eng-blogs/

Bayart · on Feb 9, 2022

It think it's the only corporate blog I know of that's on my much-read list (ie. every post goes to the top of the pile).

jcul · on Feb 9, 2022

They really are. Feels like working there would be really fun.

tptacek · on Feb 9, 2022

Because 'sho_hn brought this up, here's a stab at a pro/con list of building TCP/IP directly into our API the way `flyctl` does:

Pro:

+ Can just run "native" SSH directly over it (or, in our case, use x/crypto/ssh, without modification).

+ Lets `flyctl` offers a `flyctl proxy` command to users, so they can plug their own programs into whatever application they need to use, without asking us to change some proxy we run in our infrastructure.

+ Offers a single security and access control model (IPv6 private networking), rather than something we have to think about on a per-app basis.

+ In theory, we get all this right and never have to think about another network protocol in our infrastructure.

+ Allows existing network management tools and libraries to function directly with Fly.io infrastructure.

+ With the WebSockets gateway, we can do all of this stuff directly in browsers as well; that is, we can present TCP/IP as an API to browser Javascript to do UI stuff (and we're doing more and more UI stuff these days, in Elixir.)

+ Puts more IPv6 in the world.

+ Get to talk to Jason Donenfeld more.

+ Get to write blog posts like this.

Con:

- Spends one (maybe multiple) innovation tokens or whatever you want to call them.

- Way more things can go wrong; relies on state synchronization and on a clear network path between our users and our gateways. Right now, we have to care whether you can speak 51820/udp.

- User-mode TCP/IP via Netstack is probably significantly slower than a simple TCP proxy would be.

- Required `flyctl` to run a background agent process to manage multiple connections through WireGuard.

- The agent process adds to the list of things that can go wrong (hopefully we're ironed most of them out now).

I can probably come up with more cons.

...

It's buried in the middle of the post but I want to say it again because I think it's important: this sort of started out as a stunt; it's what I put together to allow people to SSH into their instances without having to install WireGuard locally, and that's all it was. I don't have to write a soul-searching pro/con list on stunts I use to give people SSH access, because lots of providers have super janky "pop a private terminal" setups. But all this stuff took on greater importance when we used it to run Docker for our remote builders.

I like the approach we're taking a lot! I don't... regret it? I don't think? I think I'm happy with it. But it's complicated.

ericpauley · on Feb 9, 2022

We recently moved our entire app deployment over to Fly and are mostly loving it, but one of the mildly janky features is hallpass. For instance, (1) connections often fail if you have X forwarding enabled (even if you did no specifial config on the machine), and (2) port forwarding doesn't work. While these aren't really a big deal since (1) you can just disable X forwarding in ssh_config and (2) port forwarding is unnecessary if you can tunnel in via Wireguard, it makes me wonder why a native SSH server isn't used with a small script to manage the required config changes.

While on the topic, we're also eagerly awaiting improved autoscaling (e.g., more responsive, using additional metrics, and scaling down properly). I'd be really curious if you could leverage the more detailed access to instance-level metrics to implement some cool new queue-theoretic modeling: You know roughly how long it takes for an app to launch, you know the current request rate, and you know the time to service requests. You could apply a lightweight Markov model to predict the probability of a given queueing delay in each region within the average launch time and, if so, preemptively launch a new instance before queueing delays even occur. This could be configured to balance a client's tolerance for queueing delays with over-provisioning budget.

michaeldwan · on Feb 10, 2022

As for autoscaling, our hands are tied as long as we're running on Nomad. Right now our autoscaler is nothing more than some ruby that loops over data from prometheus and changes counts in Nomad. It's slow and buggy, but worse we don't have control over where Nomad places VMs or which ones it stops when scaling down.

We're working on a replacement for Nomad (called flyd) that gives us full control over VMs. Once apps are running on that we can do a lot of cool things. Better autoscaling is one, but I'm really excited about suspending idle VMs that our proxy wakes up on demand. That'll cover most use cases without forcing customers to worry about counts or blowing through a budget.

atonse · on Feb 10, 2022

I’d love to hear more about this move away from Nomad.

We haven’t had too good a time with nomad, but not sure if it’s just our limited understanding. It doesn’t help that there are very few people out there that know it.

michaeldwan · on Feb 10, 2022

We'll write about it when the time comes. To be fair, Nomad and Consul have served us well. Most of our troubles stem from abusing them in ways they weren't designed to handle.

atonse · on Feb 10, 2022

That doesn’t surprise me. I think our biggest issue is not knowing it well enough and not being able to find people that know it.

At that point it doesn’t matter how good the tech is.

tptacek · on Feb 9, 2022

Hallpass is a truly trivial piece of code --- it might be less than 400 lines all in. All it really does is run certificate authentication off of a root cert we store in `_orgcert.internal` in DNS.

If you like, you can roll a Dockerfile that runs OpenSSH directly on your internal network address (bind it to `fly-local-6pn`), and then use native WireGuard to talk to it.

I've got a branch on hallpass that does port forwarding, but I never merge it, because you're right: using port forwarding on Fly.io is weird, because we already provide you direct access to any port you're exposing, and you can't talk to any of this stuff without WireGuard already. I think it would just confuse people more if I made port forwarding work.

alilleybrinker · on Feb 9, 2022

Thanks for the pro-con list!

The experience of...

1) build a thing because it's immediately useful for a specific use-case, 2) someone reuses it for another use-case because it's already there and saves some work, 3) times passes 4) oops really important stuff now relies on this thing in ways that weren't originally intended

... seems like a common pattern (see: JWT succeeding as the format for interoperable tokens by dint of just being around).

In this case, it seems like the pros are basically user-centered pros (`flyctl proxy`, existing tool interop, etc.) and the cons are basically fly-centered cons (state synchronization, maintaining the agent and making it work right).

The cons that do affect users (slowness, maybe they can't speak 51820/udp) seem _annoying_ but not deal-breaking for a lot of use cases. If the slowness persists over a long time it will be interesting to see how users opt to route around it (architect applications / processes to not rely on this channel).

apitman · on Feb 9, 2022

Question: ultimately all the packets are actually being sent via a Golang net.UDPConn right? ie you're simulating raw network packets by wrapping them in UDP packets, then running TCP in Golang over those wrapped packets?

tptacek · on Feb 9, 2022

That is effectively what we're doing, yeah.

apitman · on Feb 9, 2022

Cool, thanks!

dolmen · on Feb 15, 2022

This is an answer to this question by sho_hn: https://news.ycombinator.com/item?id=30276877

tedunangst · on Feb 9, 2022

How complete is the ssh implementation? I'm thinking I probably want to at least run git/hg push, and maybe even do port forwarding.

tptacek · on Feb 9, 2022

Not very. You can scp and rsync over it. You can run with or without a pty. That's pretty much it. It should work with git!

You probably shouldn't do port forwarding on Fly.io; if you're running into an actual need for that, we should talk about extending our network access control model.

tedunangst · on Feb 9, 2022

Perhaps atypical, but about 50% of my ssh use is port forwarding to construct impoverished man's VPNs. Like I send mail by forwarding localhost:25 to localhost:25 on the mail server.

If I were running PoE (Postgres on Edge) I'd probably want to connect a local client for poking around, but without the bother of meshing my laptop into the cloud.

mrkurt · on Feb 9, 2022

Most port forwarding you need to connect to Fly apps is baked in. Here's how to get at a remote postgres:

    $ flyctl proxy 15432:5432 -s -a fizz-db
    ? Select instance:  [Use arrows to move, type to filter]
    > gru.fizz-db.internal
      iad.fizz-db.internal
      lax.fizz-db.internal
      lhr.fizz-db.internal
      ord (fdaa:0:446b:a7b:20db:0:77a5:2)
      ord (fdaa:0:446b:a7b:20dc:0:784c:2)
      yyz.fizz-db.internal

That forwards whichever you select to local port 15432.

Karrot_Kream · on Feb 10, 2022

Awesome, this is a killer promo for Fly btw :D

tedunangst · on Feb 10, 2022

sho_hn · on Feb 19, 2022

Thanks!

mwcampbell · on Feb 9, 2022

> being able to pop a shell on a running app was table-stakes for the platform.

Tangent: that's debatable IMO. In my company's current AWS infrastructure, there's no shell access to either the production containers or the host machines. I did write a script to create an ephemeral container that lets me (and future staff) run a shell inside the production network. And the thing I usually do in that shell is run psql; I suppose that's not ideal for auditability. But still, I can't poke around in the live containers or the host machines; in theory they could be distroless, with no shell at all. I'm trying to take immutable infrastructure to the max here; this seems like a good thing for security. It does mean that for debugging production problems, I can only go by what I find in logs and the database. But that has been acceptable so far.

Edit: Given tptacek's security background, I was surprised that he considered production shell access essential for a new app platform.

tptacek · on Feb 9, 2022

It's funny you bring this up, because I had the same thought when I was first implementing SSH (the initial implementation relied on native WireGuard, and just plugged a client certificate into your running SSH agent). I thought people might not want to enable SSH access, and so I made the provisioning of the root SSH certificate optional: you have to run `flyctl ssh establish` to tell us to set up a root cert for your organization.

It's turning out to have been a misfeature that confuses people more than it helps anyone, and we may get to a place soon where we just automatically provision a root cert for new organizations.

mwcampbell · on Feb 9, 2022

So do you think I'm being too extreme on this? Or did you just implement SSH because customers want it?

tptacek · on Feb 9, 2022

I think it's sensible to run an application fleet without SSH access, but it's tough for a hosting provider that has to support lots of different application fleets to not offer a way to get a shell. Our authorization systems are about to get sharply more interesting as we roll out Macaroon-style tokens this quarter, so I'm optimistic we'll get to a place where we make both styles of application owners happy.

I was more on your side when I wrote the feature, and I'm less on your side now. Also, I SSH into instances to debug things all the time. :)

ignoramous · on Feb 10, 2022

Might be noob question (ahead of time, even), why/when would one prefer authz/acl with Macaroons over Zanzibar? All I could gather from summaries is that, for Cloud, Macaroons are better suited since 'decentralised'; whereas, Zanzibar appears to favour consistency with centralised architecture capable of expressing relationship graphs and permission inheritance.

10000truths · on Feb 9, 2022

I think Linode’s approach to this is best: they offer a “virtual console” that basically consists of an SSH gateway that pipes to your VPS’s virtual serial port.

remram · on Feb 9, 2022

It's useful when developing. Not being able to shell into the production system is fine if you can shell into a staging/development system.

If you are running Docker containers and you can shell into local containers, that is usually "close enough" that you can do useful troubleshooting. But fly.io (and CloudFlare workers, etc) are different enough from off-the-shelf containers that it is very important to be able to poke at containers when they break, even if they are not the actual production containers.

FujiApple · on Feb 10, 2022

> In my company's current AWS infrastructure, there's no shell access to either the production containers or the host machines. I did write a script to create an ephemeral container that lets me (and future staff) run a shell inside the production network.

I'm a fan of this approach, it is hard improve upon the security of a server that doesn't exist.

I recently setup an AWS serverless (mainly ECS Fargate) stack for a project and took this approach of spinning up an ephemeral EC2 server as part of a "breakglass" runbook for the rare cases where such access is needed.

This, combined with Tailscale userspace networking [0] (so the "breakglass" servers can run on a private subnet and are never exposed to the internet), Pulumi [1] (for managing the lifecycle of the "breakglass" instances) and Yubikey based MFA for short lived credentials [2] (required to spin up the server via Pulumi), I found to work well.

This approach is also useful for ensuring that whenever a "breakglass" server is started it is using the latest AMI version (Pulumi's `aws.ec2.get_ami_output()` is useful for this) and runs the usual security updates on startup. The ssh keypair for the server can also be created (and later destroyed) on the fly so there is no need to manage any long term ssh credentials.

[0] https://tailscale.com/kb/1113/aws-lambda/

[1] https://www.pulumi.com/

[2] https://aws.amazon.com/blogs/security/enhance-programmatic-a...

lyeager · on Feb 9, 2022

> The Consul cluster would hold an Entmoot

Hah! That one got me. They know their audience :)

jonathanoliver · on Feb 9, 2022

This one made me laugh out loud and I had to come to HN to post this. You beat me to it.

fuzzybear3965 · on Feb 10, 2022

Help me out. This threw me for a loop and Google wasn't helpful.

zellyn · on Feb 10, 2022

Weird. For me, every Google result for [entmoot] is appropriate and basically gives the answer to your question I would have given. It's a Lord of the Rings reference.

fuzzybear3965 · on Feb 10, 2022

Right - I understood that it was the walking tree from Lord of the Rings. But, what does it mean for Consul to hold a walking tree?

Is it something like: "Consul bears a heavy weight (has a lot of responsibility and so responds slowly)"?

zellyn · on Feb 11, 2022

A walking tree is an Ent. A meeting of Ents, with ponderous deliberation — ponderous on _Ent_ timescales, that is — is an Entmoot.

fuzzybear3965 · on Feb 12, 2022

Ohhhh. Their clocks run slowly. A lot of time passes between events. Totally makes sense. Thanks!

mayli · on Feb 10, 2022

I always love the style how fly.io's blog is written, and a big fan of their freemium product. I am see a group of enthusiastic hackers behind the product, and keeps improving it in a reasonable way, or a cool way that doesn't sound boring.

dudus · on Feb 10, 2022

I like it too but I feel like they overdo it just a tad.

tptacek · on Feb 10, 2022

We have the same concern! That's the whole thrust of the post. :)

dudus · on Feb 10, 2022

Keep it up. Even when you overdo a little it's still miles ahead of BS corp-speak

tarasglek · on Feb 10, 2022

First off, you guys rule for doing this. I been dreaming about a general purpose userspace/unprivileged wireguard wrapper, this gets us closer to that.

1. Could you include more info on which userspace tcp/ip stack you use and why? I presume doing userspace UDP is relatively trivial/fast compared to what slirp had to do with tcp.

2. How does flyctl hijack syscalls across Linux and Windows? Is there some abstraction to do that? Wasn't even aware this was a pattern on Windows.

I realize I could read the code, but would appreciate some direction.

Kudos for the websocket proxy too! Would be really cool if this and the unprivileged wireguard became standard parts of wireguard toolset.

tptacek · on Feb 10, 2022

We use netstack. It's great. https://pkg.go.dev/inet.af/netstack

I could try to give you reasons we use it but the truth is that I mentioned something about wanting a user-mode TCP and Jason Donenfeld said netstack was there, and then got it to work.

We don't do any system call hijacking! Don't have to. Go abstracts Dialers and Listeners, and the WireGuard part itself is just a vanilla UDP protocol spoken over a PacketConn.

pier25 · on Feb 10, 2022

Not even 30 mins ago I set up WireGuard to connect to a PG instance on Fly.

I expected this to be a headache but it took less than 5 mins to download WG, generate the conf with the fly CLI and paste it into WG. Done.

tptacek · on Feb 10, 2022

You shouldn't have to do this unless you want to (for instance, to make a permanent WireGuard connection) --- you can just run `flyctl proxy` to set up a connection to 5432/tcp.

pier25 · on Feb 10, 2022

Really?

I followed the instructions in their docs and that's what was recommended.

https://fly.io/docs/reference/postgres/#connecting-to-postgr...

Also, wouldn't I need to install Wireguard anyway to use fly proxy?

tptacek · on Feb 10, 2022

Nope! That's what this whole post is about. :)

paradox242 · on Feb 12, 2022

As I read this, the tone and some of the topics started to ring a bell. Fly? Wireguard? I scrolled back up to the top to see who the author was and sure enough it was Thomas from the Security, Cryptography, Whatever podcast that I've been listening to for the last few months. For anyone that enjoyed this kind content you should also definitely check out the podcast he is a part of, the other hosts are great as well.

dolmen · on Feb 15, 2022

Thanks for the pointer to that podcast.

https://securitycryptographywhatever.buzzsprout.com/

drunner · on Feb 9, 2022

How do they manage their mesh?

I've just been doing research on setting up my own wireguard mesh (currently using a spoke/hub setup with pi-hole/pivpn).

I found https://github.com/HarvsG/WireGuardMeshes today which is awesome, but I'm curious what fly.io / other readers here may be using.

ohyeshedid · on Feb 9, 2022

I usually build my own solutions, but I've played with Netmaker and it seems solid.

https://github.com/gravitl/netmaker

ignoramous · on Feb 10, 2022

Netmaker is SSPLd. Careful using it with anything at all (or, casually recommending it!). If your personal project connects to your proprietary server, then you're in a tough spot in terms of license compliance.

symlinkk · on Feb 10, 2022

I thought wireguard-go requires the kernel tun device. See for example this Docker image for wireguard-go which maps the tun device and requires elevated permissions - https://github.com/masipcat/wireguard-go-docker

efitz · on Feb 10, 2022

TFA is surprisingly unapologetic about the Rube Goldberg SSH solution.

On the other hand fly.io looks really interesting, I want to try it out. The infrastructure described (other than the SSH hack) feels like how a modern cloud platform should be built.

renewiltord · on Feb 9, 2022

Incredible blog post. Thank you very much for sharing. Signing up now.

Also, this post makes me update my prior on top HN contributors being unproductive (i.e. that they spend their time on this board all the time instead of working).

sandGorgon · on Feb 10, 2022

is it possible to force fly.io to use a particular region ? India has these strict data residency laws for financial/healthcare companies (and I know that singapore has them too).

So we need assured guarantees that data is not traversing outside of the region. most cloud vendors have specific data residency compliance for India - https://aws.amazon.com/compliance/india-data-protection/

michaeldwan · on Feb 10, 2022

You control which region your app and it’s volumes are in. Metrics, logs, and volume snapshots end up on servers in the USA. We haven’t addressed data residency for those platform services yet, but we might someday if there’s enough interest.

the_duke · on Feb 10, 2022

I really recommend tackling this.

Many EU companies are increasingly concerned about data location, with some going into full panic mode.

> Metrics, logs, and volume snapshots end up on servers in the USA

This will unfortunately prevent me from using/recommending Fly for EU customers at the moment. It should also be pretty prominently stated in the docs.

api · on Feb 9, 2022

WireGuard is just a transport protocol, so of course you could use it in place of SSL/TLS if you wanted. Interesting though, and I prefer it to SSL/TLS because X509 certs suck.

tptacek · on Feb 9, 2022

WireGuard isn't really the interesting bit here, it's running TCP/IP over it in userland. You cannot straightforwardly do that with SSL/TLS, but it is in fact the API that WireGuard provides.

sho_hn · on Feb 9, 2022

This is where the article lost me a little bit. I (think I) technically got the part of running a TCP/IP stack in an unprivileged user process, so you don't have to elevate privilege for adding a network interface and using the host OS TCP/IP stack. And maybe that's already very cool. But:

- What other benefits does it give you?

- This isn't a new problem and presumably has prior best practices for mitigation. What is this replacing, what was the landscape like before this? What was the most similar already?

- Have people been looking for a better solution like this for some time?

- What's the over/under on the maintenance cost? You added in another TCP/IP stack to look after. You maybe save on static configuration and can make your system more dynamic. Pros, cons ... let's list them.

The talks a bit about problems they were trying to address, but not in a way that clearly answered the above for me. It's of course valid to write a piece with a more informed audience in mind, but in something that aims to spread the virtues of an idea I think it could do more.

tptacek · on Feb 9, 2022

This is a good question and part of the reason you didn't get a clear answer from the article is that I'm not sure if I have a clear answer.

What I think user-mode TCP/IP gives us is the ability to build arbitrary services --- Postgres, Redis, SSH, network management, whatever --- without having to make infrastructure changes. We don't have to have some weird API or application proxy that knows what's running and who's allowed to run what. Instead, that's simply baked into the network, and flyctl, by dint of netstack, can just use it. If somebody comes up with a cool network service to plug flyctl (or any other tool someone wants to write) into, it will just work.

But things like the maintenance cost, well, yeah, that's most of what the post is about. The maintenance cost was not especially low.†

There's a natural inclination to read any post like this as a kind of brag, but I'm really just experimenting with trying to show the good with the bad here. User-mode TCP/IP is a weird choice! Nobody else I know of does it! It might have been the wrong choice! Even though I love it!

† It's actually not low even right now; I'm spending the first half of the day deploying code that relays stats from Netlink on our gateways through our GraphQL API, so that flyctl can check WireGuard gateway health. That is not a thing we would be spending time on if we had just written an explicit proxy for Postgres or whatever, rather than providing a generic network transport.

sho_hn · on Feb 9, 2022

That's cool, and I appreciate you sharing these experiments.

I'm in automotive/embedded at the moment, and our daily battle is making decisions on how static vs. dynamic we want our system to be - static (e.g. resource allocations or baked-in scheduling decisions) makes it easier to reason about the system and provide guarantees, but generally lowers efficiency at runtime. Dynamic can make the system much better at serving a wide range of usage scenarios, but makes it harder to eliminate the risk of pathological cases. It's hard not to see things through that lens. The way to construct and run services you've described here to me is an interesting option on that type of axis, in the sense of where the costs/friction goes.

tptacek · on Feb 9, 2022

I wish I'd thought of writing a simple pro/con list when I wrote this post. I'll think about that!

ignoramous · on Feb 10, 2022

> If somebody comes up with a cool network service to plug flyctl (or any other tool someone wants to write) into, it will just work.

Sure, just how many distinct WireGuard configurations would the gateways be comfortable with per-Fly app? 1K+? 100K+? 1M+?

dolmen · on Feb 15, 2022

The answer by tptacek with pros/cons is here: https://news.ycombinator.com/item?id=30277278

sho_hn · on Feb 19, 2022

Thanks to both of you!

zaphar · on Feb 9, 2022

The problem isn't new but the previous best practices involved giving the tool super user. Usually through an installation process. See most other VPNs. UserMode TCP/IP stacks aren't very common in practice. This is why what fly.io did is interesting.

touisteur · on Feb 9, 2022

Or network namespaces? User-space tcp might also make it easier to do tcp checkpoint restore and container/app migration. Interesting write-up.

I keep thinking of the nightmare of keeping up with the world of Internet middleboxens, broken net layer implementations and icmp hacks that the Linux kernel supports and makes 'just work'. The jump to usermode tcp seems interesting if you're not worried about that (and I've been watching the formal-proven ip/tcp stack space like a hawk for years), but I've been burned so many times with non standard stacks and 'oh you need to connect to that non-updated lynxos system and huh' or 'hah could you enable ecn or this obscure tcp option because... Legacy?'... And sometimes I need tc/netem and netlink and I don't know...

ignoramous · on Feb 10, 2022

We embed LwIP in an Android app, and it doesn't even implement most of TCP/IP features let alone handle most of the protocol's quirks. But, it mostly works as expected, because of Postel's law on the other side (for ex, https://apenwarr.ca/log/20090222)

derekzhouzhen · on Feb 9, 2022

It is not just that. WireGuard is TCP tunneled through UDP. A TCP connection tunneled through another layer of TCP will suck badly performance wise. The sliding window algorithm in both layers will fight each other.

qbasic_forever · on Feb 9, 2022

Replacing SSL/TLS with wireguard is cool but aren't you just going to run into the same headaches of rotating certificates/keys? No one is really going to rely on using the same wireguard key indefinitely, right?

tptacek · on Feb 9, 2022

It's pretty easy for us to rotate keys now, since new WireGuard peers are extremely cheap to bring up (part of the point of the post is that for most of the last year, that was the opposite of the case, and a new peer was a very painful thing to ask for). But rotating WireGuard keys with Fly.io makes about as much sense as rotating the OAuth2 API token `flyctl` uses (the token is strictly more powerful than the WireGuard key), and people generally don't do that.

ignoramous · on Feb 10, 2022

Please write a bit about the secrets-infrastructure at fly.io! The cert store, the token store, the trade-offs, the protections around it (though, I'm sure we will judge you for it, especially if it isn't "secure enough" for any made up definition of "secure").

pxeger1 · on Feb 10, 2022

> It looks like all of Fly.io isn’t working, when in fact the only part of Fly.io that isn’t working is the part that allows you to use it.

What's wrong with that? ;)

born-jre · on Feb 9, 2022

i had something similar in mind but using libp2p[0]. use as a universal mesh network as library but without central control server, better p2p/NAT traversal, no need to mess with keys*.

[0]: https://github.com/libp2p/

tedunangst · on Feb 9, 2022

Man, where are all those virtualization fanboys who said usermode linux was a dead end? :)

vbitz · on Feb 10, 2022

Well gVisor uses mostly the same method of system call emulation (PTRACE_SYSEMU).

It's also one of the three major projects that use it besides User-mode Linux and rr.

est · on Feb 10, 2022

Been using tunsafe for years. It's solid wireguard-compatible, works on user-space, and had extra features like TCP handshake+UDP data, https obfuscation, etc.

Author is @strigeus of uTorrent/Spotify fame.

the_biot · on Feb 10, 2022

It's a little surreal to see all this cheering and adoration of fly.io, but they can't even keep their blog webserver up. It's not a great look for a cloud-related company.

mrkurt · on Feb 10, 2022

Our blog webserver is fine. What kind of error did you see?

the_biot · on Feb 10, 2022

Times out, rarely gets the page out.

mrkurt · on Feb 10, 2022

Does debug.fly.dev work? Can you run `curl -v https://fly.io/blog/ -sS -o /dev/null -D-` and post what you see?

the_biot · on Feb 10, 2022

debug times out as well. curl output:

   flake: curl -v https://fly.io/blog/ -sS -o /dev/null -D-
   *   Trying 2a09:8280:1::a:791...
   * TCP_NODELAY set
   * Connected to fly.io (2a09:8280:1::a:791) port 443 (#0)
   * ALPN, offering h2
   * ALPN, offering http/1.1
   * successfully set certificate verify locations:
   *   CAfile: /etc/ssl/certs/ca-certificates.crt
     CApath: /etc/ssl/certs
   } [5 bytes data]
   * TLSv1.3 (OUT), TLS handshake, Client hello (1):
   } [512 bytes data]
   * OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to fly.io:443 
   * stopped the pause stream!
   * Closing connection 0
   curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to fly.io:443

Seems to work when forced to IPv4 though.

mrkurt · on Feb 10, 2022

Ok last thing, can you try `curl https://debug.fly.dev --ipv4` and tell me what the `Fly-Region` header says?

Thank you for helping with this!

the_biot · on Feb 10, 2022

Fly-Region: fra

willstrafach · on Feb 11, 2022

Different poster here but just curious: Are you a Deutsche Telekom user, by chance?

the_biot · on Feb 11, 2022

Nope. Still only working very sporadically for me.

iqanq · on Feb 9, 2022

Can someone explain to me why wireguard is implemented as a kernel module? Yes I get it, more performance. But isn't it completely and absolutely insane to run a complicated piece of software that is open to outside connections with kernel privileges?

tptacek · on Feb 9, 2022

The WireGuard protocol is deliberately designed to be straightforward to run in the kernel. In steady state, it doesn't even require dynamic memory allocation. It uses timers in lieu of extra statekeeping. It has a simplified networking model ("cryptokey routing") that defers to the host TCP/IP stack a bunch of stuff that other VPN protocols take upon themselves to build. It has just one keying mechanism and an API to build more interesting authentication features (like SSO integration) on top of it, rather than having it invade the core design.

It helps that it was designed and implemented by a kernel exploit author.

miloignis · on Feb 9, 2022

Running complex software open to outside connections in the kernel is pretty standard - the TCP/IP stack is in the kernel too!

remram · on Feb 9, 2022

Performance.

Hendrikto · on Feb 9, 2022

It also helps with availability. If you got a recent kernel, it‘s already there.

robertlagrant · on Feb 9, 2022

Isn't most networking in the kernel? It's pretty complicated already.