Hacker News new | past | comments | ask | show | jobs | submit login
Overlay networks based on WebRTC (github.com/pojntfx)
186 points by keepamovin 4 months ago | hide | past | favorite | 55 comments



WebRTC is widely misunderstood. It is not a p2p-enabling technology. It requires a mechanism for passing messages between nodes prior to establishing a session (this is called "signaling" in the literature). So it's another turtle layer: it's a scheme for communicating between nodes provided they can already communicate.

So why does it exist? Answer: to reduce the cost to provide a service where bulk data flows are p2p. It does this by trying to get said flows over UDP with NAT traversal directly between the host networks for communicating nodes. The canonical example is internet telephony. But it does this on the assumption that there will be a long tail of node pairs for which a hairpin route through a server will be needed. It doesn't remove the need for servers, only the resources needed for those servers.

Everyone who says they have built a true p2p system with WebRTC is confused[1].

[1] You can build such a thing with WebRTC, but only if the nodes are not browser hosted and have public IPs. If you think that's useful then you're also confused.


What protocols/networks do you consider P2P? Every P2P technology I have used requires some bootstrapping.

I have done WebRTC with zero signaling, but browser only one side[0]. I wish the w3c/ietf would care, but that’s the issue with corporate/profit driven standards!

[0] https://github.com/pion/offline-browser-communication


I think it's worth remembering that the original intent of the internet was entirely p2p and this bootstrapping issue wasn't a problem. In fact, there was a whole set of protocols developed around global multicast that allowed for discovery to build exactly these structures.

So you can't blame the ietf for not thinking of this problem - you can blame them for not being able to respond well when the market went and mucked up their architecture by putting nat everywhere.


Bootstrapping is an interesting problem. If there's a large P2P network (think like Gnutella) there's papers that have proposed using port scanning of residential IP ranges to find peers that belong to the network. It's an interesting approach because there's no centralized bootstrapping servers. But, port scanning is inefficient and such an approach may only be viable if a network is large (I may be wrong here.) Still, I think it's cool how little work has been done on these problems. I am still really into p2p networks.

For my own project I observed that all of the main networks in some way still depend on trusted introduction points. Just many of them and combined. So I think using infrastructure that's federated and open is a good design (in my case: public MQTT servers is what my own software is based on.)


You can do offline signaling. Quoting MDN:

> It's any sort of channel of communication to exchange information before setting up a connection, whether by email, postcard, or a carrier pigeon. It's up to you.

All you need are SDPs and ICE Candidates. People choose a centralized signaling server because it's the easiest but WebRTC does not dictate it be that way.

Echoing the other commenter, there really is no other way to establish a connection without some knowledge of the other peer, be it gained online or offline.


I experimented with implementing this manual signaling where the delay between back and forth could be dozens of seconds or even minutes.

I found that you need to call restartIce, which is not typically called in SimplePeer and is not normally exposed.

So as it currently stands, SimplePeer regular version that you can get from the package manager will not support signaling over dozens of minutes. It just won't work.

But it can be made to work if you just call restart ICE, which keeps everything alive.

I discovered this through my own research and experiments.

And you can see the code below. And also, you can fork this repo and read the instructions to get a working version as well.

I think the original idea was basically you can, like, be in your shell and you can run this repo and then you can be chatting with people who come to your GitHub repository. And, the signaling is done over comments and there’s sort of like a comment robot that facilitates that to make it easy.

https://github.com/o0101/janus/

RestartIce: https://github.com/o0101/janus/blob/9b092218b7623ca198c3caef...


Exactly. This is trivial to carry out and can be done "by hand" between two browsers. I don't know what the OP is talking about. They are speaking as if the commonly chosen mode of bootstrapping a connection is fundamental to the entire communication cycle. It's flat out not true. When the connection has been established, peers are connected directly, but by their definition, this doesn't matter because a trusted peer was used to form the initial connection. Ridiculous. By that logic, it wouldn't be a P2P connection if IPs were initially exchanged by snail mail or carrier pigeon, or if DNS were used.


And once you do connect with a peer, peers can maintain DHTs of other peers and you can have a true mesh network where you signal over WebRTC itself.


It's okay to make a distinction between transient and steady-state communications. If the steady-state is p2p then that's good enough for me. BTW I've never built one but I've read that WebRTC can connect browsers on a local LAN if they can see each other's IP address. (This is actually an experiment I'd one day like to do.)


Communicating between a group of computers on a LAN using IP doesn't have any limitations vs doing the same on the internet, so there's no reason it wouldn't work. (But yes I've tried it and it worked)


> So why does it exist? Answer: to reduce the cost to provide a service where bulk data flows are p2p.

Also to lower latency.


FWIW, lower latency is not guaranteed when going direct vs through a relay. There are way too many variables in routing. You've got to try it all ways and see.

Ideally, you should try to measure the one way delay on each path and choose the 'best' path in each direction, but WebRTC tries to converge on the 'same' path for both directions (either relay both ways or direct both ways).


Sounds like you understand WebRTC yet take issue with describing it as a p2p-enabling technology. It is literally technology which enabled peer to peer connectivity in the browser. Without it, browsers would not be able to directly connect to one another over the Internet. These kinds of definition-focused posts on HN are always so tiresome.


Yes, nodes behind NAT must be able to already communicate through some kind of intermediary — at least initially (for STUN), but maybe the whole time (for TURN.)

But this intermediary has very low requirements. It doesn't need to be mutually trusted by both parties, and it doesn't need any compute of its own for handling encryption/etc, just the ability to re-wrap opaque IP-packet payloads back and forth between ports. Any random low-power network switch could implement full-speed TURN. And anyone could also run a STUN or TURN server on a $2/mo VPS.

And, beyond that, STUN/TURN servers also don't need to be protocol-specific; you can reuse a STUN/TURN server someone put up to help some other protocol. They're a global resource — even STUN/TURN servers established for non-WebRTC routing, can be reused for WebRTC routing, and vice-versa.

This means that in-browser WebRTC protocols that rely on ICE, are both:

affordable to "enable" connectivity on (because any network or protocol likely has some backing foundation, which can easily afford enough ultra-cheap STUN/TURN nodes to ensure the network's operation; but even if they don't, the network can be designed to use existing STUN/TURN nodes established by p2p advocates like the EFF, acting as a free rider on those nodes; and even if you don't do that, either party, with at least $2 of motivation to communicate, can get a VPS [in a non-CGNATed country] and run their own STUN/TURN nodes on it.)

uncensorable in the weak sense (in that there's no central entity within the p2p network which can choose to deny you access to connectivity on the network.)

(Mind you, your ISP might drop all your STUN/TURN -looking traffic; or you might not be able to acquire a non-CGNATed VPS IP address anywhere in your whole country... but this is not a problem for "p2p protocols" to solve; as, at that point, you're not on "the Internet" per se, but on a weird LAN that is willing to route IP packets to some whitelisted subset of the Internet. Solving for network-level censorship isn't the domain of "p2p protocols", but rather the domain of anti-censorship technologies, like Tor's hidden bridge nodes.)

---

But I assume you don't really care about the practicality here, and are speaking more in the context of being an IPv6 public-routability maximalist. (Which, hey, I'm one of those too. Non-globally-routable addresses are silly.) Thus this:

> You can build such a thing with WebRTC, but only if the nodes are not browser hosted and have public IPs. If you think that's useful then you're also confused.

I do get why you say this — the naive counterargument is that phones on cellular networks are non-browser-hosted and have public IPs, and that therefore native mobile apps can act as both WebRTC and STUN/TURN servers.

But of course, the cellular networks that aren't CGNATed, are all IPv6 networks. So this is no help to anyone whose ISP has only assigned them an IPv4 address behind a NAT.

But there's an (IMHO much better) rebuttal, in the form of an emerging technology-trend, which you may not have considered: Desktop-as-a-Service (DaaS) nodes.

If you run a browser on a cloud VM, then you get a browser with a public-routable IPv4 address, where that browser can then participate in a WebRTC network as a "supernode."

Sure, if you're doing this with the goal of enabling STUN/TURN, then this is silly: you may as well just get the $2 VPS and run just the STUN/TURN servers on it.

But my point is that many corporate employees just connect to some DaaS using thin-client hardware by default, and do everything on that DaaS (incl. using a browser); and so, as long as your protocol is likely to be used by any corporate workers, then your protocol does get its necessary supernodes — in the form of the browsers running on these workers' VMs — "for free."


So, I mostly agree with you--and I certainly think WebRTC is "peer to peer" as I am unsure what the definition of "peer to peer" even would have to be if it weren't--there is a big caveat that I think undermines the beauty here: a browser can't act as a supernode as it can't listen for HTTP connections.

To me, what WebRTC is missing is the ability to just connect to another peer who DOES have a public IP. And frankly, the tech stack seems so close to being able to pull this off... Sean even has that demo repository he linked on this thread somewhere of just hardcoding the SDP, but it isn't clear to me -- and it didn't seem clear to him either (based on a comment in his README) -- that this would actually be secure enough as a bootstrap. It certainly could be with only a minor modification, as I should only have to know the pinned certificate of the DTLS server, but I think the browser API of WebRTC requires the server to know the client's certificate as well, which then implies the client's private key has to be hardcoded and shared, and it feels non-obvious if the security model of DTLS still works in that case (I should really sit down and try to pencil this out).

Without this, your supernodes can't be browsers: they have to be native programs with full socket access. Again: I think it is still OK to call the technology peer-to-peer, and yet it is also yet still missing something that feels really core to the idea of it being usable as a platform technology for P2P deployment, and likely it will never get the rest of the way as people are now wasting their time on the definitely-not-even-trying-to-be-peer-to-peer WebTransport :/.


> I think the browser API of WebRTC requires the server to know the client's certificate as well

Yes, as it stands, the browser WebRTC API is designed around the idea that a p2p connection is being bootstrapped by connecting two of the leaves of a star topology to each-other. For this use-case, it's a pure optimization: the server starts off intermediating between peers; then at some point, the peers each figure out that the other one can do WebRTC, so the server introduces the peers to one-another (gives each peer the other peer's authoritative DTLS certificate hash), and the peers stop talking to the server and start talking to one-another instead.

I don't think this is any kind of equilibrium state, though. This is the way things are because because WebRTC evolved purely as an optimization for companies like Google and Facebook to avoid needing to deal with massive streaming-media traffic fan-in. In their cases, they already had multimedia chat platforms that used authoritative central backend servers intermediating media streams; and they just wanted to introduce a way for a browser to "upgrade" from talking to the backend, to talking to another browser. Like upgrading an HTTP flow into a Websocket flow.

But a lot of the work since then, on Data Channels, DTLS and so forth, has been at the behest of other companies (e.g. Ericsson) whose motivation isn't to decrease load on their massive video-chat backends, but rather to do other things, like enabling browsers to talk directly to e.g softphones or IP cameras, with both sides throwing packets directly at one-another without some kind of gateway in the way.

I fully expect that while you might not get browser STUN/TURN nodes in the near future, we will at least see something like browsers offering the option of mutual authentication of DTLS certificates through a specified X.509 CA set (where both peers have been issued client certificates signed by one of the CAs trusted in the set; and where the CA could very well be a self-signed private one unique to the service.)


Do we even need "mutual authentication of DTLS certificates", though? I would think it should be sufficient for either the server's certificate to be CA-signed or -- even easier and even better as far as I am concerned (as to me the CA system is itself bowing to an authoritative system I would rather avoid) -- for the client to just already know the certificate hash of the server (as part of its address).

The only problem I am seeing is that I want to be able to leave the peer certificate hash null and just authenticate with anyone on the other side. Like, I actively don't want "mutual authentication of DTLS certificates" for when a user first connects. For later re-connections it might be useful for the client to leave the network a certificate hash, but for that initial connection I want one side to not have a certificate, same as using TLS to connect to any known HTTPS server.


> Do we even need "mutual authentication of DTLS certificates", though? I would think it should be sufficient for either the server's certificate to be CA-signed

I mean, we're talking about something where the core use-case is for equal peers to be connecting to one-another, where who is dialing whom is entirely arbitrary. While your app on top of WebRTC might have nodes that do server-like things and other nodes that don't, on the WebRTC level, there is no "server"; for any given WebRTC session, either one of the peers might end up doing the dialing or the listening, perhaps with this even changing for the same pair of peers on a connection-by-connection basis.

In such a context, mutual auth is the only kind of auth that makes sense to me. Both sides establish a stable identity (beyond their relatively-unstable current IP address), so that each side's User Agent has at least one piece of reliable information it can use to whitelist or blacklist other nodes' connection attempts by.

Consider: we're struggling right now to get the phone network to adopt protocols that allow each side of a phone call to non-repudiatably identify the entity on the other end of the call (i.e. for it to be impossible for each side to fake its phone number.)

So why would we want to introduce a new protocol in the year 2024, supposedly for p2p connectivity, where at least one side of the connection is allowed to be the equivalent of the phone network's "Unknown Caller"?

Or, to put that another way: do you really want to enable your browser to be DDoSed? By other browsers, who've landed on some malicious webpage that runs in a background tab, opening never-ending WebRTC connections to target hosts according to the websocket-delivered instructions of some booter service?

Mutual auth at least ensures that connection attempts like those will be dropped at DTLS establishment time. To talk to a WebRTC session established by a page from origin X running in your browser, those tabs in those other people's browsers will need access to DTLS certs that are signed by a party trusted by the code in your browser (i.e. the code written by origin X.) And those DTLS certs would in turn presumably only be made available to... pages from origin X. (Or its logistical data partners.)


But, I don't want "origin X", I want "peer-to-peer". I don't want some server somewhere providing the authorization to enter the network by gating the CA: I want a mutually interoperating set of web applications written by different developers all able to run code (maybe in a library) that bootstrap themselves into a permissionless peer-to-peer network. As for your concern about having to deal with a ton of incoming connections, such is the life of a supernode: if you aren't up for that, don't sign up to be a supernode. But like, without this, why do we care about "supernodes" anyway? If you have some centralized server you have to talk to to get certificates, just have it do signaling as well! :(


Good work! I just want to point out that WebRTC requires servers for STUN and TURN.

If I may, we also developed an open source solution for WebRTC videoconferencing, livestreaming and even peer-to-peer broadcasting to unlimited numbers of people! Feel free to use it:

https://community.qbix.com/t/teleconferencing-and-live-broad...


It does not require either of those.


NAT traversal is the first point on the feature list of the linked repo. So in context, it's important to point out.

While in theory it doesn't it reduces the possible connections quite a bit. Though STUN servers are plenty on the internet and free to use. There's even a service providing TURN (US geo though), but I have no idea who runs it and what their business model is, so I wouldn't rely on it.


Having used webrtc in offline environments (host device broadcast a local wifi), I can say it's a royal pain to use without ice/stun/turn servers. I had to manually mangle the sdp packets to make it work because browsers would randomly decide they needed ip privacy (from the device they connected to by ip...) where they'd translate ips to some_hash.local and try to find a turn server. I tried hosting a coturn server too, but browsers would sometimes ignore it since it was a local address. Mangling the sdp handshake is reliable but feels pretty jank.


I can't tell for sure without actually checking the repo out but this looks to be GCM with random nonces using a command line flag string as the fixed key for the network.

If you're trying to build a modern encrypted multipoint overlay network, MLS is a good place to start.


Did you mean MPLS? Or could you link to MLS?



WebTorrent[0] is also good WebRTC P2P project.

[0] https://en.wikipedia.org/wiki/WebTorrent


I recently started using Trystero to send messages and video around via WebRTC.

https://oxism.com/trystero/


I am actually working on something like this in https://borg.games ( also see https://github.com/BorgGames/borggames.github.io )

Basically the idea is that the only endpoint available through regular HTTPS is the signaling server, and the other services are accessible via WebRTC by requesting peers from the signaling server with service's public key.

There are multiple things I am planning to layer over p2p connections: CDN, virtual LANs to play games "locally", gameserver hosting, etc


[...] using STUN.

I'm convinced no one actually knows what STUN is. I always see these references to STUN in P2P contexts. STUN is just a shitty protocol that shows what your external IP and port is. That's literally it. You can't use it by itself to do any kind of NAT traversal. There is an RFC that lists how to use STUN lookups to enumerate a NAT (but its far from definitive) and anything useful requires looking at other papers about NATs to apply the results. This is what you would need to do before you did hole punching.

[...] webrtc.

See, the thing about webrtc is: its a vague, low-quality spec. It uses over-simplified descriptions of peers, NATs, and hole punching algorithms, over a design that would provide better reachability. But the approach that webrtc takes is to just fallback to TURN for edge-cases (TURN is a proxy protocol.) The problem with that is it's likely that TURN will be used significantly for things like mobile devices due to symmetric NATs. This means maintaining infrastructure for your 'p2p' system when you don't have to.

Overall, I am a fan of the fact that webrtc is in the browser and its 'standardized' but I think that the actual implementation of what its trying to achieve doesn't work well for the real Internet.


Onto the list[0] it goes.

I always have the feeling that WebRTC had so much more potential than has been realized. I wonder how things might have turned out differently if it had better server and browser support ~5 years ago. I remember trying to use it for a game at the time and had trouble finding a good server implementation, and had all sorts of issues with Safari.

In the near future I'm not sure there will be much reason to use it over WebTransport, unless you're doing p2p. I actually prefer WebRTC's data channel API, since you can get separate datagram streams, whereas with WebTransport you have to multiplex them yourself[1].

[0]: https://github.com/anderspitman/awesome-tunneling

[1]: https://datatracker.ietf.org/doc/html/rfc9221#name-multiplex...


> had all sorts of issues with Safari

Tangential rant, but this appears to be intentional. As far as I can tell, Apple is intentionally lagging behind supporting various web technologies, to make the web as unattractive as possible (within their ecosystem), and by extension, to make native app development as attractive as possible, so they can take a cut of the profits.

If their reasoning is correct, then they would actually lose money by not investing into their own browser engine, which is somewhat amusing, because the worse of a job they do, the more money they make.

(I find this idea particularly painful in the context of "we at Apple love web technology so much, we're going to use our love of web as our main excuse for killing Flash"...)


Not a great strategy. Can't compare apples and oranges. Long term weakening their web offering will only strengthen their competition and likely have compounding effects against their app store. Maybe it's a short term thing for them. Seems a better way would be to embrace web and create the synergies with their existing offerings that boost both...but, I think this highlights the issue

It's not so much about a binary choice between app store vs web. I think it's more just, the web is not very "Apple". It's too open, wild. If they go "all in" on the web, they'd be afraid to risk "losing themselves" or fucking something up and creating a backlash.


This strategy of trying to snuff out web apps by locking all iOS users into a decade out of date fork of Chrome that only gets a trickle of new features, and those features consistently have added UX friction that exists in no other modern borders is working in North America.


True, it works some. But think of what could be done if they could get beyond their, albeit understandable, web aversion.

Tho I disagree that Safari is so defunkt as you make out. They have a bunch of new features. Check out: https://webkit.org/blog/15243/release-notes-for-safari-techn...


To be fair to them though, they’re the only browser where you can transfer RTCDataChannels to workers and get that off the main thread. All others are still lagging behind there.


Just seems trivially obviously not the case to me, but seems like a popular meme regardless.



Regarding issues with Safari, they recently made a change where in order to get WebRTC data channels, you no longer had to request get user media permissions. This was the case before. It's still the case on Safari Mobile right now. But you can just say, request mic access, establish the connection, and once it's established, you just drop the mic access. In my case, typically about one to two seconds. So there are indeed quirks on Safari.


i've long felt the same. i came to conclusion the browser vendors decided it was kind of a mistake. it enables all sorts of local first and p2p things that cut out the service providers.

hopefully it will get some attention again. i have done a bunch of really interesting experiments with data channels. being able to choose the reliability is nice for sure.


> it enables all sorts of local first and p2p things that cut out the service providers.

And this is another reason why the web will always be a shaky foundation for local first and p2p software.


WebRTC will live on. I think it will continue to grow from grassroots efforts.

WebRTC has P2P and it also doesn't require the user to have any knowledge of Video/Networking. The alternatives to it are more geared at 'video developers'. I don't know which way the future will go.


Around 5 years ago, peer negotiation got massively simplified with new additions to the spec (see Perfect Negotiation). If you’re on a browser released since then, your life will be far easier.


Hyperswarm has been my go-to for stuff like this, but you bring up a good point that I don't think it's encrypted.

https://github.com/holepunchto/hyperswarm


Quite a solid project and approach. Both founders seem smart and to know the problem well. I can see their new framework (Pear runtime) getting huge in the future.


Can hyperswarm be used to set up a generic IP network?


ADGT.js [0] was another P2P overlay network based on WebRTC.

[0] https://doi.org/10.1002/cpe.4254


Felicitas is one of those devs, every repository on her profile is a gem to learn from, and well documented!


Awesome it's AGPL3 licensed


Is this like web-onion?


That's the first thing I disable in all enterprise/office PCs' browser: WebRTC.

Disclosure: I am an cybersecurity architect of IDS/NDS/XNS.


I'm curious as to why? Introduce difficulty to promote some ICE holepunching access to the device?


I don't know why the OP disables it, but chrome used to block my desktop from sleeping because 'WebRTC has active peer connections'.

I don't appreciate web pages or applications or whatever blocking my system from sleeping so i just turn off WebRTC since then.

Edit: I was not doing anything related to video, audio, p2p or anything else implying long running connections explicitly. Some "experts" were doing it behind my back on their pages. Don't know and don't care why.


Perhaps holepunching of a certain offsite JS-based app in your browser is the problem there. InfoSec know this all too well.

Things like JS-based portscanning, file-less malware in JS datastore, …


Meh, WebRTC ... it's not for everyone.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: