Hacker News new | past | comments | ask | show | jobs | submit login
Libtorrent adds support for the WebTorrent protocol (feross.org)
270 points by hauxir on July 13, 2020 | hide | past | favorite | 87 comments



A while back I built out a proof-of-concept distributed web based on webtorrents - https://github.com/tom-james-watson/wtp-ext. Once this change makes its way into popular clients, it makes the idea a whole lot more viable.

Unfortunately it looks like Mozilla's interest in the distributed web died when they made redundant the main stakeholders in the project last year - https://github.com/mozilla/libdweb/issues/109.


It's really fucking sad that Mozilla is struggling so much financially. They didn't participate on the iOS platform for a long time, then they burned a huge pile of cash on FirefoxOS, and now they have to somehow match the engineering resources of Google and Microsoft.

I personally work on a decentralized web project ... but even I can't fault them for shutting down side projects. Every project either needs to provide alternative revenue streams or feed into getting Firefox on par with Chrome's speed, security, and feature set. Hopefully Chrome will follow through with neutering their ad-blocking and people will switch back to Firefox.


When I was last looking at the libdweb stuff they didn't have WebRTC support working in the service workers to let you use it from a protocol handler (and they somehow didn't even consider that important as they insisted no one would even want to use WebRTC, which made no sense to me); did they eventually fix that so you could do WebTorrent?


You can implement your own protocol + handler with the functionality exposed by the library as far as I understood.

It doesn't really matter anyway since you have access anything firefox can do with webextensions experiments, which was required since libdeweb isn't a part of firefox(and won't be anytime soon because AFAIK most people who were working it has been laid off) and to run experiments you need firefox nightly.


Pretty much this. The code runs as an extension - it registers the protocol handler in the extension's background script. That means the code for webrtc has no need to run in a service worker.


OK, so as I don't actually do much in-browser development, and because this was also two years ago, "of course" I managed to get the terminology wrong: the issue was that WebRTC wasn't supported in a "WebExtension" (not a "service worker", which happened to be the subject of a massive thread I was reading a few weeks ago on WebRTC not working there either).

Here's the bug I had referred to the libdweb person at the time in November of 2018 (due to comment #2):

> RTCPeerConnection.createOffer doesn nothing in WebExtension background script

https://bugzilla.mozilla.org/show_bug.cgi?id=1398083

This bug had already been resolved in September of 2018 as a duplicate of this bug:

> WebExtension cannot access WebRTC mic from popup.js nor background.js

https://bugzilla.mozilla.org/show_bug.cgi?id=1278100

And I now see that bug ended up getting resolved in September of 2019 as a duplicate of this bug:

> Improve getUserMedia permission model for web extensions a bit.

https://bugzilla.mozilla.org/show_bug.cgi?id=1579489

Which was supposedly fixed? I'm honestly a bit concerned still, though, as this bug slowly got dragged from being about createOffer to being about permissions dialogs for audio/video, and it kind of sounds like there's some weird thing where a "foreground page" has to vouch for the extension, which shouldn't be required. Everyone who works on WebRTC stuff in browsers conflates it all with getUserMedia, but all of us distributed web people just want peer-to-peer data channels :/. I take it this is actually fixed then, and you can now use WebRTC in this context?


The Internet Archive is very active in the distributed web space. Have you gotten in touch with them?


For some fun confusion, there's this libtorrent library, commonly called libtorrent-rasterbar by distros. And libTorrent by rakshasa, which is used by rTorrent.

I've considered something like this for a web music player I have, but it doesn't seem to have an easy setup for downloading individual files from a torrent. And the latency would add gaps between songs.


You can use prioritize_files() combined with the sequential_download flag to "stream" a track (and prioritize_pieces() if you need the last chunk for an m4a file).

Schedule your fetches intelligently and it shouldn't be hard to keep a music player fed. 320kbps is only 40KB/s.


>And the latency would add gaps between songs.

You can't preload the next song ahead of time?


Agreed, I'd ask that moderators change the title to Libtorrent-rasterbar for reasons of clarity.


I've experimented with using youtube-dl as a backend before for my own music player. The latency to start is only about 2-3 seconds, slower than "webtorrent --mpv <torrent>" is for video.

You might want to mess around with the piece picker, there's options specifically for streaming. http://libtorrent.org/reference-Core.html#set_piece_deadline


If you have some centralized server for hosting the application anyhow, you're probably better off just using WebRTC directly, or for that matter, just serving the music files from there.


Question: why did noone extend the torrent protocol to understand http[s] and speak to servers that support resume?

edit: I’m just now reading https://www.bittorrent.org/beps/bep_0019.html and that’s exactly what they did. My bad


It is supported, it’s called web seeding.


although from how I understood it, web seeding really assumes you can control the folder structure on the servers.

so you can't just curated a data collection from publicly accessible sources and freely map torrent contents to files on http for bootstrapping/reliability.

too bad, really. it seems like a cool thing to just make a torrent from e.g. public datasets that are hosted on flimsy university servers. make the torrent, distribute it, the swarm keeps the files available and reduces load on the university servers, and the server keeps the torrent from dying even if the dataset is niche.


There are two BEPs for web seeding. One is defined in terms of base URL and relative paths the other in terms of pieces and offsets into them. Few clients implement both web seeding protocol extensions.



Kind of a tangent, but how large or small is the risk of nefarious sites running torrent nodes without the user's knowledge or consent?

We've seen something similar with using JS to mine bitcoin. That gets you money, so that probably has a stronger incentive, but someone might want to gain access to the resources this would allow.

The WebTorrent FAQ (https://webtorrent.io/faq) says this uses WebRTC and further says:

> WebTorrent clients running on one domain can connect to clients on any other domain. No silos! The same-origin policy does not apply to WebRTC connections since they are not client-to-server.

Does the browser do anything to protect the user against something like this? Is there a permission for network access, and is it allowed or denied by default?


> Is there a permission for network access, and is it allowed or denied by default?

WebRTC data channels are allowed by default.

> Does the browser do anything to protect the user against something like this?

A nefarious website can already use the fetch() API to upload or download massive amounts of data if their goal is waste bandwidth. WebRTC doesn't introduce any new vulnerability here.


Im not up to date on webtorrent, but if it catches on does that mean a good chunk of the Inter net could be freed from cdns?


unlikely. Torrents are slower, in most scenarios, than straight HTTP. With small files, they generate more traffic for chatting/handshaking than the file is actually worth.

It could however lower costs for people who distribute large files, where the final user can tolerate some download latency. I'm thinking browser games that can download a big archive via torrent on first launch, similarly to what Steam does now in their client.


A large majority of web clients must be operating on battery now, too. Willingness to burn battery participating in Torrent and other distributed protocols rather than HTTP will be low among those, let alone willingness to leave such a thing operating as a daemon in the background. It’s why I don’t see much future for these things (distributed file-sharing protocols) except in infrastructure or in applications where many users may be expected to be on desktop.


Conversely, every home router or NAS can act as a node in the mesh.


With data caps I'm not sure if home users have an appetite for that either.


Someone needs to create an app so awesome that the only way to join it is to contribute to it. Hopefully that will happen in my lifetime.


But they are highly resource constrained and if there is no direct benefit to the consumer, why throw that in?


It would help lessen the cost of bandwidth for servers and smaller ISPs, as traffic is very expensive for anyone that isn't Comcast: saturating a 1 Gigabit line can cost an ISP ~$1K/month!


An interesting question, but I worry about the latency overhead of a peer-to-peer system. I'd be happy to be completely wrong about this, though.


I expect enterprise IT will soon want a way to block this sort of thing, if it becomes popular. It risks killing WebRTC for serious applications...


One can also forbid Internet or just computers, because one can do a lot of harm with them...


Enterprise blocking of IP ports has caused quite a mess already. Let's not allow them to make it an even bigger mess.


> Enterprise blocking of IP ports has caused quite a mess already.

Which mess did it cause?


The mess that requires legitimate services to pierce through firewalls.

https://www.tldp.org/HOWTO/Firewall-Piercing/x58.html


It’s also part of why so many things that’d be better on TCP sockets end up being implemented on top of HTTP... and then actually see adoption in that form. A combo of JS limitations and firewalls aggressively blocking everything that’s not HTTP.


The mess that has made any new protocol look like HTTP for the last 15-20 years.


Considering SSH is forbidden in most enterprise networks I've seen, your attitude is clearly in the minority.

Enterprise-IT people tend to shutdown first and ask questions later, to cover their ass (and in their shoes I'd probably do the same). If it becomes trivial to torrent via mainstream browsers, I expect they will lock down whichever feature is responsible or apply massive pressure on vendors to remove the feature.


> Considering SSH is forbidden in most enterprise networks I've seen, your attitude is clearly in the minority.

You can't really forbid SSH (on paper only you can) since you can create SSH tunnels via virtually any port.


Of course, but they still make an effort (and a policy that will be applied to you if found out). They have to, it's their responsibility. If suddenly half the network is saturated by people using browsers to torrent the latest movies, you can bet that the answer won't be "we need a bigger pipe"...


The enterprise networks I’ve been on don’t give endpoints direct internet access. You can only access the web through an authenticated proxy. This forbids SSH, unless you set up additional infrastructure to tunnel it in HTTP.


Removing the feature seems pretty unlikely, since most web-based video conferencing products rely on WebRTC. In today's remote work world, that would be quite the throwing of the baby out with the bathwater.


Enterprise IT people here. We cover our ass in true enterprise style by consuming website filter lists from a provider and MitM TLS. Doesn't matter if they use WebRTC or HTTP, as long as the website is in the correct categories.

Worst case we could disallow WebRTC via DPI and whitelist it for category conferencing.


The only problem I see here would be created by themselves. The price they pay for postponing questions is losing WebRTC and those "serious applications" in the meantime.


I get the argument about blaming tools, but there's precedent here that suggests a ban could happen. Apple bans torrent-related applications from their app store despite the protocol ostensibly having nothing to do with piracy.

I think that's less likely to happen with WebRTC. But for workplaces that don't otherwise use it? Maybe.


What's are these "serious" applications?


Zoom, Slack and Teams all use WebRTC for running their respective clients in the browser.


And in their desktop apps (at least for slack and zoom).


Loads of IM / webconf applications, at the moment.


My point is that that’s not less serious than a library supporting the BitTorrent protocol.


I'm not sure I follow. Bittorrent is absolutely NOT a "serious" application from an enterprise perspective. Most companies don't use torrents and actively block the protocol (since the most common usages are unsavoury or related to entertainment, not business).

Whereas IMs are an accepted part of business processes (i.e. "serious").


> Whereas IMs are an accepted part of business processes

There was a very long period when IMs were not acceptable and actively blocked, especially at financial institutions that have a legal obligation to log and retain all internal communication for X years. I saw this at many different firms for many years.

That changed. perhaps torrent will, too. Nah, probably not :)


can they actually block it if you use a non-standard port and force encryption?


Everything can be blocked.


Everything can be blocked if you're willing to block everything else. Blocking something while leaving other services unaffected isn't trivial, and gets harder the smaller the data is. eg. if you want to block some bad guy from exfiltrating a 256 bit key, it's almost impossible to do because there are a million ways to do smuggle it out with stenography.


You also have games and "remote control" apps that use WebRTC (controlling a computer or a robot).


Enterprise IT already regularly installs their own certs on their boxes to MITM all TLS. The have the means to DPI and shut this down, though they'll need a firmware update for their firewall box.


Yes, but it's much easier to just disable webrtc in browsers at installation time once (images, msi, etc) and be done with it.


Meanwhile, during covid, disabling webrtc might be tantamount to disabling your business.


If you have enterprise IT who can disable WebRTC in your browser settings, you have a helpdesk who can whitelist approved video conferencing tools.


They're talking about blocking it entirely at the router level.


yeah, I'm thinking long-term here.


Long term, they'll just add it to their DPI boxes.

WebRTC isn't going anywhere. If you have a sales department, you can't block it.


Doesn't chrome come with hard coded certificates in their executable? Sounds hard to do MITM.

https://sites.google.com/a/chromium.org/dev/Home/chromium-se...


It comes with it's own, but it is trivial to add your own via system policies.


I recently wrote a fun thing using libtorrent-rb: https://rwmj.wordpress.com/2020/06/25/nbdkit-with-bittorrent... It's a way to download, install and upload Linux distros all at the same time.


I follow your blog and saw that post a few days ago. I found it an approachable intro to libtorrent-rasterbar — thanks!


This is great news and will help lead to wider adoption. I hope that some progress will be made for handling large files since currently it is stored in-memory: https://github.com/webtorrent/webtorrent/issues/86


I've been using Webtorrent Desktop for a while now, and really enjoy using it, but there are some glaring issues around random freezes on fast connections, plenty of open issues on it with other users reporting the same experience: https://github.com/webtorrent/webtorrent-desktop/issues

I've personally been working around this using a custom fork that messes around with the simultaneous connections settings, and it seems to help quite a bit, but I still get freezing from time to time when I download a large number of torrents simultaneously.

I wonder if an alternative implementation in something like Rust on top of WASM might make those kinds of issues easier to avoid by making more efficient use of resources.


Once WebTorrent is ubiquitous among desktop and headless clients, the networks will become a lot stronger. Especially for applications like BitChute and for anyone who copies their approach. It solves a major problem with the cost of baseline hosting vs. hosting at scale.


The problem is that many people will want to run bittorrent over a VPN connection, while they'd like to run their normal browser data over their normal (low-latency) internet connection.


You're assuming torrents are intended for copyright infringement which is unfortunately a big part of its usage right now, however if torrents become as seamless as HTTP (like not requiring a separate client) I can see them becoming widely used for totally legitimate purposes.

Torrents can be a great, standardized way of delivering updates for example, and network administrators can easily deploy a platform-agnostic way to deliver & cache updates on their network by just deploying a torrent client somewhere and their fleet would automatically discover them (using local peer discovery) instead of having to deploy multiple proprietary, platform-specific solutions.


> You're assuming torrents are intended for copyright infringement

I think they're just assuming that you don't want everyone to know exactly what you're doing online at all times by exposing your IP to everyone and their brother that uploads a resource.


This. It's a privacy nightmare. It essentially broadcasts to arbitrary other internet users find out what resources you've accessed. That's… pretty dire.


More flexibility in proxy configuration will make this easier to overcome, and there are already some plugins for Firefox which allows specifying specific proxies for specific sites.


When opening the javascript codepen demo linked in the article, the video plays immediately, but I noticed HTTP GET requests to https://webtorrent.io/torrents/Sintel/Sintel.mp4

Is this normal ? I know nothing about WebTorrent, but isn't the file supposed to be streamed from pairs, and not requested from a central server in an HTTP request ?


One of the Webseeds[1] specified in the Sintel torrent is that URL. That HTTP address basically acts as another peer in the swarm.

1. https://www.bittorrent.org/beps/bep_0019.html


And leaks your IP address and makes it possible to pirate content for simply visiting a website. You could target someone with all kinds of illegal stuff.


It's already possible for a website to get your browser to download illegal content simply by visiting the page, and expose your IP in the process to whoever they want. You don't even need to use JavaScript to do the downloading (just a image or video tag, for example), and the uploading can be done with xmlhttprequest/fetch.


I'm torn on this one.

On the one hand it means the web moves (or can move) closer to decentralized modes of operation. Bittorrent is tried and tested and Works (TM).

On the other hand, it also means yet one more thing that moves into the freaking browser. We have enough things nowadays that are shoehorned into HTML/JS when really they would have been more usable even if written in tcl/tk.

You win some, you lose some I guess. ¯\_(ツ)_/¯


Alternatively: every bittorrent client I have used (at least on macOS) is really, really bad. I used to use Deluge, which was buggy and slow to begin with. Then they stopped distributing an installer for it, so you had to install it with pip (!) and run it from the command line. God forbid I `brew upgrade python` or my bittorrent client will probably break.

Transmission crashes for me, and has little activity. Sometimes it just can't open torrent files.

The web doesn't break. You can load a URL, get some decent software with an easily-improvable UI—and assuming you're on a modern browser—expect it to work with no installation, in seconds. And if a site pushes an update, I hit refresh and get the update without needing to run a 45-minute `pip install` command.


fwiw, I've used qBittorrent on osx and found it to be usable. I'll admit that was a few years back, so maybe they (read: apple) broke that one in the meantime.

Oh, also, coming out of an hour long intercom session trying to explain to a customer rep how "no, I didn't change anything in my firefox, your site just suddenly doesn't log me in any longer" I'll posit that the web breaks all the time and all over the place, in many small, annoying ways.


Alternatively, it means BitTorrent is now supported by any number of esoteric operating systems, rather than just the ones the developer using Tcl/Tk chose to target :)


I'll let that count, but only because of the nice snideness of it. :P

Seriously though, you'll be hard-pressed to find a platform where you can't run at least rtorrent or such. and if you're running some esoteric unix from way back when, I expect you to be able to compile it for that yourself. :P


The amount of word feross has put into P2P systems over WebRTC is astonishing.

Thanks feross for your incredibly great work!


Is there a good book that explains all the BT technology? I’d like to use or build a robust client/server but most learning materials seem to be trivial or daunting.


Not a book, but I gave a talk at JSConf Asia in 2014 where I explain BitTorrent (and WebTorrent) from first principles: https://www.youtube.com/watch?v=kxHRATfvnlw

I also recommend just reading the BitTorrent spec. It's quite short and manageable as far as specs go.


Here's a slighly less trivial small guide: https://twitter.com/fiatjaf/status/1282108860405297153

After that, just read the BEPs directly: http://bittorrent.org/beps/bep_0000.html. They're small, easy to understand and straight to the point.

Then read the code for Webtorrent or other implementation in a language you're familiar with. You'll immediately recognize the BEPs there.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: