Serverless Sync in Web Apps Using Bit Torrent

kodablah · on June 14, 2016

Neat use of web torrent. Now if only we could do it with live data instead of persistent data, have built in encryption, and anonymize the peers we'd have the perfect decentralized internet :-)

bytefactory · on June 14, 2016

You mean something like this? :)

https://zeronet.io/

kodablah · on June 14, 2016

Where is the anonymity? It is unreasonable to use TOR in my opinion because despite being only a few exit nodes, it's not truly P2P then (and therefore can be slow). Tribler tried something similar. Someone needs to build an onion router of peers without defined relays or exit nodes (think of it as everyone-is-a-relay TOR, it's one of my future ideas[0]). This will speed things up substantially and can definitely work when you don't have the limitations of TOR like working w/ all TCP connections (e.g. HTTP). Maybe they could make every ZeroNet node a hidden service, I dunno.

Also, I have researched a lot about plausible deniability of publicly available yet distributed data. I have never seen a system that stores public data without them (clarity edit: node runners) being able to find out whether they are storing a piece of something they don't like. I started a thread about it on the Maidsafe forum recently [1]

0 - https://github.com/cretz/software-ideas/issues/2 1 - https://forum.safenetwork.io/t/unencrypted-data-question/969...

dublinben · on June 15, 2016

From what you've described, it seems like i2p would satisfy the requirements of your first paragraph and Freenet would satisfy the requirements of the second. Both of these networks have been around for over a decade, so they're not as in vogue as newer projects.

kodablah · on June 15, 2016

I have only slightly looked at i2p, but sadly appears the only impl requires a JVM. I have researched freenet and it does not appear to provide plausible deniability for public data (sorry on mobile and can't link easily).

icebraining · on June 15, 2016

It does provide plausible deniability, since you have no control over what files go to your node (files are spread as people request them) and they are encrypted, hence you can't easily know what they contain.

kodablah · on June 15, 2016

Both [0] and [1] seem to suggest that if you make the data "public" then someone can know they have a piece.

"It is hard, but not impossible, to determine which files that are stored in your local Freenet Datastore"

"Of course, the decryption keys, which are contained in links to the files, may be publically posted on some other site - they have to be if the site creator wants people to visit their site. But if you've never had knowledge of that link, which is very plausible if there are thousands of Freenet sites, you can't be expected to know what is contained in the encrypted files in your Freenet node."

0 - http://security.stackexchange.com/questions/12811/how-does-f...

1 - https://wiki.freenetproject.org/Plausible_deniability

Yxogenium · on June 15, 2016

If you feel brave, you could try the C++ implementation of the i2p router, purplei2p[1] (aka i2pd[2]). Last time I tried, there were a few rough edges, but it is now over two years old, so it has probably improved (or you may even improve it yourself!).

[1] https://purplei2p.github.io/ [2] https://github.com/PurpleI2P/i2pd

anonbanker · on June 17, 2016

there is a c++ implementation of I2P[0].

0. https://github.com/PurpleI2P/i2pd

shortcutme · on June 15, 2016

You don't need exit nodes: ZeroNet support hidden service (.onion) peers and around 50% of the users are connected to the network this way.

And since the ZeroNet sites works offline it does not affects the page rendering/browsing speed.

marknadal · on June 14, 2016

Feross is awesome, I've met him several times and WebTorrent is driving the future forward. I have a complementary Open Source project ( https://github.com/amark/gun ) for live data sync - it is already decentralized and anonymous by default (other than a session-ID).

Adding encryption is pretty easy, now with WebCrypto! The future is looking exciting, between WebTorrent, IPFS, and other projects!

notduncansmith · on June 15, 2016

Against what threat models does Gun protect peer anonymity?

marknadal · on June 15, 2016

None in particular since you still have to connect to a traditional HTTP or WebSocket (although WebRTC is coming soon) servers. So if you connect to a malicious peer they can get your IP can get leaked. But the messaging algorithm is very ad-hoc mesh-networky and UDP-ish. At its core, messages only contain a message ID and a body, these are then daisy chained throughout the servers and clients (they're all peers). Meaning that just because a peer sent you something does not mean they are the originator. However, nothing stops peers from broadcasting their session ID or IP or X-Forwarded-For header. But the counter is also true, the messaging system still works even without that information - which if there are enough peers preserves anonymity. I of course should say the usual disclaimers that small peer groups can be attacked, and larger networks if you have enough intelligently placed peers you can probably calculate triangulation and stuff like that. When you get down to the actual "physics" of stuff, there are lots of tricks/hacks to break any network based on timing and patterns alone.

_bdog · on June 14, 2016

Now someone could make a Chrome plugin for YouTube, which seamlessly seeds the watched videos and, when it goes offline, loads from Bittorrent.

anonbanker · on June 17, 2016

Or when the site gets a DMCA takedown request. Or when content is blocked in your country.

There seriously needs to be a Youtube killer, and this would be a good starting point.

wwalexander · on June 14, 2016

Why not use IPFS, which is specifically designed for this sort of usage and will be more resilient?

kodablah · on June 14, 2016

Seems js-ipfs[0] is not quite ready for prime time (though it's close) and the author had a requirement to do everything from a browser.

https://github.com/ipfs/js-ipfs/

lewisl9029 · on June 15, 2016

Unfortunately it looks like WebRTC support and DHT support isn't planned for this current release of js-ipfs according to their roadmap [1], both of which could be a giant leap forward in enabling truly decentralized apps.

Bittorrent DHT doesn't actually run in the browser yet either [2], so WebTorrent in the browser currently only has access to the tracker infrastructure, and thus isn't as decentralized as one might think.

[1] https://github.com/ipfs/js-ipfs/blob/master/ROADMAP.md

[2] https://github.com/feross/webtorrent/issues/288

carolc · on June 15, 2016

(IPFS dev here)

We've been working hard to get js-ipfs to a working level and just last week announced it's ready. While it's still early days and we have plenty to do to make it even more robust, we have it fully working with WebRTC support as of today.

We've developed some apps and demos that essentially enable you to create dynamic content and apps on IPFS, purely in the browser. We do have to currently rely on a centralized Pubsub (in a similar fashion to torrent trackers) but we're working on to provide a decentralized pubsub mechanisms as part of IPFS.

Regarding persistency: js-ipfs and the main ipfs (native go-ipfs) networks are not yet connected but we're very close to having that. Once they all communicate in the same network, we can provide a lot better persistency as users don't have to keep the tabs open in the browser.

Take a look at some of the examples of what can be done today with js-ipfs, that is fully in the browser:

https://github.com/haadcode/orbit (specifically https://github.com/haadcode/orbit/tree/js-ipfs)

https://github.com/ipfs/paperhub

https://github.com/haadcode/orbit-db

https://github.com/haadcode/proto2

daviddias · on June 15, 2016

Hi! (Another IPFS dev here :))

IPFS supports today WebRTC, TCP, uTP and WebSockets, thanks to the multi transport approach libp2p[0] offers. In fact, that is how Orbit works, a chat app build completely in JS, using IPFS, on the browser without any plugins. Check:

- http://orbit.libp2p.io/ - https://github.com/haadcode/orbit

The js-ipfs DHT is underdevelopment, in fact, we have an implementation, but it is not compatible with go-ipfs so we are not rolling it out, yet.

To check the latest updates on the project and learn whats next, check our log https://github.com/ipfs/js-ipfs/issues/30#issuecomment-22604...

[0] - libp2p is the network stack of IPFS, also a standalone project https://github.com/ipfs/specs/tree/master/libp2p

uptown · on June 15, 2016

Something I've never understood regarding the "Bittorrent DHT" ... is it a singular hash table that's utilized for all BT activity, or are there many Bittorrent DHT's, and when somebody says "Bittorrent DHT" they're just referring to the concept?

niftich · on June 15, 2016

The "Bittorrent DHT" is one big Distributed Hash Table [1] in the sky. As you participate, you become a node, and you'll receive a subset of the data in the entire node network. Here's a StackOverflow answer explaining the mechanics: https://stackoverflow.com/questions/1332107

There is actually another one called "Azureus DHT" which came first [2]. It's another big hashtable in the sky, and the Azureus/Vuze clients can connect to either DHT. All other clients only use the second one, the "Mainline DHT", which was an official extension to the protocol [3].

[1] https://en.wikipedia.org/wiki/Distributed_hash_table [2] http://www.bittorrent.org/beps/bep_0005.html [3] https://wiki.vuze.com/w/Distributed_hash_table

uptown · on June 15, 2016

Thank you for helping me understand better.

kinlan · on June 14, 2016

I didn't know about it. I chose webtorrent because I potentially also wanted to have content created from the app to be long term on the torrent network if people are seeding the content.

niftich · on June 14, 2016

There was a discussion about the world's oldest torrent, where others and I mused about solutions to the problem of ephemeral seeds: https://news.ycombinator.com/item?id=10962305

I suggested that one could maybe bridge Bittorrent and IPFS, then someone suggested that IPFS would be ideal for being the 'canonical' source of the file while the other protocols (HTTP, BT, WT) are merely ways to access the file.

wumpus · on June 14, 2016

The Internet Archive seeds all of our items' torrents since 2012. IPFS is a more straightforward way for us to be sure we've got independent seeds of all of our content. Got 25 petabytes of disk?

toomuchtodo · on June 14, 2016

I've always wanted to write a distributed ArchiveTeam torrent server that, instead of distributing crawlers for ingestion, equally distributed IA torrents to distributed end-user storage (in a VM most likely). If seeding for a torrent dropped below a threshold, that torrent would be advertised to other client nodes to download and then serve themselves. Client VMs could come and go, and the announcer would always ensure remote storage was handled in an orderly fashion.

Perhaps this year is the year!

niftich · on June 14, 2016

Uhhh, 20 GB free...

I'm just a third-party, but if you were to get in touch with the IPFS team to host a limited amount of IA content as a proof-of-concept, I'm sure it would be a great publicity for both of your organizations.

wumpus · on June 14, 2016

We're in touch with them, yes, and we have a proof-of-concept node.

toomuchtodo · on June 14, 2016

> then someone suggested that IPFS would be ideal for being the 'canonical' source of the file while the other protocols (HTTP, BT, WT) are merely ways to access the file.

Hey! That was me! I still believe its a great idea ;)

martinald · on June 14, 2016

Very cool. I imagine when bittorrent first starting how cool this would be to do if we had something like webtorrent

Two problems:

1) WebTorrent isn't very persistent. It requires people keep the tab open and I don't think the current WebTorrent to existing torrent clients federation really solves anything interesting. Unless you've got high volumes of traffic it doesn't help much.

2) Bandwidth and storage got so cheap. S3 et al is really pricey but you can get insane deals on colo and bare mental boxes these days. Ok, you're not going to get s3 ease of use and reliability, but webtorrent is going to be way worse on both those points.

niftich · on June 14, 2016

S3 can serve over the Bittorrent protocol though.

In fact, there are several viable configurations to bridge between WebTorrent, Bittorrent, and S3. For example, you can have S3 directly serve over Bittorrent and use Amazon as a tracker; or you can have the HTTP URL to S3 serve as a BEP19 webseed, which is now supported in recent WebTorrent.

kinlan · on June 14, 2016

/me runs to learn more.

diegorbaquero · on June 15, 2016

Check this: https://medium.com/@diegorbaquero/how-to-create-a-swarm-cdn-...

kinlan · on June 15, 2016

thanks!

icebraining · on June 14, 2016

It might make sense in situation where you can keep a quorum, ie., when the assets are common for most users (unlike the example given of the recorder) and where users are likely to keep the tab open for some time. A good example might be distribution of assets for a browser game.

Colo and bare metal bandwidth might be cheap, but they require knowledge setting up a server and keeping it running, which many programmers can't do reliably.

michaelbuckbee · on June 14, 2016

So I only kind of understand magnet links. But it seems like I could create a Chrome plugin that took every page I visited and turned the assets into magnet links and started request/seeding them on pageload. Is that analogous to what they're doing?

tjohns · on June 14, 2016

I think you just described IPFS (http://ipfs.io), actually. :)

kinlan · on June 14, 2016

You need access to the Blob data to seed that. You could get that by making HTTP requests to each of the resources on the page and then to seed that (however, as soon as you navigate away from the page there are no more seeders so the data is no longer available)

michaelbuckbee · on June 15, 2016

Well plugins are powerful things - I bet you could cache blobs in the browser for a bit as long as it was still open.

moosingin3space · on June 14, 2016

This is really cool. Wonder how hard it would be to develop double-ratcheted multi-user chat or a distributed game using WebTorrent.

235337 · on June 14, 2016

Anyone have suggestions on how to block sites like this from downloading without permissions?

edit: So blocking XHR requests seems to do the trick.

visarga · on June 15, 2016

The app is quite interesting and I was amazed until I realized the play button doesn't display the circular progress indicator on time. It has an ugly lag. This is why web apps are not quite there yet. And yet it is amazing they can record, store and sync audio like that. Almost like real apps.

kinlan · on June 15, 2016

That is me being lazy and showing a demo/concept experience, not because the web platform can't do this.

mdjeu · on June 14, 2016

Maybe this a bit tangential, but I was just thinking lately how it would be nice to use some decentralized communication backend to word processing software to enable shared collaborative editing of documents. Sort of like Google Docs, but without a centralized server.

pmlnr · on June 15, 2016

If there wasn't a 'cloud' in the picture how seeding is done I'd even believe it's serverless. Sorry for being a cynic, but what if there is no tracker available?

kinlan · on June 15, 2016

It's a very fair point, I chose the word serverless to imply no traditional back-end. One extension that I am looking forward to seeing is the use of DHT in to the WebTorrent that (at least in my head) can eliminate the need for trackers.

The important point in my head is that I can distribute audio files to anyone who is seeding client-to-client rather than pass through a traditional server that I would host.

solidr53 · on June 15, 2016

I was surprised to find out that the torrent fetching part of the prototype didn't happen in the service worker . Which is obviously the most cool place for it.

zeveb · on June 14, 2016

This sounds really cool, but it also sounds like a really easy way for malware (delivered, say, in a drive-by malicious ad) to exfiltrate data from a browser.

kinlan · on June 14, 2016

In this case, the content is only stored local to the domain - sure someone could put the data that is some sort of Malware as the file that is synced, but they could also HTTP request it too...

The theory at least is that the sandbox of the browser will stop it from getting out from there and keep the user safe.

kinlan · on June 14, 2016

Worth thinking about more though!

chj · on June 14, 2016

It's great to share data, however, how can a torrent solution support services like web API?

geocar · on June 15, 2016

Well, one way I did it was by creating fake torrents with an unlikely info_hash (if you see the all-0xaa's hash on bittorrent crashing your client a few weeks ago, that's me).

In this way, I can get use public WebTorrent signalling servers to establish the WebRTC connections between my (non-bittorrent) clients.

You can distinguish between clients and servers (if you don't want a P2P model) by having clients claim to need this unlikely info_hash, and servers seed it. In this way, WebTorrent's websockets server won't link clients to other clients.

rictic · on June 14, 2016

(This is a good question, but as asked it's so broad that it's akin to asking "What kinds of things does P2P work for?" and we're still discovering more answers to it every year.)

bbcbasic · on June 15, 2016

I think these services would compliment a torrent solution, e.g. in a video hosting site web-api gives you the metadata (and perhaps the first 5 buffered seconds of the video), and torrent gives you the video.