I’ve been working on that. It’s a hard problem, but emule works quite well. I propose something based on that: when you start a program, you advertise all the sha256’s of all the files you share. Whenever you need a resource, you send a request to the dht for its sha256. Anyone who has the file will opportunistically try to send it to you.
Since it’s a dht, there’s nothing to take down. And since it operates on sha256s of data, it’s implicitly secure. And it uses tor for the rendezvous, so it’s not possible to track who is requesting what, except by unique ID (which can just be a bitcoin wallet address you control).
The hard part is, what do you do about abuse? What if someone spams the network with bots that try to fulfill every request with bad data?
If anyone has research refs in this direction, I’d be grateful to read them. I’d also like to avoid a blockchain if possible, since it seems unnecessary for simple federated distributed data.
I think a cool layer on top of this would be to create a P2P network where you can 'friend' certain peers and produce a reddit-like aggregation of new content from various sources.
- Friend another user by adding their key and IP to a trust group.
- When your client boots up, it connects to friend nodes and downloads their recent DHT.
- DHTs from all friends are summed to sort the hashes by descending frequency, perhaps with a time decay factor.
- Display reddit-like UI of content to User: see which files are recently most popular among the set of all your friends.
Possible improvements:
- User could customize time vs vote weighting, script custom sort rules
- Give certain peers greater vote weighting ('best friend' vs 'acquaintance')
- Allow peers to 'tag' hashes to create subreddit like collections of things.
- Allow attachment of metadata to hashes - title, description, etc - figuring out how to handle discrepancies between sources here could be tricky.
- Make client automatically re-host content in your DHT: hosting IS the upvote
You and I must have like minds. I have even created an anon DHT PoC [0], conceptualized how the messaging might work [1], made a lib to use Tor easier since it's the best anon nat buster these days [2], began toying with a superset of reddit/slack/forums/etc (some grpc files at [3] and some impl in that same repo), and a bunch of other small things in order to arrive at this final destination.
Check out ethereum swarm [1]. Optionally private (hosting and accessing), decentralized, p2p data hosting protocol with baked in payments if you want to pay the network to host your data for you.
It seemed too simple of an idea not to already be done. Hopefully it conceals your IP.
Arg. Nope. It does not. It leaves it up to the client to decide how best to request the content, and by default that almost certainly guarantees no privacy.
It was not intended to provide privacy, the the distributed data store. If you want privacy you probably just want to have a private swarm, which is perfectly fine to setup. Just implement a client with a whitelist or another mechanism for the privacy you want.
If there is no privacy, why not host the data on s3 or Dropbox? Data storage is dirt cheap. The privacy concern seems like the prime reason to build such a service.
Since it’s a dht, there’s nothing to take down. And since it operates on sha256s of data, it’s implicitly secure. And it uses tor for the rendezvous, so it’s not possible to track who is requesting what, except by unique ID (which can just be a bitcoin wallet address you control).
The hard part is, what do you do about abuse? What if someone spams the network with bots that try to fulfill every request with bad data?
If anyone has research refs in this direction, I’d be grateful to read them. I’d also like to avoid a blockchain if possible, since it seems unnecessary for simple federated distributed data.