"Cross-origin resource sharing (CORS) is a policy that prevents a webpage from connecting to a server, unless that server has given that webpage permission to connect."
CORS only applies to requests for restricted resources like fonts or XHR requests that aren't simple GETs and POSTs.
"So, while this is typically not possible: foo.com/index.html ---GET---> bar.com/pic.jpg"
Typically this is possible. Images, stylesheets, scripts, iframes, and videos aren't subject to CORS.
"You can solve it by routing the request through your host server: foo.com/index.html ---GET---> foo.com ---GET---> bar.com/pic.jpg
Not necessary, the client's browser can get bar.com/pic.jpg just fine all by itself.
"Pinning tools like Hashbase and Homebase help keep dat:// sites online"
If you publish a dat archive, how do you notify Hashbase to pin it? Can you do it through dat?
To keep my dat alive with Hashbase, do I have to set up an account, provide and confirm an email address, link a credit card, etc.?
Are centralized servers, financial institutions and surveillance all required components of the anonymous, decentralized, peer-to-peer web?
> Typically this is possible. Images aren't subject to CORS.
You're right, that was a misleading example. I changed it to data.json to be more clear.
> If you publish a dat archive, how do you notify Hashbase to pin it? Can you do it through dat? Does it require setting up an account on Hashbase, providing an email address, linking a credit card, etc.?
The "Pinning" system is very similar to Git remotes. You can pin using any endpoint that complies with https://www.datprotocol.com/deps/0003-http-pinning-service-a.... So, similar to a git remote, you do need some kind of authentication with the pinning service - unless somebody writes one that's open for anybody to push to.
The UX flow will be similar to a git remote as well, you use an HTTPS to tell the server to sync the dat. So, it's an explicit user action.
>> Typically this is possible. Images aren't subject to CORS.
> You're right, that was a misleading example. I changed it to data.json to be more clear.
CORS doesn't preflight GET, POST or HEAD methods unless they have custom headers or content-type other than application/x-www-form-urlencoded, multipart/form-data or text/plain.
So a simple GET bar.com/data.json works just fine in today's browsers.
In the absence of CORS header configuration on the target site, you can't use XHR or Fetch to get anything from a different domain. It doesn't matter if it's JSON, an image, or plain text.
To some extent, you're conflating some of the requirements to avoid preflighting with cross-origin requests simply being allowed, and they're not the same.
I'd be happy to be corrected on this, but here's what I understand:
While fetch doesn't preflight for GET, it does require an Access-Control-Allow-Origin header. You can specify `no-cors` in the mode to circumvent this, but then you cant access the response body (https://developer.mozilla.org/en-US/docs/Web/API/Request/mod...)
or content-type other than application/x-www-form-urlencoded, multipart/form-data or text/plain.
application/json is a content-type other than application/x-www-form-urlencoded, multipart/form-data or text/plain, and thus a request for it will fail unless the required CORS headers are present
There's a couple issues with your comment: first, the restriction you're talking about is related to preflighting, not to whether requests are allowed at all.
Additionally, you're thinking about the "wrong" Content-type header: the limitation you're mentioning about urlencoded and so on is a limitation on request headers, not response headers.
The CORS headers are required for the GP's described request to succeed, but not for the reasons you give.
> Are centralized servers, financial institutions and surveillance all required components of the anonymous, decentralized, peer-to-peer web?
Servers and financial institutions will never go away, no matter how hard we try. I'm hoping that we can make surveillance go away by sharing much less data with organisations who rely on surveillance to survive.
Hashbase is not "centralized" in the sense that you are always free to choose a different provider of its hosting services. You can host your own Hashbase: https://github.com/beakerbrowser/hashbase
You can even choose multiple providers at once, providing you with resilience in case one of your chosen providers violates your trust, eg. by losing your data or using it to spy on you.
> CORS only applies to requests for restricted resources like fonts or XHR requests that aren't simple GETs and POSTs.
CORS applies to XHR. Including GETs and POSTs, and including fetching images over XHR.
There's some other sibling comments here discussing preflight requests, which it sounds like you might be referring to, but CORS is not limited to just preflight requests.
Sure you can, you can make any request you want without CORS, the content is just opaque to you (for example you can display an image but not manipulate its pixels).
You do this for example with `fetch('./any-resource', {mode: 'no-cors'})`.
You can then for example do a `.then(x => x.blob())` and then use the resulting blob as an image.
> If you publish a dat archive, how do you notify Hashbase to pin it?
This is provided by hashbase in their ui. Sign up for an account and add your dat. It is email based, no credit card/address unless you go over the data cap.
This is a little backwards, the goal of CORS isn't just to protect the _user_ it is also to protect the _third party website_.
All it takes for a website to opt-in to this is just adding a single header - it's possible for bar.com to allow the request from foo.com by opting into it.
>For each new origin that the site contacts, a permission prompt will be presented
I don't think this is an adequate approach to security. When the browser presents me with a prompt to load data from a third party site, I don't know what data is being loaded, what it's being used for, or whether this prompt is expected (as part of the regular functioning of the application) or unexpected (indicating that the application has been compromised, and I should navigate away from it).
In general, I've noticed that users react in one of two ways to these sorts of prompts. Naive users will blanket allow -- allowing all sites to access all of the capabilities of their browsers, regardless of the reason or necessity of that access. More sophisticated users will blanket deny. If it's not immediately apparent why a site needs the permission that it requests, that request will get denied, even if it's a valid requirement. Very very few users will think about why a site is requesting the permissions that it is requesting and consider those requests on a case by case basis.
I like the ideas of dat and IPFS, but I can't quite understand the difference.
What I can understand, is, they use new protocols and this is an issue in the Web today. I think the only way they can succeed would be with some laws or misstakes by big corps that would drive customers away from them.
What I also liked was remoteStorage [0] it is a bit like localStorage, but the data is managed independendly from the application itself.
There's a lot of similarities, as they are both peer-to-peer and decentralized.
I've mostly done Dat. I want to do a bit more IPFS.
Dat feels a bit more like git for files - you can create a local file archive using the command line tools, and it's a separate step to sync it to the peer-to-peer network. There's a global discovery service for advertising archive keys, but it doesn't work at the level of single files. It's very lightweight.
IPFS supports many of the same operations, but you're mostly interacting with a local gateway server which is continuously connected to the network. I believe IPFS tries to content hash every single file so they are de-duplicated globally.
Hashes and append only logs are all not very good for realtime data because of the extra overhead that has to be calculated, but CRDTs are.
CRDTs naturally fit with P2P/decentralized topologies, and we've generalized them in https://github.com/amark/gun which is the most popular (8K+ stars) open source (MIT/Zlib/Apache2) realtime decentralized database.
It is running in production on P2P alternatives to Reddit and other apps, that have pushed over half a terabyte in a day.
It's not incorrect to say that hashes and signatures add some overhead, but the question is whether the overhead is significant enough to matter for the usecase. Probably not.
Dat is realtime. You are notified about updates as soon as they are distributed. In Beaker, the files-archive API has a `watch()` method to do this. If you're accessing a dat files-archive, you're participating in the syncing swarm and so you'll receive those updates automatically.
You'll want to use a UDP socket if you're streaming a high volume of data with low latency requirements, for instance for a multiplayer FPS. But Dat has been used to stream live video, so it's probably real-time enough for most Web use-cases.
Small aside: Comparing Dat to CRDTs is apples to oranges. It's like comparing Javascript to B-Trees; they're not quite the same kind of technology. In fact, the new version of Dat uses CRDTs in order to allow multiple users to write to a files-archive.
When we met just a month or so ago you didn't tell me you were adding CRDTs!!! This is very exciting news. Dominic had mentioned a specific type of CRDT he had added (but wasn't generalized).
Append only logs have overhead that would make it a poor choice for most realtime applications, like GPS tracking, yes FPS games, google Docs, website creators, and many more. Basically any use case where data mutates.
In 2010 I used and built all my own custom event source system, I was the hugest proponent of this / append only logs. It was so futuristic. But 4 years in I hit all sorts of scaling problems and had to redesign everything from scratch and that is when I found CRDTs. On all accounts they are superior, in a mathematical or logical manner, because they are a superset to DAGs, immutable/append only logs, and many other popular data structures. Not apples and oranges.
I built a little side multiuser wiki side project that’s actually using two levels of CRDTs... hyperdb (underneath hyperdrive) and then automerge on top of that. Sort of hard to explain the full design in a short entry, but you can play with it here:
It's been around for a year or so. I was wondering when it'd make it to HN. It's very nicely done but unfortunately the only implementations are in javascript.
I'm not sure Rust is an especially great language for interop purposes or that the productivity will be high enough to keep up with the JavaScript implementation.
If you want to make a library with a C compatible API I'd be tempted to explore SubstrateVM. It can export C symbols to the generated standalone .so / .dll files, maybe you can even reuse some of the JS code, or failing that, a Java/Kotlin implementation would be compiled down to native code or be usable from other scripting languages like Ruby.
I know the people involved and they were definitely working on it (with funding) a long time ago. I remember talking with them about it around the time of the io.js fork which was in 2015. The project was up and running at that point.
Really neat to see beaker broswer support native P2P with progressive enhancements like github.com/beakerbrowser/hashbase providing the cute URL's we all like.
I'm sure hashbase.io will be blocked really quick, so it's important that the core P2P address system stay in the forefront. Transports also need to find many ways to communicate over https, shadowsocks, tor, DNS, and others.
Doesn't answer your original question but I moved to syncthing from dat. It just works out of the box, no manual setup, and a very active community around it. Also opensource and in production for years with rave reviews, and being used in plenty of big scale production projects.
I never understood how Beaker browser can act as a server listening on a port. It sounds like you always need relays on the internet because your router and ISP is gonna block all ports unless requested not to.
UDP hole punching (using the UTP protocol) and the discovery network works a lot of the time.
Much of the people publishing public content for access by Beaker are using hashbase.io to "pin" the content and to act as a public peer, and those ports aren't behind a firewall, so the data can be directly replicated easily.
I still wish these guys all the best, but if you want to start doing 3rd party like this then everything will eventually devolve like it did for the normal web. We do need a new web and way of moving information, but once you start directly connecting to servers they need to know who you are.
I don’t hold this binary view that a “decentralized Web” has to avoid certain technologies. I believe we should aim to use peer-to-peer systems where it’s impactful and practical, and in the rare case you do need a third-party server, there are things you can do to limit your dependence and make them easy to reconfigure.
That’s our approach with Beaker. We use peer-to-peer systems as much as possible, and then plug in servers as minimally as possible.
I upvoted you and wish you all the best, but the core problem I see with this is that it makes it hard to make policy around.
If I'm making a new system or setting policy for a government or other high-security minded client (like a political campaign, military contractor, activist group, or private intelligence corp) I need off the shelf stuff with zero known attack surface OR I need to individual vet every single offering within that protocol suite. This is why you can email members that work for The Government of Ontario, but they won't click on links to non-whitelisted places. The attack surface when clicking a link is fucking huuuuuge (pdf 0days anyone?), while the attack surface for loading an email is much smaller.
There are a ton of interesting web-replacements that hackers are playing around with right now, but the one that wins for the next web is the one that lets stupid people do whatever they want without worrying. In my opinion, 3rd party means worrying, and in an ideal world it would go away.
The irony of this whole thing is that I'm actively arguing against my own long-term interests. A structural change of the kind I advocate for would dramatically reduce the profitability of being in either data science or cybersecurity; both fields I have a foot in. But I don't care.
Securing the flow of information between people is too important to humanity's long term survival.
Yeah that's an interesting perspective. There are a lot of security issues that come into play when we start toying with how the Web platform works, and I'm somewhat curious whether all Websites should have a sort of "uninstalled" versus "installed" mode, where the uninstalled mode is basically able to do nothing. Then users have to go through an "install" flow to enable the riskier APIs.
I think one other area that the Web hasn't tapped into enough is using protocol/scheme identifiers to introduce strong guarantees to URLs. You can compose schemes with a '+', so I think if you wanted an "on click guarantee" that a site is going to have certain security properties, you might try something like:
http+safe://.../
dat+safe://.../
And then the site would load in a "safe mode" which, like the "uninstalled" mode, is extremely limited in what it can do.
Firefox is in the early phase of implementing support for non-HTTP protocols, starting with Tor. My understanding (from reading some Mozilla comments here maybe?) is that their goal is to make this pluggable so that eventually Firefox can support all kinds of non-standard transports, including p2p ones like dat://
We’re not there yet, but Mozilla is making a very positive step in that direction, and hopefully other browser vendors will follow. If I were creating a new protocol, I would be doing it now, to “skate where the puck is going” as they say.
I think the really interesting things here are more than just "support new p2p transports/protocols". The things Beaker is doing around changing the way we author the web are the real important pieces, and finding the "middle ground" or transition path is one of the keys to success. So, to answer the OP as well, supporting loading assets from traditional servers is pretty much a requirement for authors I think.
Everyone is a server is pretty much the opposite of "serverless". The article is just illustrating the migration path from traditional web assets from centralized servers and how you can use them in the distributed web.
A real life use case would be something like:
You want to build a financial tracking app in the distributed web. But your bank is the central source for your data. With the method shown in the article you could request the api endpoint from bank.com inside your p2p/Beaker app then load that data into local storage/a dat archive/memory and use it to track your data. Think mint but without giving your bank credentials to a 3rd party.
The name 'Serverless' really doesn't explain the concept. Basically it means you dont have to worry management of servers, and you only do low effort work by out sourcing most of your logic (authentication, db, etc) to 3rd party APIs and vendors.
CORS only applies to requests for restricted resources like fonts or XHR requests that aren't simple GETs and POSTs.
"So, while this is typically not possible: foo.com/index.html ---GET---> bar.com/pic.jpg"
Typically this is possible. Images, stylesheets, scripts, iframes, and videos aren't subject to CORS.
"You can solve it by routing the request through your host server: foo.com/index.html ---GET---> foo.com ---GET---> bar.com/pic.jpg
Not necessary, the client's browser can get bar.com/pic.jpg just fine all by itself.
"Pinning tools like Hashbase and Homebase help keep dat:// sites online"
If you publish a dat archive, how do you notify Hashbase to pin it? Can you do it through dat?
To keep my dat alive with Hashbase, do I have to set up an account, provide and confirm an email address, link a credit card, etc.?
Are centralized servers, financial institutions and surveillance all required components of the anonymous, decentralized, peer-to-peer web?