Using the signed exchange mechanism means you allow anyone to serve your content. You will no longer know when it has been served and by whom. Instead, Google will know more about what your users are consuming on your website than you - despite HTTPS!
Also, there is no mechanism to limit who is allowed to serve your content for you.
I see no technical reason why the content has to be prefetched from Google instead of your own server.
It's also confusing for users and administrators.
Want to block access to a website in your network?
Guess what: Your block will not be effective because Google will proxy the data unbeknownst to the firewall.
The reason it has to be prefetched from not-you is to protect the users privacy. Until they click a link it is not considered acceptable to leak their search to the potential destination. Links have to be fetched from a third-party who the search engine trusts not to share the data, that at the moment is Google but will hopefully expand.
Google already can do this by preloading a cached page from its own domain. So this specification is unnecessary.
I think the real reason is that Google wants to build a walled garden, but doesn't want the walls to be noticeable. Even with AMP, they display a header that looks like a browser's address bar [1]
Also, on that page Google admits that it uses AMP Viewer to collect information about users:
> Data collection by Google is governed by Google’s privacy policy.
Which is probably their real motication for creating AMP.
> Google already can do this by preloading a cached page from its own domain.
That's what AMP already did. This spec is better because it ensures publishers retain control over their own content, and doesn't confuse users by showing "www.google.com" in the URL bar for content that didn't originate from Google.
Publisher might want to display their URL in the address bar. But as a user I want to see the actual URL, not what Google or publisher want to show me. I don't want to see "example.com" in the address bar while I am actually connected to Google via a TLS connection signed by Google's key and my IP address is collected according to Google's privacy policy.
What confuses users is Google displaying a fake address bar [1] or browser displaying the wrong URL.
The URL you see _is_ the actual URL. It doesn't matter where the content was initially loaded from because the page is signed by the publisher's private key (the publisher has full control over the page contents, Google can't alter it).
The content is served from Google's servers according to Google's (not publisher's) privacy policy. While Google cannot alter the content, it sees the unencrypted HTTP request. I don't want neither Google nor publisher to control contents of my address bar.
Google already knows the unencrypted contents of the page, and they know you clicked on a link to it (from their search results page). The signed exchanges system doesn't reveal any information to Google _or_ the publisher that they don't already know.
Your browser controls the contents of the URL bar, not Google or the publisher.
I copy your post, but make it available further up the thread. Even though I sync your comment’s edits to mine several times a day, I also control Hacker News so I get them to display your username in place of mine, as to not confuse readers.
Page and DNS prefetching exists, HTML exists, why not just link to the page on the original domain?
> I think the real reason is that Google wants to build a walled garden
Exactly, this is the real reason why this abomination came into existence - all of this is masked as work for greater good all for those poor kids with limited network speed.
As end effect everyone will suffer - user will never leave google ecosystem, he will remain on search page without even knowing about it, creator will lose control over his own content
Exactly: Google built a walled garden and now replaces the fence by glass, because people don't like the view from inside their cage. The worst thing is, that it'll get away with it.
Why should the user‘s privacy be protected toward the content provider instead of the search provider? The search provider already knows more about me.
I think that product is still cached by Google. Cloudflare is just providing the cryptography for Web Packaging so that the browser will show the url from the original page instead of the Google cache.
A bit more on this... there are a LOT of secondary bots out there, either searching for security holes to exploit or otherwise slurping content for reasons other than search.
JS based analytics (google or otherwise) is generally a better option for detecting actual usage. Yeah, you lose maybe 2% of actual users. You also lose 99% of the various bots. You still have to filter google's and bing's bots that execute JS though.
Also, there is no mechanism to limit who is allowed to serve your content for you.
I see no technical reason why the content has to be prefetched from Google instead of your own server.
It's also confusing for users and administrators. Want to block access to a website in your network? Guess what: Your block will not be effective because Google will proxy the data unbeknownst to the firewall.