I can’t believe there is no way to opt out of AMP as the end user. The UX is so terrible. Often times I will search for something and have a Reddit result come back. When I tap the link, I get the AMP page which:
* does not show all comments, often ones I am actually looking for
* does not let me collapse comment sections
* uses the default white background theme which burns my retinas if I am looking at my phone in a dark environment
* shows overlay ads for the Reddit app that cover about 40% of the screen for no goddamn reason
* requires 2-3 separate actions to get to the original page
Yet I cannot find a browser extension or setting to tell AMP to fuck off. Honestly AMP might be what finally gets me to switch search engines after many years of using Google.
Using DuckDuckGo with a backup of !g (send search to google), I don't think I've ever hit an AMP page in search results in my life. Maybe because I only use !g for really technical searches.
AMP was what made me abandon Google Search in January 2018 for DuckDuckGo.
Surprising how little I've noticed the change, after using Google Search for over 15 years. I try queries on google.com maybe once or twice a week if I don't find what I'm looking for on DDG. If it's anything media or product related, I feel like I'm on an old, crowded MySpace page. DDG feels more like the old Google.
You can even tweak your DuckDuckGo settings (dark theme, etc) and save them in DuckDuckGo using a passphrase of your choice, which you can restore and keep them in sync across those devices.
I pick on Reddit because it has the most glaring issues. But there are others. Certain tech sites like Gizmodo come to mind. So do some news sites. It's especially weird when the result is a page that contains a video, and the video is what I actually want but because of AMP it doesn't load, and it's not immediately apparent what's going on.
AMP is straight up broken technology. Imagine if you subscribed to a print version of the NYT but instead of getting the Sunday edition you got a ransom note looking summary of some of the articles from Clipper Magazine. Would you be OK with that?
Use something like Searx [1], either self-hosted or hosted by someone or some organisation you trust. You can still get Google search results if you feel the need, either by explicitly asking for them (!go for search, !goi for images, !gon for news, !gos for scholar, !gov for video) or by enabling the Google engines in your config. In the latter case you get Google results mixed up with the other enabled engines. The results are presented as normal links without redirection through the originating search engine.
Searx can be extended so it would be possible to create a plugin which rewrites AMP links into non-AMP equivalents, where available. It can already do things like Open Access DOI rewrite (Avoid paywalls by redirecting to open-access versions of publications when available) so the ground work has been done. I'm currently working on improving (and fixing, where necessary) the image search engines and will probably start on such a plugin if nobody else beats me to it.
It's not really google's fault Reddit's amp sucks. AMP sucks, IK, I'm working on my company's AMP pages right now and they are a PAIN. But with enough tweaking, they can be gotten right. So I wouldn't blame Google for Reddit's devs.
Using the signed exchange mechanism means you allow anyone to serve your content. You will no longer know when it has been served and by whom. Instead, Google will know more about what your users are consuming on your website than you - despite HTTPS!
Also, there is no mechanism to limit who is allowed to serve your content for you.
I see no technical reason why the content has to be prefetched from Google instead of your own server.
It's also confusing for users and administrators.
Want to block access to a website in your network?
Guess what: Your block will not be effective because Google will proxy the data unbeknownst to the firewall.
The reason it has to be prefetched from not-you is to protect the users privacy. Until they click a link it is not considered acceptable to leak their search to the potential destination. Links have to be fetched from a third-party who the search engine trusts not to share the data, that at the moment is Google but will hopefully expand.
Google already can do this by preloading a cached page from its own domain. So this specification is unnecessary.
I think the real reason is that Google wants to build a walled garden, but doesn't want the walls to be noticeable. Even with AMP, they display a header that looks like a browser's address bar [1]
Also, on that page Google admits that it uses AMP Viewer to collect information about users:
> Data collection by Google is governed by Google’s privacy policy.
Which is probably their real motication for creating AMP.
> Google already can do this by preloading a cached page from its own domain.
That's what AMP already did. This spec is better because it ensures publishers retain control over their own content, and doesn't confuse users by showing "www.google.com" in the URL bar for content that didn't originate from Google.
Publisher might want to display their URL in the address bar. But as a user I want to see the actual URL, not what Google or publisher want to show me. I don't want to see "example.com" in the address bar while I am actually connected to Google via a TLS connection signed by Google's key and my IP address is collected according to Google's privacy policy.
What confuses users is Google displaying a fake address bar [1] or browser displaying the wrong URL.
The URL you see _is_ the actual URL. It doesn't matter where the content was initially loaded from because the page is signed by the publisher's private key (the publisher has full control over the page contents, Google can't alter it).
The content is served from Google's servers according to Google's (not publisher's) privacy policy. While Google cannot alter the content, it sees the unencrypted HTTP request. I don't want neither Google nor publisher to control contents of my address bar.
Google already knows the unencrypted contents of the page, and they know you clicked on a link to it (from their search results page). The signed exchanges system doesn't reveal any information to Google _or_ the publisher that they don't already know.
Your browser controls the contents of the URL bar, not Google or the publisher.
I copy your post, but make it available further up the thread. Even though I sync your comment’s edits to mine several times a day, I also control Hacker News so I get them to display your username in place of mine, as to not confuse readers.
Page and DNS prefetching exists, HTML exists, why not just link to the page on the original domain?
> I think the real reason is that Google wants to build a walled garden
Exactly, this is the real reason why this abomination came into existence - all of this is masked as work for greater good all for those poor kids with limited network speed.
As end effect everyone will suffer - user will never leave google ecosystem, he will remain on search page without even knowing about it, creator will lose control over his own content
Exactly: Google built a walled garden and now replaces the fence by glass, because people don't like the view from inside their cage. The worst thing is, that it'll get away with it.
Why should the user‘s privacy be protected toward the content provider instead of the search provider? The search provider already knows more about me.
I think that product is still cached by Google. Cloudflare is just providing the cryptography for Web Packaging so that the browser will show the url from the original page instead of the Google cache.
A bit more on this... there are a LOT of secondary bots out there, either searching for security holes to exploit or otherwise slurping content for reasons other than search.
JS based analytics (google or otherwise) is generally a better option for detecting actual usage. Yeah, you lose maybe 2% of actual users. You also lose 99% of the various bots. You still have to filter google's and bing's bots that execute JS though.
This is such a strange reaction from HN. The AMP cache URLs have been a top 3 complaint about AMP here. "I can't copy-paste URLs, it's hard for users to understand which site they are on, it looks like the content is provided by Google rather than the real provider", etc.
Now there's a solution that preserves the preloading and validation benefits of AMP caches but maintains the original URLs, in a way that's cryptographically sound, in the process of being standardized, and controlled by the publisher. This gets launched much faster than one would have expected. And suddenly everyone pretends that the AMP cache URLs were never a problem and this is some kind of a power-grab.
I think that most people are worried about Google using a controversial[0], draft web "standard" (Signed HTTP Exchanges), that introduces a major change in how the web works, in mass production, without trying to first resolve the problems raised with the proposal.
[0] For instance, Mozilla considers the current specification to be harmful[1].
Currently, it's difficult to implement however unlike rewriting a page in AMP, signing the page is a purely mechanical operation. All that's required is to improve the tooling, it is theoretically possible to be a one-click change for any website out there. Initially adding gzip support to a web server was difficult and out the reach of many webmasters, now it's basically universal.
There was an attempt to address Mozilla's concerns[1], but Mozilla never responded, unfortunately. If the Mozilla community chooses not to respond, that might cause people to consider whether or not their position should be given much weight.
What do you mean they never responded? They say they are working on a response[0]. Taking time to respond and informing the other party that it will take a while is not "never responded".
3 months is a long time.... how long is someone supposed to wait for a response before you just move ahead? If the answer is "forever", it becomes trivial to perform a denial of service attack on a standard. Mozilla specifically said, "this is not high priority for us". If it's not high priority for them to respond, that's fine, but waiting forever doesn't seem like a reasonable thing to require.
I am against this standard. First, I want to see the real URL in the address bar. Second, I don't want Mozilla to spend resources to implement specification that is made by Google for its own purposes.
I simply don't trust Google to not change the rules later.
What will stop Google from down-grading 2nd class URLs (ie, not hosted with google) to page 2 results?
It's effectively the same thing as having no AMP at all, yet they cleverly got everyone on board with this tactic.
Edit: I just skimmed through this... this looks _WORSE_ than having Google show their domain. This is some of the sneakiest most deceitful garbage I could have ever imagined.
Just no way. Need convincing? Look at the animated gif half way down:
On their page about AMP Viewer Google admits that they are collecting user's data when they view AMP pages [1]:
> The Google AMP Viewer is a hybrid environment where you can collect data about the user. Data collection by Google is governed by Google’s privacy policy.
With replaced URL it will be more difficult to spot.
Because literally the entire private reason for AMP is a power grab.
Not the public reason, but absolutely the private reason.
If Google, Apple, Amazon, Microsoft, or whatever publicly traded company makes a move its for money and power and preferably power, since that yields even more money.
AMP on web and email is the perfection of embrace, extend and extinguish
They also push a browser based on Chromium. I wonder how many intentional incompatibility issues there will be with other web browsers that are not developed by G$$gle.
Well, I wouldn't call this a solution just yet. If you read through the documentation, you'll find that this won't work on shared hosts and requires a TLS certificate "that supports the CanSignHttpEchanges flag. As of April 2019, only DigiCert provides this extension." [1] Plus, as if the lift of transforming HTML into AMP HTML wasn't already big enough for your average web site owner, implementing signed exchanges will be over the head of 99% of the folks building web pages on the Web.
IMO, while the URL problem was a big issue, the bigger issue is that AMP's restrictions and limitations gives your users a neutered user experience in the final end. As others have pointed out, if it wasn't for Google's implicit requirement to implement AMP (e.g. to get into their carousel and other locations), AMP would have been DOA.
> as if the lift of transforming HTML into AMP HTML wasn't already big enough for your average web site owner, implementing signed exchanges will be over the head of 99% of the folks building web pages on the Web
Converting web pages into AMP isn't something you can automate, but supporting signed exchanges is. You need certificate authorities to support the flag and web servers support the protocol, but if this catches on then the only thing you'll need from the site owner is the decision on whether to allow it.
Well, it's disappointing DigiCert didn't tell Google to fuck off. I hope this never comes to something like Let's Encrypt, so the vast majority of developers can never use this.
Sometimes, Google needs a gentle nudge from users saying "we don't like this" and hope they reconsider (I doubt it).
> I hope this never comes to something like Let's Encrypt, so the vast majority of developers can never use this.
Let's Encrypt's response:
I think it’s likely too early in
this draft’s development for Let’s
Encrypt to prioritize implementation.
It looks like it has a ways to go
within the IETF before it would be
an internet standard.
Honestly, if comcast and friends started blocking this crap by default (with an opt in for people that want to be spied on by google) I’d take back at least half the mean things I’ve said about Pai.
AMP has a lot of things all together, some of which I like:
* I like that when AMP is used for ads then the ads are fully declarative. Advertisers getting to run custom javascript, even in a cross-domain iframe, isn't great.
* I like that AMP allows sites (currently primarily search engines) to trigger preloading in a way that doesn't leak information to the site that is being preloaded.
* I like the way things like "sorry AMP only allows us to use 50k of CSS" can give developers leverage to push back against bad site designs.
* I like that it centralizes some measurements: instead of every ad provider using their own custom polling system to determine if the ad is on screen they can all subscribe to events triggered by a single well written system. This doesn't affect the amount of tracking (there's lots either way) but it makes it hurt the user experience less.
* A lot of people that don't want to implement AMP are doing it because then they get more search traffic. I understand how there isn't currently a non-AMP way of doing preloading in a way that doesn't leak information to the site (see above) but I think Web Packaging should be extended to support this in the general case and allow publishers to use AMP only if they want to.
* The interaction between AMP and content blockers isn't great. If you have a content blocker set to allow some JS but not all (for example, no third party JS) then it's not going to run the AMP JS or the contents of the <noscript> block, and AMP pages will render with 8s of white screen before the CSS times out. This is a pain, but I'm not sure what the right way to fix it would be. (I wish content blockers were smart enough to figure out which <noscript> tags to run, but that's probably asking too much.)
If you wanted to expand on how AMP seems like an attempt at a walled garden I would be interested in reading it; I haven't previously read any explanations that made sense.
> Do you trust google to treat non-AMP pages the same as AMP pages?
Google clearly doesn't treat AMP and non-AMP pages the same way: only AMP pages are eligible for the carousel in Google search, and there's a little icon.
Once there's a way for non-AMP pages be safely preloaded I would be very surprised if Google search didn't start doing that, though. (Speaking only for myself, not the company.)
Fair enough but there is now zero need to load them from the AMP cache at all - this security model could allow News Carousel to load them from the originating site and still have access to the pre-rendering instant load magic/lies that AMP provides.
It feels a little dodgy to me this standard and a bit embrace extend but I'll see how it plays out and reserve judgement until we see this happening in the wild and how well it works. Personally I'd like to be informed in the browser chrome that it was being served via this mechanism rather than me visiting the original site.
Can you maybe see that people feel the browser is now lying to them about where the content is coming from?
If you're loading the content from the originating site, surely there's no benefit at all to signing. If you're loading the content directly from the site, the browser just needs TLS to verify the integrity of the content.
And you're also back to the situation where you can't preload the content in a controlled manner or privacy-preserving manner, nor have the page-speed guarantees since the version being served to the user is not the version that Google crawled.
It's kind of the opposite. The cache is where the actual benefits come from. That's not the part you want to get rid of. The AMP spec was just a vehicle for making the caching possible in a secure manner.
This model would theoretically allow the validation, caching and prefetching to be done for all (signed, so opt-in by the publisher) HTML pages. Which is another one of the historical top complaints about AMP: why can't light, fast-loading, mobile-friendly HTML get the same treatment in search results.
> Can you maybe see that people feel the browser is now lying to them about where the content is coming from?
I can see that they are feeling like that, I just don't understand how they arrived there.
How is this different from a e.g. company X's website being behind Cloudflare? The browser didn't contact the actual server that company X hosted the content on. Instead the browser contacted a server run by Cloudflare that could prove cryptographically (via TLS) that it was authorized to serve content on behalf of the actual site.
> And you're also back to the situation where you can't preload the content in a controlled manner or privacy-preserving manner...
A few people have pointed out the privacy-preserving aspect of AMP. I'm not sure I get how that's the case. Is this referring to the fact that the page is not being pre-loaded from the content owner's own webserver? The main privacy violators on the internet are Google and Facebook. How is loading something from Google cache protecting my privacy?
Worse still, if someone posts an amp link on Twitter or a chat client Google now gets to know when I access a specific website even though they are an unrelated third party[1].
Edit: [1] In practice this was probably already the case since Google Analytics is so popular. But still.
If you make a search query, but have not clicked on any results, you have a privacy expectation that the web servers of the search results you have not clicked on will not know you performed this query, your ip address, cookie, etc. For example, if you search for [headache] and then close the window, mayoclinic.com knowing that you made this query would probably be a surprising result.
With naive preloading, you would preload a search result from that origin. Your browser would make an HTTP request to the site and that site (sending an ip address, the URL you are preloading, and any cookies you may have set on that origin). So, this approach would violate your expectation of privacy.
Instead, if the page is delivered from Google's own cache, the HTTP request goes to Google instead of the publisher. Google already knows that you have made this query, and are going to preload it (the search results page instructed your browser to do so in the first place). The request will not have any cookies in it except for Google's origin cookies, which Google already knows as well. Therefore this type of preload does not reveal anything new about you to any party, even Google.
AMP has been doing this for a long time in order to preload results before you click them. However, until Signed Exchanges the only way to do this was that on click the page would need to be from a Google owned cache URL (google.com/amp/...). With Signed Exchanges, that can be fixed. The network events are essentially the same.
Note that once the page has been clicked on, the expectation of privacy from the publisher is no longer there. The page itself can then load resources directly from the publishers origin, etc.
To your last point, if someone posts a link on twitter to an AMP page on a publisher domain, and then you click it, your browser will make a network request to the publisher's origin. Google will not be involved in this transaction in any way. If someone explicitly posts a link to an Google AMP Cache Signed Exchange, then yes this will trigger a request to Google but this will be far less likely going forward as these URLs will never be shown in a browser. For example, try loading https://amppackageexample-com.cdn.ampproject.org/wp/s/amppac... using Chrome 73 or later. This is a signed exchange from one domain being delivered from another. You'll never see that URL in the URL bar for more than a moment, so it's unlikely to ever be shared, like I'm doing now.
Thanks, this was very informative. I'm not a fan of AMP at all, but this helps me understand the reasoning a little bit better and why Google hosting the AMP cache is necessary for preserving privacy.
At its root, I think my objections to AMP boil down to a few things:
On a technical level:
1. It's buggy and weird on iOS.
2. I'm not convinced I care about a few seconds of loading time enough to justify the added complexity of making this kind of prefetching possible. Additionally, this seems like a stop-gap that will be rendered unnecessary by increasingly wide pipes for data.
On a philosophical level:
3. It gives Google way too much power over content.
4. I want the option to turn it off completely because of points [1] and [3], and because I fundamentally want to feel in control of my internet experience.
Edit: The point about SXG making AMP URLs less likely to get copy/pasted to other mediums is a key benefit I hadn't considered and will likely make avoiding AMP outside of Google search easier.
2. How many URL's do you load in a day? My browsing history over the last 10 years averages to 417 pages per day. 2 seconds per URL is 35 days of my life...
Bandwidth increases do not fix latency. If a document has to round trip from the other side of the planet, that adds about 200 milliseconds until we break the speed of light. If that same document must make several round trips to be able to initially load (very common!) this adds up rather quickly. The only solutions are localized caching and prefetching.
Yes, exactly why people think this is creepy. I also expect you not to start using battery rendering shit I haven’t asked you to in the background or data that again you don’t have permission to use. Just because the majority of users don’t care doesn’t mean you gregable are not corrupting the foundation of a free web. I still feel you’re making the web super creepy, grabbing extra data and the whole project focused could be accomplished without this embrace and extend - derank slow pages more aggressively doesn’t lead to a two tier web and doesn’t tie everyone further into Google’s brain washing algorithms. But this “solution” to the problem at least for now Chrome doesn’t visit Google for these new style links from elsewhere so at least that is some improvement. The fact this whole project should not exist and adds zero value and I can’t opt out is a massive problem for me.
If the browser were to prefetch search results, it would leak information to all the result pages about the user having done that search. (I once had a blog post accidentally rank on the first page for "XXX". I really don't want to know who is searching for that particular term.)
Google has to know what you're searching for to compute and show the results. So there are few additional privacy implications from the preload.
And your last case is exactly what will no longer happen. People will now copy-paste the original URL rather than the cache URL. Click on the link, and you're taken to the original site.
> If you're loading the content from the originating site, surely there's no benefit at all to signing. If you're loading the content directly from the site, the browser just needs TLS to verify the integrity of the content.
The browser security model stops them from doing this, but presumably in this new world they could allow this to work and not host the content in the carousel themselves.
I think the argument about content suddenly becoming "slow" and no longer AMP validated if it's not served from the AMP cache is a poor one.
Finally I'm willing to postpone judgement but I did just explain why people feel that Google is embracing and extending the web if you can't understand why people are worried about this that's not something I can help you with ;-)
Cloudflare does not have the same scope, power, monopoly or scale that Google have - I can change CDN provider if they start doing weird stuff, no problem, but I can never really get away from Google.
My biggest quarrel with this is that its just another way for google to take control over the internet. Does any other search provider than google use AMP? Does any browser other than googles own support this?
How busy are you? You can't wait 0.5 seconds for an HTTP request? And you think its worth feeding google with more precise data about your movements online than they already have? And as a business integrating AMP, loose control over your own content and platform? Why?
my counter argument to this would be: we don’t need more corporate control of the internet and standards, we need less... bing throwing their weight in isn’t any better imo
I don't have links to hand but everything I've seen shows real dropoffs in users as you increase the time. Once you're looking at low numbers of seconds you're looking at significant numbers of users simply abandoning the site. Half a second extra is not insignificant, and the user experience changes a lot between things that feel instant and things that have a noticeable wait.
Yes but you don't need AMP to have a fast loading website or even one that applies the same principles as AMP when it comes to having inline CSS, loading scripts async etc. The biggest problem in all of this is usually ads and analytics anyways.
Also Google controlling AMP specifications means Google can decide what widgets (from what companies) can be there on the page, what ad networks and analytics systems can be used.
This is actually the opposite: users are deceived because they think they connect to publisher's site but in fact they are still inside Google's walled garden. Their data are collected according to Google's privacy policy but it is difficult to spot looking at the address bar.
Also, Google controlling AMP means that Google decides what analytic systems and ad networks are allowed on the AMP page. With Google having its own ads and analytics business, doesn't this tempt them to make life little easier for its own products and little more difficult for competitors'?
As bad as the URLs were, at least you could edit them to get back to the non-AMP version if you were technically literate enough. Now there'll be no distinction, you could get sent to an AMP link from Google which is a lesser experience than the 'real' site and have no way of getting out.
I believe if you refresh the page it triggers a request to the original site, which will probably then choose to give you the non-AMP version of the site.
It only works in Chrome. The Web has now been split in two, and you now have to use Google Chrome to be on the faster version. Google is shamefully abusing its power in several places here.
Google is unethically abusing their power against non-Chromium browsers like Firefox. Speed matters in the eyes of users, even if we individually block AMP. See the link below for a general pattern.
Google just gives users what they want. I've checked the link you provided and the website is total wreck in terms of user experience (subscription popup, large obtrusive ad banners and so on).
I have push notification disabled but it wouldn't be surprise for me if they asking to subscribe for push notifications on the first page view.
Current era of content websites is a disaster except few cases like medium and maybe reddit with a discount.
AMP is an only solution for general users who just want to google a cooking recipe or latest news in their town.
First, everyone go out of their way to break REST[1] caching by eliminating proxies from SSL (for some good, some bad - reasons).
And now we're trying to shoehorn it back in?
It used to be that a local caching squid proxy was a great way to make load times of various "front pages of the Internet" bearable on a shared low bandwidth uplink (local/national news sites etc typically being served from the cache/lan).
New ssl/tls kinda-sorta breaks that (there's no middle ground - either install intercepting cert that catches everything, or abandon caching on everything. Either cache CNN. com and medical records, email(webmail) and Facebook messages - or neither).
AMP might be a bridge too far - but some kind of (semi) public "signed, not encrypted" would still be a good fit for hypertext applications/documents - because of the caching benefits.
the preloading and caching seems like a marginal speed boost. the main win from AMP is just the stripped-down format, which does not require the cache.
Different groups of people are commenting at different times.
Also, Mozilla members rally around a ton of stuff here on HN. That's why you see so many posts about Rust despite the fact that it's not really that popular. That's also why the top comments on stories about MS Edge switching to Chrome where lamenting the fact that they didn't choose Firefox, despite the fact that hardly anybody uses Firefox.
What's wrong with it exactly... beside being weird. I'm not a fan of manipulating the URL the way they do with this change, but couldn't you just opt to not use AMP if you don't like it?
Ideally people would develop fast sites on their own, but apparently they need the help of Google.
If you don't use AMP your search engine placement suffers. Often dramatically, as all the pages in Google's top-most carousel are all AMP pages.
And AMP is a pain in the ass. It's sold as being "just HTML" but it isn't, really. You can't even use an <img> tag, it has to be <amp-img>. So you have to generate two versions of every page. Achievable for large companies but if you don't have a lot of resources that's a big overhead. As is so often the case, it helps concentrate all web traffic to a smaller and smaller number of sites/publishers and shutting the rest out. That's not good.
The issue is that you can't, or you risk your site being basically blacklisted from google. Especially if your a news site.
Users have no control outside of not using google. If google were to provide a setting for the user to never see AMP, I would have less issue with this. But they don't
Instead, they basically force publishers to use this because if they don't the news carousel will not show their article.
It just gives Google more control over the web for minimal at best benefits
Nah I see AMP staying around long enough to capture a significant portion of web share that they mine data from. Essentially is just a way to insert themselves as "the internet"
We need to decouple two things that are mashed together in this post:
Web packaging and Signed exchanges seems benign and beneficial, you can sign a particular page inside a package (let's say a zipped folder of some kind) and now anyone can cache that data and show it, while both the browser and the user knows that it's safe to display it. Since the AMP format is similar, it seems quite beneficial to now have all your AMP content support this feature. And anyone who made some of their pages AMP can use that same process to support other Signed Exchanges (such as p2p networks or CDNs) . This is great since it makes distributed caching much easier.
The bad part is that google search uses this signed exchange format not to show the actual URL but rather put it in an iframe inside chrome (and only chrome). The real question is whether we will be able to use this functionality outside search, if I have my own site and show a large iframe with signed exchange page, will I also be able to change the browser url bar? mmph, probably not.
There is a little confusion here, understandable. Google search will not show these signed exchanges in an iframe, the pages are full frame.
Try it for yourself. Using Chrome 73 or later (you probably already have this), and a mobile browser (either a phone or mobile emulation), try the query [amp dev success stories].
It will only use signed exchanges in Chrome because currently only Chrome supports signed exchanges. The search engine explicitly looks for the browser to state that it supports signed exchanges in an Accept header, like any other new technology.
Yes, any page can use this. So, for example if you went and fetched a signed exchange from https://amppackageexample.com/ (or any other site that supports one, this is just an example), you could then serve that from your own server, more or less just like any other file (the less is that you need to set the right Content-Type header, but it otherwise works just like serving an image or a zip file).
Forgive my lack of know-how, but does this theoretically mean I could download this _signed package_ to my computer along with the signature and use it later to prove that the information was provided by the source according to the signature?
I'm not quite following which parts were or weren't needed for what's been enabled in the post here, for the usecase of delivering a single offline package that can be opened like a website, is there something that works yet? Or a repo I should be following other than the spec?
Once I can create webpackages and deliver them to clients a lot of thing I want to do become hugely easier and nicer.
Oh interesting! I'll see what I can find there, thanks!
I also had a look in the blog and the "progressive web apps" might be the right thing to look at. There's probably something subtle that's different but I think I can use these to solve the actual problem I have.
edit - damn, I don't think this is right at all. Frustrating as it seems pretty perfect but I have to serve from my own domain for 30s before a user can install it :( I just want a single file way of delivering web content! It seems like all the features are basically there, just with restrictions to focus on different use cases.
You could prove the document was signed using the source's private key. That does prove the document was signed by the source if you can prove that only the source had access to the key.
Good question. The publisher signs an expiration timestamp in the Signed HTTP Exchange. The publisher can choose this timestamp and the browser will not respect signatures with expirations in the past. Note also that the specification requires, and browsers enforce, that the expiration cannot be more than 7 days in the future.
Wouldn't it be better to borrow from HTTP and allow a head request to the original source - with a reply of a current signature?
Isn't this whole exercise really just adapting public key signatures on top of old school caching?
With a http proxy you ask for an url, the proxy fetches or serves on behalf of the owner. This adds some circumvention around the way tls/ssl breaks that type of caching. But it should still be able to do a head-like request for a current signature - with no need to download the content again if it is unchanged?
Doing this on every page load breaks either user privacy (by making the origin fetch before the user clicks) or the preload performance gain itself (by blocking load while waiting for this round trip).
But if the signature is expired, preload would fail anyway, which would trigger a regular load "on click" - but that click should maybe result in a head request for possibly just getting an updated signature?
It's ridiculous. Google wants to keep users at its domain so much that it invents a whole technology to substitute address bar contents. This shows how harmful it is when a company has a significant market share in several different areas (browsers and search engines).
I hope at least Mozilla doesn't adopt this technology and will show the true URL.
This technology is complicated. Browser vendors have to implement all of this only to please Google.
Last week I blocked Google from my domains (blog: lucb1e.com/!130), hopefully others will follow suit and degrade the search quality until people get better results (at least for some more obscure content) elsewhere, or perhaps until Google notices we are really not okay with their behaviour.
Are you blocking GoogleBot by IP range or User-Agent match? Why aren't you using your robots.txt file to block GoogleBot instead or in addition to your server-side logic?
Robots.txt was my first thought as well, but that is said to not actually block your site from appearing in the results. They'll gather from other sites what the page is about (think <a href=mysite/somepage>how to knit a sweater</a>) and show that as title without page summary. Maybe if it looks like the site is down, they won't bother.
Blocking is based on user agent, they seem to set that reliably and the IP addresses change. You can do some reverse lookup magic but this was way easier than looking up every single IP that visits my site.
This is the exact opposite of "keeping users at its domain". That was the situation _before_ they implemented this standard. Now users will get sent to the publisher's domain instead (via a prefetched page load).
No they don't. The page contents are controlled by the publisher and cryptographically signed so Google can't alter it. Another improvement over the previous situation.
Remember the talk about how the Chrome team was going to "rethink" the navbar, and what domain and site identity really mean? And people were a little worried about this?
Turns out people were right to be suspicious. This is hot garbage. You can no longer ask a user "What URL does your navbar say you're at?". It is no longer a source of truth. They will actively be lied to.
But what does it mean that you are on a particular URL?
For a long time already it's not being connecter to a particular physical server. Now it's the next step - to be completely decoupled from the server and just mean content instead.
This is meant to offload tracking from just Google Analytics and SERP clicks, which is used to track user behavior (but can be blocked) into services that cannot be blocked beyond Google domains.
If Google hosts the website and is masking the resulting url, they're able to have more visibility than Google analytics. They'll likely give this AMP some SEO boost temporarily and that will get web admins to adopt the technology.
It's just like reCaptcha, which is used to track users across the web (requires google.com + gstatic.com urls to load, which drops its own cookies or scans existing ones), blocking recaptcha will break core web functionality... and recaptcha v3 is even worse.
Web publishers don't necessarily want their content decoupled from their own servers, but they don't have a choice now if they depend on traffic from Google.
You are not decoupled from the server. Google still sees HTTP request you make in plaintext and collects your data according to their privacy policy. It just won't be obvious because of publisher's URL in the address bar.
there was no need to be suspicious. google wasn't being sneaky about it, they have been actively talking about, promoting, and openly developing this feature for at least a year.
This sounds terrible. Does it mean that browsers will begin lying to users and say that the users are visiting the website's server when they are really visiting a restricted version of the website that is hosted in Google's cache? I don't want my content restricted or hosted in Google's cache.
AMP doesn't load in a privacy sensitive way. It's on Google's servers and it takes many seconds to load if you have JavaScript disabled.
Also, the feature only works on Google Chrome and possibly Edge, which gives another point to the article below.
The browser displays the URL from the origin that digitally signed the unmodified content.
A browser already doesn't show you what server delivered the content. That would be your wifi AP, cell phone tower, or ISP node. The internet has already long established that we can trust content without trusting intermediaries.
There are two elements that are important: integrity and privacy. The content integrity is protected via a digital signature, the "signed" part of "signed http exchanges". The signature proves that the document hasn't been tampered with.
Regarding privacy: The intermediary (a search engine in this case) already has the content being delivered as a result of crawling it. It also knows the user clicked on a link to get that content, and knows the user's ip address. Even without AMP or Signed Exchanges, the privacy situation is the same. Once the page is loaded, all further interactions with the origin are normal https traffic, so later requests are not different in privacy either.
What this enables, for search results, is the ability to load the bytes of the content before the user clicks a search result. If the browser prefetched those bytes with the origin's awareness, then the user's privacy with respect to the search query would be violated, making prefetch problematic. With this setup, documents can be prefetched while preserving user privacy and after the user clicks all browser behavior continues as normal from that point forward.
AMP allows Google to see exactly how you interact with every page on the internet.
Just from the text of the pages you visit they can build a profile around you. What your interests are, how much of an article you're likely to finish, whether you're the type of person to highlight text as you read, etc.
Unless you live on an island with a poor satellite connection AMP is useless as anything more than a corporate user data collection tool.
AMP documents don't share user data with Google, which can be trivially seen by inspecting the network events that the page generates.
If the publisher chooses, they can send logging to Google Analytics, but this is not part of AMP.
The typical argument otherwise is that the AMP javascript is loaded from Google's cache, however these javascript resources allow for a very long cache lifetime (1yr if the page came from the Google Cache), so relatively few page loads will actually end up fetching them from the network for most users.
Edit: These resources are also on cookieless domains.
> The typical argument otherwise is that the AMP javascript is loaded from Google's cache, however these javascript resources allow for a very long cache lifetime (1yr if the page came from the Google Cache), so relatively few page loads will actually end up fetching them from the network for most users.
No, if Google can change the way web works from day one they can change anything they want. Don't forget Google is killing imap and dns already. Why not http to?
Also, Google explicitly states that it is collecting data in AMP Viewer [1]:
> The Google AMP Viewer is a hybrid environment where you can collect data about the user. Data collection by Google is governed by Google’s privacy policy.
I assume they collect information from HTTP request the browser sends when requesting an AMP page.
We are talking about their AMP cache. If you don't use Google Service, except if you like to prepends their amp cache URL before your links, you'll never get there.
Their AMP cache happens only on their search service. They already know which links you click... having an AMP cache on top doesn't give them MORE information than they already get. The use of that cache also make sure the website doesn't get more information because it's preloaded.
If (or when) the share of that privacy-conscious users will rise, Google might motivate webmasters to compile GA scripts in the main JS script, and considering pretty much any website now a days just doesn't show content with no Javascript enabled, it would be much harder to avoid.
I browse mostly without javascript on and that's not true; easily more than half of websites work just fine without it, and that number goes far up if you accept some lack of features. Though there are some that indeed don't work at all.
Although your point is well taken that there could be ways to sneakily track users eventually despite the aforementioned measures, and potentially even without javascript being required (though I doubt that share of privacy-concious users will ever raise significantly - most people simply don't care).
Google can't tell if a link has been clicked if JavaScript is off and the `ping` attribute is removed, so AMP removes privacy there.
By forcing web publishers to host their content on a Google cache, they lose their server-side logging and the ability to determine how they set up they way they serve their own sites.
Also, why do you artificially slow page loads on AMP pages to 8 seconds when JavaScript is disabled? That is a privacy issue.
The linker (google in this case) could rewrite the link to use a redirector if they choose. If Javascript is off, AMP and thus Signed Exchanges are disabled on Google search results anyway.
You misunderstand the 8 second CSS animation in the AMP boilerplate. Here's the code (simplified):
<style>
body { animation:-amp-start 8s steps(1,end) 0s 1 normal both}
@keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}
</style>
<noscript>
<style amp-boilerplate>
body{animation:none}
</style>
</noscript>
See the noscript section: if javascript is disabled, the CSS displays the body immediately. If Javascript is enabled, but for some reason the AMP javascript fails to load, after 8 seconds, the page is displayed anyway. The page is probably somewhat broken without the javascript loading, but the 8s is a fallback, not code to slow down non-javascript browsers.
There are legitimate (privacy/speed) reasons to not load AMP's JavaScript while still not turning of JavaScript entirely. Google does have the capability to know when you're on an AMP page, because the JS loads from ampproject.org, which is registered by Google.
An 8-second delay seems like an intentional "bug" to coerce users to turn on JavaScript (and advertising).
The javascript is heavily cached, so will not give a request on every page load.
That is not the intention. If javascript is disabled entirely, Google Search won't even load AMP pages. The scenario you describe of a user loading an AMP page directly without javascript enabled is somewhat rare.
Many people use tools to block third party JS from loading. AMP can't be called privacy-friendly while making it extremely difficult to use when tracking (AMP Analytics) is blocked. The 8-second delay happens to me every time I accidentally click an AMP URL in my browser.
I don't use Google Search, and I frequently get sent to Google's AMP cache via other link sources (e.g. HN).
I don't have javascript blocked, but I do have Google's tracking blocked via standard tracking protection (which is now a built-in feature in most non-Google browsers), which means <noscript> tags are not triggered, and I get the 8 second delay due to non-loading JS resources.
I don't think my setup is as rare as you make out.
It is clear that the current developments on the web are worrysome and we need real privacy. We need to be able to find a website and visit it completely anonymous, unless we actively submit information to said website or a court order is issued.
A cell phone tower or ISP node is ideally just infrastructure, "plumbing". Google seems to be trying to advance their strategic position in that direction. Rather than just being one search engine among several, they are trying to become part of the infrastructure. This could prevent future privacy solutions (and even prevent competitions between search engines).
The real reason to make this spec is not to improve integrity o privacy or something else but to make users stay on Google's domain instead of going to other site. Google wants to build its little walled garden, and this spec is needed to make users think that the walls aren't there.
> With this setup, documents can be prefetched while preserving user privacy and after the user clicks all browser behavior continues as normal from that point forward.
But Google can already preload and show cached version of the page without this spec. The only difference would be that address bar shows "google.com" instead of publisher's domain. There is no need for this specification.
I think you're missing the point of the GP. It says that you don't know and you don't care which particular server returns your content - is it a self hosted machine, is it a cloud machine, is it a CDN? No way of knowing unless you inspect the deeper stack. What you see very visible is which BRAND (I. E. URL) returned your content.
So this Amp exhange technology changes nothing in this regard. It's like Google provides its own Free CDN, it is just not done in a traditional manner.
> It says that you don't know and you don't care which particular server returns your content
Which is plain wrong. I care.
When the URL-bar says I’m looking at company.com, I expect my browser to have used my OS’s DNS-resolver to look that name up, connect to the IP-given and nothing else.
I certainly don’t expect it to send traffic to certainly-not-the-nsa.com which are MITMing my traffic and tracking/monitoring it.
If I can’t trust my browsers URL-bar to exclusively and accurately reflect what is actually requested, it is effectively lying to me, the user, it’s owner.
And then suddenly all URLs are phishing URLs because Google made URLs no longer matter or mean anything.
My point is that even if you look at the URL bar currently and it says company.com, you don't know what you're connecting to. Probably you're connecting to CloudFlare/CloudFront/Akamai/Fastly/any other CDN which is set up with good-enough certs to impersonate the domain. Therefore you're not trusting a particular server, you're trusting a relationship that the domain owner built with her's service providers.
The proposed scheme is just another way to extend this kind of relationship that the publisher builds, a new mechanism if you will. There is nothing in there that requires more or less trust from your part than before.
You're complaining that need URLs to reflect what is requested - in fact, I argue that you want the URL to tell you what is being served. But this is not what's currently happening.
URLs are already lying to you.
I doubt that you WHOIS-lookup all DNS resolved-IPs to verify that the IP presenting a cert is assigned to the organisational entity that you want to connect to, and have a whitelist of those entities that you actually allow your browser to connect to. Because that's what currently required to make sure you don't go through CDNs and other intermediaries between you and the publisher.
Using a CDN currently means the company use trusted mechanisms like DNS to delegate certain traffic to other providers (like with Cloudflare). And it does so for everyone.
In which case the URL serves what was requested.
What AMP does is provide google.com content and lie to the user and says it comes from company.com.
Which isn’t true, and it only does so for users coming from google.com. Where I’m sure google will be happy for the additional tracking data.
This is NOT the url the user was lead to believe he requested. This is not what everyone else is served.
> I don't want my content restricted or hosted in Google's cache.
how is this different than using your own domain, but pointing it to a github.io page? Or using medium, but with your own domain (but still being served from medium's servers)?
Is it just google you're adverse to, or the entire idea of someone else hosting your content?
1) I want full control over my servers and to not be penalized in search engines for not hosting my sites on Google. Where are the server-side logs?
2) I want full control over how I publish my sites with real web standards. AMP is not a web standard, it's a Google format that they are strong-arming people into using.
3) Mozilla considers Signed HTTP Exchanges harmful. This technology is as bad as what Microsoft was doing with IE in the old days.
4) I don't publish on Github pages, but if I did, I would still have a choice over which servers I put the sites on.
5) There shouldn't be a single company (or few companies) that dictates how we publish online.
6) Shame on the people who are splitting the web with this fake-opensource technology. There's even a Google engineer over here referring to the Web like it's a Google product. https://news.ycombinator.com/item?id=19631136
As per point 6, I wouldn’t take what was said there as a statement from Google, or potentially even an employee of Google. They did it as a throwaway .. anybody wishing to kick the hornets nest could have posted that, employee or not.
It's not written like someone trying to kick a hornet's nest. It's written like someone who has been conditioned inside of a culture that has begun to view the Web as a Google product on some level.
However the last question is a fair point - nobody complains about CloudFlare's caching of your web page as you designed it.
The critique of AMP is that it receives privileged placement in search results, and that content authors are being pressured into adopting this de-facto Google-controlled spec, where they host your content and control its presentation. Anything that furthers AMP helps Google in this effort.
That's a good point! Domain owners can host their websites wherever they like, and yes that includes Google's cloud.
If they go through a content network like Cloudflare, you can't even tell who's hosting the site by looking at the IP address.
It drives home the point that websites are abstractions that have no necessary relationship to any particular physical hardware. Network tools may or may not tell you a bit more about the source, depending on if there are any leaks in the abstraction.
There is a difference between the web publisher controlling that abstraction and a web publisher that has been strong armed into one abstraction or another.
I didn't even know about "HTTP Exchanges", and I'm more interested than ~98% of the population about this kind of stuff.
Showing the name of the "signer" in the address bar, instead of the server where the content is actually hosted goes against decades of browser UI design.
> Showing the name of the "signer" in the address bar, instead of the server where the content is actually hosted goes against decades of browser UI design
Does it though? If you use Cloudflare or Akamai or Cloudfront or Netlify or etc. etc. then what shows up in the URL bar is not the server where the content is actually hosted. Well, it is the server where it is hosted, it's just one of the many domains hosted by that server.
That has never been different. Cloudflare & co are reverse proxies, for all intents and purposes from a user agent view, they are where the content is coming from. They are the ones pointed to in DNS, and they have valid SSL certs.
And how is this all that much different? In fact I would say it's more secure. DNS can be spoofed pretty easily. This is a cryptographically signed package. If anything, I'd have more faith in this changing my URL than a proxy via DNS.
Just because Google invented it doesn't make it bad.
> In fact I would say it's more secure. DNS can be spoofed pretty easily. This is a cryptographically signed package
How is it more secure? If, as you say, DNS can be spoofed easily - I can easily get a certificate issued with the required extension and make a "cryptographically signed package".
> If, as you say, DNS can be spoofed easily - I can easily get a certificate issued with the required extension and make a "cryptographically signed package".
Spoofing DNS to clients is much easier than spoofing DNS to certificate authorities. Otherwise domain-validated HTTPS certs wouldn't mean much.
But when there is a CDN there, "who I'm talking to" is really just an intermediary who pretends to be you, and may have in fact modified the content. With this, it is still an intermediary pretending to be you, but at least now the package is signed and can be verified.
The CDN is you, for all intents and purposes. It's your agent in the back and forth, as much as your hosting provider would be. A third-party cache isn't.
I don't mind that you can sign and verify content, that's fine and useful. I'm just not a fan of changing the address bar's meaning.
But what I'm saying is that the meaning that you ascribe to the address bar is incorrect -- it already only tells you who published the content, not who you are actually connected to.
What I'm saying is that this does not change the meaning of what's in the URL bar. It's the same as before. It tells you who published the content originally.
> it already only tells you who published the content
No, it tells you the origin of the document. If you are the creator, and you choose to put your content on server X it will tell you "I've got this from server X". Whether that server is a reverse proxy or a shared webhost or a dedicated server in a DC or a raspberry pi running on your desk doesn't matter - it's the designated original that you, the owner of example.org chose.
That's what it always meant, and it changes when you do a redirect, and it shows you the current URL even if there is a canonical header of http-equiv. I can put a reverse proxy on my host and proxy example.com to example.org - the address bar tells you that you're reading example.com, not example.org, as it should, because you're connected to me, not to example.org.
Do a trace route on any domain and you'll see that the server isn't the one that give you the answer, but some intermediary. Sure in that case when you did the request, the content is fresh and the server answered RIGHT NOW, but that cache still get the content from the server, it's just a bit older.
I've been using DDG for at least a year now. On some occasions I can't find what I need and end up checking Google, but in those cases, Google usually can't find what I need either.
Signed HTTP exchanges may be harmful, but Google is beginning to get enough dominance so they implement it and browsers with a minor market share must follow or are left behind.
The behavior for browsers without support is to show the google.com/amp URL as before, along with a small html-based bar with additional information about the original domain and share intents.
Does that mean that the Google+ button is coming back? Seriously? Why not just serve the content and leave it at that? Is the tiny bit of extra data you get from a unique "share on Facebook" URL worth it?
I didn't know the Web Share API existed, but based on the RFC, it looks like yet another Google-driven "standard." I still don't see why it needs to be added to the page.
> AMP doesn't load in a privacy sensitive way. It's on Google's servers
Only if you load the page from a Google SERP, in which case, Google would already know if you visit the page. If it's loaded from a Bing SERP, it's served from a Bing server, and the same for Baidu and other AMP caches. This is far more privacy preserving than preloading a page from some third party web server that the user might never visit.
It's solely the search engine boost you get from AMP that bothers me. Because of the money involved many sites have no choice but to implement AMP and stay competitive. If it weren't for that, it would just be another technology and the fact that it only works in few circumstances would probably make many sites not bother with it. The UX downsides would probably see many sites actively avoid it.
The fact that it has seen such adoption is testament to Google's ability to influence with it's rankings alone.
Have you noticed ranking improvements? We've done AMP on some sites and not others and we saw no difference in ranking.
Sure, having a very quickly opened page is nice, but on the other hand, features are limited. That might or might not work well, depending on what kind of content you have, what engagement you're looking for.
On iOS it continues to be very broken, although the difference in scrolling "inertia" was resolved by Apple.
AMP introduces a very non-Appley top bar within the browser, adds new swipe semantics that can be confusing, breaks "tap status bar to scroll to top" behaviour, breaks reader mode (although this is inconsistent), and generally looks out of place. The best way to describe it is like a GTK or KDE app running in macOS. It's clearly not a "native" experience and doesn't really look or act like any other webpage in mobile Safari.
Most people do not like to be forced to do things a certain way. Especially if they have a working site already and now have to remake it from the ground up just because some other actor decided it isn't good enough to get visitors.
Just you wait until you notice you can't go to town in your car any more. Only teslas are allowed into city.
That's a bad example for me personally since I think all cars should be banned (except maybe electric cars but I haven't done the necessary research to see the actual environmental impact). I haven't used my driver's license in years and always take the train (I've also stopped flying).
But anyway, that's off topic. I understand that it's a pain for developers but for users like me who are often on a bad connection it's a life saver.
Yeah, I think this feature and the signed exchanges standard both sound great. It allows CDN-like servers to host content without having to be trusted to not modify the content. That sounds like an improvement over the current CDN situation.
Also, sites that link to other sites can preload the linked site's content into the user's browser, without leaking the user's IP to the linked site, so if the user doesn't follow the link, nothing about the user is revealed to the linked site. That sounds like a performance and privacy improvement wrapped up into one. I'm finding the rest of this discussion thread extremely disappointing as it seems like most of the posts here are just "amp=bad and amp people like this so it's also bad".
Signed Exchanges could also remove some of the concerns around JavaScript crypto, because there is a (potentially offline) key that signs the web app you're running, so you're not vulnerable to hacks of the hosting environment itself.
What's really needed is a way for the browser to lock a given web app/package to a specific version (and hash), so that even if the signing key becomes compromised, the app can't auto-update to a newer version containing malicious code.
Combining this with something like Certificate/Binary Transparency would allow browsers to check that they are not being uniquely targeted with a specially altered version, and you could set a policy saying "Only auto-update to a newer version of this web app if its hash has been published in a log for more than a month (and/or endorsed by signatures from N out of M other organisations I trust)".
Semi-related, I think Web Packages and Signed Exchanges could have some usefulness outside of Google's caches. One of their spec examples was for verifiable web page archives.
Another idea it could be used for a wifi "drop box" (drop station?) when there's no internet connection around. That isn't uncommon at some popular spots up river into the woods in the US.
The idea is that as people enter the area, they can update the drop station automatically for things like news or public posts with whatever they've cached recently.
I'm pretty sure I read about this idea before the spec was drafted but I couldn't find or remember the site, something like vehicle-transported data.
Thanks. IIRC the site I saw was from a few years ago, before the spec was drafted. (I updated my post to be more clear). Pretty sure there was a few photographs on the page out in the flat grasslands.
In general, this sounds like an interesting use case.
One thing to note is that the specification currently limits the lifetime of a signed exchange to 7 days. It's possible that by exploring some of these use cases, especially offline, the spec could be improved with respect to some of these constraints.
Unfortunately, signed packages won't work for archival or any significant offline use. The signed exchanges are forced to be short lived (in days) to limit the damage that can be done when someone steals a TLS private key.
It's a very narrow spec designed just for AMP, basically.
AMP pages take forever to load, I hate seeing a white screen > 1 second. with amp that's all I see on mobile. A white screen burning my eyeballs, seemingly forever. And then I snap out of it and hit back to escape from AMPs empty void.
I read in probably another HN thread that using an adblocker will actually slow down an AMP page - there's a hardcoded 3 second CSS delay which is cancelled if (via JS) it's detected the page is loaded.
I would _pay_ google to be able to disable AMP permanently on mobile web results. The experience is the absolute worst. I'm fine with them wanting to ruin mobile web (that's their choice), but PLEASE let the users be able to disable this terrible "feature."
After reading all of the comments here, this seems like a good thing.
This fixes the main UI issues with how AMP is currently used in google search - mainly the url not properly showing where the content is.
If secure exchange is treated the same as AMP pages in google search, I.E. SXG content will be preloaded whether or not it's AMP, it would get rid of the second complaint of AMP - that google's preloading of the content is an unfair playing ground and that the only reason it's fast is because it's preloaded.
If SXG is treated the same as AMP in the carousal then that would fix the last and most serious complaint about AMP.
As far as I can tell, google do seem to be moving in that route, so this should be applauded not derided (the original fiasco that is AMP non-withstanding)
The headline is an outright lie: these AMP pages are loaded from Google and not your domain.
The new feature is that Google's browser displays your domain, obscuring the fact that Google is doing the serving. The change is what is displayed, not the server.
When I had a website with embed videos from other sites, I had user contacting me because the other sites had some problems. They couldn't tell the difference between megavideo/youtube/dailymotion content and my site, so they came to me and blamed me.
So what this means is that not only Google bullies you into putting your traffic under their control, but now, any problem on their part will be blamed on you by the user.
> So what this means is that not only Google bullies you into putting your traffic under their control, but now, any problem on their part will be blamed on you by the user.
I hadn't even considered that. Add to this Google's notoriously absent customer support department and you have a recipe for a lot of frustration.
next month they'll also style it like your browser's native address bar for a better user experience and intoduce a w3c standard API for hiding the real address bar. /s?
They have been trying to make it so that users can't tell if they are on real webpages or AMP pages, and it looks like they finally implemented it. AMP is about Google, tracking, and ads, not page speed, even if they have convinced many of their engineers that it's about page speed.
So, when will Google roll out signed exchanges for plain HTML content? That's much more interesting, and if combined with e.g. a restriction such as a Lighthouse speed score of > 60, it'd be in all measurable ways better than AMP.
Faster than AMP, more open than AMP, and all the benefits of AMP.
Does nothing for publishers' needs for deeper control and analytics. Just a "feel good" gesture that results in additional complexity for everyone involved. Google is not the only company in the world that knows how to load a page efficiently.
The publisher's cookie-based analytics will operate on the origin in the URL bar in this case. The document (though not the delivery server) will have access to publisher origin cookies.
Conceptually, you can think of a signed exchange as a 301 redirect to a new URL which has already been cached by the browser (so there is no 2nd network event). The cache was populated by the contents of the signed exchange, assuming the signature validates.
"Your website has been banned for illegal activity and all content has been deleted. The suspension is immediate and indefinite. Please consider using another Internet"
Also, their algorithms mistook live streaming of Notre Dame fires as 9/11 incident. How can live stream be a past incident?
You are confusing the term “live stream” to mean something actually happening at that moment in the real world. All a “live” stream actually is to Youtube or FB or any other streaming service is just incoming RTMP packets. Youtube matches incoming streams against their ContentID database just as they do for normal uploads. I would bet that the same thing would have happened if the same footage were uploaded normally after the fact. People will use unlisted streams to broadcast pirated content otherwise. I faced a similarly false claim once where me playing Super Mario World on a real life SNES was falsely matched to some major label song, and since it was my third strike (all false) my account got banned from being able to use Youtube Live entirely. In the end I’m kind of glad that happened because it led to me developing my own minimal self-hosted stream site for small friends-only streams, including things Youtube would give legit claims for. My tiny VPS wouldn’t stand up to thousands or probably even dozens of simultaneous viewers like Youtube or the other big names can, but that doesn’t matter at all for me.
"YouTube Live is an easy way to reach your audience in real time. Whether you're streaming a video game, hosting a live Q&A, or teaching a class, our tools will help you manage your stream and interact with viewers in real time."
Youtube live is supposed to be streaming of live events. My point is that their algorithms incorrectly decided a live stream was a past event. Even google/youtube acknowledged it was incorrect to tag a live event.
The experience on iOS remains profoundly buggy. The URL bar doesn't hide properly, scroll-to-top doesn't work, rotation is busted, text selection is wonky, reader mode is disabled. How I wish I could disable this monstrosity.
I partly agree with this. If you run a WordPress blog, then EasyEngine [1] and OpenLiteSpeed [2] can really boost your site performance.
The performance will be greatly affected if you run some cancerous theme with endless JavaScript calls. But both of the mentioned "engines" have changed the way I see blogging with WordPress.
Best of all, this is accessible to your average user as well. DigitalOcean can spin you up an OLS instance in a minute or so...
No it isn't. It's designed to make the user experience better on sites that frequently host Google ads (and also often contain a ton of bloat, 3rd party js, poorly constructed DOMs, awful CSS, etc).
The only way Google could proactively "solve" this problem was by creating a "standard", and then also offering to absorb end user traffic for sites that adopted the standard. FWIW, AMP is an open standard not solely owned or contributed to by Google.
7. Looking at the last few merged PRs nearly everyone involved is a Google employee. I realize this could be a coincidence but I'm not going to analyze the whole repo.
8. The TSC is 3/7 Google employees.
Regardless, until Google issues a legally binding release of the project to an independent organization it is owned by Google. The TSC and AC could be removed at Google's whim.
Why is that the only way? Seems like they could easily have achieved the same result by significantly penalizing sites based on load time and number of external requests.
A fast load time when a page is indexed does not guarantee a fast load time when it is served up to the actual viewer. Serving the page from cache is the only way to guarantee that the page will still be fast when the user wants to view it.
Because users want relevant search results much more than fast websites. Google already factors in a website's performance in their rankings, but weighing it too much over content relevance will make search results worse.
If they actually cared that much about making the results "relevant", they wouldn't mix a bunch of irrelevant suggestions into the results page, each marked with "missing: <query_term>" pointing out exactly how they ignored part of the user's request.
By that logic then, what's the point of AMP if Google is saying page load speeds aren't really that big of a factor? Why go through through the trouble of deriving a whole new subset of HTML?
Because users want relevant search results much more than fast websites. Google already factors in a website's performance in their rankings, but weighing it too much over content relevance will make search results worse.
What's the difference between influencing positions and visibility based on AMP support vs overall page performance?
If visibility is influenced by AMP then Google benefits, users using Google services likely benefit, web developers suffer, users not using Google services to view the content continue to suffer (because companies will continue to maintain two versions of the website, a bloated version with 100 external tracking requests that will be shared on twitter/reddit/facebook/hn/etc, and an AMP version that will only appear on Googles services), and the internet as a whole suffer. Whereas if visibility is influenced by page speed+external requests then everyone would benefit.
- AMP is a transparent and unambiguous standard that leaves no uncertainty as to whether you are somehow "performant enough" to qualify for the simple but limited visibility boost (referring to the news carousel)
- AMP prevents important usability problems beyond performance, like page content jumping
- AMP can enable advanced/extreme performance optimizations by default that are somewhat rare in practice (eg. only loading images above the fold) or isn't really possible to do safely/properly without a spec like AMP (eg. preloading content before the user clicks the link without unpredictably disrupting the website's servers) or sometimes avoided due to cost (eg. fast global caching with Google's impressive CDN). Important for users in the developing world.
Addressing your other points:
- Users who don't use Google services don't suffer. AMP is not Google-exclusive, all the major search engines (like Bing, Yahoo, Yandex) are stakeholders in the AMP standard and are free to support AMP. AFAIK there is nothing in the AMP standard that favors Google over other search engines or any other platform that might support AMP.
- Not sure how web developers suffer more from AMP. I'd think web developers would suffer more from trying to wrangle their bloated website performance independently rather than use a standard toolkit that enforces best practices and enables difficult/expensive optimizations out of the box.
- It's not clear to me how the internet as a whole will suffer, but I suspect this is just general hyperbole and not a specific point.
as long as google is using their grip on the web to drag crappy websites kicking and screaming into having acceptable load times, i'm okay with it.
yeah, there's a few sites out there that are faster than amp. but most of them are not, and before amp the trend was certainly not to make anything lighter or faster.
AMP pages loaded through Google search with hot cache load slower than some of the websites I've developed when loaded with cold cache.
It's absurdly slow, uses tons of unnecessary JS, and it is a privacy nightmare because now I can't just use server-side GDPR and ePrivacy guideline compliant analytics anymore, but either have to give up analytics entirely, or have to use privacy-obliterating Google Analytics.
And if a user ever loads the page with JS disabled (which all my sites are designed to support), AMP breaks and just shows nothing at all for over 8 seconds.
> AMP pages loaded through Google search with hot cache load slower than some of the websites I've developed when loaded with cold cache.
Basically this. AMP sets a hard upper bound for how fast your webpage can be. Have a purely static HTML+CSS blog but want to get the page rank boost from AMP? Just add reams of unnecessary Google Javascript to what should be a very simple site.
> AMP pages loaded through Google search with hot cache load slower than some of the websites I've developed when loaded with cold cache.
On a mobile device in India? Nonsense. Your page load time is dominated by latency, which the AMP user doesn't see because it is preloaded from near caches.
> uses tons of unnecessary JS,
Which of the JS is unnecessary? The JS to load images allows AMP not to preload images below the fold, which is absolutely necessary for speed and for being friendly to data plans.
> now I can't just use server-side GDPR and ePrivacy guideline compliant analytics anymore
Explain. You still get first party tracking that gets fired when the user clicks to your page and can get user consent via data-consent-notification-id.
> And if a user ever loads the page with JS disabled
In that case, it's the SERP's fault for showing the AMP page instead of the non-AMP page. In the normal JavaScript-enabled scenario, the SERP would be stupid to show your non-AMP page.
> On a mobile device in India? Nonsense. Your page load time is dominated by latency, which the AMP user doesn't see because it is preloaded from near caches.
My test device is a Huawei Ideos X3 on a 56kbit/s throttled 3G connection. The same effect also applies with a Pixel 1 on the same connection, or either of the devices on a modern 3.9G LTE connection. (Tested on O2 net in Germany, works reliably better than AMP even and especially while on a train — if you've ever tried using O2 on the intercity train between Hamburg and Münster you know that every third world country has better internet than, I've seen 8kbps with 13 seconds latency there)
> Which of the JS is unnecessary? The JS to load images allows AMP not to preload images below the fold, which is absolutely necessary for speed and for being friendly to data plans.
AMP uses megabytes of JS for that purpose, I do the same in under 1kiB (even including an intersection observer polyfill). And my CSS is much much smaller as well. Part of why I get a 100/100 in all pagespeed and lighthouse tests, including when simulating mobile connections, while AMP pages get only 60/100.
> Explain. You still get first party tracking that gets fired when the user clicks to your page and can get user consent via data-consent-notification-id.
I want JS-free analytics that do not require tracking or any consent (GDPR allows collecting some information without consent, same with the yet unreleased ePrivacy directive with which AMP is not compliant anyway).
What? Where are you pulling these numbers? Also, what do you mean by hot cache? I'm starting to suspect that you don't even understand that the AMP page (the JavaScript for sure, and often the entire HTML and above-the-fold images as well) is already on the user's device, while your page is not.
If the AMP version takes longer to load than the time between the search results loading and the user clicking on it, then the AMP version will still have a visible load time.
Obviously, this part is affected by the AMP js being in cache or not.
Still, often my own page can load faster than just this user-visible part of loading the AMP version.
AMP works best when the user visits almost only AMP pages (so the resources stay in cache), and the user has a high-latency high-bandwidth connection.
But that's almost nowhere in the world true, in reality most people have relatively low latency with low bandwidth.
Your claims disagree with the facts on the ground, where latency is the main factor affecting page load time. This is the driving force behind CDNs, HTTP2, QUIC, and pretty much every speed optimization that people have been working on in the past few years. https://www.afasterweb.com/2015/05/17/the-latency-effect/
Your claim that your page loads faster also reeks of wishful thinking. Pretty much every AMP page I have loaded from a SERP loads instantly, not just fast. For someone on a worse connection, the page will have started loading before the user clicks on the link from near caches versus have not started loading at all from a far server. In the rare case where the AMP JS is not in the browser cache, it will be after loading the first result.
As mentioned, I've done testing on actual devices on actual high-latency low-bandwith connections, hundreds of times. That's the "facts on the ground".
If you say pretty much every AMP page you've loaded has been instant, please post the specs of the devices and network you've been using for testing.
Additionally, if the latency between the device and the nearest server is over two seconds, the latency to a far server as well as the click latency don't even come into play anymore at all, instead the number of connections needed becomes much more important, and bandwidth also becomes a much larger factor.
Your claim that HTTP/2 would have worked towards better latency on lower connections is also false, on bad mobile connections HTTP/2 actually increases latency, which was a major reason for QUIC aka HTTP/3 in the first place.
> As mentioned, I've done testing on actual devices on actual high-latency low-bandwith connections, hundreds of times.
And as I've mentioned, you've been testing the wrong thing by not understanding the whole point of AMP (safe preloading).
> instead the number of connections needed becomes much more important
A page preloaded from an AMP cache needs at most one TCP connection, usually zero if it uses QUIC.
> and bandwidth also becomes a much larger factor.
Which also works in AMP's favor because the device doesn't need to load your custom JavaScript or potentially unoptimized images, just the tiny HTML and optimized images above the fold. The weight of this (and the associated gain) is tiny, which is why bandwidth is a relatively unimportant factor.
> On bad mobile connections HTTP/2 actually increases latency
You're mixing up dropped packets with high latency. That's neither here nor there because Google's and Cloudflare's AMP caches both use QUIC — my point was that latency is the key factor that all modern web speed technology has attacked, including AMP.
With the peerweb.com platform I will be providing a free-for-non-commercial-use polyfill for Signed HTTP Exchanges which can be used for distributed p2p content offloading.
Peerweb helps sites automatically offload all resources (including streaming ugc video) to a decentralized p2p network.
This feels like it solves the biggest user complaint with AMP, which was the ugly URLs and having to click an extra time to get the "real URL" for sharing.
It also at least helps slightly address one of the complaints of publishers, which is that cookies and some analytics will work now.
But it still doesn't address the biggest complaints of publishers.
I'm guessing Google cares a lot more about the user experience than the publisher experience, since users make up most of the traffic and all of the ad consumption, so this is certainly good for them!
This is true with TLS as well, though it also requires man-in-the-middling (MITM) the connection. MITM is usually rather easy compared to stealing a private key.
Yes but it is much harder to MITM if the users are in different parts of the world.
Someone can buy a lookalike domain name using similar-looking UTF characters, send out a bunch of email spam with an URI that looks like the original, and once the user visits the webpage it instantly loads the AMP and suddently the URL is authentic. There will only be a very quick url change from the punycode url to the original that I doubt many will notice.
If you're going to break the way the web works, you might as well break it hard. At least that seems to be Google's philosophy with this unasked-for inserting itself between people and publishers.
Who is going to be in charge of the signing process? Can I sign the content myself? Will Google eventually only sign content it deems appropriate? Can we trust Google to always do the right thing?
Good question. The signing is done by the publisher, using the same digital signature infrastructure that is used for TLS (https). So, the publisher alone has the signing key, and any browser can verify the signature by comparing to the public certificate, signed by a certificate authority.
I 've stopped using google because of AMP. Only google can sabotage google at this point, and I think this is how it will happen if they keep going this way.
Here's the real question - how is this implemented in chrome? are signed exchange website that are clicked in google search page and whose url is switched is still running the google search page's code?
If this is so, and it only works for google search's page and not some generic web strategy, this is a massive breach of browser/web-content separation, bigger then even the auto-sign-in to google from chrome.
Conceptually, you can think of a signed exchange as a 301 redirect to a new URL which has already been cached by the browser (so there is no 2nd network event). The cache was populated by the contents of the signed exchange, assuming the signature validates.
There is no "reaching into the document" from the previous click or anything weird like that.
I just wanted to say a quick thank you for going out of your way to answer to many questions here. Despite the obviously hostile environment (I have some reservations about AMP myself), your answers have been very clear, informative and level-headed, so thanks for that.
There is just something about AMP that reeks of soylent drinking hipsters. Everything from thinking requireing devs to use some abitrary CDN and tons of obfuscated scripts for the most mundane things (like forms?) to the emoji in the <html> tag... Eh no thanks. I get that exchangeable signed website packages might be a good idea for various purposes. But no thanks I don't measure my life in seconds.
Can any Google engineers comment on the the large amount of negativity towards AMP found in these HN comments (and pretty much every other HN forum discussing AMP)? I can't imagine working on a project and seeing so many people (especially given the average education level on HN) who can't stand it. I want to hear from you guys about this!
Does AMP run on Firefox? I've been using Firefox since the Quantum launch and haven't seen any AMP lately. I had completely forgotten this nightmare once existed.
Wtf? I guess it's time to blacklist Google and their web of worthlessness. I mean, using Google Search is literally "give me advertisement in disguise" these days, and I don't understand what people expect from using it. An effective remedy also seems to block all script; serves as a useful filter for actual content rather than ad-infested clickbait, and the script-ladden pages you can't view aren't worth your time and bandwidth anyway.
* does not show all comments, often ones I am actually looking for
* does not let me collapse comment sections
* uses the default white background theme which burns my retinas if I am looking at my phone in a dark environment
* shows overlay ads for the Reddit app that cover about 40% of the screen for no goddamn reason
* requires 2-3 separate actions to get to the original page
Yet I cannot find a browser extension or setting to tell AMP to fuck off. Honestly AMP might be what finally gets me to switch search engines after many years of using Google.