New Chrome Beta release renders web pages as you type, screens downloads

JonnieCache · on Jan 6, 2012

Blogspam.

Here's the actual announcement: http://chrome.blogspot.com/2012/01/speed-and-security.html

drats · on Jan 6, 2012

We really need a way of telling when the main link has been modified. The only reason I didn't downvote JonnieCache is because I know his name, because he is a regular contributor, and so I double checked that the url and his comment were the same. Then when I saw they were the same I trusted his comment more than the moderators actions: i.e. I suspect this story link used to go elsewhere.

HN needs tons more transparency on moderation for simple things like this, as well as some of the dodgy issues I've seen in the past. I know a lot of trolls are "rules lawyers", but frankly when I notice changes being done by a select elite for my own good with no accountability or properly outlined rules it leaves a bad taste in my mouth. Especially when people complain of opinionated moderation and no evidence remains of it for me to make a determination myself as to who was in the right (I suspect the moderators are right in most cases, but if it's not there it's not there).

I'm taking a break from HN I think.

powertower · on Jan 6, 2012

This is really going to fuck up your log analysis days...

Seeing traffic that never materialized.

Headers aren't part of any standard log format.

Every web-server configuration and every log analysis script will need to be modified, unless Chrome...

1. Adds a GET variable to each URL to signify this is a preview pull (ex: GET http://url/?ChromePreview=Background).

2. Hits the URL again in some way (ex: HEAD http://url/?ChromePreview=View) to signify that this is now a view.

(edit: adding GET vars is a bad idea as outlined in comments)

Which will solve only half the problem, you'll still need to update your analysis scripts in a non trival way.

Headers won't work here well as these are not logged, and the only thing you could do with them is block the request, or have your Apache, IIS, Node.js, etc add non-standard entries into the log file, which creates more problems.

(edit: headers are about the only way for this to work as outlined in comments)

Not to mention the extra traffic on the web could be doubled.

zootm · on Jan 6, 2012

Headers are a perfectly reasonable way to signal this, and they're sending them. Adding GET parameters changes the URL fetched even though the browser still wants the same resource -- that would be wrong.

The fact that there are servers which don't allow users to do the sort of log analysis they need to do is not the fault of Google. Headers are the correct place to put the information.

masklinn · on Jan 6, 2012

> Adding GET parameters changes the URL fetched even though the browser still wants the same resource -- that would be wrong.

Not only that, but depending on the way the framework works it could break the page itself. I believe CherryPy's mapping for instance provides GET variables as kwargs to the handler, unless the developer added catch-all handlers it's going to blow up the routing and yield either a 404 or a 500.

powertower · on Jan 6, 2012

1. Modify every Apache + IIS + Nginx + other configuration with non-standard log formats.

2. Modify every analysis software/script.

Simple and easy!

zootm · on Jan 6, 2012

The analysis software already needed to be modified. And if your Apache isn't already configured to try and find this sort of thing, you're also counting things like page prefetches as true views.

The header is the correct place to put information like this, regardless of whether the deficiencies of the server software.

powertower · on Jan 6, 2012

> And if your Apache isn't already configured to try and find this sort of thing, you're also counting things like page prefetches as true views.

Care to elaborate on how your Apache configuration is already set up for this?

This is certainly not default/standard behavior.

I have no doubt that this can be done by detecting a header for the prefetch (if there is one!) and using Apache's syntax to mod the log, AND then accounting for this in your analytics scripts ... BUT what I doubt is that this will be a simple task to push to the other 100 million server and analysis instances that I don't control.

Not to mention some of those instances will not be like the rest, and the work required to handled this will be great.

zootm · on Jan 6, 2012

I don't disagree that it's difficult for those running servers who want to filter this sort of thing out, given the state of the current tools. The fact remains that it's the right way to do this sort of thing, however.

As for me, I happily do not run any public websites other than my personal ones, and in that context I don't much care about analytics, so I'm afraid that I can't show you a good way to do this in Apache. For what it's worth, thought, I think the header for prefetch is "X-Purpose: prefetch" in Safari and Chrome, and "X-Moz: prefetch" (yes, really) for Firefox.

icebraining · on Jan 6, 2012

Can't Apache, etc be configured to rewrite the URL and add the GET var itself if the admin wants to?

Alternatively, you can just block the preview if you really need your clean logs. Just add a rule to reply with 403 to any request with their header.

Too · on Jan 6, 2012

Google did something similar themselves several years ago as an add-in to internet explorer. It was called web-accelerator or something like that. (correct me if i'm wrong, it might have been for firefox or even chrome itself).

It prefetched links you were likely to click on a website but it had to be abandoned because some GET-links also issued actions, such as removing blog-posts, on way to many web-pages causing havoc.

I fear this will have the same problem.

asm89 · on Jan 6, 2012

I believe pre-fetching of links is already more common in browsers these days, so I don't think that this feature will introduce these issues again.

Note that this new feature also _renders_ the page in the background.

taf2 · on Jan 6, 2012

Prefetching what it thinks you are going to type in the URL bar is very different from prefetching links - that's the difference this time if I understand correctly ..

conipto · on Jan 6, 2012

One thing that bothered me about instant, and will probably annoy me about this, was when using chrome to test REST api's. Hitting my local development server in debug mode with half constructed URL's used to drive me crazy. In general for surfing around - I think the feature is great, but I would love to be able to exclude a given site from pre-fetch/instant.

elisee · on Jan 6, 2012

You could setup another profile and disable it in the options (or just disable it in the options while developing / testing and enable it back afterwards)

http://support.google.com/chrome/bin/answer.py?hl=en&ans... has instructions on how to disable it.

playhard · on Jan 6, 2012

we actually launched an extension for chrome which did the same thing in october 2010 when google instant search launched. it was a one night thing. i was in college. the adoption was not good.. so i kinda ignored the project. we actually did it with the omnibox api but it was an experimental api at that time.so i released it with an button ,which you can click and type to load pages instantly..

here is the link, https://chrome.google.com/webstore/detail/nipkbmplhlokenofof...

it does not work anymore coz of an api problem. u can see the working video. never bothered to fix it. just wanted to say that i did that before Google :)

bambax · on Jan 6, 2012

I understand that Chrome will now start to fetch and display pages before you finish typing the url.

Google Instant is bad, but at least only Google supports the increased load on its servers.

Now Chrome is trying to build "Web Instant", which everyone will have to support.

zoul · on Jan 6, 2012

"If the URL auto-completes to a site you’re very likely to visit, Chrome will begin to prerender the page," so I guess the interesting question is how the "likeliness" algorithm looks like.

jfoster · on Jan 6, 2012

The redundant pageloads provide a better experience to some of your users. Doesn't that make it worth it?

dangrossman · on Jan 6, 2012

Not when it screws things up.

I have a REST API for placing automated phone calls, for example, which I regularly test using my browser. Chrome is going to start calling people when I've half way finished typing a URL because it thinks it knows what I'm going to type. That's not only not what I want, but costs real money.

Or what if I'm an administrator for some user-generated content site. I have a webpage that shows me a list of user accounts and lets me click a "delete all this user's content" link to fix someone's mistake by request. Now in the process of typing the URL to get to this admin area, Chrome thinks I'm trying to go to the "delete all" page and accesses it in the background...

This is a dangerous feature.

maggit · on Jan 6, 2012

It would appear that you actually do not have a REST API. GET requests may not cause side effects in a REST API. This is an important feature of REST.

On the other hand, there are probably many HTTP based APIs that cause harmful side effects on GET, so the danger is still lurking.

dangrossman · on Jan 6, 2012

It was an example; these APIs clearly exist in the wild, and side effects are not the only danger of prefetching API URLs. What if a URL that is meant to be accessed with a GET request that kicks off an expensive, long-running process is prefetched?

There's a front page story right now about Google AppEngine, where the linked discussion mentions mashing a link to start a MapReduce job every day that's costing $200-300 a day to run. What if Chrome autocompletes the URL to that job, costing you $300 before you finish typing something else?

icebraining · on Jan 6, 2012

What if a URL that is meant to be accessed with a GET request that kicks off an expensive, long-running process is prefetched?

Then the API owners will realize they shouldn't break the HTTP standard just because it happened to work before.

    In particular, the convention has been established that the GET and HEAD methods
    SHOULD NOT have the significance of taking an action other than retrieval. These methods
    ought to be considered "safe". This allows user agents to represent other methods,
    such as POST, PUT and DELETE, in a special way, so that the user is made aware of the fact
    that a possibly unsafe action is being requested.
    
    Naturally, it is not possible to ensure that the server does not generate side-effects
    as a result of performing a GET request; in fact, some dynamic resources consider
    that a feature. The important distinction here is that the user did not request the
    side-effects, so therefore cannot be held accountable for them.

This argument comes up every time there's a breaking change, but developers should be responsible for their actions. They broke the spec, they should be deal with the consequences.

bambax · on Jan 6, 2012

This problem used to exist for a Google optimizer of some kind, a long time ago (6-8 years ago I think).

Do you remember its name?

idspispopd · on Jan 7, 2012

also a nice way to artificially inflate chrome visitation figures.

yanw · on Jan 6, 2012

This feature differs from instant in that the pages will be pre-fetched (pre-rendered) in the background. Pages will still appear after you hit return, the difference is that they will appear quicker.

mason55 · on Jan 6, 2012

The stress on the server is the same either way; it doesn't care if the browser is displaying the page to the user or not.

patrickod · on Jan 6, 2012

Doesn't Chrome already do this to a certain extent ? I've seen server logs myself where Chrome has tried to fetch URLs that I have not finished typing.

jfoster · on Jan 6, 2012

Chrome currently has an option to start loading pages as you type (disabled by default) but this sounds a bit different. The current feature immediately displays pages as you type. It sounds as though this will instantly preload but not reveal until the query is submitted.

mike-cardwell · on Jan 6, 2012

They should send a special HTTP request header along with pre-load requests so site owners can choose to block them.

Eg web browser sends a request with the header:

X-Page-Preload: Something

I configure my webserver to 403 any requests with that header.

icebraining · on Jan 6, 2012

Apparently they do, they send

    X-Purpose: instant

(http://www.google.co.uk/chrome/intl/en-GB/webmasters-faq.htm...)

paraschopra · on Jan 6, 2012

But 3rd party JavaScript libraries (like GA or ours) cannot read such headers and will invariably get executed. How would Google Analytics recognize and discard such irrelevant requests?

icebraining · on Jan 6, 2012

I dunno, use the Page Visibility API[1]?

[1]: http://www.w3.org/TR/2011/WD-page-visibility-20110602/

mike-cardwell · on Jan 6, 2012

Will the javascript even be executed during this "pre-rendering" process? It's the people with server side analytics that are going to be affected most by it.

alexchamberlain · on Jan 6, 2012

To disable this behaviour, Detect this and return an HTTP 403 ("Forbidden") status code. Then you will blacklisted client side.

I'm not sure I like this, since it will slow down your site until the blacklist gets cleared up if the user accesses a page they are forbidden for.

icebraining · on Jan 6, 2012

Yeah, but on the other hand it's hard to find an appropriate status code. Since prefetching is a common feature in many browsers they should agree on a new code, there are still plenty of 4XX available.

mike-cardwell · on Jan 6, 2012

That's good to know. That can be used to do what I want.

alexchamberlain · on Jan 6, 2012

Why would you block them, especially for static pages?

mike-cardwell · on Jan 6, 2012

To save resources

ryandvm · on Jan 6, 2012

My guess is that the effort of blacklisting them is going to cost you far more than the false positives will. Not to mention a less swift experience for your users.

Unless of course you are the admin for http://www.faceboo.com/

alexchamberlain · on Jan 6, 2012

Bandwidth I guess, but I conject that it wouldn't starve most servers resources on static files. Coupled with good cache settings, I think this is a great move forward.

Imagine when I request Hacker News (too often), if it could download the HTML and the site returned NOT MODIFIED on everything else before I've even pressed enter, it could be displayed straight away, with little pressure on the server.

michaelmior · on Jan 6, 2012

Agreed. It seems like Google has no desire to implement this however. Presumably because they feel prerendering will give Chrome a significant speed advantage and they don't want site owners to opt out from day one.

Looks like the best you can get is a JS property once the page is loaded. http://code.google.com/chrome/whitepapers/pagevisibility.htm...

icebraining · on Jan 6, 2012

it seems like Google has no desire to implement this however

Why do you say that? They seem to send "X-Purpose: instant", at least according to their FAQ[1] and bug reports[2].

[1]: http://www.google.co.uk/chrome/intl/en-GB/webmasters-faq.htm...

[2]: https://code.google.com/p/chromium/issues/detail?id=91735

michaelmior · on Jan 6, 2012

Cool. I couldn't find any reference to this in the FAQ when I was reading :)

josscrowcroft · on Jan 6, 2012

I was pretty sure that Chrome already does this (maybe it was a flag though, actually). I switched it off after a while because it got irritating... luddite that I am.

bgarbiak · on Jan 6, 2012

I too think Chrome did that for a while - but it's gone in the current version. I definitely had it turned on for a while, used and actually liked the feature. Now it only works for search phrases and Google results.

ricg · on Jan 6, 2012

What will this do to web traffic? Sounds like this could result in a lot of additional traffic (if the the prefetched page is not what I'm interested in).

Maxious · on Jan 6, 2012

Prefetching has been around optionally since dialup days (Wikipedia says at least 2001). Around March 2005, Google instructed Mozilla browsers to prefetch the first search result URL. The sky is not falling.

jfoster · on Jan 6, 2012

Increase it. Partially due to prefetched pages not being what users are interested in, and partially due to seemingly faster & more pleasant web browsing causing an increase in overall browsing behavior.

azakai · on Jan 6, 2012

It will also increase CPU usage and decrease battery life, because the pages are also pre-rendered in the background.

It is nice to have the page appear instantly if you wanted exactly what was preloaded, but the overhead this adds in both network and CPU is kind of worrying.

jfoster · on Jan 7, 2012

Good point about battery life. Perhaps the Chrome team should consider turning this on/off depending on the system state.

It's the type of feature that I think most users would want on when they are at plugged in at home using a virtually unlimited internet connection. On the other hand, when using battery power and a 3g or tethered internet connection, it probably isn't so desirable.

nextparadigms · on Jan 6, 2012

Maybe this will be one of the benefits of having 1 Gbps connections in the future.

alexchamberlain · on Jan 6, 2012

This would work exceptionally well on SPDY connections.