Hacker News new | past | comments | ask | show | jobs | submit login
Google, make Google Analytics HTTPS by default (smerity.com)
75 points by Smerity on Dec 30, 2013 | hide | past | favorite | 24 comments



There is a wider debate that deserves to be had about analytics - not just Google Analytics but about web tracking in general. A debate about what data is collected in the first place and just how "anonymous" that data is. Is it possible for companies to respect user privacy and collect user data that helps inform improvements to their business or website at the same time? How can we answer this question when we don't even know what companies track and record in the first place? Google cetainly aren't telling.

It's interesting that Google does not display the IP address to Google Analytics customers citing privacy concerns, but it does of course capture that data for itself. Put another way, Google believes displaying the IP address to analytics customers would be a privacy concern, yet you could argue that Google capturing that data for itself is a greater privacy concern because, unlike the indiviudal analytics customers, Google can aggregate all the phenomenal amounts of data it captures to build a much more detailed picture of online behaviour across multiple sites.

Google are not collecting this information for nefarious purposes, but no company should be allowed to collect such phenomenal amounts of data about users' online behaviour without scrutiny. For example, consider how little comment is made on the privacy implications of using ChromeOS where your online behaviour is not only captured by Google, it isn't even anonymous.

No other company has its digital fingerprints all over the web like Google. But Google gets an easy pass on matters of privacy and online tracking. Why?


We have a number of sites in the UK for reporting crime which are supposed to be anonymous. The best known is Crimestoppers UK [1] which tells you you can "Contact us anonymously with information about crime". All these sites use Google Analytics and some also add Facebook tracking codes too.

None of these sites point out that Google, Facebook, etc can easily ID you and while you may be anonymous to the site operator you aren't truly anonymous. I've contacted Crimestoppers and the local police to ask about this and got nowhere. The bottom line seems to be that the police and Crimestoppers have absolute unshakable faith in Google and are comfortable outsourcing crime victims' privacy to an American advertising company.

[1] https://crimestoppers-uk.org/give-information/give-informati...


I developed a very similar form for one of these sites. We were requested to and implemented features for safeguarding anonymity, like stripping information from web logs. The organisation insisted on including Google Analytics on the anonymous form submission page; I repeatedly made it clear that this was an awful idea, that this completely defeated anonymity, and that if usage data was required we could collect it in some other way. Unfortunately, I don't think they really care.

Security around this so-called anonymous data was thoroughly lax, and eventually someone is literally going to be murdered as a result of it.


The most concise argument a user can bring to this "debate" is to simply block all of these trackers/analytics/etc.


Analytics data is the currency of internet industry and it will remain like this.


I don't agree about "nefarious purposes". I think the shift of data about customers away from businesses and toward google is a telling sign of intentions: more data for google to sell (pre-packaged in some ad-retargeting deal). In this case, I'd consider google's data collection to be monopolistic-style "nefarious purposes" and anti-competitive.

Your other points are bang on.


(Full disclosure: I've been working in analytics since 1999 and founded a web analytics business as well as consulted many fortune 500s and non-US government agencies on their management and usage of data. I'm naturally pre-disposed to not trust Google. I have a technology background in systems design engineering, as well as a business and economics background)

There's one major problem that many people don't realize: The value in that data is immense and Google has corned the market in both analytics (recording the data) and advertising (using the data to profit). This profit comes directly at the expense of businesses - This means lower margins because they must pay higher ad prices to be competitive and attract or retain (see re-targeting on doubleclick) their existing customers who Google knows more about. The customers are worse off because the products they know and love get out-competed by similar knock-offs who invest less in the product/service development but more in ads. This tends to make prices for people higher over the long run and the benefit they get lower over the long run. It's like a Google tax which also reduces options.

Heh, and that's saying that the NSA isn't already paying Google for their https private certs and using the massive mountain of customer data already. I'm tending to think that the NSA has already though of that one.


And that's why it feels so good to block all hits to `google-analytics.com`. There is a choice of extensions that do the job. Sites which do not work because of that blocking are probably very rare.


To add to that, I would advise blocking even more than just that domain. I would even advise blocking all third-party static assets hosted on a google/doubleclick/etc. domain. (If there are assets that are needed to prevent breaking functionality such as web fonts or otherwise, then caching them on your own server (or router in the case of personal use) is recommended. This advice might put the reason behind google caching the images in emails into perspective. (google's email image caching tactic wasn't about email marketing, it was about keeping more and more re-targeting data to themselves)


Google Webfonts makes use of user-agent sniffing to determine your OS, and depending on your OS they optimize the font served to the font-renderer in your system.


Interesting. It seems like something that could and should be packaged into a simple JavaScript to reduce chatter and synchronous calls in the JavaScript pipeline rather than having a blocking server call. (So that it can be distributed and cached without upstream requirements)


Could you clarify your argument? It seems like learning about alternative services and increased competition should increase options, not reduce them.


Certainly. Discovery of alternative services is vastly different when learning through a friend who had a great experience (unpaid) as opposed to learning about a service through a carefully constructed (paid) marketing plan. Often the marketing plan has significant resources compared to the investment in the product itself. The strength of a marketing effort (including how well it can convince and how well it's targeted) determines the winner. If one learns of an alternative service but the marketing sizzle has confused customers of existing products, then the advantage goes to the stronger marketing team, not the stronger product.


It seems like this would depend on whether marketing is used for good or evil. For example, showing a lot of people how your project actually works and why you might want to use it (as Apple does sometimes) is taking the high road. A stronger marketing campaign may work well due to better user education, in which case it's not necessarily "unfair" that a somewhat weaker but better-understood product wins.

But of course there are the usual dirty tricks, too, causing user confusion as you mention.


Very true. I really want to see more companies taking the high road. It's unfortunate seeing the state of marketing. The food industry for example - people are not eating more healthy meals as a result.


While we wait, lets make all sites we control ourselves use HTTPS. Change this line in the script:

ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';

To this:

ga.src = 'https://ssl.google-analytics.com/ga.js';


In the article I explicitly say this won't work sadly :(

While the Google Analytics JS is loaded over HTTPS, the tracking information sent back to Google inherits HTTP/HTTPS from the website itself.

That means if you're on a HTTP site, it'll send back the results over HTTP. You'd need to actually hand roll your own GA JS to have it send back over HTTPS and then I'm not certain if that'll cause issues with records at Google's end.


add to your tracking JS: _gaq.push(['_gat._forceSSL']);

And for the newer analytics.js: https://developers.google.com/analytics/devguides/collection...


It's not just the NSA; any old coffee shop wifi operator could essentially bug your machine with Google analytics (even after you go home): http://paulgb.github.io/cachebeacon/

The HTTPS proposal might be expensive but it would prevent this. Every mainstream and beacon and CDNed JavaScript would have to be on board though.


> any old coffee shop wifi operator could essentially bug your machine with Google analytics

... also whenever people use all those CDN-hosted jquery and other popular JS libraries. Which is - unfortunately - also widespread practice these days.


Interesting thing that is happening right now in South America:

Somebody hijacked some servers from Movistar Peru / Argentina, and are serving a modified file for Google ads / analytics JS.

Here are some hijacked URLs: http://www.googletagservices.com/tag/js/gpt.js http://www.google-analitycs.com/ga.js http://pagead2.googlesyndication.com/pagead/show_ads.js

The LinkBucks script they serve instead: http://pastebin.com/mYYpYDkR

The script basically hooks mousedown on the whole page, and redirects you to http://dca14d4e.megaline.co/url/ORIGINAL_URL


I just block Google Analytics entirely. For my own sites, I rely on server logs, because I know I'm not the only one blocking all this tracking garbage we're being served with almost every request we make.


Google does not care about what you want. Especially for GA, which still does not have better date range picking after all these years.


Google: let me swap accounts on the GA page using the same dropdown on all your other apps.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: