Hacker News new | past | comments | ask | show | jobs | submit login
Twitter's t.co uses meta tags and JS instead of 301 Redirects to Mask Referrers (getclicky.com)
53 points by ams1 on Aug 20, 2011 | hide | past | favorite | 29 comments



It seems to depend on the User-Agent:

  < HTTP/1.1 301 Moved Permanently
  < Date: Sun, 21 Aug 2011 02:55:16 GMT
  < Server: hi
  < Location: http://dl.dropbox.com/u/81822/fans.jpg
  < Cache-Control: private,max-age=300
  < Expires: Sun, 21 Aug 2011 03:00:16 GMT
  < Content-Length: 0
  < Connection: close
  < Content-Type: text/html; charset=UTF-8
  < 
  * Closing connection #0
But with a quite common User-Agent:

  curl -v http://t.co/emmQt03 -H "User-Agent:Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.1 (KHTML, like Gecko) Ubuntu/11.04 Chromium/14.0.825.0 Chrome/14.0.825.0 Safari/535.1"
  < HTTP/1.1 200 OK
  < Date: Sun, 21 Aug 2011 02:56:05 GMT
  < Server: hi
  < Content-Type: text/html; charset=utf-8
  < Cache-Control: private,max-age=300
  < Expires: Sun, 21 Aug 2011 03:01:05 GMT
  < Content-Length: 183
  < Vary: Accept-Encoding
  < Connection: close
  < 
  * Closing connection #0
  <noscript><META http-equiv="refresh" content="0;URL=http://dl.dropbox.com/u/81822/fans.jpg></noscript><script>location.replace(http:\/\/dl.dropbox.com\/u\/81822\/fans.jpg)</script>


If they didn't detect user agents they would break the possibility of doing short link expansion with repeated HEAD requests following the chain of redirects. Which a wide range of services need to do efficiently.


This is even more significant in that unlike other URL shortners such as bit.ly, t.co won't show you statistics (at least for free).

So if you get a bunch of t.co traffic and you don't want to pay Twitter for statistics, the only way that I've seen that you can understand how you are getting traffic is to search for keywords relevant to your site and hope to find a tweet that includes the t.co link that you saw in your logs. You usually can't even search for the t.co link itself (unless it appears in the text of the tweet.)

Very annoying for people who want to study their server logs without paying extra money, but a great way for Twitter to monetize. (Even if users decide to include a bit.ly link to get free statistics within a tweet, it will still be hard to track down as explained above.)


you can actually just search twitter for that http://t.co URL e.g.: http://twitter.com/#!/search/http%3A%2F%2Ft.co%2FISHbpUw, but also Twitter says it will be releasing an API to make it easier for analytics providers to provide analytics about the links shared: https://dev.twitter.com/docs/tco-link-wrapper-faq#How_do_I_p...


Thanks. My failed search was for a substring (i.e. t.co/ISHbpUw) since that's what I see in Google Analytics but that failed; searching for the full URL resolved that problem.

My previous post was also poorly worded. I have no knowledge over whether t.co statistics will be free or not; only that I don't think there is a free way to get them now; I haven't investigated if there were available paid options. The situation of tracking which tweets are giving you traffic is still poor right now, but it might improve in the future.


I noticed long ago that none of Twitter's redirects had the proper referrer listed, and it really is annoying. There's no way to parse a log and know how many people clicked on a link from Twitter unless I use another landing page just for Twitter shares which is a bit silly. I don't see why Twitter is doing that.


Read the article again, they just fixed the very problem you're having! All you have to do now is to count all HTTPs accesses with a referrer from 't.co'.


Nope, if someone clicks on one of my links on Twitter the referer field is empty. If I click on Facebook I get a Facebook url. That is a problem for both Twitter and web sites.


Reread it again :). Especially look at the charts.

Like the author, I personally see t.co in my server logs.

It would be pretty hard for Twitter to remove the referrer altogether (and I'm not sure what the point would be; they should want people to see that they got traffic from Twitter). Just did some research on ways to remove the referrer and found this: http://coding-talk.com/f14/how-to-remove-referral-8570/. At comment #8, they suggest making links via Flash, using https, or using a redirect (which is what t.co is). Users could also do it by configuring their browser to not send along the referrer when they make new requests, but Twitter wouldn't have control over that.)


Which referrers did you see before the change? Wasn't it just the twitter client that the visitor used to click the link?


Because of Twitter's hashbang URL routing, you'd often just see "http://twitter.com/ as the referrer.


Correct me if I'm wrong but this means the referer will show up as t.co, meaning that you can just as easily track twitter referals, just by looking for t.co rather than twitter.com?

The downside is that you can't see which twitter URL it came from, but in my experience that was rarely useful as so often it came from users' home pages. And the upside is that it will show a t.co referal for non-web twitter clients, e.g. mobile apps.


HTML5 includes the noreferrer link rel: "If a user agent follows a link defined by an a or area element that has the noreferrer keyword, the user agent must not include a Referer (sic) HTTP header (or equivalent for other protocols) in the request." [1]

Not sure about browser support, but it's implemented in at least WebKit [2].

[1] http://www.whatwg.org/specs/web-apps/current-work/multipage/...

[2] http://www.webkit.org/blog/907/webkit-nightlies-support-html...


I'm glad someone else noticed this. A couple days ago I noticed that my real-time Japanese photo site http://tensecondstotokyo.com started acting really wonky. Most of the images being referenced are broken as hell and back. I built this back in March after the tsunami so that I could see photos from the ground of what was going on. Currently (as in right now) re-working the backend to account for the changes.


As annoying as this is if you're writing something that depends on t.co using proper HTTP status codes, it's really fantastic for users of Twitter.

It masks the referer header, which protects my privacy without breaking sites that rely on the referer header.

Secondly, and much more importantly, it gets rid of privacy-destroying URL shorteners like bit.ly that give the posters statistics on their tweets.

It might be a tiny annoyance to some developers, but the privacy gains are fantastic.


I don't think this really benefits users all that much from a privacy perspective; it really just consolidates tweet link clicks in server logs.

The information that would normally go in the referer header would presumably be available via future t.co APIs (although there is some privacy protection there now).

Posters can still use bit.ly links (just not by default); they would actually have a t.co link to a bit.ly link to their actual link. t.co links may offer similar statistics in the future. Twitter is also a private company that can track a large portion of peoples' browsing history.


Very true, though I'm still not sure what I think of the balance between privacy and business-friendliness that is best - just like most countries, I suppose...

The proportion of sites with Facebook/Twitter/insert-hot-social-site buttons/widgets on nowadays means that a great deal of people's browsing gets tracked anyway, unless they take precautions against that. Pretty much every blog, startup, news site, etc etc.


definitely noticed all the outgoing links briefly flashing over to t.co the past few weeks. Interesting. I expect someone at twitter dev will explain this in few days


What's the point of hiding referrers when Twitter uses hash-bang URLs, which break referrers already?


Twitter is not equal to its web interface. APIs deliver t.co links too.


It's definitely time for browsers to stop sending referer headers.


It is a two sided issue. As I am browsing the web I have referers disabled in the browser to protect my privacy. But at the same time I heavily on seeing where people came from to find my hobbyist website. It motivates me and I would not have discovered communities of like minded people.


Yeah, let's break the web!


Why would this break the web?


It honestly wouldn't. I browse with some antivirus stripping out my referers and modifying other headers to be less identifying, and it's vanishingly rare for me to get any sort of problem whatsoever.

The flipside of this is that knowing where people have come from is very valuable even in non-invasive ways to business, and in the end that's what the internet is more or less about nowadays. The average web user probably only uses a few things which aren't solely controlled by one vendor - email being a notable one, and probably why Facebook messages will never win in the end if history teaches us anything :-)


serving up a full HTML page seems a lot more expensive than returning a header


But it's better for the privacy of your users. Just because you have a website that I've visited doesn't mean you're entitled to my browsing history.


I'm pretty sure only one http message is needed to send both html headers and html content, which makes the only draw back a slight lag of the client rendering the page.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: