Hacker News new | past | comments | ask | show | jobs | submit login
Facebook uses bgsound to see if you have opened an email (plus.google.com)
239 points by newman314 on March 4, 2012 | hide | past | favorite | 93 comments



I've personally found loads of bugs in most of the major email clients and numerous webmail clients that cause them to make outgoing requests which can be detected (even with remote images disabled). Most of these are closed now. I have an automated tester which sends an HTML email with a bunch of tests like this bgsound one to your address which displays information about any callbacks. You can access it here:

https://grepular.com/email_privacy_tester/


Ooh thanks for this :)

Got a "DNS Prefetch - Anchor" on Gmail


That one used to trigger on Thunderbird and Apple Mail. It was fixed after I submitted the relevant bug reports.


Could you link to the fix/bug report? My mozilla-ppa(deb) Thunderbird 10.0.2 leaks like the Titanic after the iceberg.

Gmail does much much better - a counter-intuitive result.


DNS pre-fetching was disabled in v3.0.2 - https://bugzilla.mozilla.org/show_bug.cgi?id=544745#c21

I am surprised that you're seeing leaks in Thunderbird 10.0.2. That is my own client of choice so I'm always keeping an eye on it after updates and I am not currently seeing any leaks... If you have remote images disabled, are you still seeing leaks? If so, which ones?


It was just a dns pre-fetch. (I never expected Thunderbird to do that.)

I disabled preFetch in the config editor and my ship's running dry again.


For me, Thunderbird had a dns-prefetch leak, but it's fixable via config.


Seems, the author has already documented this.

For those interested: https://grepular.com/DNS_Prefetch_Exposure_on_Thunderbird_an...


So did I, it picked up a Google IP address. Anyone know how to close this? I have images turned off.


It seems to be the MTA (probably the spam filter), since it happens to me even without any client open, so there's probably no way to close it by your end.


If it happens without any client open, then it isn't really a problem is it?


Well, it doesn't happen if the address doesn't exist (I tested it too), so it can still be used by spammers to check for that.


If an address doesn't exist the message will simply bounce. Or am I missing something?


Yes, here is an example of an address that doesn't exist:

https://grepular.com/email_privacy_tester/lookup?code=gapz72...

If the address exists in Gmail, you get a response on "DNS Prefetch - Anchor", so it could be used to determine whether to send an email or not. I'm not a mail administrator but I can imagine how it might be useful to a spammer, e.g to stop mass bounce replies.


Erk.

webOS email client leaks data all over the place.


yay, Gmail via Safari doesn't trigger any of the tests.


People don't expect that when they read an email, the sender knows when they read it and from what IP address.

If you're using a tracking image or any similar technology, then you are exploiting a weakness in the system to take information that you don't have permission to.

I recognise that it's standard industry practice, but it's categorically not ok. If you want to track users, ask for their permission. If you haven't done so, then you have no right.


You are right in principle, but this isn't a tricky detail about email. This is fundamentally how Facebook and Google run their businesses.


Tracking email opens is surely helpful to their businesses, but is it really fundamental?


Depends on how you look at it. To me their business models are about understanding their users and exploiting their behaviors for more revenue. Understanding their behaviors in regards to email usage is one of the ways they work. To me "understanding user behaviors' is definitely fundamental


Even if Google did zero tracking of users, they'd still be able to display adverts relevant to the search term used when displaying search results. Tracking you across sites just increases their profit, it's not necessary for their business to exist and be profitable. And even if you argue that it is necessary for them to remain profitable, then they don't deserve to remain profitable. Not at this expense.


It is one of the classic things to track in A/B tests. And A/B testing is a pretty fundamental part of their business.


Flawed logic; A/B tests are hypothesis testing, the specific hypotheses can still violate one's privacy. If they could use a bit of retrofitted code to surreptitiously capture your text messages and geolocation, it would certainly help their business, but would it be ethical?


Plaintext e-mail is awesome, not just because it's readable, but because it's not vulnerable to these sorts of attacks.

Incidentally, it's unfortunate that Sparrow doesn't have a 'force plain text' option. Even though I've checked 'prefer plain text', all Facebook e-mails are delivered in HTML. This might be a reason to switch back to Mail.app.


Doesn't gmail stop these attacks without needing to force plaintext only by simply disabling images by default?


Yes. There was a bug a few years back though where they would display attached SVG images. These images could actually contain javascript, which left it vulnerable to XSS.


Why is zzz90210's post dead? Everyone knows about tracking via images. I never considered something like bgsound, probably a lot of other people did not as well.

And it's the whole point of the article.


His post is dead because this comment he made: http://news.ycombinator.com/item?id=3662065

Took his karma negative, and once that happened his account was killed. As a new member you have to be careful about controversial statements until you build up a karma cushion.


I see, an indirect cause didn't occur to me.


The highly-upvoted mail-bug testing site in comments says gmail isn't vulnerable to bgsound - https://grepular.com/email_privacy_tester/


attacks? Run for your life, is mute-sound-email-tracking attack!


This is pretty standard practice really... many e-mails will have a "tracking pixel" or similar. Facebook has this in their e-mails too: <img src="https://www.facebook.com/email_open_log_pic.php?mid= =blah" style="border:0;width:1px;height:1px;" />


Yes, but about 100% of email clients now don't load images -- whether webbugs or actual content -- because of that. This BGSOUND tactic certainly isn't standard practice, despite being an extension of a prior technique, extended specifically to get around the existing filter.


Last time I checked, the iOS email client had loading of remote images enabled by default. It's been a while though and I don't have an iOS device to be able to check on.

There was a time when the iOS email client (and Apple Mail too) would actually load content from the html audio and video tags, even when remote images were disabled.


You don't list a source, but I'll propose a different stat. A survey in 2010 implied that only 33% of folks kept images on... but that's a wide jump from 100%.

http://www.clickz.com/clickz/column/1716214/disabled-images-...

Your point is right, that emailers are trying to get around the image blocking... But it's not that everyone turns off their images; some folks still keep them on. The question then becomes: what's the best way to block tracking pixels while still allowing consumers to experience attractive emails if they wish?


Can't you attach the images to the email, and reference them from the HTML? If that doesn't work, you can send a PDF or Word file.

Putting img tags that link to your server doesn't sound like a very good way to get attractive emails anyway. What if your server goes down? What if the user disconnects from the internet before reading their inbox? What if the user rereads your message after several years and it's suddenly not so attractive anymore?


The problem with attachments is that you increase the bandwidth on the sender. So, instead of, say, the sender spending 2k bandwidth per mail and each recpient another 60k on bandwidth to get images when opened, the sender pays for all that extra imagery bytes. It may shift the cost in a way you prefer, but it also slows down the send so users may not get their mails in a reasonable time, and may not appreciate the larger mails filling up their boxes.

There is also the issue with referencing the inline images, though I've been told that it's not such a big deal (I've never tried it myself).

As for the "going down", most reputable email vendors have pretty well done image servers, for this very reason. But yes, if the user disconnects, the images won't be available. But same with the web site that the email is linked to, so users couldn't necessarily click for more info either.

But commercial emails usually aren't designed to be saved and re-referenced like mails from friends. Instead, they expect to be read while online, and either reacted to quickly or discarded. I guess it's similar to the mindset with paper mail.

Too bad there isn't more effort on making better mails that don't rely on images but instead make better experiences... instead of better tracking tech.


The problem with attachments is that you increase the bandwidth on the sender. So, instead of, say, the sender spending 2k bandwidth per mail and each recpient another 60k on bandwidth to get images when opened, the sender pays for all that extra imagery bytes.

But wouldn't the sender have to serve the images over HTTP anyway? Unless you expect a big percentage of those emails to never be opened, but then I have to wonder if you should be sending them in the first place.


Yeah, not everyone opens the mail, so the recipient shouldn't have to deal with the larger download (and the sender doesn't have to send a larger mail), and also, the opens are more distributed over time (for the most part, depending on when you send and to whom), so you can send the smaller mails faster but have a more even distribution of bandwidth as the opens occur across a longer period.


What filter is this getting around at actually? Since the article / post says it only works for email clients that have "show images" enabled, I'm wondering what the added value of using this technique is?


Since the article / post says it only works for email clients that have "show images" enabled?

It does? Someone mentions that they had loaded the image bug, leading to someone else making the show image comment. BGSOUND would not be filtered by that.


I just checked using a proxy -- there are no calls made to Facebook from gmail if you do not display images. (Checked using Chrome on a Mac.)


Did your proxy check DNS requests? Anyone who runs their own DNS could easily assign a unique subdomain for each email, embed links to that subdomain within the body of the email, and see if they get any DNS requests for those domains.


Also checked using Chrome/Firefox on Ubuntu, with the same result. iOS Mail App didn't grab anything with remote images turned off.

Considering it's an IE specific tag, I tried with IE and gmail, I couldn't get it to download anything, even when allowing remote images. IE was too bugged out trying to render gmail (not hating, stating).


Can you read the post without signing into a Google account? I click through via my iPhone, and get prompted for a Google account. Isn't this a post about getting tracked on the web?


I can on my desktop, but I can't on my (Android) tablet.

On the tablet Google actually wants me to create a Google Profile before I can read the article.

I've never run into this before today. I wonder if its new.


It's old. Google has god-awful mobile interfaces on most of their services that always either redirect you to something different or require you to log in and then redirect you to something different.

Sorry if I sound annoyed, but it's a pet peeve.


Yeah, that's dumb. If I select "request desktop site" in the Android browser I can see the post, otherwise I'm redirected to a sign in. You only get the redirect in Android Chrome, since it doesn't have a "request desktop site" option. Annoying.


Ahh thanks for this extra bit of info. I use an iPhone browser which lets me select my user agent, so in cases like these, I tell it to pose as Safari on a Mac. With this, I was able to read the post.


Apparently, Google forces you to log in if you are using mobile devices - Android/iOS. Works on desktop without log in.


It's public, I'm pretty sure I got there before logging into Google this morning.


Given that this is a community of entrepreneurs, I'm surprised how unsympathetic a lot of people here are to this technique. Operating a business online, analytics are a very important part of understanding how your users interact with with your service and improving the quality of your correspondence. Facebook is just trying to identify which emails people respond to (i.e. open) most.


Probably because most entrepreneurs have morals/ethics. Just because you can do something, or just because you want something, doesn't mean you should. And doubly so when you know that the victim would most likely not agree to being tracked.


Wow, I hope that subscribing to this mindset is not the status quo among entrepreneurs. Another "very important" thing for most online businesses is money. Doesn't mean that we should accept it when they get it through underhanded means.


Just because the information is useful, that doesn't mean it is ethical to collect it.


Would those of you who object to this tactic also object to collecting website analytics (which in my experience many normal users, including members of my own family, do not understand either.) If not, why not?

This isn't a rhetorical question; I'm sincerely curious. Thanks.


Personally, I think analytics is fine as long as you run them from the same domains as the website. Using e.g. Google Analytics allows them to track users across multiple sites, which is much more dangerous, in my opinion, and should be avoided.

I use Piwik on my own sites and will never share that data with anyone.


For email, it's the same reason I object to ordinary mail containing tracking devices beyond those needed to deliver the mail.

For websites, it's the same risk I take when leaving my house to visit any public place.


How many of their users do you think know they're being tracked like that? I mean actually know and understand, not "were notified in the fine print on page 27".

Tracking users in ways they don't actually know about and understand, well, the usual terms normal people apply to that sort of behavior are "underhanded", "antisocial", "betraying trust", etc.


The vast majority of users don't understand any tracking technology, period.


http://www.sendgrid.com/ which provides email services to a lot of clients perhaps uses this by default. I don't think people in this business look at it as an attack. Websites use various techniques to track people's activities, and this is just another one of them.


SG's "open tracking" is a great feature!


One thing that puzzles me. Pretty much every email client now will not display images in HTML emails by default.

But most email marketing tools provide some metric called "% of emails opened" or something similar.

Now assuming that the marketing emails you send out are mostly text (as they should be), it seems unlikely that many recipients will bother clicking the "Show Images" button since there is no reason for them to do so (not to mention that it often brings up a privacy warning or the function may have been entirely disabled by IT for unknown sources).

So in that case, that metric must be enormously unreliable. So much so as to be basically meaningless?

I have also heard statistics from business types who say things like "Email marketing is useless, only x% of people even open the emails" and I have also seen such things repeated on websites about internet marketing.

Something doesn't seem quite right here.


The outrage over this seems rediculous to me.

How many people complaining about email open tracking are also Sendgrid customers? Use email newsletter service? Your emails may have open tracking (including who opened it) without you even knowing.


Already blocked in the Fanboy Tracking List.., added a few weeks back.

http://hg.fanboy.co.nz/rev/68fbc20cd533


IIRC this only works in IE. If someone sniffs their network traffic they could confirm what is going on.


I wonder if Microsoft will criticize them the same way the criticized Google for similar workarounds.


Isn't bgsound IE only? That's what a quick search lead me to believe...


Many desktop e-mail clients have their own rendering engines. Outlook, I believe, uses Word's HTML engine, for example.


Yes, I think it is, but don't forget that IE has quite a share on the browser market.


I figured it they were doing this they would have made it so when you opened a notification via email it would mark it as read within the Facebook site, but it doesn't. That would be a nice feature and purpose for the tracker image.


After reading the title I thought that they take over the microphone and use it to detect the email arrived sound :)


This is one of the oldest tricks in the book. The question to ask is: who doesn't do this?


This is yet another reason to filter all messages from Facebook directly to the trash folder. The last thing I need is a constant stream of attention-sapping fb emails polluting my inbox.


You could also just tell Facebook to not email you.


It's easier for me to setup a quick filter once than to navigate the ever-changing Facebook UI and privacy settings.


Yesware also does something similar as well...


I have yet to work out why anyone should wish to turn on html in emails ever and I always thought that was the start of email being broken. Protocols existed before machines and should serve us as well as them.


People turn on HTML in email because there is no other widely accepted alternative for rich text email.

As for why plain text only is not good enough, here's a simple thought experiment. Imagine that we did not have any kind of email. We invented computers, and computer networks, but somehow, incredibly, overlooked inventing email.

In this hypothetical world, people still communicate by writing letters, and sending them through the post office. Of course they write the letters on computers in word processors, and then print them, and it is the print outs that they mail.

Now, imagine in this hypothetical world that someone finally comes up with the idea of email, and pitches it as an electronic equivalent of regular mail that is faster and more convenient. Is he going to make it plain text only? Of course not. It will need to be as capable as physical mail, which means it needs to support sending anything that people can print on paper. That means some kind of rich format that can handle different typefaces and fonts, colors, inline images, and attached documents.

Email started off plain text only simply because when it was invented the technology wasn't up to the challenge of handling the presentation features of real mail. The technology has improved, and rich email is the natural, inevitable outcome.


So, by the same logic, twitter should allow javascript to execute when I view some post about people microwaving small mammals.

Perhaps I should be more explicit.

My problem is not with the abilities to assign pixels to an output device. Email should not be an interactive proposition and should have no ability whatsoever to run code.


Email should not be an interactive proposition and should have no ability whatsoever to run code.

Agreed, but HTML isn't runnable code. It's a declarative document format.


I was meaning HTML + scripting, which admittedly is much less of an issue these days. I just remember having to rescue lots of people using outlook.


HTML isn't what bugs you. Unauthorized loading of remote content is what bugs you. That's a misfeature of your (and most people's) mail reader.


You are right, that and any ability for scripting.


My take on this is that it is fairly common to use some type of analytics to know if your targets are opening the mail that they get from you or not. This is a fact of which I am familiar with from my knowledge of email marketing. In reading this, I am not so sure this is a direct attempt to invade privacy. I just think Facebook is following practices is already common with email marketing. From a privacy or trust concern, I feel much safer with Facebook services than I do with Google.


An invasion of privacy does not cease to be an invasion of privacy just because a small group of deeply unethical people persist in it for years.


To win this fight, you need to convince users (and not just the techies on HN) to care. You aren't going to convince site operators to hamstring themselves over an issue their users don't care about. (Even if you do, unless you win over ever single site, all you will do is kill the ethical operators and leave the unethical one standing, let like antibiotic resistant bacteria. Is that really what you want? )


I think it would be pathetically easy to convince them to care, are you kidding? The problem is notifying them that it exists. Despite what many seem to think, people do care about their privacy and most people would find this extremely creepy. If your gmail alerted you that images could be used for this purpose then much more users would care. The problem is that users have no idea that this can occur and it is not fair to them in the least.


This isn't about websites, it's about email. Ethical operators do not insert tracking bugs into email, and they definitely don't do it in a way that tries to deliberately get around long-standing restrictions in virtually all email clients explicitly designed to impede this.


These sorts of tracking pixels or tracking items are long-standing methods of analytics gathering in the email marketing and delivery space. This is not an attack of privacy. They also know every click you make in that email as well. And of course they track the number of times you do these things. They also track that traffic anonymously if you forward the email to a third party who interacts with it.

This is the same practice as any tracking done on a web page even going back to the pixels that AWStats used to use.

I work for an ESP and we've been doing something similar since 2003.


It is an attack on privacy. The recipients don't expect to be tracked in this way, and you don't ask their permission before doing it (1). You're exploiting a weakness in the system.

It doesn't matter how many organisations are doing it, or how long they've been doing it, or how much money they make from it, or how useful the information is, or how good the "services" you provide are. It's still an attack on peoples personal privacy.

I don't expect to be able to convince you though. After all, you work for an ESP so you have to justify it to yourself somehow.

(1) hiding something away in the T&C's doesn't count.


You could say the same for any website. What's different?


I can, yes. And I do...

EDIT: Oh, are you talking about tracking someone on your website, or tracking people across websites?

EDIT2: When people visit a website, they expect that website to be able to see their IP address and track their movements across it. What people don't expect is that websites (without their permission) can track their movements across other websites.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: