Hacker News new | past | comments | ask | show | jobs | submit login
Disqus cracked – Security flaw reveals users’ e-mail addresses (cornubot.se)
159 points by SuperChihuahua on Dec 10, 2013 | hide | past | favorite | 87 comments



If a political organisation was revealing the identities behind anonymous speech on a jewish forum, the world would be up in arms. If the identities on a gay board was published, Obama himself would be apologising. Now the identities of thousands of people commenting on politics in Sweden was revealed, and it's OK because "they" are the bad guys, says the extreme left organisation Researchgruppen.

The slippery slope is


Being born into a jewish family isn't a choice. Being gay isn't a choice.

Being politically affiliated with hate speech IS a choice.

That is some straight up false equivalency bullshit.


Hate speech is just the speech that the majority has determined that they don't want to hear, and clearly, in some countries, are willing to exercise state violence upon those who commit it.

The problem with making speech legal or illegal based on popular opinion of it, is that one IS perfectly capable of engaging in hate speech without choosing to do so. What happens when what I said online two years ago becomes hate speech tomorrow? Do I need to make sure I go scrub all of my past speech before anyone finds out and sends me to jail?

Furthermore, one "chooses" their future course, and whether they want to self-identify with one group more than another. To claim otherwise is to deny any semblance of free will. I know many people who choose to identify with the gay community that aren't gay, and many who choose to NOT identify with the gay community who are gay.

False equivalency rains down wonderfully on the head of a man imprisoned for speaking his mind. You can tell him all day how he chose to be in jail.


The British probably thought those behind the movement that ultimately resulted in the United States were engaging in hate speech. Be careful not to use a tinted lens to evaluate the world. We all do it. I am not critical of your statement. I am merely pointing out that dissenting points of view throughout history have often met with push-back. Very often, years later, those responsible for the "hate speech" were recognized as being at the root of positive world-changing developments. Imagine the people who dared to engage in "hate speech" against the flat earth and geocentric dogma, voting rights, slavery, etc.

I am disagreeing with you on one point. Being Jewish is a choice, just like being Christian is a choice. I was born into a family, like most, with generations of religious belief. I, however, am an atheist. You don't have to be a Jew. You can have jewish culture in your life, respect it and enjoy it. That does not mean you have no choice but to also be religious. That part is a choice. Just as it is for anyone from any other religion.


> Being Jewish is a choice

Do you think that proponents of Neo-Nazism (etc) really care about the distinction between "born into a Jewish family" and "is a member of the Jewish religion?" I'm not saying that there isn't a distinction, but people engaged in hate usually aren't too interested in nuance.


Probably not. But that's not what I was arguing. Right?

For every example we care to provide there's probably a deviant group more than willing to discriminate against that population and even want to kill them. I think the history of genocides more than proves that point.

Please don't be offended. Jews don't really have a monopoly on being hated or being murdered en-masse. Many argue --quite convincingly-- that the Jewish genocide was modeled after the Armenian genocide of 1914/15 (the Nazi's copied some of the methods the Turks used on their Armenian population).

In the US, at least, in non-deviant circles, I believe you can be a person and not someone "born into an <X> family". In that context whatever you push in front of people as what defines you is up to you. Believe it or not, as an atheist I have a number of good friends who are Born Again Christians. Every single one of them is a great person. Each one of them chooses to define themselves by their religion differently. One in particular will smash you in the face with it every chance he gets. Not surprisingly he has suffered greatly with employment because, well, this isn't something you wear on your forehead and pester people with at work. The other guys are just guys who happen to privately be BAC's. He is a militant BAC. Choice.


> Being born into a jewish family isn't a choice.

The normal US point of view is that "Jewish" is a religion, not an ethnicity, hence is a choice.

That point of view is not universally shared, of course.


Can you provide your reasoning for saying this is the "normal" point of view?

A cursory search tells me nothing of much use.

Apropos of nothing, I found a Pew poll[0] that says... 21% of atheists believe in a god? Huh?

[0] - http://religions.pewforum.org/pdf/report2religious-landscape... ("Conception of God")


I don't have a citation, sorry, just a general perception of how the issue is typically addressed in conversation and the media...

And again, if you talk to people who self-identify as "Jewish", you get a rather more complex picture.


The difference is that in Europe, free speech is not as expansive as it is in the U.S. Hate speech is defined more broadly east of the pond, and is generally a crime in itself in many European nations. In contrast, in the U.S. only hateful acts are crimes (speech itself is not enough), though hate speech can be a a tort (a private action, i.e., for emotional distress).

The crack was done in cooperation with the Bonnier Group tabloid Expressen, in order to reveal politicians commenting on Swedish hate speech-sites.

Exposing politicians is a time honored practice in all countries and across all ideologies. It's possible that the hacking is still a crime in Sweden, regardless of the motives. No one is defending it--you seem to have done that on your own, as the linked article certainly does not.


Press freedom generally trumps privacy, as it should.


Well yes, because of hate speech and potential violence...


In all our worry about NSA taps, the simple fact is that gravatar and now disqus allows anyone NSA, your health insurance company, groups who dislike your group, etc., to track your blog comments, help desk comments, any comment you make around the net.

Your comments to gay rights groups, anti-gay rights groups, cancer support groups, aids support groups, abortion groups, democratic politics, tea party groups, gun rights groups, doctors offices, anyone using wordpress as a front end group.

Anyone can do this with a simple search engine they create, and apparently we don't care because gravatar was setup by a silicon valley favorite and is now owned by Wordpress who was informed of this years ago and refused to consider it a privacy leak. And anyway we like them cutsie cartoon avatars.

Anyone can do this with a search engine that maps pages to md5 hashes and vice versa and either a rainbow table of email addresses, or even easier, a list of your customer's email addresses, because let's see if any of our customers have health problems they didn't disclose.


Any comments you make under your email address are attributable to that email address. Duh. The whole point of gravatar and disqus is to make it clear that your comments on a bunch of different sites are from the same person.

If you don't want a particular comment associated with your name or email, why would you ever fill in that name or email when commenting?


If I go to comment at a wordpress site it says this:

"Email (required) (Address never made public)"

MD5 leaks of my email address into web pages is in fact making my address public.

Hey lmm, duh, when you make a comment under a different name but with the same email address that you think is anonymous at your local hiv testing site, you may not expect that your insurance company can track that down because wordpress has been leaking your md5 address all over the place.


>If I go to comment at a wordpress site it says this: "Email (required) (Address never made public)"

So wordpress - not disqus or gravatar (which I'm aware is owned by wordpress) - is lying to you. Let's put the blame in the right place.


No it doesn't. You need to know the email address up front in order to generate the hash.


But the point is that you can easily brute force that, especially if you have a list of people that you suspect may be making such comments and their email addresses.

Saying that your email is kept private by taking its MD5 sum is like expecting than an unsalted MD5 sum for a password hash in a publicly accessible password database will be secure for people with weak, brute-forcible passwords like "1234". You are providing a little bit of obfuscation, but no real security.


From the same person, but not from that email address. Bad idea, but as per the article people expect these posts not to be linkable to their e-mail address. Why? Well, obviously, because that's the way Disqus works "usually".


Really what we should be doing is explaining to people that "foo@example.com" is a valid email address and that they should use it everywhere.

I am foo@bar.com. I don't know who owns that email, but I want to preemptively apologize to them about Disqus comment responses they've received.


The fact that md5 hashes of email addresses can be brute forced is nothing new. This has been pointed out for years, concerning services like gravatar. The only thing which is important, is that Expressen/Researchgruppen believes that they have the right to exploit Disqus' service in order to "make the news".


Why don't/shouldn't they?!

Theres a security problem with Disqus.

Apparently it (genereal technique) is "old news"

Apparently Disqus doesn't care enough to fix it.

Demonstrating the attack may be the only way to get them to care.


Anything requiring third-party cookies, AND requesting an e-mail address not only stinks of spam-oriented advertising revenue, but also total disregard for user security. Even more telling were the options to sign in with services like Facebook Oauth.

So from the beginning, I think it was always obvious that Disqus had no interest beyond the bare minimum in casually protecting user privacy. This prompted me to avoid ever providing Disqus with any kind of serious e-mail address. Looks like my instincts served me well.


You think Disqus is selling your email address to spammers?


I don't get it, if your email address is so private then why you share it with 3rd parties? Also, why would your email address be so private if the spam filters are so efficient nowadays, what's the harm in having a public email address? Please enlighten me.


> I don't get it, if your email address is so private then why you share it with 3rd parties?

How would you use it otherwise? My backyard is private, but I share it with a few 3rd parties. That doesn't mean i intent to share my backyard with the entire world.

There is an element of trust with particular 3rd parties that is being violated. Why is that so hard to understand?


The address of your backyard is not private.


Analogies are terrible. "Imagine X is like Y. Ok, but what about aspect Z of Y? Oh, that doesn't apply to X."


So what is the address of my backyard? Besides being an inane argument, how do you know it is not private?

Unless you run your own mail server, someone knows your email address, so are you trying to define that as not private? Obviously, we are using fuzzy terms about private/public when there is a huge gradient of privacy.


It gets a bit awkward when you comment on racist/anti-immigration websites with your (at least semi public) email while you represent a far right party that officially has a zero tolerance on racism.

The party has its roots in the skinhead/neo-nazi organizations and they have been trying to shake that image problem for quite some time now. To be fair, putting on suits has helped them.


Yeah well, the person has himself to blame. Either be identifiable or anonymous. Don't use private email and expect to be untraceable. That's just mad.


At least some of the emails weren't even private. I saw an example of one of the politicians who got their identity 'leaked'. The email she used for the Disqus comments was listed on her official municipality contact page.


That's the point.

95% of the population has absolutely no clue about how technology works and how it will be used against them, sooner or later.


Why share it with 3rd parties? Because that's the typical method of creating an account and managing it in case you forget your password and they will typically claim that they aren't going to share it with anyone and the cost of finding an anonymous mail that they will accept is quite a bit of time. (I typically do not use my email to sign up for anything.)

Why not public? I guess it's fine if everyone knows it assuming a perfect spam filter (they aren't perfect btw) but I don't want it to be public knowledge what websites I use and what I say on those sites. Non-sinster example; I could be publicly discussing a sexual encounter and just not want the whole world to know (non-psuedonymously) that I did that.


you are absolutely right, we have public email addresses, and also the spam filters. But then still, information needs to be protected. There are people who dont want their email addresses public or in some list which is being sold over and over again to marketing companies which then flood your inbox with spams. It is irritating at times even if there's spam filters.


I think the point isn't that the emails were exposed so much as the exposure of the emails allowed the identity of commentors to be revealed.


The article is a bit miss leading. The so called security flaw does not reveal the e-mail address directly, but a MD5 hash of it. Sure it can be cracked, but it doesn't mean that it will get cracked.


Actually, yeah, it will be cracked, by someone.

And E-Mail-addresses aren't passwords; trying a few hundred variations for each firstname for each lastname is perfectly feasible and should crack a nice percentage of these hashes.


My email is firstname@companyname.co.nz (I have a few of these at different companies). I'm fairly confident this isn't going to be cracked any time soon by random MD5 hashing.

(of course, my real name can be extrapolated from my HN username)


Given 10^6 possible first names (that's really generous but, hey, I like my dictionaries to be cosmopolitan in character) and 10^6 domains (again, generous) exhaustive search takes 10^12 hashes. My laptop can do 10^7 in a second. This means you have about 10^5 seconds until your email is broken given that the MD5 hash is divulged. That's plus or minus three hours.

Your call on whether "An adversary can only defeat my security given three hours and a hardware investment of $1,600 2010 dollars" is an acceptable security bound for your users. If it isn't, don't use MD5 for crypto purposes.


The rainbow table would just need to include alphanumeric letters + '@' for up to 30 letters. I think your emails are in nearly every rainbow table in existence.


Just the 1-10 character lowercase alphanumeric rainbow table from freerainbowtables.com is 297 GB. Of course, you can generate rainbow tables with various parameters and tradeoffs so it's not trivial to compare them.

Still, I don't think I've ever had a rainbow table that contained plaintexts longer than 12 characters. Are 30+ length tables common these days?


EDIT: “nearly any email rainbow table”, i.e. 1-10 characters cross joined with all domains for a given tld.

You’re correct that brute force with an entropy of 3 per bit would still be too big for rainbow table usage (like 10^15 PB too big).


Using MD5 means that CPU/GPU is cheaper/easier than a rainbow table. Or you can use both. Generate 33.1B hashes/s and start with a rainbow table.


How many *.co.nz's are there? 1,000,000? 10,000? How many first names are there? People can generate millions of MD5's per second now-a-days.


They are after a specific list of politicians which the email addresses are probably known already. So there is no security. Hashing and hashing with salt only protects population, not individual with knowledge.


$500 graphics cards can run a dictionary against a md5 hash at the rate of 700,000,000 per second. That is 0.7 billion per second.


MD5 hashes of emails is very common practice for Gravatar etc. - although it's fairly sucky, I'm assuming this is in the API specifically for things like showing Gravatar images.

I reported a username -> plaintext email vuln to Disqus earlier this year and they were very prompt in patching it, I wouldn't criticize them for this at all as this a very common issue across most blog comment systems.

Would be nice to change how Gravatar works, but it's fairly fundamental. I think if you want your email to be private you should probably be registering temporary ones or using the + aliases like gmail offers to avoid these kinds of hash-cracking attacks.


Another solution would be for these services to use something with a greater work factor than MD5. When a typical user can brute force MD5s at a rate of 8.5 billion per second with AMD HD7970 graphics card then it's time to use a different hashing algorithm. Something like scrypt or bcrypt with a larger work factor would make these attacks much harder and more expensive, while leaving the fundamentals of the system the same.

http://hashcat.net/oclhashcat/ https://www.tarsnap.com/scrypt.html https://en.wikipedia.org/wiki/Bcrypt


Y'all seem to be forgetting that this isn't random ascii string bruteforcing: you start with a list of known email addresses collected from other places or bought from spammers, then you hash all those, and you see which ones match.


Another reason to not use your real name or email address when commenting across the web.


Or you know just be a decent person and not post things you wouldn't say in person yourself.


Yeah - people have never been persecuted for who they are in person.


Common Sense, The Federalist Papers, Candide ...


Things like... I think gays shouldn't be beaten up and jailed in Russia for engaging in gay "propaganda?" Things that people who are under the threat of being beaten up and jailed for saying fear?


Get 150 million e-mail addresses from Adobe hack. Match vs Disqus users. Profit.


Even better, you can be certain a great number of users reuse their password


True. A percentage of the union of Adobe and Disqus users will use the same password for both services.


But if they haven't changed their password after the Adobe hack then they're already boned, aren't they? How doe the Disqus vuln add to that?


You don't want to try 150 million Adobe logins on Disqus. You want to identify which ones to test first.


Maybe I'm being dense this morning... if I were in the Adobe 150M, some criminals would already have my email address, right? How does getting Disqus's hash of it help them out?


Not being dense at all, it's a valid point, but crossing both leaks helps them find out quickly which logins and password combos to try at disqus, and which accounts will be compromised.


By linking your Disqus comments to your e-mail. For example your comments on sexual preference blogs, political blogs etc. Mapping your life, possibly opening up for blackmail.


Well I guess risk profiles vary. I'm certainly more worried about criminals having the access they need to reset my creds with various services, than about them knowing my oddball political opinions. I still don't see how this Disqus issue makes Adobe worse, although admittedly any big list of email addresses (like Adobe) makes this worse.


Maybe you remembered to change your email password, but forgot to change your disqus password.


Surely any hashing would be susceptible?

Even a slower or more "secure" hash wouldn't help much, because I can take your starting known email address and find comments you have made. i.e. I can start with "bill@example.com", slowly hash that to 901e54d1 and then search google for 901e54d1 to find comments you've made.

Speed isn't a big deal if I'm interested in attacking specific subsets of emails. (Which could still be a "large" set in a real world sense.)

As long as the hashing algorithm is known then it would be weak to finding comments made by known authors. If the hashing algorithm is unknown then it falls under security by obscurity.

So is there any way to implement a decentralised pseudonymous but ID-based system where the ID is tied to email but cannot be generated from email (or rather is generated from email but with some added entropy that prevents going in either direction in the future.).


It's possible to prevent this. Disquis could create a service limited to their network that could store a unique id for each registered email address and hashed with the email, or used instead of the hash. This obviously adds additional computation, latency and storage to their system though so it's far from free but it is definitely possible to prevent this type of hash lookup.


you could salt the hash twice. One (large) you store the other you throw away. This way to compromise a specific account you would have to steal it's stored hash. If there was a leak of the hashes you would need to bruteforce all the hashes which were thrown away


In general, what you're asking about is called a "salted hash". I don't understand enough about Disqus's system to say it would definitely have prevented this vulnerability.


A salted hash only slows down brute-force attacks and dictionary attacks. The salt is still stored with the hash, so you could still eventually match the email address with the hash. Instead of hashing each email address once and comparing it with all of your collected hashes, you'd have to hash each email address using every salt until you found a match.


Sure since they're using MD5 then a salt wouldn't solve their problem, especially if the salt were also part of the url. And let's not pretend that the url itself is somehow secret: there are many ways to collect those, particularly if specific users are targeted. Usually when people are this boneheaded about hashes they're trying to save storage space, but I can't imagine that storing a separate random identifier would add significantly to Disqus's storage.


I think encryption would be the correct way, since you don't want people to be able to compute either plaintext from ciphertext or ciphertext from plaintext. You would still have to provide an encryption oracle, which would allow bruteforcing, but that could be rate limited.


Researchgruppen seems to be in violation of the Disqus terms of service by harvesting personal information and also disclosing this in other mediums.


So what? I doubt they're bothered if Disqus bans them from using their service.


Glad I use separate e-mail addresses for signing up to sites and for personal correspondence.


So, what would suffice to maintain anonymity AND the ability to send the user email notifications?

I'm guessing Hashing (not MD5 though) + Salting + throwing away the salt and bruteforcing it every time you need the plain email (you will lose the option to mass mail your users)

Even then the whole concept of anonymity AND email bound account seems kind of silly. Even if the user uses a secondary email address just for this, he still has to trust the email provider (and if he uses a throwaway, what is the point of collecting it anyway?)

This crack is proof that services that provide a fake sense of anonymity can do a lot of harm.


Old olllddd news, almost every Wordpress blog uses Gravatar, same issue...


So why do you feel compelled to post when you know what you are saying is old news? You just adds to the noise and make it off putting for anyone else to post that actually knows about this event including your obvious point.


Now I'm confused.. why do you feel compelled to reply? You just add to the noise.


Men = Ego.


Why MD5 the email addresses? If you just need a unique id for a user, why not use a GUID or something that isn't traceable from public information?


Nice. Glad I decided against using Disqus for my site.


Where does one get the dataset of email id's of Adobe users ?


Yawn. A rainbow table can be used to determine str in MD5(str). News at 11…


The news is that Disqus is relying on that.


So does Automattic with gravatars (on nearly 20% of the internet, no less), and pray tell how many other companies that use md5(email) as a unique ID for a reason or another. Duh! What were they thinking!


I don't know what they were thinking. It's wrong, they shouldn't violate their users privacy like that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: