Hacker News new | past | comments | ask | show | jobs | submit login
How to convert email addresses into name, age, ethnicity, sexual orientation (maxklein.posterous.com)
144 points by maxklein on Jan 3, 2010 | hide | past | favorite | 64 comments



There's a little-known world of commercial services like Rapleaf that give marketers information from a list of email addresses. I wrote a test service to show you what information's available on your address:

http://petewarden.typepad.com/searchbrowser/2009/12/what-can...

The scary thing is that if they can get your full name and rough location, it's then often possible to get a full address (see whitepages.com for a consumer-facing example). That then ties into other datasets that have information on income, occupation, marital status, etc for every household in the US.


I've just released the code for this as open source:

http://petewarden.typepad.com/searchbrowser/2010/01/how-to-f...


It didn't know anything about any address I threw at it, so I feel pretty secure. And I'm as public as it probably gets, I make absolutely no effort to hide my identity or whereabouts.


The address j@ww.com you mention in your profile gives a hit for an Amazon account in Kansas:

http://web.mailana.com/labs/findbyemail/index.php?email=j%40...

Doubt that's you though, probably just an MD5 clash, since it's a long way from the Netherlands?

I only see one other check in my access log from the same IP that checked for your address about an hour ago; mishu<snip>.com. That did find a flickr account, though it appears to be private, so not much info there.


> The scary thing is ...

Sorry, I remain not scared.


What could possibly go wrong when there's an easily accessible database of detailed information on every individual? Move along, nothing to see here.


I assume you were ironic. Anyway, what could (and did) go wrong is the systematic extermination of a particular subset of people in this database. (Sorry for the Godwin point reached so fast.)


Yeah, really this is NOT anything new. Direct Marketing service companies have been providing this for years. Back in the mid 1990s we were sending one such company reels of 1/2" tape with names and addresses, and getting back reports of age/income etc. distribution.


Agreed; speaking personally anyone willing to spend the 10 minutes with a whois database service, my email address and a little digging on my sites could find out pretty much every address I've ever lived at.

(bonus points if someone does it and gets all 3 :))

But I know that data exists - and for now I am happy for people to discover it about me.

Speaking for non-techy types, though, it is something to be concerned about. Less so for marketers as things like stalkers and fraudsters etc. It's worth raising these issues gently to try and encourage people to understand what data is accessible about them - and ensure their comfortable with that.


Simple strategy .. but so borderline ethical.


I've been thinking about this since I discovered it - is it not facebook being unethical here? They give full access to the personal info of a person given their email address. Their new 'privacy' terms is basically about nobody having any privacy.

Small and agile business like mine will capitalise quickly on this info, but sooner or later, big companies like coca-cola will also see nothing wrong with storing public information that a user has 'chosen' to make public.


If the question is "Am you being unethical?" and your response is, "But isn't the real wrong being done by someone else?", then the answer is "Yes, you are".


I don't believe Facebook made this decision lightly. I'd like to believe that if they found that by releasing this information it placed their users in jeopardy, they wouldn't have done it. Call me naive if you want.

I didn't pay much attention to the whole privacy controls thing. I assume that I am just like millions of other people who just clicked the "Close" button quickly so I could check my wall posts. With that said, I basically operate under the assumption that whatever information I put on that site, its going to be broadcast out to the world. Just like I know that whatever comments I post here on HN can be searched on Google and read by whomever. I think most people in my generation know this.


Max, I thought you would have given it a lot of thought and so did I, that is why I called it borderline. I am with you when you say that these persons have opted to allow their information to go public and that Facebook effectively has let its users down. Anyway running a business is tough and marketing at some point in time was compared with guerilla warfare:) (now is JavaScript!)


Is it really unethical? Everybody knows that part of their facebook is viewable to the world, and they can change it and make it private if they want.

I don't this tactic makes spamming any more or less ethical.


This is where our hacker stupidity/blindness sets in.

'Everybody' in your sentence actually means 'computer programmers/tech savvy people/criminals'.

And this is why Facebook's move is so distressing. My Mum and Dad don't know this. My 'normal' friends don't. It is a disaster waiting to happen.


'Everybody' in your sentence actually means 'computer programmers/tech savvy people/criminals'.

Considering the overwhelming, and increasing, importance of information technology in basically every aspect of our lives, I feel less sympathy every year for people who are not "tech savvy". Now sure, cue calls of arrogant elitism from the privileged classes, etc. But tell me, where does it end? At what point do we say that people should learn the basics about the things they use every day, and that might harm them?

We teach kids that electricity is something to be careful about. We require drivers to pass tests. And yet the internet is, or soon will be, the single most important thing on earth and yet I know people who could not tell you what HTTP or DNS stands for or where "internet" even comes from. We need to start teaching this shit in schools. It is becoming basic required knowledge to function in modern society and should form part of standardised education. Ignorance is not a defence in law, why should it be here?


But tell me, where does it end? At what point do we say that people should learn the basics about the things they use every day, and that might harm them?

Agreed; but none of these sites tell you, in a clear, concise and understandable way what information is available and at what level. Facebook is fairly good - but you still have to go digging.


Good point, but people don't even seem to be thinking enough to look for such a document. They should be intuitively aware of the risks and possibilities associated with companies holding this kind of information.

I didn't need to read Facebook's privacy agreements to know that there was no way in hell I was going to hand them my entire social graph on a platter just in exchange for a free blog and image hosting. Just the certain knowledge that law enforcement will have easy, reliable access to a list of all my friends, with contact information no less, is a total showstopper. Maybe I'm paranoid but I don't understand why more people aren't considering these things.


> Maybe I'm paranoid but I don't understand why more people aren't considering these things.

By today's standard, you are paranoid. By reasonable standards however, you are just informed. The internet is mainstream for 10, maybe 15 years. This is not enough for normal people to learn about it. This is not enough for parents to teach kids.

People are learning, however. See my aunt, while refusing to even use a computer, likes to "spy" on some of her acquaintances by asking her daughter to check the relevant Facebook profile.

I am confident that people will learn. They will make the distinction between private and public. They will see how massive centralization of data (Facebook, Gmail) could be used. I think, however that this will be long. Maybe another 10 or 15 years.


I wouldn't go so far as to say everyone knows that part of their Facebook profile is viewable to the world, but one should still assume that any information posted online is public, even if said person has some quaint notion of privacy.

Simple rule: if you don't want it made public, don't put it on the internet.

As for ethics, I don't find anything wrong with using information gained using this method unethical if it's used in an aggregate. If, as the article suggests, you use it to find out that the majority of your users are gay, you might be able to fine tune your site to better serve their needs. Or it could reveal to you that your site is too niche and make you generalize it better. If you use it to cold contact random people using their personal information gained surreptitiously, then your ethics could certainly be questioned.


Everybody knows that part of their facebook is viewable to the world

No, that was a CHANGE in Facebook's policies after at least 200 million people had signed up. Some of those things can no longer be changed to be private.


>Some of those things can no longer be changed to be private.

Which things?


You cannot make your friends list private from your friends or friends friends anymore. If you had this turned on, it was switched off.


Someone else can check the ToS and privacy notice, but if I remember correctly sex and religion went to world-viewable by default if they are specified on the profile at all.

After edit: "The Ugly: Information That You Used to Control Is Now Treated as 'Publicly Available,' and You Can't Opt Out of The 'Sharing' of Your Information with Facebook Apps"

http://www.eff.org/deeplinks/2009/12/facebooks-new-privacy-c...


>Someone else can check the ToS and privacy notice, but if I remember correctly sex and religion went to world-viewable by default if they are specified on the profile at all.

I know some of the defaults changed, but that's not what I asked.


What's the unethical component to this? To me, it all depends on what you do with this information. If you use the demographic info simply to understand your users and improve the service for them, then why is that bad? Perhaps you realize after looking at this demographic info that your assumptions about your userbase are wrong and you need to adjust your offerings or cater to a minority that you neglected. On the other hand, if you use that info to discriminate, then ya, that would be unethical.

And if you think giving this list our to MTurkers is unethical, then I have to ask why as well? I think you have to ask yourself that by giving away this information, does it open up your users to vulnerability? Would you be enabling unethical behavior on the part of someone else, thereby making you guilty by association?

For the record, I downvoted you because with a subject this sensitive, you should explain your position...not just make a quick, semi-accusatory statement. We need more comments with substance on HN.


It is only unethical if you assume that most people don't understand Facebook and privacy controls.


Think about this from the perspective of the user though. When I go on Facebook, I expect that whatever information I put on the site is public, regardless of privacy controls (btw, I didn't change any privacy settings when they made a fuss about it because I didn't care/ didn't understand what the difference was from before). If people don't want this information broadcasted, why post it in the first place?

As an example, consider the "relationship status" feature. I don't know about you, but for most of my friends it was a running joke that you weren't "official" until you listed your status on your profile. So I think most people understand that their information is public.


which they've done a reasonable job of promoting.

A larger and larger number of my non-techy friends are locking up their Facebook accounts.

Whether this is just because the default settings have changed to be more restrictive I don't know (when they did the updates mine tried to get me to swap from "fully open" to "friends only" so I am not sure their algorithm for suggested settings is very good)


Pretty much every service pretty much indexes you by email address. If you can get someones primary email you can track them down on all these sites; sure Facebook does a particularly bad job of keeping certain personal data private but it's not alone :)


With the recent FB changes, I always thought that marketers getting your details through this method was the least of your worries - I figure identify theft is the bigger worry!


Think about it rationally. There are 300 million facebook users. How many people are out there who want to steal your identity? What are the odds of those people actually stumbling upon YOUR profile? What is the damage that they will do, and how much of that damage are you going to be liable for?

Worrying about identity theft is as rational as worrying about getting your eyes gouged out by crow - yes, it can happen - but fact of the matter is that it's not very likely to happen.


I disagree ... that's the equivalent of say leaving your front door open and not worrying about being robbed because there are 300 million doors in the city.

Sure you are unlikely to be targeted specifically, but once someone has decided to commit identity theft, you don't want to be the easy target.

And 10 FB users or 300 million is kinda irrelevant with all the automated tools/scripts that are available these days - searching for the 1 vulnerable person out of 300 million is a relatively trivial task. They will 'stumble' upon your profile not because they randomly picked you from 300 million, but because they ran a script that identified you as an easy target.


There are companies who make money by scaring you about identity theft. But personally, I just don't think it is a valid concern. Stealing your identity is soemthing that has always been possible, but only nowadays is there some type of hullaboo about it. I just cannot see it happening so often and being such a big problem for me to bother about it.

But it's just my opinion, I don't actually know. Maybe one day when I'm being carted away for ordering a million grandma porn videos without paying, I will realise the error of my ways...


Since there's money to be made from stealing someone's identity, then the easier it gets to do it, the more often it'll happen IMO. Services like this are lowering the barrier of entry.

There's a reason companies are trying to scare you about identity theft. They're in this market because they see it as a growth market. They believe identity theft will increase and they're targeting it with their services and products. I agree that some will use scare tactics to get us even more wound up over it and make some extra $, maybe because they've already over saturated the market, however I definitely think it's going to become a sizable problem.


An awesome start up allowing you to do exactly this and more is http://flowtown.com. Their main feature is allowing you to convert email address into social profiles, you can really learn a lot about a person by what they post online - a great way to learn your customer base online.


Thank you for the love Matt ;-)

Businesses that care about their customers / users / clients are the ones that will win in 2010 and beyond. I want to be apart of this future and it's the basis for why @danmartell and I founded Flowtown.


Businesses like that "care" about their customers like ranchers "care" about cattle.

I don't suppose you're going to publish a list of your clients? Maybe for a fee?


I can vouch for Ethan and say he has personally emailed me a few times to let me know how things are going (with regards to some of my suggestions) and I can tell you he generally cares.

I mean, I don't know him outside of a few email transactions but in any case, he either truly cares or is really good at faking his authenticity, I will go with the former. This would be the reason I recommended his service and the reason I am trying to utilize it at my current place of employment. (head office is not very quick on approvals)


Do you store emails entered into your demo?


Yes, if you search an email address in Flowtown the email and the results we discover are stored in your account.


What's your opt-out policy for people you "discover" information about?


Update from a reddit comment: Use this url - http://www.facebook.com/search/?ref=ffs&q=name@domain.co... and screenscrape for even more spammy goodness.


The other service that does something like this is rapleaf.com:

http://www.rapleaf.com/downloads/sample_free_screening_repor...

I'm not very trusting of them after that scam email thing they did a while back (basically linked all your social accounts together with your email, and then emailed you and told you someone was checking up on you, and of course you were alarmed that the account for Friendster was linked to Flickr and all your other services).

I'd be loath to give rapleaf a bulk upload of data, as I wouldn't trust them not to resell the email addresses. Still, I could see how the information could be useful (e.g. for advertising buys) even if only presented in aggregate.


"So you have somehow begged, borrowed or stolen an email list of 1000 users who you believe are interested in your new service."

If you had to scrounge around to find my name and other details, or otherwise do something besides ask me, I'm not interested, just so you know.

If I ever get an email with my name from a company I've never given information to, I look to see any indication of how they got it - and then delete that email.

I'm sure the people whose eyes are lighting up at this idea don't care that some of their targets won't like this bit of cyber-stalking. I can only hope those who use it get more resistance out of it than responses.


The "evil" in this method is that you will never know the company has your email, because they will get your info off your facebook profile without having to contact you. Facebook will give the info to anyone who has your email and they won't notify you.


The disrespect and lack of ethics is not limited to Facebook by this behavior. The fact that Facebook is insecure does not make exploiting Facebook's insecurity any more ethical or proper.


The point of this service is not to contact you. If they already have the e-mails, then they won't need this service; they will e-mail/spam you anyway. It's in place because...

> Such information can help you correct course before you are too invested in a particular idea you have.


I'm quite aware. The purpose is to get information about me through exploiting a service. You can put it in a little targeted spam to me to try to get past spam filters and get a favorable reaction.

It's snooping, and it's reprehensible.


I wouldn't call it exploiting; I'd call it 'harnessing public resources', since most of the information collected through this service from Facebook is supposedly public.

Also, how can you have a 'favorable reaction' to spam given that...

> If I ever get an email with my name from a company I've never given information to, I... delete that email.

I also agree that this is snooping. However, this is just a tool, and should be used responsibly. Abuse would make it reprehensible.


Public resources are what I put on my blogs or web sites or what I mark to be public. As has been heavily discussed in this thread, Facebook has changed policies to make certain information public no matter what users choose in their privacy settings.

"how can you have a 'favorable reaction' to spam"

The intent is for people to lower their guard when they read email that's directly addressed to their name and seems to display knowledge of them.

"Abuse would make it reprehensible."

I think its abuse is being promoted, here. I mean, really, reread the blog post.


In the UK, I'm sure the Data Protection Act will kick in. It doesn't prohibit you from storing personally identifying data, but it puts restrictions on what you can do with it and how to interact with it.

IANAL, so take this comment as a simple "be careful with the law" that applies to where you operate.


Probably, it's illegal at every EU country which implements the 95/46/EC directive(http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:...)


That text is pretty long, can you summarize the parts relevant to this for us?


Points 28, 38, 39 may apply here. I haven't studied this directive but its spanish implementation(IANAL, but if you work with personal data you need to know how to comply), and under that law, users should be asked for explicit permission unless Facebook had stated to them beforehand that they could transfer their data to a third party for a particular purpose at the time of the collection of the data. In that case, there should be a signed contract between both parties (Facebook and you), which states the purpose and duration of the treatment of the data. After that treatment, the third party should destroy that since it's not authorized to use it anymore.

Anyway, as everyone here knows, its pretty hard to enforce this kind of law.


Well there is really nothing to enforce when you agree to Facebooks TOS and Privacy Policy upon signing up. Facebook has never forced you to sign up.

Excerpt from privacy policy: Certain categories of information such as your name, profile photo, list of friends and pages you are a fan of, gender, geographic region, and networks you belong to are considered publicly available, and therefore do not have privacy settings. You can limit the ability of others to find this information on third party search engines through your search privacy settings.


As far as I can tell this is the "trick" that is the basis for Flowtown's business.

http://www.flowtown.com/


Being completely bootstrapped we've been focused on revenue since day 1 and our "trick" of turning emails into social profiles is our minimum monetizable product.

The current iteration of Flowtown is nowhere close to our final vision: helping businesses scale caring.

We launched with an MVP that had even less features than today's iteration (product is just 9 weeks old).


Well, ok. This trick is the basis for the current version of your product, or perhaps "current business."

Is that fair? Or are you doing something other than, say, scraping http://www.facebook.com/search/?ref=ffs&q=farmerje@uchic... when I type in my email address? (At least WRT Facebook; other networks would naturally have different tactics.)

You don't need to be defensive. Every network is bootstrapped off another network. If you can get away with it, get away with it for as long as possible, but make sure you have an escape plan.

I assume that's what you're doing. Am I wrong?

That's not meant as a challenge, BTW. If I misunderstand your current product then the record should be corrected.


I'm not sure "trick" is a fair assessment, why do you think it's a trick? We don't scrape Facebook for data, but use a variety of data providers for most of our social information.


Sorry. Your original response made it sound like that's what you did do.

I knew, e.g., Facebook lets you search by email address to find a friend if you know their email. Since the data on Flowtown more-or-less matched that data I had assumed that was the source of the data.

You'd do the same for the other social networks, plus perhaps get data from Rapleaf and other social profile companies.

If you're saying that's not the case, I can't argue with that and apologize for implying otherwise.

As to whether it's a trick or not, what difference does it make? You can call it a tactic if you'd like.


Ah, ok, totally my bad. I had the wrong context around "trick" I thought you were referring to our business and not a potential method we used to uncover social data.

On an totally unrelated note Cassie says "Hi" - I was at her place for one of her roommates bday's and I was on his computer responding to this thread and she's like OMG jfarmer is that who you're talking with... such a small world.


the part that acares me the most is that he is willing to go around giving the list to anyone




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: