> I spent ten years on a project to build a free, public, highly advanced search engine for social network data. I obtained special permission from Facebook to index 420 million public profiles. When Facebook reneged on the deal and tried to destroy my business, I spent years of my life and a lot of money on legal costs and eventually obtained a settlement. I have continued to operate Profile engine for several years despite it not making money because it stands between Facebook and a monopoly over social data and because it helps educate people not to trust Facebook.
It seems like his main motive is to ruin Facebook — ostensibly by making it obvious to the world the dangers of FB data. And he’s doing this by releasing the accessible data — including photos — of every user who set their profile public.
The PR harm to Facebook is obvious. I’m not so sure the millions of people naive enough to not tighten their privacy settings will be completely understanding. Even if there is no legal danger, this doesn’t seem well thought out in terms of consequences.
Edit: A Quartz article from 2014, in which the service is described as “spammy”.
"Edit: A Quartz article from 2014, in which the service is described as “spammy”."
Yeah, seems pretty sleazy. "I didn't get as rich as I wanted, so I'm going to release all this information about people who never wanted their information to be distributed this way, and I'm going to release it in a way that is uncontrollable."
Claydon writes: "I could have sold this data to some data broker for a lot of money and it would have been used by those with money for marketing or political purposes rather than freely available for the public good. Instead I donated it for free to the Internet Archive."
How is it any worse than what the Internet Archive does normally when it scrapes sites itself? Most sites aren't able to stop scrapers the way FB could.
Internet Archive would be sleazy if it started out by selling data before becoming what it is now? I don't see that the history of the company matters.
tbh I'm not crazy about some of the things the Internet Archive does, either.
What bugs me about Profile Engine is that there are probably a lot of people who realized their mistake of leaving their profiles wide open on Facebook, and tightened up the restrictions, whose details might still be wide open in this dataset without them even realizing.
The GDPR purports to apply to any EU person, no matter where they are. That's one of the many problems with it.
However, it is already the case that the EU casts a wide net. blekko got a demand for info from the EU regarding Google's dominance of the search space in relation to mobile apps. Most of the questions were about opinions, not facts. After consulting with our lawyers, they said that the small amount of advertising income we had from the EU was enough that the EU could demand that we answer, but we could ignore the first one and see if they asked again more firmly.
Doesn't this become the internet archive's problem, and not the creator. And all the internet archive would have to do is set up a way for people to remove their data from the archive.
Then how was Roman Seleznev arrested and charged under US laws even though he never visited the US, or even a country with an extradition treaty with the US?
Because small, weak countries like Maldives will kidnap and send anyone, anywhere if the country asking for the kidnapping victim is wealthy and/or powerful enough. A much more withering indictment could be made against New Zealand for their attempted extradition of Kim Dotcom to the USA.
Exactly. People tend to counter with "well the US did it". That's your countries fault people, I'm sorry, and as a US citizen I work on these things, but please stop blaming us for your weak governments. We have our own corruption problems, and if you think I mean the outsider who just wrecked two political dynasties and our MSM then I politely suggest you figure out who owns your news sources.
Very interesting. For those curious, this covers the period where Facebook grew from ~50m to ~500m monthly users (depending on what months are included). Some selected events from Facebook's history in this era:
2007/01 m.facebook.com launched
2007/05 Facebook Platform launched
2007/11 Facebook removes "is" from status updates
2008/06 Facebook settled with the Winklevii
2008/11 Facebook Credits launched
2009/02 The like button is added [1]
2009/09 Facebook announces they are cash flow positive
2009/09 Facebook launches @-tagging friends
2010/06 Comments now have like buttons
2010/10 Fincher's movie The Social Network is released
And that's what happens when noprocrast interacts with the lack of preview and time-limited editing. This is what I meant:
2007/01 m.facebook.com launched
2007/05 Facebook Platform launched
2007/11 Facebook removes "is" from status updates
2008/06 Facebook settled with the Winklevii
2008/11 Facebook Credits launched
2009/02 The like button is added [1]
2009/09 Facebook announces they are cash flow positive
2009/09 Facebook launches @-tagging friends
2010/06 Comments now have like buttons
2010/10 Fincher's movie The Social Network is released
I remember having a lot of fun writing status updates that began with ’is’ back in the day. I can’t believe the last time was more than ten years ago...
I wonder when the "how you met" which used to generate all kinds of absurdist responses disappeared, perhaps to encourage more friending between people that actually had no clue how or if they'd met.
And no Facebook timeline should fail to mention when they removed the Top Gun quotes from the bottom of the page.
> We sued Facebook, fought hard in a David and Goliath battle and won a good settlement. One day, maybe we'll have time to tell the whole story - you'd be utterly shocked what goes on inside Facebook - what you've already heard is just the tip of the iceberg.
FWIW I haven’t seen any sign of collaboration by Internet Archive, on their site nor on their Twitter. Anyone can upload what they want to IA — it’s no indication of IA endorsement or of copyright status.
Edit: sorry if I was unclear. Anyone can create an account and create their own file archives. If you poke around enough you can find old movies and books that are still under copyright. I assume IA has to follow DMCA
1. I have seen "The item is not available due to issues with the item's content" many times while poking around. It's the Internet Archive's version of SIGSEGV or "Bad command or file name." This is due to DMCA mostly.
A for profit company makes a copy of user data, fights Facebook’s requests to delete that data, tries to monetise it, and then then that fails (possibly because of incoming privacy legislation) freely distributes it as a data dump.
<quote>
"...
We have donated the complete Profile Engine database to the Internet Archive with the
current exclusion of the following sensitive fields:
Email address
Facebook user ID number
Facebook username
Surname
Profile Engine login password hash
...
"
</quote>
and this :
<quote>
"...
What if this data is abused?
This data has already been publicly available, first from Facebook and then on many search
engines (including Profile Engine) for up to 10 years, with the consent of the person
who entered their information on Facebook. Anyone who wanted to misuse this information
has probably already had access to it and already saved what they want
...
"
</quote>
so the main reason is that he probably realized that there is no way to make money
with this as anybody who wanted to (mis)use the data already had their own copy of it.
Since at least today, the items have been removed from both the linked page and from archive.org; “The item is not available due to issues with the item's content.”.
Fortunately, nothing vanishes permanently in today’s web, since we have this wonderful thing called archive.org… oh, wait…
I can’t read past the paywall. But you read a newspaper article in which someone (who? A FB official? An academic?) claims FB can derive something from user data? That’s something literally anyone can try to do. Doesn’t have to involve FB officially.
"Prior to April of 2008, Plaintiffs had written the first Survey, Petition, Polling, Quizzes and IQ Test applications to appear on Facebook and ranked as one of the largest Facebook application developers in the world."
https://www.patreon.com/profileengine
> I spent ten years on a project to build a free, public, highly advanced search engine for social network data. I obtained special permission from Facebook to index 420 million public profiles. When Facebook reneged on the deal and tried to destroy my business, I spent years of my life and a lot of money on legal costs and eventually obtained a settlement. I have continued to operate Profile engine for several years despite it not making money because it stands between Facebook and a monopoly over social data and because it helps educate people not to trust Facebook.
It seems like his main motive is to ruin Facebook — ostensibly by making it obvious to the world the dangers of FB data. And he’s doing this by releasing the accessible data — including photos — of every user who set their profile public.
The PR harm to Facebook is obvious. I’m not so sure the millions of people naive enough to not tighten their privacy settings will be completely understanding. Even if there is no legal danger, this doesn’t seem well thought out in terms of consequences.
Edit: A Quartz article from 2014, in which the service is described as “spammy”.
https://qz.com/279940/meet-profile-engine-the-spammy-faceboo...