Hacker News new | past | comments | ask | show | jobs | submit login
Two More Cases of Third-Party Facebook App Data Exposure (upguard.com)
157 points by jmagaro88 on April 3, 2019 | hide | past | favorite | 43 comments



Unfortunately, Facebook had a fundamental misunderstanding of how privacy has to work, and their users will be paying for their error for years.

If it's earth-shatteringly bad for your users if their private data is leaked by a third-party, you cannot exfiltrate that data to a thrid-party. Full stop. No amount of policy un-leaks data, and "You cannot continue to operate as a Facebook service" is an empty threat the moment it becomes more valuable for the third-party to violate the agreement than to continue to operate as a Facebook service.

The takeaway: if you are responsible for user privacy, you must do the computations on the user's data. Have partners ship you the computations they wish to do, vet them, and then ship them results compliant with your users' expectations. Don't hand third-parties a subset of the keys to the kingdom and expect an honor system to preserve user privacy.


Have partners ship you the computations they wish to do, vet them, and then ship them results compliant with your users' expectations.

In this case, the user clicked okay on a dialog that said something like "Share my friend list with this application." It would be sane at that point to expect that the application has access to your friend list. The application typically doesn't want to do a "computation", per se, they want to do something like show you your friends that are already using the application, so that you can share things with them and so on.

There are many, many services that share data in this way. iOS and Android share your contact list in a similar way, for example. And those services have the same exact problem, that sometimes third parties leak data. There is no other, better-implemented way for a platform to share data.

In the end, this is a "scandal" because Facebook is getting bad press already for other issues, and people do not really understand the nature of data platforms so they cannot distinguish big problems from small ones.


Facebook used to be much better at locking down third-party exfiltration. But back in the early 2010s, the zeitgeist was against it; there were countless articles in the genre of "I'm a random third party developer, and Facebook is trying to stop me from exfiltrating massive dumps of my user's data! How anticompetitive!" So they decided to start being more open.


Can you provide one or two of those articles?


My favorite is this one, where a guy complained Facebook wouldn't let his extension export all your friends' email addresses.

https://www.zdnet.com/article/facebook-blocks-google-chrome-...


Look for articles like this one:

https://www.adweek.com/digital/facebook-and-zynga-battle-ove...

Many of the initial adopters of the Facebook platform were companies like Zynga that were reliant on specific details of the platform, and got very frustrated by platform changes. Many but not all of the changes that Facebook wanted to make during this period were ones that kept data more private.


Here's an article discussing it: https://gigaom.com/2011/07/06/who-owns-your-social-graph-you...

Look at how angry the commenters there are that facebook would even consider restricting the access that third-parties have to such data.


How would this work in the case of data portability? If Facebook were to be forced to provide an API that allowed users to export all of their data to a competing social network would Facebook be responsible for ensuring that the competitor was using the data responsibly?


Not at all; they'd be responsible for bundling that data in a well-defined format into a blob of some kind that the user can request be exfiltrated (after providing their credentials to authenticate the request). The third-party would then have to digest said blob. Users assume trust of the third-party regarding responsibility for data misuse when they feed the third-party the blob (same as if they'd hand-entered the data via a regular GUI). Google already offers a functional model of this via https://takeout.google.com

Putting control in the hands of the user is quite different from allowing third parties to exfiltrate data on a user without their consent.

(It is worth noting that this approach is still exploitable---third party convinces users to cough up their authentication codes, then acts as the user and makes the request for the whole kingdom themselves. But user education on the amount of power handed to someone when you literally give them your passwords is a separate issue).


Say there's a startup that is going to revolutionize date keeping and events and scheduling and all that, for the sake of aping a common naming scheme, call them Calendr[1]. Only drawback is that their security is an afterthought, but they're not promoting that.

So a Facebook user that is friends with you on Facebook says to Calendr, "scan my contacts and generate a calendar that already has my contacts' birthdays and any events they've created on it (one would assume this list would include anything that is shared at the Friends Only and Public tiers) for me."

Three weeks later, Calendr is hacked and all of their data is accessible. A Have I Been Pwned-style service will let you read through the data and sure enough: fixermark's super secret event was now publicly viewed as part of this data set. You do not have an account with Calendr and you haven't even heard of it before.

How would you, as a Facebook user, prevent this from happening beyond not creating the event in Facebook? How would Facebook prevent this beyond not providing the data to the third party?

[1] edit: oh geez, there is a Calendr. This has nothing to do with the real Calendr (this is fictitious Calendr).


They may not be able to prevent it without refusing to exfiltrate that data. But then they maintained clear resopnsibility (at the cost of usability) for the user's data. Excellent example though, because it highlights a real joint-ownership problem in data on a social network (the aggrieved fixermark in this case certainly couldn't have stopped his friend from hand-entering the details of the super-secret party into FakeCalendr without consent either; to a certain extent, sharing information always implies trust of the recipient to store that information responsibly).

Unfortunately, privacy / usability is the tradeoff. Facebook had clear incentives to simplify usability at the cost of privacy. But as a result, these breaches continued to happen.

(I use the past tense here because I don't know what their app ecosystem looks like now. When I was using it, it was extremely easy to do a full friends-of-friends data exfiltration, with the only guard against it being "Don't do that and then dump it publicly for all to see").


Yep, now if you have N friends your data might be in the hands of N companies. You'd need a GDPR-like privacy framework with audits and even then the risk of mistakes is enormous.

(Note that when people talk about data portability they're really talking about federation since social networking can't work otherwise. Non-social data is a little easier.)


Facebook has a Data Abuse Bounty program where they pay for reporting third-party data leaks like these: https://www.facebook.com/data-abuse/faq/


Between 2007 and 2009 it was a far west for Facebook apps. A gift app that you could write with about 100 line of code could reach 10 millions of users in 2 days. More complex apps could do better. That was the most amazing part.

At that time the Facebook's API was pretty much open and you can get everything. It was an experiment and Mark Zuckerberg had a lot of hope in what people could do with that data to add value to the users. I was not doubting that he was doing it with good intentions. But he was naive...

Unfortunately, most of the apps were abusing all the channels that Facebook was giving them to get more users and milk money out with ads and micro-payments (ex: through OfferPal Media - now Tapjoy).

During that time I was pretty surprised how much info people were giving away with a click through. Even on the main Facebook product people were posting all kind of stuff, including stupid things they were doing. It really seemed that people were becoming more open and it was the beginning of a new era for privacy (or lack thereof).

Facebook realized pretty quickly what apps were doing and they started adding more granular permissions. Eventually Facebook started limiting more and more access to the API until 2011/2012 when the user generating gold mine was pretty much gone. Again, Facebook has always been working to fix the experience for their users and also to make clear that those where 3rd party apps. But people did not really care.

There have been probably hundred of thousands of apps that had access to "sensitive" user data. According to the Facebook's Term of Service, data could not be stored for more than a certain amount of time. But nothing was technically preventing people to store that data forever...

And here we are...


Anecdata: A couple of years ago, I was at one of the very first (not sure if not the only one) FB connect meetings here in Dallas.

A couple of local startups were talking about how to leverage the "login with facebook" button. It was a big thing...

Most people I talked to, told me: "The very first thing I do is to save all the email of their friends" or stuff like that.

So yeah, this was years ago. I'm failing to see how this is a surprise at all.


Since there are many anecdotal reports of Facebook failing to delete the profile history data even after closing your account, is there a better way people should be scrubbing their data? Some kind of tool, perhaps, that edits all of your posts and replaces them with scrambled / gibberish text?


Facebook sure as hell won't create such a tool. If you do, open source it and I shall use it for sure.


They could easily be keeping a history of everything, so while this would affect the final version of the item, it wouldn't delete the history. The data would still be there for them to mine, and to be stolen.


This other article that got posted today might explain why this happened in more detail: https://medium.com/@six4three/deceit-by-design-zucks-dirty-s...

Seems to suggest that FB platform apis were designed to not share any privacy metadata with devs. Maybe not the same as how apps like At The Pool stored that data, but might explain the firehose of data that FB gave devs and now they will point the finger and say it was their fault for these leaks/breaches. Food for thought.


Developers would have to intentionally write extra code to respect privacy metadata, so it seems unlikely that would have made a difference.


Criminals seem unlikely to follow laws, so why bother having them? Of course devs would need to intentionally follow the privacy wishes of users but without metadata, even responsible developers who want to, can't.

I guess my question for you, considering it looks like you worked with devs at FB, is this article regarding FB platform design accurate? That's the most shocking thing to me that this article conveys, that even if you wanted to ensure data privacy as a dev, you couldn't unless you built a custom tool. I'd be pretty surprised if most (or any) would.

Curious on your thoughts.


Oh jeez. So you think that medium article is accurate? It would be pretty nuts if what they are saying is true. Makes something like Cambridge Analytica and whatever happened today with At The Pool be a question of "when" and not "if" when it comes to the leaking of FB user data.


Wait, how the heck did "At the Pool" get plaintext fb passwords?


Well, the 2nd paragraph explains that: "The passwords are presumably for the “At the Pool” app rather than for the user’s Facebook account, but would put users at risk who have reused the same password across accounts."


They did not. I think those are their own passwords if the user used directly the app without FB login


Not to discount the possibility that the article could be incorrect about this, but it makes the claim quite unambiguously: "it contains plaintext (i.e. unprotected) Facebook passwords for 22,000 users"


They have updated the article to say

> it contains plaintext (i.e. unprotected) passwords for 22,000 users

and

> The passwords are presumably for the “At the Pool” app rather than for the user’s Facebook account, but would put users at risk who have reused the same password across accounts.


Thanks for clearing that up, didn't say that when I read it, indeed, it originally said "Facebook passwords".


I don’t really get it. Isn’t the opposite of this (restricting third party developers) exactly what people are furious at Twitter over, for killing Tweetbot etc.?


One hell of a clickbait headline. Since it is more fallout from the previous data handling issues and not further screwups on Facebook's part.

Not to downplay the issue... but its clearly written clickbait


We've changed the URL to the study the article is reporting on, which also provides a cromulent title.


Not clickbait in the slightest. This is not about Facebook per se about persistence of their shared data - my information - once it's made public. HIPAA, by comparison has all sorts of statements about PII and business associates. But apparently FB can share with whoever has a pulse and I can find out about it later via Shodan.

I wrote this even with the original link and version of the headline.


"Millions of Facebook Records Found on Amazon Cloud Servers" was the original headline.

That headline the first thing you would likely think is Facebook was using AWS and left some data open somewhere. It 100% implied Facebook was doing more wrong now, instead of the companies that already had the data from the previous issues were not handling the data correctly.

Yes this news is still notable. But the headline gave the wrong impression and was banking on the already bad attitude towards Facebook.


Eh, I don’t really care if it is a failure on the point of Facebook engineers or a failure on the point of Facebook data policy that allowed other engineers to post data about me in an insuecure manner. Seems like splitting hairs here.


In this case I think it is a failure on the part of a third-party application developer, rather than any failure on Facebook's part.


I still think this is fundamentally Facebook’s problem. For example, I don’t care if it’s a failure of the payments ecosystem or my bank if using a new payment technonlogy opens me up to fraud that then drains my bank account- I just won’t use that new technology anymore. Similarly I don’t care who’s fault it is if using Facebook leaks information about me I didn’t realize and didn’t want to become used in the ways it has been. I will just not use Facebook anymore.


I swear, every HN article, you get 10% of the comments are about the article being discussed, and the other 90% are people quibbling over the headline.

Ok, that's an exaggeration.

And when the comments are good, they are really good. Makes the entire HN experience worthwhile.


So sorry? I mentioned Shodan as a way of tying it to other leaks of negligence and I why I thought this was relevant, and not just a tossaway clickbait article. I think my disagreement is deeper in nature than semantic on the headline. I even dropped in HIPAA as a model for regulations of shared private info, as gross as it may be to think about in a regulatory sense.


Titles R Hard.



Thanks. We've changed the URL from https://www.bloomberg.com/news/articles/2019-04-03/millions-..., which appears to be a summary of that.


Any way to access the full article anywhere? It shows to me as "more information available on the Bloomberg Terminal"


The link has since been updated to be the original UpGuard research source.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: