Ask HN: Ads triggered by WhatsApp “end to end encrypted” messages?

dan-robertson · on Sept 23, 2022

I think you partly may just be biased by happening to notice ads more if they fit the topic you sent to your wife or being lenient in deciding if ads meet the categorization. And if you dwell on an ad because it seems to match, you may get more similar ads.

Here is how I think you could design a more robust (but less fun) experiment:

- Come up with a bunch of topics, write them down on slips of paper, put the paper into a hat

- Each Monday, draw three topics from the hat, send some WhatsApp messages about the first, Messenger messages about the second, and don’t discuss the third. Don’t put the topics back in the hat.

- If you see any ads relating to one of the topics, screenshot them and save screenshots to eg your computer with a bit of the topic

- Separately, record which topic went to which platform

- After doing this for a while, go through the screenshots and (each of you and your wife or ideally other people) give a rating for how well the ad matches the topic. To avoid bias, you shouldn’t know which app saw the topic.

- Now work out average ratings / the distribution across the three products (WhatsApp vs Messenger vs none) and compare

rakoo · on Sept 23, 2022

A simpler protocol to realize that the Baader-Meinhof phenomenon is probably what's happening:

- pick said topic, something you never cared about before, talk about it but don't write any messages containing it; - for 1 month record every ad you see about it; - send a message about the topic; - for another month, record every ad you see about it

Comparing the number of occurrences will tell you what is happening.

KennyBlanken · on Sept 23, 2022

> pick said topic, something you never cared about before, talk about it but don't write any messages containing it;

This does not work. How did you come about the topic? Answer: it was in your brain, because advertising, trends among your peers and social connections, online trends real or astroturfed, etc.

That's why you end up with people thinking their phone is "listening" to them.

bobkazamakis · on Sept 23, 2022

The Brain -- famously incapable of organic thought.

Maximus9000 · on Sept 23, 2022

It also has to be a topic that advertisers would pay to target you with. You can't talk about something super obscure that advertisers don't care about - like steam engines.

warkdarrior · on Sept 23, 2022

Thanks for this!! Now my email is full of offers to buy historical steam engines, steam engine parts, and engineer hats. Amazon is even advertising a subscription for coal deliveries!

saluki · on Sept 26, 2022

I've already bought two steam engines this morning. The ads were pretty convincing.

addandsubtract · on Sept 23, 2022

More likely, ads about games on steam.

NaturalPhallacy · on Sept 23, 2022

Your inbox is now full of ads for model trains and hobby stores.

johncalvinyoung · on Sept 23, 2022

There are assuredly advertisers for steam-engine-adjacent products. Memorabilia, experiences/outings, conventions, models, books, artwork, games.

Firmwarrior · on Sept 23, 2022

You're saying to record ads you see about it on TV or something? (Just to eliminate the "My computer is secretly recording me" angle)

ASalazarMX · on Sept 23, 2022

The problem is, your smart TV could be spying you too, if it's capable of voice commands or videoconference. If you discuss sex toys near it, at least some related keywords could make their way to targeted advertising.

My wife and I routinely use ad blockers, private browser windows, browser profiles, and try to use as little ad-supported products as possible. This doesn't stop targeted advertising, I guess because most devices we use connect through the same IP. A couple of days after she starts looking up a city we want to travel to, I'll start receiving ads from airline companies or travel agencies, and even tours/cruises to said city/region. Fighting tracking and spyware is nearly a lost cause unless you become a digital Amish.

stvswn · on Sept 23, 2022

Smart TVs in general use IP address to try target devices across households, which is against the privacy policies of a lot of ad tech providers because IPs are not redactable/resettable by an end-user.

The best way for small ad tech providers to compete with "big tech" has been to cross lines that the bigger companies won't cross, this is an example for why there are a lot of profitable ad tech companies in the connected TV / video ads space.

Even if you use a VPN, the TV itself likely has a unique ID for ads, so someone just needs to see one request with both the true IP and the unique device ID and then remember that for the future. It's all very shady. TVs are very far behind the level of user control that phones and browsers provide because there's less scrutiny and its more fragmented across manufacturers (all of which want to get in on ad tech).

You can usually find some opt-out of the identifiers if you dig deep enough into the menus, because multiple laws and regulations require them.

yuletide666 · on Sept 23, 2022

The first thing I did when helping my parents set up their new LG OLED TV's at xmas was to disable all the ads and tracking. It's exhausting how much pressure they put on you to opt-in, and how many layers there are, constantly implying the TV will be nonfunctional without it.

But sure enough, it works just fine with no ads, no "free tv channels", and no voice functionality.

autoexec · on Sept 23, 2022

Have you ever checked back to see if updates had re-enabled some of those or introduced new ones? You'd hope they'd let you know if they started getting ads all the time, but the tracking stuff is much less obvious.

grepfru_it · on Sept 24, 2022

LG will send out updates that require you to accept new license agreements when you turn the tv on next. It’s very obvious about what they are tracking but very obnoxious in pointing it out. The parents that OP refers to probably just clicks accept all and moves on.

We have an LG tv and one of my family members hit accept all after an update and now my remote listens to us. To fix this properly I would need to factory reset which loses all of our streaming settings. I actually don’t because I have a separate ISP only for our TVs so there’s a bit of separation between our streaming use and phone activity

rakoo · on Sept 23, 2022

I'm talking about every occurrence that might be pushed, so it's the TV ads, webradio ads, search suggestions, ...

TaylorAlexander · on Sept 23, 2022

If anyone wants to try this, a friend sent me one link to a device called Levo which does “herb oil infusion” aka it lets you make weed brownies easily. I clicked the one link my friend sent and now I get ads for Levo constantly in my YouTube adroll. Though I should say this is obviously on Google’s ad network specifically and I have no idea if this applies to other networks.

madeofpalk · on Sept 23, 2022

> I clicked the one link my friend sent and now I get ads for Levo constantly

Yes, this retargeting is 'expected' and is not surprising. This is completely different from what OP is describing.

TaylorAlexander · on Sept 23, 2022

No I realize the differences. I am just saying that if someone wants to see if their encrypted app is resulting in ad serves, they could try discussing this product in encrypted chat only, using the methodology described in the comment I was replying to above.

RC_ITR · on Sept 23, 2022

These kind of stories are always fun to analyze using the Socratic method.

-How did you learn about the product?

-Have you ever searched for it?

-Did a friend of yours tell you about it? Do you think they searched for it?

-Are a lot of ads for it playing on TV channels you like? Could instagram know you like those TV channels?

-Is it something your neighbors got? Do you think there has been a spike in shipments of this product to your neighbors?

Eventually people start to “get” that scanning the text of messages is way more helpful for humans than it is for computers. They’ve got other data they can use.

hbn · on Sept 23, 2022

I also have a theory that sometimes when people say "we were talking about <product> and I never even typed it into my phone or anything, and suddenly I started seeing ads for it the next day!" that the person in the story may not have looked up <product>, but someone else in the conversation might have Googled it or browsed an Amazon listing or something and they have some kind of connection in their ad profiles whether it be that they know these 2 people interact a lot, they're in the same geolocation, same wifi network/IP address, etc.

I'm just not convinced of the always on microphones in phones listening for and processing every single thing considering how much battery drain that would cause, whether the processing is done on device or they're sending all that data to a server to be processed.

autoexec · on Sept 24, 2022

> I'm just not convinced of the always on microphones in phones listening for and processing every single thing considering how much battery drain that would cause,

We know our phones commonly listen for "smart assistant" prompts and audio beacons (https://www.nanalyze.com/2017/05/audio-beacons-monitor-smart...), so they don't seem to have any trouble abusing the mic access. Honestly without a whistleblower, there's little hope of really understanding how much data a company collects and what they use it for. At least sometimes we can see it in their own marketing materials. For example, https://advertising.roku.com/resources/blog/insights-analysi... tells us:

"Roughly twice per second, a Roku TV captures video “snapshots” in 4K resolution. These snapshots are scanned through a database of content and ads, which allows the exposure to be matched to what is airing. For example, if a streamer is watching an NFL football game and sees an ad for a hard seltzer, Roku’s ACR will know that the ad has appeared on the TV being watched at that time. In this way, the content on screen is automatically recognized, as the technology’s name indicates. The data then is paired with user profile data to link the account watching with the content they’re watching."

None of the people I know who use those devices knew that was happening, but the info was out there at least. When so many people are watching everything you see and do and say who can ever know what every company is doing or what the source of any one ad is?

RC_ITR · on Sept 24, 2022

> "Roughly twice per second, a Roku TV captures video “snapshots” in 4K resolution. These snapshots are scanned through a database of content and ads, which allows the exposure to be matched to what is airing.

There were users under the impression that Roku was unaware of the content it was displaying? Like 4K snapshot or not, if I know a user is watching an NFL stream, I know that ad played.

autoexec · on Sept 24, 2022

> There were users under the impression that Roku was unaware of the content it was displaying?

Sure, they expect Roku would know if they launched Disney+ or Netflix, but not that they would knew exactly what movie you were watching or what specific scenes you viewed and for how long. Same with personal videos cast to your screen via roku. It's pretty reasonable they'd know you were streaming content from your other devices, or which apps you were using, but less reasonable that they'd be watching over your shoulder taking notes.

muzani · on Sept 23, 2022

I don't think it can be observed because it's likely a bug. It freaks users out. People uninstall the app and start threads like this.

It's sort of like getting mugged once and then setting up a camera in a bunch of alleys to prove that muggers exist. You can even set up a camera of yourself running into dark alleys every night, but the odds of reproducing a mugging is still extremely low.

There's a certain kind of precision that convinces me it's real though. Precision is common. I look at a book on Amazon, and a FB ad for that book appears.

But I get rejected for a loan via WhatsApp and then used car ads appear for that model of car that I applied a loan for? That's a bit on the nose.

dahdum · on Sept 23, 2022

> But I get rejected for a loan via WhatsApp and then used car ads appear for that model of car that I applied a loan for? That's a bit on the nose.

Aside from random coincidence, I could see this happening if you provided your personal information (especially email) for the loan application. It could have been shared to multiple underlying lenders alongside a data vendor who ultimately provided interest targeting (which can include car models) to an ad network.

Getting an ad for that specific model could also have been due to other online activity, such as checking the KBB.

sangnoir · on Sept 23, 2022

> Getting an ad for that specific model could also have been due to other online activity, such as checking the KBB

I suspect op may have researched the car model and got retargeted: some ad networks keep track of specific products you've shown interest in (not generic interest-areas like Google ads) and track you via a cookie. You may be visiting a completely different site that uses the same network, and get ads on the exact product you've spent minutes reading about.

muzani · on Sept 23, 2022

Ah thanks, I didn't know that was possible. It did have email.

If it's integrated into such activities, it might actually be a good explanation for all the other similar scenarios blamed on WhatsApp.

RC_ITR · on Sept 23, 2022

This is why incognito mode has the warnings it does about “people can still see your screen”

No matter how secure the platform, if you apply for a loan through it, the loan provider will “know” you want a loan, and happily sell that data.

boltzmann-brain · on Sept 23, 2022

> the odds of reproducing a mugging is still extremely low

I did that and I got a mugging on camera. The attacker was convicted.

badrabbit · on Sept 23, 2022

This is facebook. They've been caught recording people and selling that for advertising, they deny it because technically your audio is transcripted not recorded and they can send only some keywords back so whole conversations aren't sent back to them.

badrabbit · on Sept 23, 2022

Plenty of research and news stories about this if you care to search. The speculative part of my comment is about the transcription which I'm speculating because of their fervent denials despite evidence which technically their wording in their denial statements is correct.

If I had to guess, your whatsapp messages are e2e secured but keywords are sent to facebook when they match some condition. So if you message "happy birthday" to someone, they won't see that but the fact that the keyword "birthday" was found even if the word isn't included is sent to fb. That way they can say they're not snooping your messages.

thehappypm · on Sept 23, 2022

Source?

criddell · on Sept 23, 2022

We did an experiment. We talked about how hard it is to find highlighter yellow nail polish. Nobody in the house is a purchaser of nail polish nor did we do any searches for highlighter or yellow or nail or polish. A day or two later my wife got an Instagram ad for highlighter yellow nail polish. It could have been a coincidence or maybe they were listening.

Or maybe some combination of things we did previously led naturally to thinking about that yellow nail polish. I'm thinking about something like the trick where you ask somebody a bunch of addition problems that have 14 as the answer (what's 10+4? 2+12? 3+9? etc...) then ask them to name a vegetable and they will almost always say carrot.

wpietri · on Sept 23, 2022

"I remembered the time I was in my fraternity house at MIT when the idea came into my head completely out of the blue that my grandmother was dead. Right after that, there was a telephone call, just like that. It was for Pete Bernays--my grandmother wasn't dead. So I remembered that, just in case somebody told me a story that ended the other way." -- Richard Feynman, "Surely You're Joking, Mr Feynman"

nescioquid · on Sept 23, 2022

Someone did this trick with me in the '80s, but the numbers were to sum to 13. I still said "carrot", however. Wish I would have thought to use a different number than 13 when I tried it out on others.

Melatonic · on Sept 23, 2022

Do you have the setting that turns on end to end encryption? I thought by default it was off (and always off for group chats) ?

codethief · on Sept 23, 2022

Are you thinking of Telegram? There is no such setting in WhatsApp.

paulrouget · on Sept 23, 2022

Am I missing something? Why carrot?

criddell · on Sept 23, 2022

I’ve always assumed it’s because of the association of 14 with 14 carat gold.

gusgus01 · on Sept 23, 2022

Interestingly, I've always seen it as any number of quick math problems, not just ones that equal 14, and it consistently works too.

criddell · on Sept 23, 2022

12 year old me was not much of a scientist. I don't think I ever tried any number other than 14...

muzani · on Sept 23, 2022

It's an association thing. I thought carrot too. Maybe it's because of the tedious nature of processing. Maybe it's because the orange bar at the top of the screen makes me think carrot.

whimsicalism · on Sept 23, 2022

Ah yes, a bunch of anecdotes in reply.

netsectoday · on Sept 23, 2022

[flagged]

thehappypm · on Sept 23, 2022

you’d think that a educated group like this would understand that anecdotes are not sufficient evidence for something like this.

netsectoday · on Sept 23, 2022

This is a message board about tech... comments aren't welcome anymore - we need evidence to participate?

I think it's interesting when a bunch of people chime in and say "Hey, yeah, I had some crazy thing happen to me, I'm in tech and understand how this stuff works, and there's a very small to zero chance this happened through some other parallel construction by the tech company, they just straight up listened to my conversation and showed me an ad".

This is what kicks off a handful of you to go packet sniffing and write up a blog post looking for this behavior. So yes, evidence is welcome but it doesn't seem like we are quite there yet.

whimsicalism · on Sept 23, 2022

In general I agree, but I think when you are being explicitly asked for a "source" in response to an allegation that it is settled that FB has been "caught recording people," I would prefer to not have anecdotes in reply.

jakelazaroff · on Sept 23, 2022

I mean… this is a conversation, not some sort of formal debate? Someone is telling you "hey, this happened to me," and your response isn't "have you considered this other explanation?" but rather "I won't discuss this further unless you do a bunch of research and present the results to me."

whimsicalism · on Sept 23, 2022

I'm happy to continue discuss it (not sure where you are getting the idea that I'm not from), but I think it is also fair to point out when someone asks for a source to a claim that something has been proven/caught and instead the replies are a bunch of personal stories where people think something is happening.

To me, that is indicative that, contra the original claim, no such thing has ever been proven.

Is it verboten to say that?

jakelazaroff · on Sept 23, 2022

It's not verboten. But, candidly, it is kind of rude. What's the difference between someone at, say, the EFF "proving" something happened by running an experiment and writing about it publicly, and someone on Hacker News doing the same?

whimsicalism · on Sept 23, 2022

I disagree that it is rude to point out something is an anecdote.

The proof has to do with the technical details, not the authority figure posting it. If someone from the EFF wrote a blog post with the same content as these HN posters, I would be similarly dismissive of this as "proof."

TigeriusKirk · on Sept 23, 2022

They aren't saying "this happened to me"

They're saying "facebook has been caught multiple times doing this", which is not a personal anecdote, but an assertion that proof exists and is available.

So where is it?

netsectoday · on Sept 23, 2022

I'd prefer to say whatever I want. Must have filed it in the wrong place.

whimsicalism · on Sept 23, 2022

You can say whatever you want, doesn't mean I won't criticize you for it or downvote you.

And I'll flag if you violate HN guidelines, which you have.

netsectoday · on Sept 23, 2022

Cool! Which ones?

whimsicalism · on Sept 23, 2022

> Edit: Holy fuck there are (paid?) Facebook shills all over this like flies on shit.

From the HN guidelines: [0]

> Please don't post insinuations about astroturfing, shilling, bots, brigading, foreign agents and the like. It degrades discussion and is usually mistaken. If you're worried about abuse, email hn@ycombinator.com and we'll look at the data.

[0]: https://news.ycombinator.com/newsguidelines.html

jakelazaroff · on Sept 23, 2022

"Data" is just the plural of "anecdote", so why would they not be?

TigeriusKirk · on Sept 23, 2022

> packet sniffing software 24/7 to catch proof

I have to say, the fact that no one has done this makes me doubt it's real.

As hated as Facebook is, there's tons of motivation for people to catch them out with undeniable proof, and yet no one has done it.

netsectoday · on Sept 23, 2022

[flagged]

procinct · on Sept 23, 2022

Don’t iPhone have an indicator when the mic is recording? Also, this feels like it would be insanely easy to test by capturing the payloads sent to FB; you could even use something like Charles Proxy to do it.

FB having access to microphone makes sense for plenty of other completely innocent reasons (for example, if you can record a video from inside the app).

If this was actually true, I can’t help but feel that someone would have proven it technically by now instead of relying on these types of self experiment and anecdotes, especially given how commonly this is touted.

netsectoday · on Sept 23, 2022

> Don’t iPhone have an indicator when the mic is recording?

No.

Just tested this out, zero indication that the mic is hot on a recent iPhone with up-to-date software when recording a voice memo.

Edit: There ya go... downvotes for saying the iPhone has no indicator when you record audio.

Hamuko · on Sept 23, 2022

Works on my iPhone.

https://litter.catbox.moe/cpvpr8.jpg

netsectoday · on Sept 23, 2022

The screen was off when the event happened...

1) Does your iPhone still record audio when the screen is off?

2) Can you see the audio indicator when the screen is off?

3) If a background app starts then stops recording audio while the screen is off, would you have an indicator that it recorded audio?

procinct · on Sept 23, 2022

> 3) If a background app starts then stops recording audio while the screen is off, would you have an indicator that it recorded audio?

Yes. iOS displays an indicator if an app has recently used the mic.

> Note: Whenever an app uses the camera (including when the camera and microphone are used together), a green indicator appears. An orange indicator appears at the top of the screen whenever an app uses the microphone without the camera. Also, a message appears at the top of Control Center to inform you when an app has recently used either.

Source: https://support.apple.com/en-nz/guide/iphone/iph168c4bbd5/io...

Hamuko · on Sept 23, 2022

Have you tried looking at the screen while using the Facebook app?

Also, I feel like the goal posts are moving quite fast in one direction.

netsectoday · on Sept 23, 2022

Goal posts? Is this a competition?

The phone was sitting between two people having a conversation, one of them "swiped it open" meaning it was off to begin with, then was immediately displayed an ad for that conversation, and upon hearing this the tech-savvy person in the house understood what happened, confirmed it with the mic access to facebook in the settings, and then disabled the behavior.

Hamuko · on Sept 23, 2022

>Goal posts? Is this a competition?

Considering the original claim was "zero indication that the mic is hot" and now it's "zero indication that the mic is hot if the screen is off", I'd say that the goal post has moved considerably.

But if you want to know if Facebook is listening to you through the iPhone microphone, you should probably look at the screen for the indicator. iOS apps can't start recording on their own in the background, there's no API for that. If they are listening to you, they'd have to start the audio session in the foreground, which would allow you to see the indicator.

https://developer.apple.com/forums/thread/65604

https://stackoverflow.com/questions/70562929/how-to-start-au...

(Unless you believe that Facebook is using some kind of a private system API for this and is passing through the App Store checks)

netsectoday · on Sept 23, 2022

Just a few things to note here...

I wrote the original "Wife swipes open the phone" comment, so that's the context you seem to be missing. Sure you can see a little dot on your phone when YOU run some experiment today and look for it, but was that indicator available in the exact situation where the targeted ad was displayed? No.

Also, this incident happened in the past and we know there have been dramatic API changes on both Apple and Facebook products. The limits of the API today don't reflect the capabilities that were available to developers in the past. I doubt Facebook is hacking the App Store process to use hidden APIs. It was probably just available in the past and my wife granted the facebook app complete access to the mic, so they took what they wanted.

I'd make sure to disable that permission today too, just in case.

One last thing is I just opened my iPhone again and hit record. I honestly didn't see the tiny orange pixel at the top of my phone until you pointed it out. I was basically looking for the green video indicator light to show. So I'm technically wrong about NO indication, you're welcome.

tonmoy · on Sept 23, 2022

The GP didn’t say they were using iPhone. The Facebook app on Android has been known to record audio even when running in the background

tadfisher · on Sept 23, 2022

That's not possible without permissions these days, same as iOS. In Android 13, background processes have no mic or camera access whatsoever.

netsectoday · on Sept 23, 2022

It was an iPhone, and there is no indication from the phone when the mic is recording.

Melatonic · on Sept 23, 2022

Android has a notification now when the mic is recording and has had the ability to deny microphone and lots of other access for a long time now. Thankfully it sounds like iOS is catching up

procinct · on Sept 23, 2022

> My wife and MIL sitting at the table talking about a unique topic with an iPhone running Facebook sitting in front of them

They explicitly said they were using an iPhone?

titaniczero · on Sept 23, 2022

And don’t forget the battery. A mic recording 24/7 would drain the battery much faster and would not go unnoticed unless specialized hardware is used like the one for “hey siri” and “ok google”.

netsectoday · on Sept 23, 2022

Doesn't the Facebook app drain your battery?

titaniczero · on Sept 23, 2022

Try any voice recording app for a few hours, now use the facebook app for the same number of hours. The impact on battery life of a mic actively recording alone is very noticeable, so noticeable that your phone has a special chip just to recognize patterns similar to “hey siri”.

Melatonic · on Sept 23, 2022

It would not need to record high quality audio and could maybe even take advantage of that same chip? Just thinking out loud here - smaller, crappier audio would also be easier to send back unnoticed (or instead of even recording audio it could be transcribing on the fly to a text file using something super basic and easy with low accuracy)

netsectoday · on Sept 23, 2022

[flagged]

titaniczero · on Sept 23, 2022

> This way too I can troll people on the internet when they suspect this is happening and I can say "bUt ThE bAtTeRy LiFe!" to defend Meta: my corporate overlord business daddy.

Please, stop with the sarcasm.

Okey, let’s say they manage to record us without a huge impact on our battery life. Now, how do you send these recordings or even the extracted keywords from a popular app, a client installed on devices controlled by the users and susceptible to reverse engineering and network traffic analysis without anyone noticing it?

It’s just too much risk and they don’t even need it, see my relevant reply here: https://news.ycombinator.com/item?id=32950204#32953216.

netsectoday · on Sept 23, 2022

Great question! I'd love a peek at their source code to figure out these answers too!

I swear that comment is sarcasm free.

thehappypm · on Sept 23, 2022

This isn’t evidence. Even if Facebook was not listening to your conversations, there would be some rate in which you would just randomly be served an ad related to a topic you were discussing. There needs to be evidence that it is happening at a rate too high to be attributable to chance.

netsectoday · on Sept 23, 2022

Sounds like a good way to engineer it... anything to improve the bottom line even if insanely-targeted ads only trickle out to users. How about limiting who sees this feature to also limit the risk of being detected? Maybe just do it once a year to everyone, or never to specific "tech-savvy" users that they have completely profiled.

nindalf · on Sept 23, 2022

So if were walking past a playing blaring the piña colada song, I’d see ads for alcohol and umbrellas? If coworkers around me are talking about activities I’m not interested in, I’d see ads for those?

They have far better information that shows I’m not interested in alcohol or extreme sports. Audio in the background is so low-signal that it isn’t worth showing ads based on it.

Even just transcribing speech something accurately is not something that was possible until the last couple of years. Yet this conspiracy theory has been around for a decade or more.

ac2u · on Sept 23, 2022

It's ironic that you're asserting this by replying to a parent message which explains why this probably isn't the case.

Hamuko · on Sept 23, 2022

>Zero coincidence

Yeah, it's probably not a coincidence that your wife is talking about X and is recognised by Facebook to be in a group of people that are interested in X.

neodypsis · on Sept 23, 2022

Would they argue that the message goes first into a neural network that outputs potential product labels based on the message and that it all happens client-side? That's the only way I see it possible for them not to violate the E2EE.

vineyardmike · on Sept 23, 2022

An important thing you’re missing is the control. You should record every ad.

You need to know if you got 3 topics of ads every day and 1/3 of them are related to that secret topic, OR if you get 300 topics every day and 1/300 are related to that secret topic. If it’s the former, it’s suspicious, if it’s the latter, it’s way less suspicious.

dan-robertson · on Sept 23, 2022

The control is the topic you pick that you don’t discuss on WhatsApp or messenger. The idea is random differences between topics should average out over many trials.

vineyardmike · on Sept 23, 2022

I still think it’s important to consider the volume of topics that show up that aren’t being explicitly looked for.

I’ve gotten Instagram ads for ketamine and I absolutely am not discussing or searching for it. I probably wouldn’t even notice a random topic if it’s not so absurd. I’m sure there’s tons of topics I don’t even realize I see.

dan-robertson · on Sept 23, 2022

The reason for the control I suggested is to try to counteract the bias people have towards noticing things they recently thought about. I think the question of what adverts people are shown in general is interesting but quite separate.

chillacy · on Sept 23, 2022

You can’t come up with the topics yourself either, because the topics you will think up are different based on your demographic / type of person you are, and ad networks basically try to guess that.

JoshuaDavid · on Sept 23, 2022

You can if you first come up with a list of topics, and then once you have that list, randomly assign each of those topics to one of the three categories.

dan-robertson · on Sept 23, 2022

The idea is that you may discover the topics you don’t talk about that week still come up as much as those you do.

nindalf · on Sept 23, 2022

There will be some bias in what they choose to screenshot right? Meaning, the unrelated topic might show up in the feed but they don’t screenshot it because it doesn’t fit the narrative?

Also, what we’re interested is if the text changed what was shown. If I saw ads for X last week but didn’t notice them, then spoke to a friend and noticed them and took a screenshot, it would appear to confirm the theory. Even though I was always seeing ads for X.

Ultimately, I don’t think people who are convinced of this theory will change their minds so it’s a moot point.

dan-robertson · on Sept 23, 2022

Yeah, that’s the biggest flaw in the experiment I proposed, I think. This is the reason I try to have the hopefully independent grading of ad-topic-relevancy blinded to which system the topic was communicated over. It may be that one sees many vaguely related ads for the WhatsApp topic due to some selection bias but a similar number of actually related ads.

uup · on Sept 23, 2022

I call this the gaslighting explanation: “no, it definitely wasn’t the messaging product owned by an advertising behemoth. You must have searched for it somewhere else.” Obviously the OP remembers where they’ve seen the product. If they has seen the product elsewhere, they wouldn’t have started this thread!

moralestapia · on Sept 23, 2022

(wrt some comments in this thread)

Is it so hard to believe that Meta is snooping on WhatsApp conversations? Meta, a company of unprecedented size that was built over monetizing your private data? A company who's been caught in plenty of scandals (like Cambridge Analytic) about this exact sort of thing (violating their users' privacy)?

Someone from this community, which generally means educated, tech-literate and sensitive to these topics shares a perfectly plausible observation, of something that has been experienced as well by plenty of other folks, me included; and then some people come and try to make up the most convoluted explanations (candy boxes from Kazakhstan just happened to be trending that specific day, nothing to see here, move along!) to this phenomena and try to shift the blame away from Meta. Why do you do this? Are you Meta employees? A PR agency they hired?

It's just baffling. Apparently some people DO want to be abused.

Plot twist: we all get ads about candy boxes from KZ now.

brap · on Sept 23, 2022

For anyone who has ever worked at a FAANG like company in the last decade, yes, this is actually very hard to believe.

Despite the shady image they have, these companies go to great lengths to avoid doing shady things (because ultimately it’s bad for business). Not to mention the hundreds of tech employees that would have to be involved and keep quiet in this type of “conspiracy”. It’s incredibly unlikely, I truly believe that.

dessant · on Sept 23, 2022

I can imagine you haven't been involved in anything illegal, but I'm sure you've aware of Meta's documented track record of coordinated illegal actions. Do engineering teams just fall head first into a bucket of 2FA phone numbers and start using the data for ad targeting, and nobody bats an eye from the legal department to product managers? Or are they hypnotized to build services for biometric data collection without consent? Nobody does anything nefarious, but their collective actions which benefit the company just end up being illegal, again and again?

The tech companies you work for do often engage in illegal activities, and some of your collegues are complicit. I'm sure it is an uncomfortable thought for some of you, but this is all part of the public record.

z9znz · on Sept 23, 2022

I think there's a natural bias people have to want to NOT see the bad in the organizations that pay them $$$$.

This is certainly true in a lot of finance.

thatoneguytoo · on Sept 23, 2022

I completely agree (as another employee of FAANG). It's ridiculously hard to do anything against policy once it's set, and trust me, the policies are set. Media overplays a lot of things which aren't just there.

The sad reality is people are very predictable, even with basic data.

rini17 · on Sept 23, 2022

The employees obviously are told the functions and APIs that they are implementing have a completely legit use case. That is not hard to believe at all and was the case in Cambridge Analytica scandal, for example.

m463 · on Sept 23, 2022

"bad for business" leads to systems that do unexpected things. For instance, on-device generate identifiers for any image sent, and send the identifier out-of-band. This helps catch child pornography.

I can imagine the same thing done for text. The text might be encrypted, but interest keywords might be generated on-device and sent out-of-band.

rendaw · on Sept 24, 2022

The PRISM "conspiracy" was very shady and involved probably hundreds of employees. And if they have hushed people punching holes for the government, it's not crazy to think some data could leak out into other parts of their pipelines too.

I'm not claiming this is real, but I agree with GP.

beowulfey · on Sept 23, 2022

Let me start by saying I have no idea if Facebook is reading my encrypted messages or whatever. However, I will say that in my experience, whether something is bad for business if it gets discovered is usually not a concern for large corporations, if the thing being done makes them more money. Because everything is just a balance sheet.

For an example from non-FAANG companies, see illegal dumping of toxic waste by chemical companies, such as DuPont and PFOAs [1]. Despite knowing what they did was illegal, the math works out -- products with PFOAs were something like $1 billion in annual profit, and even when they got caught the fines and legals were a fraction of that, spread out over many years.

So I personally believe these companies 100% would do shady shit if it increases their profit margins. And why wouldn't they? There is no room for morals in capitalism, and the drawbacks are slim.

[1]https://www.nytimes.com/2016/01/10/magazine/the-lawyer-who-b...

spaceywilly · on Sept 23, 2022

The most plausible explanation is that people are just easy to predict. Might be tough to admit, but that’s actually a much simpler explanation than Facebook having a back door into our messages, which are end-to-end encrypted.

As others above me have thoroughly explained, there are numerous ways Facebook could figure out what you’re reading about/listening to/viewing on the internet, which ultimately drives what you are chatting with your friends about. Reading your messages would actually be the most difficult and low fidelity way for them to try to mine this information. They can just see your entire browsing history and extract from there, since the majority of website have a tracking cookie that in some way phones home to Facebook.

bhk · on Sept 23, 2022

Seriously? Facebook knows their internal thoughts well enough to guess what topics they would choose when trying to pick something they "never talk about"?

If FB could do that, then FB would realize that these topics are not actually products they are interested in, so they wouldn't be showing ads.

dan-robertson · on Sept 23, 2022

FB can show you one ad per month about some special steam train ride and maybe you’ll scroll past it without a second thought but then maybe one day you’ve been watching a film about the golden age of steam or you’ve been talking to a friend about it and then you see the ad and remember the film or conversation and you think ‘Crikey, how on earth did Facebook come up with that as!’

Facebook show (many people) a lot of ads and they only need to get lucky a few times for you to think it’s uncanny. All the non-unique times an ad was not relevant will have blurred together and so you won’t easily remember that they were the vast majority of the ads you see. A little bit of feedback (eg if you dwell on the coincidental ad) may cause you to see more related ads.

z9znz · on Sept 23, 2022

> The most plausible explanation is that people are just easy to predict. Might be tough to admit, but that’s actually a much simpler explanation than Facebook having a back door into our messages, which are end-to-end encrypted.

I disagree with this to the extent that I would say the exact opposite is true.

Facebook (and others) have proven time and time again that they cannot correctly predict user behavior by locking out or banning users who actually did nothing wrong (because their algorithms predicted that the user was breaking terms of service or might be planning to). This happens over and over, even in cases not so complex as the "photos of my child to send to my doctor".

But on the flipside, Zuckerberg has been documented saying one thing to the public and exactly the opposite in private. Heck, Facebook has had memos and emails leaked where they talked about how they would say one thing in public (and to regulators) while doing the opposite secretly.

I believe that Facebook cheats and breaks agreements (and laws) in multiple directions all the time, often willfully. They've even been caught cheating their own ad customers by intentionally overstating the effectiveness and target accuracy of their ads.

mgraczyk · on Sept 23, 2022

It's hard to believe because I worked there and worked on this stuff (data and ML side) and know that they aren't.

moralestapia · on Sept 23, 2022

>I worked there [...] and know that they aren't

I know that, unfortunately, this is what puts bread on your mouth.

But, really? Are you suggesting that Cambridge Analytica didn't happen? Did we all hallucinate that?

You guys jumped the shark already. These attempts at damage control are laughable.

mgraczyk · on Sept 23, 2022

It's not what puts bread in my mouth though. I don't work there now and don't work on anything related to ads or messaging.

CA happened but that has nothing to do with this. The policies that allowed CA to collect data were very public, Zuck enthusiastically talked about the open knowledge graph all the time prior to CA, much to the dismay of many investors. Facebook didn't lie in that case, they misjudged the potential to misuse open data access, and the potential for negative PR as a result.

By analogy, it's like you're the landlord of an apartment building and you don't lock the front door. You put up a huge sign saying "this door is unlocked, everyone is welcome". You sell ads for your building embracing the unlocked door policy. Then somebody walks in and photographs all the tenants through their windows. Suddenly people who didn't care about the unlocked policy are now very angry, and rightly so. But this is completely different from collecting data, lying about it, and operating a massive conspiracy to conceal the data use from literally tens of thousands of employees who would normally be able to see it.

Aunche · on Sept 23, 2022

Being educated and tech-literate means that you should try to think more critically than "Facebook bad." You brought up Cambridge Analytica as your scandal of choice, which is the most newsworthy scandal, but the one where Facebook is the least guilty. Everyone had the same access to the APIs that Cambridge Analytica did and Facebook had shut down those APIs before the story broke out. Acting on instinct will only lead to regulation will won't be effective at stopping what you're trying to stop, cause needless side effects, and undermine your political credibility to push for changes that solve the important issues.

Closi · on Sept 23, 2022

From WhatsApp privacy policy:

We limit the information we share with Meta in important ways. For example, we will always protect your personal conversations with end-to-end encryption, so that neither WhatsApp nor Meta can see these private messages.

JumpCrisscross · on Sept 23, 2022

> WhatsApp privacy policy

Facebook has a deep culture of pathological lying. They lied to the FTC [1]. They lied to WhatsApp and to the EU [2]. They created an Oversight Board and then lied to it [3].

Each of those lies are more substantial than lying in a privacy policy.

[1] https://www.ftc.gov/system/files/documents/cases/182_3109_fa...

[2] https://euobserver.com/digital/137953

[3] https://techcrunch.com/2021/09/21/the-oversight-board-wants-...

ImPostingOnHN · on Sept 23, 2022

saying someone lied in the past is not convincing evidence that any arbitrary statement by them is affirmatively a lie

lrvick · on Sept 23, 2022

Meta controls the proprietary Whatsapp client software that decrypts your messages and they can have that decrypt and scan the messages for them and send back metrics and how often different words are used.

They can of course also have their app de-crypt and re-encrypt the messages to the key of a requesting third party like police or hired reviewers if certain keywords are used.

Authorities could also have Google or Apple ship a signed tampered Whatsapp binary to any user or group of users, like protestors, that uses a custom seeded random number generator so they can predict all encryption keys generated and no one else, including Meta, will know.

The variant of end to end encryption where third parties control the proprietary software on both ends, is called marketing.

muzani · on Sept 23, 2022

Also WhatsApp privacy policy:

As part of the Meta Companies, WhatsApp receives information from, and shares information (see here) with, the other Meta Companies. We may use the information we receive from them, and they may use the information we share with them, to help operate, provide, improve, understand, customize, support, and market our Services and their offerings, including the Meta Company Products. This includes:

- improving their services and your experiences using them, such as making suggestions for you (for example, of friends or group connections, or of interesting content), personalizing features and content, helping you complete purchases and transactions, and showing relevant offers and ads across the Meta Company Products

====

Popular theory is they can't see or store your messages, but can analyze them on the client and profile you (e.g. interested in brazil nuts)

pfortuny · on Sept 23, 2022

What does “conversation “ mean in that text.

I can perfectly mean just the audio exchanges when both parties talk.

Also: E2E does not imply necessarily that they do not know the key.

sschueller · on Sept 23, 2022

Prove it. Open source the client and open the server for 3rd party apps to use.

worker767424 · on Sept 23, 2022

It's possible that the client blindly fetches a mapping from keyword to ads, saw the keyword client-side, then requested the ad.

Drakim · on Sept 23, 2022

You'd indirectly reveal what those keywords are to Meta by which ads are being requested. If an ad for a sex toy is being requested, it's pretty obvious what the two parties are talking about.

atoav · on Sept 23, 2022

Still information leaks the encrypted channel and the trust is broken.

thescriptkiddie · on Sept 23, 2022

I hate to break it to you, but a privacy policy is not a legally binding document.

madeofpalk · on Sept 23, 2022

> Is it so hard to believe that Meta is snooping on WhatsApp conversations?

Where's the evidence? I don't know what ethos "Hacker News" is supposed to capture, but surely it's not superstition?

tedunangst · on Sept 23, 2022

Well, at least some people here are smart enough to know how to run disassemblers and packet captures. Clearly not everyone, but a few tech literate people.

KaiserPro · on Sept 23, 2022

> Is it so hard to believe that Meta is snooping on WhatsApp conversations?

for a lot of people, no

> Meta, a company of unprecedented size that was built over monetizing your private data?

one of many companies, however "meta" does have the advantage that you can opt out of them, mostly.

> A company who's been caught in plenty of scandals (like Cambridge Analytic) about this exact sort of thing (violating their users' privacy)?

CA is interesting as it started out as an academic study, which was consented fully. CA then went on to scrape people's public profiles, which often included likes, friends, etc. This combined with other opensource information allowed them to claim to have good profiles of lots of people, the PR was strong. Should FB have had such an open graph? probably not. Should they have taken the rap for everything evil on the internet since 2016? no. There are other actors who are much more predatory who we should really be questioning.

> Are you Meta employees?

I think you place far to much faith into a company that is clearly floundering. Its not like it has a master plan to invade your entire life. Its reached it's peak and has not managed to find a new product, and is slowly fading.

However, as we all think we are engineers, we should really design a test! but first we need to be mindful of how people are tracked:

1) phone ID. If you are on android, your phone is riddled with markers. Apple, supposedly they are hidden, but I don't believe that they don't leak

2) account, and account is your UUID that tracks what you like.

3) your IP. if you have IPv6, perhaps you are quite easy to track. even on V4 your home IP changes irregularly and can be combined with any of the above to work out that you are the same household.

4) your browser fingerprint. (be that cookies, or some other method)

5) your social graph

method:

1) buy two new phones.

2) do not register them with wifi

3) create all new accounts for tiktock, gmail, instgram etc.

4) never log into anything you've created previously, or the fresh accounts on old devices.

5) message each other about something. However you need to source your ideas from something offline, like a book from a thrift store or the like. maybe an old magazine. open a page, pick the first thing your finger lands on. this will eliminate the "I heard about x" or "i'm in the mood for y"

report back.

TacticalCoder · on Sept 23, 2022

> 5) message each other about something. However you need to source your ideas from something offline, like a book from a thrift store or the like. maybe an old magazine. open a page, pick the first thing your finger lands on. this will eliminate the "I heard about x" or "i'm in the mood for y"

Wait... If WhatsApp is really E2EE encrypted, why would any of the other steps be necessary? Dude and his wife can simply pick at page at random from a magazine in a store, never search anything online about it, start talking about it using WhatsApp as if it was something of great interest to them. If they start getting related ads, obviously something shady is going on. There's no need for new phones / new GMail accounts / etc.

KaiserPro · on Sept 24, 2022

> If WhatsApp is really E2EE encrypted, why would any of the other steps be necessary?

because you need to eliminate the chance of profiling by any other means.

Using the same phone as before means that the pre-existing profiles exist, which means that the relationship is already inferred. Because its trivially easy to track people, you need to eliminate all other variables.

xerxesaa · on Sept 23, 2022

As someone who has actually worked on end to end encryption at Meta, I can tell you I am not aware of anything where the company reads your WhatsApp messages - either in transit or device. The company takes fairly serious measures to ensure it cannot even accidentally infer such contents.

I don't know what is happening in this specific case. Perhaps the ads came from some other similar search queries. Perhaps they came from the keyboard intercepting what was typed. Or perhaps something else that I can't think of. But I'm nearly certain it did not come from meta intercepting the contents of your messages.

It's hard to convince people at this point because many have lost trust in Meta as a company, and I understand that. But I still find it stunning that so many people are making so many false claims without any actual knowledge to back it up.

daqhris · on Sept 23, 2022

Thanks for your explanation.

I didn't have in mind the scenario of a keyboard logging user inputs besides the normal functionality of WhatsApp. I find this theory to be very plausible. Not at all happy with Meta's privacy policy, but I agree that it is worth considering other threats.

From using a VPN that logs all incoming and outgoing traffic (NetGuard) on an Android One device, I've noticied that the default Google keyboard gets in touch way too many times with some distant servers. Whereas, an open source keyboard from F-Droid, FlorisBoard, does no snooping and gets updated solely through the app store.

AJ007 · on Sept 23, 2022

The third party keyboard apps are a big question for the OP.

Another consideration, there are companies that track and sell geolocation data. It's "anonymized" but so precise you know the street address a user resides at. It is not a stretch to consider "anonymized" retargeting from keyboard inputs.

I was dismissive of it in the past, as comments voted higher here are. However I've seen enough weird ads show up within minutes of making jokes about obscure topics that I suspect there is something going on.

The piece that might be missing here is third parties collecting signals, "anonymizing" them, and then ads get re-displayed through Facebook, Google, etc. It may not be the major ad platforms doing it directly. In theory this should be harder now with the iOS tracking restrictions.

For the skeptical, consider Avast's Jumpshot. Here millions of users thought they were protecting themselves when their raw browsing stream was being sold live to third parties. I They aren't the only company that has done that. https://www.theverge.com/2020/1/30/21115326/avast-jumpshot-s...

lrvick · on Sept 23, 2022

Google, Apple, or Meta retain the power to ship a tweaked binary with a compromised RNG to a subset of users if authorities order them to be it now or in the future after a privacy policy change.

Proprietary encryption means users cannot verify or control the keys or the code that generates or uses the keys. The app can exfiltrate the keys or do any keyword processing on behalf of Meta as well which can include well intentioned features like forwarding paintext messages containing certain dangerous-seeming words to authorities or theoretically trusted third party review teams. Naturally they could also return -metrics- about frequency of word use back to Meta for ad targeting as well.

I too have been a champion of encryption and privacy at past companies only to have all my work undone and watch all the data become plaintext and abused for marketing by a new acquirer.

The only way end to end encryption solutions can avoid these types of abuses is when the client software is open source and can be audited, reproducibly built, and signed by any interested volunteers from the public for accountability.

Short of that it is really not that much different than TLS with promises Meta will not peek, at least not directly, today.

beiller · on Sept 23, 2022

If they modified the RNG of person A's phone app during a forced stealth update, then shouldn't person B not be able to decrypt the message? Have you ever had an app update to Whatsapp that you cannot communicate with other people until you are forced to update? The alternative is that there is a vast internal conspiracy at meta that hundreds of engineers, and hundreds of ex-engineers are somehow silent on, which would be using 2 encryption keys, one that law enforcement can read, and one that the other end of the device can read. Isn't provable that Whatsapp the app is using the operating system level secure prng functions? If there was evidence of this, wouldn't it be great for a whistleblower to come out and make a killing shorting Meta's stock? Right now would be the perfect time to be kicked while they are down.

mhio · on Sept 24, 2022

> then shouldn't person B not be able to decrypt the message?

The RNG example is a way to create keys that make it trivial for "C in the middle" with the RNG details to extract the contents. They are still valid, just not useful as keys.

The Juniper attack and Dual EC exploit is a good real world example of compromising an RNG for passive decryption, although Dual EC was designed to be like that.

https://www.cs.utexas.edu/~hovav/dist/juniper.pdf

Melatonic · on Sept 23, 2022

Even with end to end encryption couldnt the app at the end also be just aggregating the data (or even transcribing audio) to send over separately?

lrvick · on Sept 23, 2022

Yeah though this is more likely to be detected with basic network analysis. Selectively compromising the RNG seed would be much harder to detect without source code.

5d8767c68926 · on Sept 23, 2022

Meta has repeatedly demonstrated they will do whatever it takes to capture user data. Kid VPNs, in app browsers, etc. Is it any surprise that people are deeply suspicious of any coincidences that arise from using a supposedly private channel.?

Given evidence at hand, it is hard to view Meta as anything but a bad actor.

hayst4ck · on Sept 23, 2022

> Perhaps the ads came from some other similar search queries. Perhaps they came from the keyboard intercepting what was typed. Or perhaps something else that I can't think of. But I'm nearly certain it did not come from meta intercepting the contents of your messages.

Isn't this kind of splitting hairs? Does it matter if text information came from a "side channel"?

It seems like the promise Facebook makes is that 'your communication using whats app is secure,' that's certainly my interpretation of what "end to end encrypted" means. It is a promise of security. That means text is sacred and even text sent to giphy should be privileged from the ad machine.

The question being asked here is not "is it end to end encrypted?" It's "are my communications secure?" End to end encryption is just one element of that security.

thetrb · on Sept 23, 2022

The thing is if it's a 3rd-party Android keyboard or similar that logs your messages then there's nothing Meta can do about this.

hayst4ck · on Sept 23, 2022

That's still Facebook's problem, no excuses. Facebook absolutely has the power and resources to lobby google and congress. Security teams at both companies will unequivocally agree that keylogging presents an extremely grave security risk that consumers are unlikely to understand the consequences of and therefore need to be protected from.

Imagine a hapless military professional/politician downloading one.

The problem is one of alignment. Facebook wants to monetize whatsapp and wants the whatsapp data. That's why there was a mass exodus to signal in the first place. Facebook was weakening the protections of the app.

Due to the alignment problem Facebook can't advertise whatsapp as the secure and private choice because they are actively working to make it less secure and private. That's why Brian Acton quit (leaving $$$$$$$ behind) in the first place.

hellotomyrars · on Sept 23, 2022

I don’t agree that it is Facebook’s problem but I do think this is probably where a lot of data gets leaked that people don’t realize or think about.

In a perfect world sure Facebook has the power and money to do a lot of things. So do the other megacorps. They don’t do them, and you’re correct it is the misalignment of incentives to due so.

But Facebook doesn’t control what keyboard you use on your phone and if the keyboard is sending every message you type somewhere, they can’t do anything about that and they aren’t lying that they can’t read your messages.

Whether or not you believe that they do in fact harvest the message data is up to you. But certainly people using keyboards that harvest data is very plausible to me as a vector for this stuff.

hayst4ck · on Sept 23, 2022

In the other post in this thread, I link to a website that ostensibly has a method of warning for non standard keyboards. If "e2e" communication is part of a products marketing, do you think they have a responsibility to warn when that expectation might be violated? What about warning that text sent to giphy may be used for advertisement purposes?

If I were to summarize my entire thoughts on WhatsApp, it's that it advertises security (e2e), while they only make money from violations of the security. The behavior OP expects is exactly the behavior a person would expect from this set of alignments.

If a leak is able to be monetized (even if it is google harvesting keyboard data and selling it back to FB) do you think that would be punished or rewarded?

If this very same post were for signal, I think the response we might expect is concern and investigation, not a response of defense and deflection.

There was an article several weeks ago about how a "special master" tasked with understanding what data Facebook collects on you was stonewalled because "even Facebook don't know what data Facebook collects."

https://news.ycombinator.com/item?id=32750059

"we don't want to be accountable for any data except the data that's part of the download your data":

> Facebook contended that any data not included in this set was outside the scope of the lawsuit, ignoring the vast quantities of information the company generates through inferences, outside partnerships, and other nonpublic analysis of our habits — parts of the social media site’s inner workings that are obscure to consumers. Briefly, what we think of as “Facebook” is in fact a composite of specialized programs that work together when we upload videos, share photos, or get targeted with advertising. The social network wanted to keep data storage in those nonconsumer parts of Facebook out of court.

> Facebook’s stonewalling has been revealing on its own, providing variations on the same theme: It has amassed so much data on so many billions of people and organized it so confusingly that full transparency is impossible on a technical level.

> The remarks in the hearing echo those found in an internal document leaked to Motherboard earlier this year detailing how the internal engineering dysfunction at Meta, which owns Facebook and Instagram, makes compliance with data privacy laws an impossibility.

Facebook doesn't even want to know if the WhatsApp is leaking data.

thetrb · on Sept 23, 2022

That 3rd-party keyboard would also be able to log your Signal messages, so I don't get your point.

hayst4ck · on Sept 23, 2022

If the original post is true and Facebook is leaking message based data into systems that produce ads (3rd party or 1st party), they have a responsibility to diagnose and resolve the issue. Despite their responsibility to do so, they are not aligned with doing so.

Excuses like "the user did something bad" aren't productive.

A warning that the users expectations (secure communications) do not match reality (3rd party keylogged communications) seems like the minimum level of responsibility:

https://maheshikapiumi.medium.com/allowance-of-third-party-k...

If WhatsApp derived information is being seen in advertisements, it is Facebook's responsibility. It is in Facebooks best (next quarters profits based) interests to not be responsible.

pdntspa · on Sept 23, 2022

Are you absolutely sure that this is still the case? You say you "used" to work on it, but modus operandi for these companies is rugpulling protections like this as soon as nobody is looking

xerxesaa · on Sept 24, 2022

I feel quite confident based on first hand knowledge of code, system design, and the many, many privacy reviews we had to go through when building new features to ensure we didn't accidentally log or otherwise infer data we weren't supposed to.

WhatsApp architecture is designed with the assumption that the server could be compromised and yet such an event should not result in any message contents being revealed. Furthermore, the encryption function is designed to ratchet and rotate keys so that a leak of a key at a given point in time would not compromise past and future messages.

So yes, I have a strong sense of confidence that message contents are not exposed to Meta and, given the bar set by privacy reviews, I don't think Meta would do some backdoor workaround like scraping the contents off the device and sending an unencrypted copy. To be clear, my claims are specifically around message contents and when it comes to certain metadata (ex. the sender/receiver, the names of groups, etc) I don't recall the exact details of how they are treated.

Now, despite the fact that I've said all this and that my knowledge on the matter is fairly recent, I'm not sure I could ever say anything with absolute confidence. The code base is huge and not open source. I obviously have not seen every line of code and as you pointed out, there's always a chance some company policy changes happened without my awareness. So I would say "highly" confident but not "absolutely" confident.

bartimus · on Sept 23, 2022

What about spell-check data?

FreeHugs · on Sept 23, 2022

Just because that the messages might be sent end-to-end encrypted from Sue to Joe does not mean Meta cannot read them.

Meta has control over the app Sue uses. So they could send them to Meta unencrypted in addition to sending them to Joe in an encrypted fashion.

Or they just extract the relevant terms:

Sue->Joe: "Hello Joe, I'm so excited! We are going to have a baby! Let's call it Dingbert. You're not the father! Jim is. I hope you don't mind too much!".

Sue->Meta: "Sue will have a baby"

Insta->Sue: "Check out these cute baby clothes!"

rreyes1979 · on Sept 23, 2022

More so, my wife sent me a picture of my daughter working on a puzzle. Less than 24 hours later, her Instagram was showing ads for a store that was selling the same type of puzzle as the one my daughter was playing with. So it's not just terms but images too.

planb · on Sept 23, 2022

She probably gave Instagram access to her photo library (not unreasonable for a photo sharing app). That means the Instagram app can scan her latest pictures in the background when it's opened. I think it's more likely that the data was leaked this way.

snowwrestler · on Sept 23, 2022

In case folks don’t know this: on an iPhone you do not need to give an app access to all your photos in order to use photos in the app.

Under Privacy > Photos, you can set “Selected Photos” instead of “All Photos” on a per-app basis.

Then when you go to add a photo to the app, you first go through an iOS prompt to select the photos the app will have access to. Only then do you go through the app’s photo selection dialogue.

I have all my apps set this way (or “None”).

nailer · on Sept 23, 2022

I just did this and the UI is weird and confusing - it looks like I need to statically pick photos in the settings app, which obviously won’t work for day to day use every time I take a photo and want to publish it to instagram.

Not saying it doesn’t work like you say, just saying it doesn’t look like it does.

bombcar · on Sept 23, 2022

At least for Telegram each time you go to pick a photo to share, it offers you the chance to "add more photos visible" or you can click Manage.

I assume Instagram and friends would do the same.

I often just take the photo via Telegram instead, which automatically adds it to your photo roll and gives Telegram access to it. It works relatively well.

snowwrestler · on Sept 23, 2022

You can just hit “done” in the settings app and it will close (with no photos selected).

Then on Instagram (for example) when you go to post, you’ll get a message like “you’ve only let Instagram have partial access to your photos - Manage”. Tapping Manage will let you select photos that Instagram can access.

wil421 · on Sept 23, 2022

Glad I deleted my Meta apps and only use online FB when I need to.

The other day I noticed the yahoo mail app on iOS was reading my clipboard for no reason. I’m going to start blocking photos on most of my apps.

Melatonic · on Sept 23, 2022

Instagram is especially malicious with this - it is the only app that REQUIRES access to my microphone for me to post something. They try to do this by having a camera inside instagram (that you can record with which would obviously require mic access) but even to post stuff I have already taken (even just photos) it wants mic access. I usually temporarily give it what it wants, post, then remove again.

paavohtl · on Sept 23, 2022

Is this something that actually happens (= can anyone prove this by disassembling the app or MITMing the network traffic), or is it just unfounded paranoia?

mid-kid · on Sept 23, 2022

Considering how easy it is to implement these things without anyone noticing since it's closed source, you have to assume it is happening in any scenario where you need any decent opsec. Even in scenarios where you don't, there's been enough cases of similar things happening with well-known apps and services to be wary.

titaniczero · on Sept 23, 2022

> Considering how easy it is to implement these things without anyone noticing since it's closed source

You can reverse engineer those things and analyze your network traffic. You can’t have a client in a device controlled by the user, in this case an app, send anything to a server without anyone noticing it.

And frankly, they don’t even need it. Just with your contacts they can link you to your friends and common interests without even you having a facebook account, all you need is friends with a fb/ig account who have linked their accounts to their phones and use whatsapp.

The contacts are known to be sent to the server, they are known to be linked to facebook except in the european union where there is a different app from WhatsApp Ireland and a different privacy policy that specifically states (in the version outside of EU) that it shares your contacts with facebook and they are much more valuable and much less risky than reading your messages.

mid-kid · on Sept 23, 2022

> You can reverse engineer those things and analyze your network traffic.

I frankly don't think people realize how much obfuscation of both app code and network traffic goes on under the hood. "analyzing network traffic" isn't a sustainable option when things are encrypted and behind dozens of layers of protobuf, websockets and other fancy protocols, and get updated and change around all the time. Far from everything is introspectable http, javascript and json these days, and that applies espeically to big apps like these. It's not hard to send privacy-sensitive data along with "legitimate" data like analytics at unexpected times and evade scrutiny.

Yes there's people that dedicate themselves to reverse engineering apps like this, but they're few and far between, and most of them focus on either the easy fish, or security vulns. Considering nobody's building public documentation on the protocols of these apps I'll have to assume it's hard enough and changes often enough to be worth the time of people without special monetary interests.

I agree with the rest of your assessment, there's way less "obviously malicious" ways to exfiltrate data about users than literally uploading users' pictures, since for example whatsapp stored unencrypted backups on google drive until very recently, among other things. I'm just trying to shed a light on the fact that apps like this have a lot of ways to accomplish this without raising too many eyebrows.

dvtkrlbs · on Sept 23, 2022

It shoukd be easy to test since Ios has a feature called app privacy report that lists networks and permission access and no when you just open the instagram app it does not access photos. Only when you open add to story page or click on the new post icon it does the access.

planb · on Sept 26, 2022

Thanks for making me aware of this! You're right!

Shish2k · on Sept 23, 2022

> Considering how easy it is to implement these things without anyone noticing since it's closed source

I see you’ve never heard of Jane Manchun Wong...

campital · on Sept 23, 2022

I imagine the reputational and potential legal consequences would be fairly severe if this sort of privacy invasion were discovered (either by employee leak or reverse engineering). Seems unlikely Meta would take a risk like this.

nerdponx · on Sept 23, 2022

Back when deep learning was first hitting "mainstream" for object recognition in images, I recall reading that Facebook was using it to look for brand logos and other signs of using a particular product, in your uploaded photos.

Turns out they were also building a database of everyone's face so they could build shadow profiles...

whywhywhywhy · on Sept 23, 2022

How did she buy the puzzle to begin with.

lm28469 · on Sept 23, 2022

> my wife sent me a picture of my daughter working on a puzzle.

> her Instagram was showing ads for a store that was selling the same type of puzzle

How did she take the pic ?

giarc · on Sept 23, 2022

I think that's an important question. Did user take the photo within the app, thereby skipping the camera roll, or did they take the photo, then upload to WhatsApp from camera roll. If the latter than as someone else said, could be that Instagram had access to camera roll and decided to serve ads based upon the puzzle.

netsharc · on Sept 23, 2022

I have a suspicion as well that this is what they're doing: before the message is encrypted and sent, the app (on your phone) does analysis and picks out keywords relevant for advertising. So they can claim and be technically correct that they are not reading your messages. Although if their algorithm is doing it on your phone, is it... reading?

Or they can say, technically it wasn't a message before it was sent. The dictionary definition[1] even mentions "send".

[1] https://www.oxfordlearnersdictionaries.com/definition/englis...

Melatonic · on Sept 23, 2022

This is definitely the most likely scenario in my opinion

jtbayly · on Sept 23, 2022

> Just because that the messages might be sent end-to-end encrypted from Sue to Joe does not mean Meta cannot read them.

No, that's precisely what End-to-End encryption means.

piva00 · on Sept 23, 2022

It means that for strictly one receiver end-to-end encryption. When it's touted as a feature without explicitly stating that "all messages are sent only e2e encrypted and only to your receiver" we can't assume only the receiver is getting the message, it might be E2E encrypted for all traffic, between people using their own keys and nothing stops Meta from sending a different encrypted payload to their own servers with a key they have access to.

Facebook loves to use newspeak, wouldn't surprise me if they applied newspeak to what "end-to-end encryption" means.

rob74 · on Sept 23, 2022

So it's end-to-end encrypted, but your data is sent to some "ends" you didn't think it would be sent to? Well, if that's not a good reason to end your usage of WhatsApp, then I don't know what is...

neilalexander · on Sept 23, 2022

Meta own the proprietary code running at either end of the encrypted pipe. Of course they can.

omgomgomgomg · on Sept 23, 2022

They can decrypt if someone enables backups, so I see no reason they could not read them indeed.

Signal might be the only app unable to read, but even that, I would not trust.

marcus0x62 · on Sept 23, 2022

How would you propose Signal -- or any app for that matter that provides end to end encryption -- encrypt the messages in the first place if they don't have access to the plaintext at some point?

m0RRSIYB0Zq8MgL · on Sept 23, 2022

End-to-End means that it can't be read in the middle. It does not not mean it can't be read by proprietary clients on either end.

philsnow · on Sept 23, 2022

Until there are cybernetic implants, the "ends" are the app running on your phones, which they control.

The quandary of what one allows to run on those implants sounds like a chilling sci-fi novel (chilling not because "but FAANG could read your thoughts!" but because people would absolutely still get them installed).

rr888 · on Sept 23, 2022

End-to-End is about the networking, not the end points.

https://en.wikipedia.org/wiki/End-to-end_encryption#Endpoint...

pfortuny · on Sept 23, 2022

That is the technical definition.

spoiler · on Sept 23, 2022

So you're nit-picking over the phrasing of the sentence, but should instead focus on the spirit/meaning behind it.

It's illustrated in their example below that they if you say you're having a baby, meta can send some type of distilled ad-keywords to its servers (eg `[mother, baby]` if it knows the user is a woman based on their name/profile, but probably more sophisticated than that). The message you sent is still technically end-to-end encrypted, though,

jtbayly · on Sept 23, 2022

I addressed this just below:

https://news.ycombinator.com/item?id=32951417

amelius · on Sept 23, 2022

Google can in theory read what is on your screen (assuming you use Android) regardless what app with what encryption you use.

tom-thistime · on Sept 23, 2022

Oh, come on. It's called "end to end" but it isn't. Meta has to read them to provide the service. This is not a new revelation.

rreyes1979 · on Sept 23, 2022

I think they are extracting terms. Some of the messages generated ads that were related to a term but not really about the conversation.

rhn_mk1 · on Sept 23, 2022

> Just because that the messages might be sent end-to-end encrypted from Sue to Joe does not mean Meta cannot read them.

I think it does actually no one except them can read them. If someone else can, then by definition it's not end-to-end encryption.

From https://www.definitions.net/definition/End-To-End%20Encrypti...

> End-to-end encryption (E2EE) is a system of communication where only the communicating users can read the messages.

viraptor · on Sept 23, 2022

The conversations being e2ee do not affect the app itself from acting on contents. By definition the app needs to know the contents to display it, but it can also update your ad profile. It doesn't even need to send the whole message to meta, just the keywords triggered, or a preprocessed vector defining your interests.

E2ee means only the messages themselves can't be intercepted and read. But if anyone can actually prove fb acting on message contents, I suspect the EU banhammer would be interested.

rhn_mk1 · on Sept 23, 2022

The application processing the message for the purpose of displaying it is clear.

But if the message is copied, read, analyzed and sent further on behalf of a third party before encryption, then that puts that third party in the middle between the sender and the recipient. A man in the middle directly undermines e2ee: "no one else reads your message".

It doesn't matter if the third party made the messaging app or not. What matters is whether information in your messages is accessible to anyone besides you and the recipient.

ale42 · on Sept 23, 2022

E2EE doesn't prevent the app itself from analyzing messages locally, and sending updated interest profiles to meta... which can be a vector of weights or whatever thing they might be using to know what ads to show. If the logic is in the app, the message doesn't leave the app and E2EE is preserved.

This said, analyzing messages for the purpose of ad display is creepy, whatever the way it is done.

rhn_mk1 · on Sept 23, 2022

E2EE most certainly does exclude analyzing messages anywhere for a third party.

Notice that "ends" in "end-to-end" are users, not applications. When an application forwards things to an entity, then that entity becomes an "end" of the conversation. When it displays a message to the user, the way the user wants, then the user is the end. When it processes the message and delivers results to Facebook, the way Facebook wants it, then the application makes Facebook the "third end".

In such scenario, Facebook had intercepted the message, just chose to forward only some extracted information (which may or may not be enough to reconstruct the original). This does not match the definition of "end-to-end encryption".

viraptor · on Sept 23, 2022

> Notice that "ends" in "end-to-end" are users, not applications.

That's not right. First, it's technically an impossible, since users can't do encryption themselves - it's the application that does it. That's where the e2ee boundary is.

Second, we've got e2ee communication between non-user entities as well. There's are servers using for example zerotier which communicate e2ee through other nodes. Third, applications can definitely send the data to other parties automatically. WhatsApp executing backups as configured does not make it not e2ee.

rhn_mk1 · on Sept 24, 2022

It's not a distinction between softwares, it's a distinction between agents. I.e. who the software works for.

xuki · on Sept 23, 2022

Whatsapp can't read the message on their servers but they can read it at clients, otherwise they cannot display the messages for users. Likewise, Apple/Google can read them too because they have to in order to render the texts.

jtbayly · on Sept 23, 2022

This is just redefining terms, then.

We know the app decrypts it to display it. But if the app decrypts it to send it to the parent company, then it is by definition not end to end encrypted anymore.

If the app decrypts it, analyzes it and sends information about the message to the parent company, then the same thing is happening. The parent company is reading the message, INSTEAD of E2E encrypting it. It doesn't matter whether that reading happens on device or on the company's servers. E2E means the company is not reading it.

propogandist · on Sept 23, 2022

there was a time when “Unlimited” meant without any limits, but US cell carriers have redefined the term to support their business model.

It’s possible that this data harvesting ad company has redefined what E2E means (to them) to advance their business interests.

charcircuit · on Sept 24, 2022

>then it is by definition not end to end encrypted anymore.

HTTPS is E2E between the client and the server.

kadotus · on Sept 23, 2022

But the problem arises, I think, is when they say they can't read them: "WhatsApp's end-to-end encryption is used when you chat with another person using WhatsApp Messenger. End-to-end encryption ensures only you and the person you're communicating with can read or listen to what is sent, and nobody in between, not even WhatsApp." https://faq.whatsapp.com/general/security-and-privacy/end-to...