Hacker News new | past | comments | ask | show | jobs | submit login

I have family in the industry (digital ads, bidding for ads) and some of the stuff they have explained to me is pretty gnarly. One cousin was explaining an idea he had for a new business and it was essentially just about collecting as much personal data as possible. Imagine a simple DB table with 'user_id' as primary key. Then you just start adding data on top of data with new columns or joins. After a while you have hundreds (thousands) of datapoints per person, then you start selling that data.

So i want to help my family out b/c i have the tech skills and building some prototypes would be easy. But i'm also not sure i want to feed the advertising beast. Shit is whack right now man they listen to you talking to your wife through your phone about buying a mattress, going to a concert, taking a trip somewhere. Next time you open the browser you are bombarded with relevant ads. too creepy no thanks!




> they listen to you talking to your wife through your phone about buying a mattress, going to a concert, taking a trip somewhere. Next time you open the browser you are bombarded with relevant ads.

I've heard this from non-tech family/friends.

Is this actually happening though? If so, who is doing it and how?

Edit: I'm referring specifically to ad tech that targets ads based on overheard conversations.


Without doxing myself by giving too much info about myself away, I can say with 99% certainty that this is not happening with Facebook or Apple.

The remaining 1% is if policy changed in the past few years (read: < 3 ), or if there was some top-secret team that few knew about working on this.

In general, these cases are coincidences based upon one or some combination of the following:

1) Search history that became "subconscious" and the person forgot. Annecdotally, check your history and see if you remember _everything_ you've ever searched for...you may be surprised. Android especially saves more than you might think

2) Related searches or characteristics relevant to the market. Example: during Black Friday in the 'States, many people are searching for TVs, so Amazon or Walmart will probably serve you ads for that in anticipation. Let's say with 10% odds, 1/10 people will see this ad and think "huh - I was just telling my wife we need a new TV. How did they know?" Law of large numbers and such...

3) The person installed malware that IS collecting sound samples, feeding that data to an ad-server, and actually performing this malicious behavior - but it isn't necessarily FB/Apple/etc. You tend to see this more on Android, as the Play Store has too many apps with malware like this, but it can happen on iOS too if the user isn't careful about privacy permissions.

Hope that long response helps answer your question =)


Your item (3) definitely seems to match the current situation in the wild - FAANG companies are genuinely invested in maintaining their brand and reputation and so I do completely agree that they wouldn't tend to collect seemingly shady or 'underground' data themselves.

I think the definition of malware is blurring though. On some websites when I view the list of advertisers & third parties listed in consent dialogs, it's literally hundreds of them. There's no way to tell how many of those are equally motivated to protect user data and be on best behaviour.

It's also easy to forget that many users simply don't have the same understanding or reasoning about phone permissions that the audience on HN does.

I worry a great deal that we're walking into a world where older generations in particular are exploited by technology via a bombardment of settings and dialog boxes -- and that we're veering away from the real promise of technology, which is to provide clear, simple, fast and effective life improvements.


> I think the definition of malware is blurring though. On some websites when I view the list of advertisers & third parties listed in consent dialogs, it's literally hundreds of them. There's no way to tell how many of those are equally motivated to protect user data and be on best behaviour.

I worked on a side project a few years ago for price checking stuff. When I was working on the toys r us integration I noticed their site loaded insanely slow. So I dug deep and the site had dozens of hits to random urls. After doing a bit of spelunking - whois lookups, etc - the webpage was having my browser contact pretty much every major (or subsidiary of) tech company you could think of: oracle, ibm, cisco, facebook, etc. etc. etc.

It made me realize how utterly insane the web has gotten.


> I can say with 99% certainty that this is not happening with Facebook or Apple.

> The person installed malware that IS collecting sound samples

The Facebook app uses certificates that circumvent permissions. Any analytics package installed in the Facebook app could actually be listening and putting that on their ad server, making the Facebook app the malware that is collecting sound samples.

Facebook Inc. can accurately say "we aren't collecting" and may actually have no knowledge, and everyone continues to be misdirected.

Ironically, whether it was a package in the Facebook app, or from any other service on the phone, the ad-server is sharing the same fingerprinting across to all the other apps and companies including Facebook.


If i remember right when there was a 'scandal' last year - where journalist worked with transcribing voice data from google assistant recordings. They found out that it activated plenty of times without the key phrase(it is easy to see why it happens), and it did contain PII occasionally.

Probably by sheer 'coincidence', when google was being investigated - apple, samsung and few others shut down similar initiatives.

And with how hostile and profitable the advertising world is, I'll stay with 'guilty until proven innocent' mindset.


Did they also say that it was used for advertising? I don’t think it’s much of a stretch to think they would, but it doesn’t sound like a smoking gun.


You are saying, "Trust me I work with apple and facebook"

No, no trust, none at all, zero. This is not at all in any way personal and you're anon so it couldn't be.

Do you understand how that works and why? Pathological lying has taken place in the world's most successful bait and switch. Nobody agreed to this. Not a single person agreed to a surveillance state and the creation of a turnkey facist enforcement solution. The stasi couldn't have dreamed of having so much power.

Zero trust. Less than zero. Facebook and Apple (and others) have now been caught and are desparately trying to pretend it's all ok. It isn't. Not even close.

We now have to assume the content is lies, we don't have a choice. The fact you need to be anonymous in claiming everyting is really ok is telling.


Do you not trust anything on Wikipedia? Of course you can't ascertain 100% whether a comment is true or not but going around saying everything is a lie because Ad companies lie to you doesn't seem very helpful. OP wasn't saying trust me I work at Apple/Facebook they were saying: I had pretty good visibility into internal projects and the codebase and from what I saw that type of tracking wasn't going on. Of course you can only take that type of comment at face value but to assume it's a lie seems silly.

Not believing a PR/damage control statement from an Ad company on the other hand is probably the right thing. Now at some point Ad companies may start doing huge disinformation campaigns on social media with payed commenters but that doesn't seem to be the case yet.


You trust Google anytime you plan your way to the airport with maps.

“Zero trust” sounds cool because cynicism is often confused with smartitude. But it is impossible to actually verify even a tiny slice of the information you consume and rely on every day.


Conflating reading a map with a claim of "nothing to see here, there is no crime, trust me." is utterly ludicrous. But I think you know that.


Then I guess when you say “zero trust” you actually mean “some trust, but not enough for that”.


If you wish to deliberately play silly semantic games to obsfuscate the obvious and intended meaning there is no prospect of meaningful exchange of ideas. This conversation is well beyond sense in that if you have a point I have no idea at all what it could be regarding whether or not facebrick et al are criminal enterprises or you should not trust at all those who claim, anonymously, without any evidence at all one way or another. But especially when those claims are that facebrick aren't misbehaving terribly. Especially based on hard won experience and the mess we are now in. Espeically because we know they are desperate not to be seen that way and spend money to that effect.

But yeah, maybe it doesn't involve trust at the level of whether the letter s is actually q. Sure.


>> You are saying, "Trust me I work with apple and facebook"

The exact turn of phrase was "Facebook or Apple". Formally speaking, you cannot conclude that they are working for either, given they phrased it as a disjunction, much less that they are working for both.

So maybe let's not jump to conclusions about other commenters? Who knows what the GP meant by "doxing myself"?

(Note: formally, "A or B" implies neither A nor B).


FFS

the claim is "i have inside info at apple and facebrick therfeore trust me" The pedantic difference is utterly meaningless here but you know that.

The answer is you absolutely refuse to trust an anonymous person based on this claim. The end. Period. Yeah? Yes.

Interesting all the pedantic responses that have zero baring on that, including yours.

Trust was asked for. The only sane response is to refuse, publically and loudly.


Even if "trust no one" is good advice, it's certainly not evidence that Facebook or Apple listen to audio from your phone and use that to target advertisements.


Verify uisng evidence. Esepcially when dealing with the claims of an industry of pathological liars.

Especially verify if there are zero consequences for someone deliberately and falsely making a claim that this time there is no deception.

Deliberately false and misleading claims by facebrick, goog, apple and the entire internet advertising con-job is how we got to this point, remember. So maybe don't just decide to believe blindly when that has been tried before with the outcome we have.

The _denial_ is the thing that is not evidence. Whether it is happening or not the denial here is not worth electrons used to deliver it.

I make no claims at all about what /is/ happening beyond all these companies having zero credibility given their incredible bait and switch to produce an outcome that literally nobody agreed to.


I don’t really trust them more than I can throw them, however I’m inclined to believe in this case given the practicality of it. Capturing and parsing all that audio is expensive and would be low signal to noise. Additionally, many of these companies are sitting on data that is significantly more valuable and relevant.

So purely from a selfish business perspective, which is how I assume they make decisions, why would they implement this? And that is to say nothing of the PR risks.

What do you have leading you to believe it’s happening?

Or are you simply saying we should not implicitly trust the anon commenter? If so, I agree fully.


> Verify uisng evidence. Esepcially when dealing with the claims of an industry of pathological liars.

Verify what, exactly? Verify literally every claim someone else makes that Facebook denies? There's simply no smoking gun to indicate that they're listening to phone audio. It would be extremely surprising if security researchers had not discovered them secretly doing this, or that a concerned employee wouldn't have leaked it to the press.

I don't trust these companies either, but that doesn't mean I believe any and all negative claims made about them, particularly claims that don't have any evidence supporting them other than the occasional well-targeted advertisement.


So now flip it and decide whether you trust positive claims that facebrick are not doing whatever.

That is what we are discussing, here, now.

I don't trust those positive claims that facebook are not misbehaving. If you do, I can't help you.

Is that evidence of their nuclear weapons program? Obviously and clearly not. I have never here put forward evidence of their misbehaving. That continues to come regularly. Whether it is this or somenthing else more will come is a reasonble bet.

Should you trust them or anyone who says without evidence "we are doing nothing wrong" That is being addressed here.

Do you? Really?

So much pushback for taking exception to an anonymous defence with no evidence. Surely taking exception to anonymous claims with no evidence because "trust me" is what every single one of should do. Especially when we've been burned so very, very hard.

Until there are real consequences for telling lies trusting facebook denials is naieve in the extreme.

Fool me a 56th time what am I? An owner of a facebrick account.


Re: #2, it's also possible that an ad you saw and dismissed prompted you to start thinking about replacing your TV, and then a later ad from the same campaign seems like an odd coincidence.


I worked at Facebook for four years (three of which on the iOS app) and it's implausible to me that they're using the microphone to listen into people’s conversations, at least on the flagship apps (Messenger, Facebook, Instagram).

They'd have to be using some very sophisticated techniques to hide internally that they're doing it, given that engineers can all see the entire codebase. I was pretty intimately familiar with every bit of the build process; it'd have been hard to miss some injection of secret code. Also, the benefit would be minuscule, and the threat to their reputation if caught (given that Zuckerberg has very explicitly claimed not to be doing this) would be huge.

I am almost certain that it's not happening on Android either, but I'm not as familiar with that codebase, so I won't claim to know first hand.


I recently read an article by Nicholas Nassim Taleb: https://medium.com/incerto/the-intellectual-yet-idiot-13211e...

Here is an excellent description of the Big Tech employee today: "but their main skill is capacity to pass exams written by people like them".

Yes, Facebook is like a cult. You can verify this by asking any Facebook employee their view on privacy. You will usually get a ramble which will look exactly like the other passage from that essay:

"that class of paternalistic semi-intellectual experts.... who are telling the rest of us 1) what to do, 2) what to eat, 3) how to speak, 4) how to think… and 5) who to vote for."

And I will add 6) why privacy does not matter.

My point is: I think this person is lying. Now, if only Facebook open sourced its entire codebase, I would be willing to take back my view. See the problem here? It will be back to square one.

"Trust us. We know we are right. And we must be right because we are successful.."


Sure, facebook probably doesn't listen to your conversations directly, but I can believe some random free to play game does, and through various data brokers that data can end up with people who advertise on facebook.


the threat to their reputation if caught (given that Zuckerberg has very explicitly claimed not to be doing this) would be huge.

"We're sorry, we can do better." ad nauseam.


What does that prove though? Those very same engineers are accessory to all the crap that we do know Facebook has pulled in the past and likely is still pulling in the present. That code doesn't have to be secret at all.


There are quite a few ex-Facebook people who have come out with scathing criticisms of the company, and plenty of Google's internal political fights have spilled out into the media.

I'm fairly certain that if this has been happening for years, somebody would have leaked confirmation or publicly admitted to it by now.


>Is this actually happening though? If so, who is doing it and how?

It's possible but a lot of the reports are clearly coincidences. Take this example of someone watching 'Catch Me If You Can' 2 days ago, and then yesterday seeing a HN posting about the main character and finding it hard to accept as coincidental.[0]

0. https://news.ycombinator.com/item?id=22048337


I think, in this case, you underestimate the probability of that happening randomly. The birthday problem [0] can explain that.

Imagine you have conversations with your spouse about hundreds of things. And then you see ads about hundreds of things. Sometimes they happen to be about the same things. That's something out of ordinary for you, so you remember that. You are spooked by that, and you might even tweet about it, or tell your friends about it. You don't tweet every time you don't see an ad about the thing you talked about. But the probability of that happening several times over the course of a year is quite high.

[0] https://en.wikipedia.org/wiki/Birthday_problem


I think it’s possible that some minor actor is doing it (which I suppose would mean it’s almost certain that someone does it), but it’s likely not done at scale or at a big actor such as Facebook, Samsung or Google.

Most people who are creeped out about things like “I talked to my friend about traveling to Aruba yesterday and today on Facebook I have ads for trips to Aruba!”

First of all you could have had those ads all year but only noticed after the discussion. That’s the simplest explanation.

But also, if this was indeed a friend (or someone who was ever e.g. the recipient of the same email as you, or has friends in common with you etc) then that person perhaps shared that they were in Aruba recently. Since the two of you were in the same location when you talked, a clever advertiser could show you ads for products or services that people they think you interacted with have already bought - such as a hotel stay in Aruba.

I’d really like to hear if anyone has first hand experience with spying like this - and the fact that no one has yet come forward and confirmed it suggests it must be rare.


Surely it couldn't stay a secret in the industry. Especially as plenty of people spoke up against other questionable practices.

Maybe it's like the dieselgate thing where various engineers may not see the big picture, but I honestly do not believe this is the case. I have a strong feeling that for vast majority of people, ads have a significant effect on your purchases, perhaps you were not thinking about traveling to Aruba but after couple of your friends went there, talked to you about and you got bunch of ads showing beautiful Aruba you decided to go. They don't need to listen to you to "get you".


Uber spies on passengers using the microphone, I’m willing to bet. I was in an Uber, having a conversation with the driver about wine tasting in Woodenville. When I got home, I had an email from Uber advertising wine tasting trips to Woodenville. The email was sent four minutes after my Uber ride. Coincidence?

I found a forum for Uber drivers where someone asked, “Why does the driver app ask to access the microphone?”


Do you know how expensive (and shit) voice-to-text is? Go check out AWS Transcribe prices. Try it out too, see how much nonsense it will spew out.

It's funny as I've literally just done a project on this to automatically classify phone calls. In reality voice recognition is only good at set phrases, most of Google Assistant and Alexa is smoke and mirrors.

It wouldn't even be able to recognise "wine tasting in Woodenville", let alone for it to be economically viable to process every voice like sound in a taxi and make money off adverts.

It's just the Baader-Meinhof phenomenon, not Uber recording you.


Did it occur to you that maybe the driver brought up Woodenville because he got the same email before the ride?


I've never heard of Woodenville, but I'd be willing to bet that it's a popular place to go wine-tasting, and that would explain both your conversation and the email advertisement. The exact timing could of course be a coincidence, and not one that seems at all unlikely.


No they don't but due to false allegations from passengers (and to a lesses extent drivers) they plan on starting to record the conversations.


It's not proven yet I guess? But then again, the big companies keep saying they don't record/listen to voice commands for assistants either.

But the reports of third party companies suddenly having access to this data (for quality improvement purposes) tells another story.

Also my YouTube keeps suggesting videos for any show my phone is near when it plays.


They don't need to listen to your voice for that. The show can have a signal hidden in its sound track that you can't hear, but that the apps on your phone can. Then they can report home that you are watching such-and-such show. No idea what happens after that but I guess google apps share data somehow.

There was an article about this that I read a while back, I'll try to find it if I can.

Edit: here, found something:

https://arstechnica.com/tech-policy/2015/11/beware-of-ads-th...

The ultrasonic pitches are embedded into TV commercials or are played when a user encounters an ad displayed in a computer browser. While the sound can't be heard by the human ear, nearby tablets and smartphones can detect it. When they do, browser cookies can now pair a single user to multiple devices and keep track of what TV commercials the person sees, how long the person watches the ads, and whether the person acts on the ads by doing a Web search or buying a product.

Edit again: to be fair, not even Facebook and Google have the tech to match the shows you watch with ads just by what your phone hears you say. Speech recognition still doesn't work _that_ well.


Of course they don't record and listen, that would be slow and take huge stores of data. They save transcripts and read/parse it instead. Much faster and less data to store.


> Is this actually happening though? If so, who is doing it and how?

I think there's a lot of anecdotal evidence that would seem to confirm this. It might seem like they're listening, but it might just be the real-time data collection taking place. If you have an Alexa in your house, it's always on and listening.

I have a neighbor who swears when him and his wife are talking about their grocery list, they'll get ads on their phones for the stuff they need. They even tried on purpose to test it out and were talking about getting blinds (they didn't need new blinds) and twenty minutes later they opened Google on their phone and on their desktop PC and they were being served ads for blinds.

I had another friend who did the same thing and started talking about how he hated he had to constantly shovel all the snow in his driveway and within minutes was getting ads on his laptop for snow removal services.

People will swear Alexa doesn't do this, but the amount of instances I've heard makes me think they can't all be coincidental.


I have experienced situations in the past with android which makes me believe that either android itself or an application on android phones is intercepting the plain text values of notifications.

I've sent messages to friends through signal about specific things which then started showing up in their google ads the next day. After digging into it we found the recommendation to set signal to no longer preview the messages via notification. Since changing that there's been no further occurrences.


I don't know if mainstream apps are doing that, but I don't see why someone couldn't create a "trojan horse" app that requires the microphone, and starts monitoring and sending back recordings if it detects a conversation, it probably would be eventually spotted though if it gets any populartity.

Or, perhaps microphones in public that try to tie overheard conversations with beacon or location data to find out who it heard.


https://www.popularmechanics.com/technology/security/a145332...

Apparently these apps don't record human speech, but the capability's there.

https://www.latimes.com/business/la-fi-cameras-grocery-store...

Not quite targetting based on conversations but still creepy.

I think the targetting based on voice conversations is maybe a bit overblown, but text conversations through messenger or whatsapp, I have no doubt about. There's been more than one time I've started getting targetted ads based on text conversations i've had through those apps.

But, I also get targetted ads based on my emails, things I search for on reddit, youtube videos I watch or search for, websites I browse, local ones based on my location, and a myriad of other things that are nearly as bad as my conversations being recorded.


It's all anecdotal and so far I don't think anyone has actually found phones etc exporting conversation keywords or anything like that. That being said it's happened a few times that I've been in the car with my girlfriend talking about something, won't search anything about it and we'll get ads about it in the next couple days... The one time that stood out was talking about babies and getting ads about childcare, baby products, etc way more over the next couple days.


No, although Google and/or Alexa will build up a profile of your interests based on things you ask/search for through them - but this is the same as with any search engine.


I partially suspect Facebook since a lot of people I know mindlessly let Facebook Messenger send / read / receive texts. If anybody has evidence to the contrary feel free to let me know. I also suspect anybody using Google Voice is giving up their secrets as well.


No, it's not.


I used to work in the space, assembling the DB table you described in a meaningful way is surprisingly hard to do, because you would need to attach confidence intervals to everything you tag onto that user_id. Reality tends to be complex and often wrong, and there are very few actually good signals. That's what makes Facebook such a powerhouse.

I eventually left that space for personal reasons, but with distance and time I've resolved that I no longer want to touch that space, no matter the startup idea. I would sooner start developing software to block or push noise into those systems. The digital ad/marketing industry is driven by a lot of people who care about anything but the health of our society or the actual consumers.


1: Your family have missed the boat - this has been done for the last 5 years. It isn't new

2: It isn't as easy as you (Or they) think. I simple DB isn't going to cope with the real-time needs of the ad industry


I was at a company doing this in 2004.


Five years? More like the last 20+.


See also companies/products such as Anonos BigPrivacy[0] which claim to provide a software solution which somehow still allows you to perform analytics on user data while magically ticking all privacy compliance laws for you.

Pretty hard to interrogate without actually seeing the technical details / implementation, but I strongly suspect this is not really following the intent of regulation (but instead playing a shell game to extract revenue and allow players to navigate through the field for a while until it all unravels).

[0] - https://www.anonos.com/company


I have fam in adtech and the running joke is "haha, we're the ones doing all the evil stuff you wouldn't believe even if we told you, HHOS."


Help your family out, make good money from it. The worse it gets, the better. When the shit hits the fan, nobody needs it to come into the headlight with a bang, we want it to become visible with a hypernova.


The Trump approach, for when you actually need shit to hit the fan.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: