Hacker News new | past | comments | ask | show | jobs | submit login
Tool beeps every time data is sent to Google (twitter.com/bert_hu_bert)
338 points by MasterYoda on Aug 27, 2022 | hide | past | favorite | 106 comments



Dupe: https://news.ycombinator.com/item?id=32549604

"Audible feedback on just how much your browsing feeds into Google" (290 points | 5 days ago | 205 comments)


>However, it appears the Google services set includes the cloud IP addresses, so you must check both sets before determining something is in fact a Google service and not a Google customer.

That's a pretty big caveat at the bottom of the GitHub readme

https://github.com/berthubert/googerteller#data-source


It’s badly worded as it implies the tool doesn’t do this, when it actually does:

> we need to define an ip(6)tables ipset. This will first exclude Google Cloud, and then include all the other Google IP addresses.


Looks like the tool does this. That said, since there are 3 google cloud customers in the world right now I think its a small issue.


Having set up a Pi-Hole recently, I've had a similar realization, but for all of FAANG. I've noticed that the moment phones start getting used on my network, more than half of the requests made end up getting blocked.


Any stats by F A A N G? I'm guessing for me given I have a bunch of Apple stuff that one A is getting a lot of pings. The other A very few. I have a kindle but it's not plugged in and I haven't turned it on for months. G, well I use Chrome and many Google apps on iPhone. F, have FB and Messenger apps on iPhone but assume those are mostly isolated.


Not OP, but I've been using NextDNS [1] (a similar pi-hole like setup) on my phone and in its analytics there's a GAFAM Dominance graph.

This graph [2] isn't mine but it looks similar to my results.

[1]: https://nextdns.io [2]: https://twitter.com/NextDNS/status/1159804929680257024


Unfortunately, it's not so clear-cut by looking at which domains are getting hits, but it would make a good weekend project. I'd have to map all subsidiaries back to their parents eg YouTube or IG, and then there's the trick of dealing with third-parties, like branch.io, who appears to be very popular.


Amazon owns AWS, so lots of services that you don't think of as Amazon related (Netflix, lots of news sites, ...) would trigger this.


The beeping while he types in the URL is completely unsurprising. That’s just autocomplete. Still, a neat tool. Does anyone know what triggers the call to Google whenever he expands a menu on the site? Google Analytics?


Websites love to record interactions, however insignificant they are, in order to correlate them with conversions. I'd say Google Analytics is likely the right answer.


Frankly the rest is completely unsurprising as well. "Large" commercial websites tend to track user actions. Either via Google Analytics or other (sometimes even more intrusive) means.


Yeah probably analytics, it tracks clicks and page changes


It's common these days for sites to track not only which buttons you clicked but where exactly you clicked on the button using tech like Hotjar.


I have done something a tad different.

I am running my custom made proxy (dns blocking like pihole is a joke, you can circumvent it as simply as https://2899908462 ) with interesting features like ASN ( https://en.wikipedia.org/wiki/Autonomous_system_(Internet) ) blocking and for fun I have blocked all google ASNs.

Half of the internet stopped working.

Then I have expanded this to google, microsoft, amazon, facebook ASNs.

The whole internet stopped working. From all search engines I am aware of, only yandex.ru was still operational.

What you are seeing with beeping is just a tip of iceberg. The google is getting far more data from its cloud.

I don't believe that most of advanced users are aware of, how deep the rabbit hole of internet centralization has gone.

And until people figure out how important self hosting actually is, it is only going for worse (yeah, I understand how convenient, blah, blah... the cloud is).


Add Akamai to that list of ASNs and you cover everything. I did that experiment a couple of years ago and my Apple devices went unresponsive unless I disconnected them from the Internet entirely.

The rabbit hole goes as far as locking yourself out of many government services. If you’re a Canadian, it means not being able to travel without submitting to corporate privacy policies and ToS (ArriveCan apps/website).


Unresponsive? As in the user interface stopped working?


Yeah, there are a lot of synchronous network calls in many applications. I ran out mobile bandwidth recently and have spent the last week in AT&T's penalty box and it's amazing what broke. For example, Audible's app goes to complete hell in this state when you launch it; you can bypass the hang by clicking library but it would be content to mostly just sit there.


I found a fun issue with the Apple Music app- if you're connected to wifi but NOT connected to the internet, it takes 60 seconds for the app to respond when you try to play something or change tracks.


DNS issues are the worst. Everything in your stack reports having a "connection" but everything takes 60 seconds to not even load.


  Then I have expanded this to google, microsoft, amazon, facebook ASNs.
  The whole internet stopped working. 
Neither am I delighted to have my machine talk to Cloudflare or Akamai (per https://news.ycombinator.com/item?id=32618098). Add to that a few font, framework services and ad delivery networks and why bother having any independent servers at all?

For decent people there is no privacy, but, for malicious people, it is more than enough privacy to avoid accountability. That's no surprise today, but who would have looked forward to a net like this back in the 1990's?

Just scrap everything and start again. What we have now is a failed experiment.


> Neither am I delighted to have my machine talk to Cloudflare or Akamai

what's your concern around CDNs? Its not like its viable for every company to put in 1000s of edge nodes to ensure every area of every country has a good website/app experience.


  Its not like its viable [...]
I agree. That's why I ended by writing "Just scrap everything and start again. What we have now is a failed experiment"


Did the internet maybe stop working because you blocked all the major cloud providers?

Nextdns (lazy man's pihole) does a decent job of adblock and seems to stop stuff like Braze tracking from working.


Sure it did. But those are the data that google is getting, it is network transfer trough its servers. They are far beyond "tracking the clicks". And btw, no. The DNS blocking is stopping only domains. The primary addressing level is ip and you cant block it by preventing access to domains. But you can do it using proxy (or firewall, but typically firewall is clueless about the data layer), that is why I made it and while it can read the pihole blacklist and use them, it can do much more (like screwing js browser fingerprinting on network level by injecting js that randomizes the image creation etc.).


Google is not getting any data when you connect to a GCP VM through TLS. Or did I miss another huge PRISM-level scandal? Same for Azure and AWS. But if you block those no wonder a huge part of the internet is gone.


They know who owns the VM, and now they know that you connected, sent & received a certain amount of data at a certain time.

[Edit: apparently this is more encrypted than I was thinking, so the next bit is probably wrong.] They could potentially look inside the VM to look at the specific data on the other side of TLS.


> They know who owns the VM, and now they know that you connected, sent & received a certain amount of data at a certain time.

Which is quite similar to meta data that's collected on phone calls, such data is regularly the basis for governments killing people [0]

[0] https://www.justsecurity.org/10311/michael-hayden-kill-peopl...


Sure, and your ISP knows every single IP you have connected to, too. That is just how the internet works.


The ISP only knows that the traffic went to GCP. Google knows which customer it went to.


eSNI (or similar) still hasn't been rolled out at large scale. If your ISP wants to, it can know what domain the application is trying to connect to. Domain fronting may confuse them, but most services don't use that at all.


Can anyone provide proof that Google records this (for Google, not as network telemetry for cloud customers)?


Google Cloud would be negligent if they didn't collect information about ingress and egress traffic mapped to each of their tenants. Since they own the servers and network, it's on them to be able to investigate and track abuse.


Perhaps I phrased it poorly. The inference seems to be Google using this data for their gain (beyond operational needs). Is there any proof of that? Or is my inference incorrect?


Show me the GCP setup that does TLS beyond ingress. Maybe you can find one in healthcare or finance.


I know it blocks braze because an app I was testing it in an app I was developing. It's possible for them to hardcode IPs, but last time I checked they were too lazy for that. Maybe static IPs still cost extra?

Some apps will do one DNS query and cache the IP they get, so maybe what you're seeing is them using IPs they cached before you rigged up this custom network situation.


> Did the internet maybe stop working because you blocked all the major cloud providers?

You're saying this like it's not the point. Six or seven companies control the internet and have root and logs of all incoming traffic on ~all servers.


There's a huge difference between Google/Microsoft/Amazon having the possibility of accessing "root and logs" just as a side effect of running the service and the OPs assertion about them actively gathering information from that source. It would be a huge scandal if any of those cloud providers were caught peering inside their customer's VMs. Have I missed that scandal?


> It would be a huge scandal if any of those cloud providers were caught peering inside their customer's VMs.

No it wouldn't.

At this point it could be revealed tomorrow by mountain of incontrovertible evidence and most people would shrug, move on and ask "what next?".

Snowden. Shrug. Cisco backdoors. Shrug. Pegasus. Shrug. Solar winds. Shrug...

The past decade can be described by the pattern "It would be a terrible scandal if X happened", and then precisely X happens. Then we normalise to it.

> Have I missed that scandal?

You may say it semi-sarcastically, but of course the irony is that actually you very well could have misse it. You only need to take a vacation for one week, a major shitstorm hits the front pages and fades from the news cycle. Now it's the "new normal".

The important point is, you might never know. Without homomorphic encryption you simply have to trust entities that have the means, the motive, the opportunity and the track record for screwing you over.


I'm all too aware that nobody cares. But I do and did follow all the examples you cite. So it would be very surprising to me if that was found out and it would indeed be a scandal like those others. That nobody cares after even a small amount of time is depressing but it's a different discussion.


You and I probably follow this stuff more than the average person. These days it's pretty much my job to. And yet I missed Carrier-IQ, the Android vendor malware. Eventually read about it a year after the first investigations. Also I almost missed the Apple CSAM debacle, being busy with a couple of contracts. Total time from tentative leak, through disclosure, expert-public outrage to Apple backing down was about 8 weeks, please correct me if I am wrong?

This is Blotto front exhaustion and fatigue in action. It's in the counter-terrorism literature. When you're under attack on many fronts, and adversaries regularly create new ones, and attacks are frequent but random, eventually some get through.

And I very much consider "big tech" to be adversaries in the civic cyber-security game, because they can and will do whatever would make them money, bending and breaking laws, covering up wrongdoing, silencing critics and smearing whistleblowers. They've done so reliably for years.

Perhaps at issue is what we think a "scandal" is.

Scandals used to be mainstream news events that caused widespread public discontent, led to lengthy investigations. government reports, companies being fined, shut down, careers being ruined, even suicides and jail time....

Today the word has lost its currency. Data leaks were once scandalous but we long passed the point when weekly and then daily major breaches lost the interest of the media. By definition, news has to be something new. Otherwise it's "Oh-Dearism". Again, company X installing malware and spying on you is hardly raising eyebrows. People are coming to expect it.

I'm not making a point of moral outrage, or even passing much by way of judgement here. It's just what's happening. But the essential "criminality" of big-tech (if only in spirit not letter) does have profound implications for the future of digital technology, and we should not ignore it. The possibility that the main players have been silently compromising rented VMs for reasons other than mandated law-enforcement should not be lightly dismissed.

I'm curious to know what you think the mechanism/psychology is at play in the "people not caring", other than the fatigue factor I mentioned.


China did not shrug and forced the development of local equivalents like a little more than a decade ago.

And theyre now still painted as the bad guy for "censoring".


Censoring and forcing local equivalents is a false equivalence.

The censoring makes the western internet quite hard to use without vpns (or last time I was there, Google Fi seemed to not have to go through the firewall and routed the traffic through Europe somehow?)


Microsoft openly read every email and teams message and frequently leak the contents to bing and nobody cares.

Windows laptops from OEMs frequently come with backdoored https implementations and nobody cared after the first one.

Dell business laptops come with malware that sends videos and photos to ukraine and israel and nobody cares.

It won't even be a blip when they are caught doing it.


It's been a consistently effective strategy to assume any amount of surveillance or control they can implement without leaving evidence is already something they're doing.


Yeah I mean I guess that's sort of a problem? No reason why people can't setup their own hardware other than cost/sloth.

Cloud providers made themselves the best solution for a lot of services.


What's interesting to me is how split into node type the Internet is. All the servers are Google/AWS/Azure, but there are almost no legitimate clients on those ASNs.

I've also done AS blocking (preventing certain IPs from getting a free compute trial without human intervention; this was back in the crypto days), and indeed, blocking the networks that you did is great for getting rid of bots and harms 0 legitimate users. (I think the big culprit is "free for open source" CI systems; they tend to be hosted on the major cloud providers and I found those doing a lot of command-and-control. I'm surprised those are viable to keep around for free, though.)


How important is self-hosting? Because as it stands, six or seven corporations control the majority of internet traffic and it seems to be fine. There have never been more users doing more things at once than right now.

Self hosting is a huge undertaking and one really has to justify the value proposition, especially for people who want a presence on the internet but don't want their full-time job to be internet administration.


Why is self hosting a huge undertaking compared to using the cloud?

What are the major differences from a layperson view? Is spinning VMs on your hardware that much harder than doing it on other people's computers? It's an honest question because I haven't done both


Yes. Cloud hosting puts out of the operator's space of concern actual maintenance of the hardware, which for most companies that want something like 24/7 uptime is a big deal.

I worked at a small company where we planned for 24/7 uptime before clouds were ubiquitous. We planned out which of the three engineers in the team would hold the pager and the cost of gas reimbursements for them to drive one state over to deal with the machine in the secured rack facility if it physically went down. We didn't have nearly enough bandwidth capacity to our building itself to support the traffic we anticipated for our service.

Smaller projects that don't require 24/7 uptime can be self-hosted, but you still want to be aware of fabric-layer security... If you're hosting on a machine in your building and somebody roots it, what will physical access to your intranet let them get away with? Can they see source code from there? Financials? Employee database? All of this is less a concern if you're using AWS with separated instances that are no more connected to each other than Netflix is to Disney+, even if they're in the same building.

Cloud hosting lets you focus on the software and credentials and pay someone else to focus on hardware and application of credentials.


It depends on what you want to self host.

Low hanging fruits like cloud storage, web hosting, VPN are fairly easy to setup and self maintain.


Yes, it's not just clicking a "create VM" button or setting up a box in the closet. It's setting up the networking and security so that only the right people can access it and the packages are updated etc. And being the support person if anyone besides yourself is using it.


> Then I have expanded this to google, microsoft, amazon, facebook ASNs.

> The whole internet stopped working.

Not that surprising, considering these companies already had direct influence over 70%+ of internet traffic back in 2014 [0]

By now that number is probably in the 80-90% range as it's a problem the vast majority of people are either completely unaware of, or sometimes deny it to be even a problem in the first place.

[0] https://staltz.com/the-web-began-dying-in-2014-heres-how.htm...


Maybe next step would be to make a tool which says if given website is clean or not? So we can start making whitelist.


please consider a more inclusive term such as allowlist.


Huh, what is https://2899908462 ?


It's an IPv4 address expressed as a 4-byte integer instead of four separate decimal octets. 172.217.23.110 is an IP belonging to Google.

172<<24 + 217<<16 + 23<<8 + 110 = 2899908462


This is also why 1.1 === 1.0.0.1

Saves 4 whole keystrokes if you quickly want to ping something to test connectivity.


In the 90s there was a campaign about the new worldwide 10-digit phone number format.

So if you own some 31.x block, you could have a phone number which matches your IP.


Worldwide 10 digit format? News to me. Can you say more?


There was a campaign that went something like "de hele wereld bellen, even tot tien bellen"

I think I'm wrong, it was about the 10-digit Dutch numbers.

https://www.youtube.com/watch?v=NfcD3RiB1EA



Has it been implemented by any carriers?

The US phone system had such a service for a few years (area code 500 iirc). It had so little uptake that it was shut down.


I wrote this a number of years ago to accomplish same:

    #!/bin/ksh                                                                                                                                                                                                     
                                                                                                                                                                                                                   
    Die() {                                                                                                                                                                                                        
        echo Bad IP                                                                                                                                                                                                
        exit 1                                                                                                                                                                                                     
    }                                                                                                                                                                                                              
                                                                                                                                                                                                                   
    [ $# = 1 ] || Die                                                                                                                                                                                              
                                                                                                                                                                                                                   
    [ ${1%%?*.?*.?*.?*} ] && Die                                                                                                                                                                                   
                                                                                                                                                                                                                   
    I=$1                                                                                                                                                                                                           
                                                                                                                                                                                                                   
    until [ ${#I} = 0 ]                                                                                                                                                                                            
    do                                                                                                                                                                                                             
        J=${I#[0-9.]}                                                                                                                                                                                              
        [ ${#I} = ${#J} ] && Die                                                                                                                                                                                   
        I=$J                                                                                                                                                                                                       
    done                                                                                                                                                                                                           
                                                                                                                                                                                                                   
    I=$1.                                                                                                                                                                                                          
                                                                                                                                                                                                                   
    typeset -ui N=0                                                                                                                                                                                                
                                                                                                                                                                                                                   
    while [ $I ]                                                                                                                                                                                                   
    do                                                                                                                                                                                                             
        J=${I%%.*}                                                                                                                                                                                                 
        I=${I#$J.}                                                                                                                                                                                                 
        [ $((J>>8)) = 0 ] || Die                                                                                                                                                                                   
        N=$(( (N<<8) + J ))                                                                                                                                                                                        
    done                                                                                                                                                                                                           
                                                                                                                                                                                                                   
    echo $N                                                                                                                                                                                                        
                                                                                                                                                                                                                   
    # End



  man inet_aton
[…]

  int inet_aton(const char *cp, struct in_addr *inp)
  […]
  The address supplied in cp can have one of the following forms:
  […]
  a   The  value  a is interpreted as a 32-bit value that is stored
      directly into the binary address without any byte rearrangement.
https://manpages.debian.org/stable/manpages-dev/inet_aton.3....


A different format for writing ip addresses


Why do you need a different format for writing IP addresses to circumvent the PiHole? Shouldn't a regular IP address literal skip the DNS lookup, too?


I used that notation for fun, as most of people are not aware of it, and based on comments quite a few people learned something new :)

And it is a regular ip address, just written differently :)


Ah okay, I got it: it's the decimal representation of the IP address.

https://www.lookip.net/ip/172.217.23.110

Funny I wasn't aware of this despite 25 years of being an Internet user


> "Funny I wasn't aware of this despite 25 years of being an Internet user"

You're not alone… I been using the Internet longer'n that and I didn't ever think of converting an IP address to a decimal number either. It makes perfect sense now that it's been pointed out, but for some reason it just never crossed my mind to even try it.


To be fair, there are many different bases. I wonder if there’s anything special about base 10 that made them support it.


IPv4 addresses are written in base 10 for human users.

The "special" case is supporting network addresses to be written as 10.2932832 ("convenient" for class A's), 172.16.61031 (ditto for B's), or just one big address like 39282329, when we're used to 4 octets separated by dots.

Not every bit of host software supports these cases anymore, as basically their sole remaining use case is as a curiosity or to circumvent bad security controls.


Bah, humans!


> only yandex.ru was still operational.

Maybe having Russia not being part of the same monopoly is actually a good thing for the internet.


>you can circumvent it as simply as

except no company actually does this


Offtopic, looks like the easiest way to record a screencast with sound from the machine was to film it with another device.

This is not a jab at linux, I wouldn't know how to do that with MacOS either; the default Cmd+Shift+5 video capture tool doesn't allow recording internal sound.

I'm surprised that in 2022 this task can still be so problematic.


> I wouldn't know how to do that with MacOS either; the default Cmd+Shift+5 video capture tool doesn't allow recording internal sound.

>If you would like to screen record on your Mac with audio, you can use the QuickTime Player provided by Apple. Select it from your applications folder and then go to File > New Screen Recording from the top menu bar.

https://www.geeky-gadgets.com/screen-record-on-mac-with-soun...


If you want to record sound generated by the computer itself only (not the microphone), you can use a tool like BlackHole [1].

BlackHole adds a loopback audio device which can be used with the default video capture tool (select the loopback device in the options menu after pressing shift-command-5).

To be able to play audio from the speakers while you're recording, you need to add a multi-output audio device in the macOS "Audio MIDI Setup" app. Switch to the multi-output device by option-clicking the audio icon in the top menu bar. Now both the speakers as well as the BlackHole audio device will receive audio.

[1]: https://github.com/ExistentialAudio/BlackHole


This is also really useful if you need to mute a Zoom call if you need to listen to something else (like an unrelated video). You can just temporarily set the Zoom audio output to the BlackHole device.


Recording has been built into Gnome for quite some time (ctrl+alt+shift+r), but with Gnome 40 it's got a nice UI and pops up with the screenshot tool (PrtSc). Though it doesn't do audio either.


Mac can do it easily if you setup SoundFlower.

https://rogueamoeba.com/freebies/soundflower/


Depends what you mean by easiest.

You can also use ffmpeg to create a "screencast" with sound and it's just a couple of extra command-line options. E.G. the options

  -f pulse -ac 2 -i default
work well for me.

Of course, if you don't know this or your machine's setup well, and you happen to have another device lying around, you're prone to pick it up and make the video that way instead.

Which may be harder because then you have to get the video off the other device somehow. Lol, epistemology is an interesting field.


I wouldn’t read much into it. I’ve had many people send me “screenshots” taken with their phone. The phone camera has become the goto tool for anything visual for many people.


Because it's convenient. Everyone knows how to use their phone's camera and click on "share". A screenshot involves more steps unless you know what you're doing and are already logged into Twitter. And screenshots on Android phones are often just painful.


pressing volume and power at the same time is painful?


It's so untypically cumbersome and unintuitive for a smartphone UI that to me it doesn't feel like a feature intended for end users. The button combination isn't always consistent from one phone to the next and afaik there's no way to know the right one without googling it or trying out a few variants. If you get the combination or timing wrong you probably changed some audio setting or locked the screen. Sometimes it's easier but in general it feels more like trying to enter BIOS settings on a random computer - it's basically the same procedure as entering a smartphone's or Surface Pro's bootloader menu.


Installed obs studio on my wayland+pipewire machine and it just worked. Full screen capture with multiple audio sources simultaneously.


There are very cheap USB HDMI capture sticks on the market that I use for this. Works great on all platforms!


OBS Studio can do it on Windows.


It’s more dramatic. A screencast would feel synthetic.


The beeping could have been more sharp, like geiger-muller counter sound.


Found prefer to see a firefox version. i.e. excluding the built in chrome telemetry



oh wow. Right that is pretty bleak


The tracking is mostly present in the scripts running on those sites, not the browser itself. You have to run NoScript and other blockers to deal with those.


The tracking is also present on internal Firefox pages like the add-on one, where Mozilla intentionally block extensions like ublock.


The nice thing about his code[0] is that the C application just counts the new-lines on stdin and makes a beep for each. Just about anything could be fed into this app to make ad-hoc live monitor of activities.

A simple one is to monitor any access to files inside a directory:

    inotifywait -r -m . | ./teller
You could also beep on important events in your own log files, beep for each request that hits your socket, or set up the runtime to tell you when the garbage is collected.

[0] https://github.com/berthubert/googerteller


As discussed here: https://news.ycombinator.com/item?id=32549604

(over 200 comments)


Setup at the router level with a raspberry pi or something might be fun. Beeps when you're not even at a computer.


Genius! And I'm sitting here and debugging Google Analytics tracking with developer tools console.log and networking panel. On my next project I'm going all out and integrating TTS to debug Google Analytics events.


The beeps likely combine to a high frequency outside of the human hearing range


Make a version that uses different frequencies for different major companies.


This is oddly effective at annoying me off YouTube.


Can this be made as a browser extension?


Geigle Counter




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: