>However, it appears the Google services set includes the cloud IP addresses, so you must check both sets before determining something is in fact a Google service and not a Google customer.
That's a pretty big caveat at the bottom of the GitHub readme
Having set up a Pi-Hole recently, I've had a similar realization, but for all of FAANG. I've noticed that the moment phones start getting used on my network, more than half of the requests made end up getting blocked.
Any stats by F A A N G? I'm guessing for me given I have a bunch of Apple stuff that one A is getting a lot of pings. The other A very few. I have a kindle but it's not plugged in and I haven't turned it on for months. G, well I use Chrome and many Google apps on iPhone. F, have FB and Messenger apps on iPhone but assume those are mostly isolated.
Unfortunately, it's not so clear-cut by looking at which domains are getting hits, but it would make a good weekend project. I'd have to map all subsidiaries back to their parents eg YouTube or IG, and then there's the trick of dealing with third-parties, like branch.io, who appears to be very popular.
The beeping while he types in the URL is completely unsurprising. That’s just autocomplete. Still, a neat tool. Does anyone know what triggers the call to Google whenever he expands a menu on the site? Google Analytics?
Websites love to record interactions, however insignificant they are, in order to correlate them with conversions. I'd say Google Analytics is likely the right answer.
Frankly the rest is completely unsurprising as well. "Large" commercial websites tend to track user actions. Either via Google Analytics or other (sometimes even more intrusive) means.
Then I have expanded this to google, microsoft, amazon, facebook ASNs.
The whole internet stopped working. From all search engines I am aware of, only yandex.ru was still operational.
What you are seeing with beeping is just a tip of iceberg. The google is getting far more data from its cloud.
I don't believe that most of advanced users are aware of, how deep the rabbit hole of internet centralization has gone.
And until people figure out how important self hosting actually is, it is only going for worse (yeah, I understand how convenient, blah, blah... the cloud is).
Add Akamai to that list of ASNs and you cover everything. I did that experiment a couple of years ago and my Apple devices went unresponsive unless I disconnected them from the Internet entirely.
The rabbit hole goes as far as locking yourself out of many government services. If you’re a Canadian, it means not being able to travel without submitting to corporate privacy policies and ToS (ArriveCan apps/website).
Yeah, there are a lot of synchronous network calls in many applications. I ran out mobile bandwidth recently and have spent the last week in AT&T's penalty box and it's amazing what broke. For example, Audible's app goes to complete hell in this state when you launch it; you can bypass the hang by clicking library but it would be content to mostly just sit there.
I found a fun issue with the Apple Music app- if you're connected to wifi but NOT connected to the internet, it takes 60 seconds for the app to respond when you try to play something or change tracks.
Then I have expanded this to google, microsoft, amazon, facebook ASNs.
The whole internet stopped working.
Neither am I delighted to have my machine talk to Cloudflare or Akamai (per https://news.ycombinator.com/item?id=32618098). Add to that a few font, framework services and ad delivery networks and why bother having any independent servers at all?
For decent people there is no privacy, but, for malicious people, it is more than enough privacy to avoid accountability. That's no surprise today, but who would have looked forward to a net like this back in the 1990's?
Just scrap everything and start again. What we have now is a failed experiment.
> Neither am I delighted to have my machine talk to Cloudflare or Akamai
what's your concern around CDNs? Its not like its viable for every company to put in 1000s of edge nodes to ensure every area of every country has a good website/app experience.
Sure it did. But those are the data that google is getting, it is network transfer trough its servers. They are far beyond "tracking the clicks". And btw, no. The DNS blocking is stopping only domains. The primary addressing level is ip and you cant block it by preventing access to domains. But you can do it using proxy (or firewall, but typically firewall is clueless about the data layer), that is why I made it and while it can read the pihole blacklist and use them, it can do much more (like screwing js browser fingerprinting on network level by injecting js that randomizes the image creation etc.).
Google is not getting any data when you connect to a GCP VM through TLS. Or did I miss another huge PRISM-level scandal? Same for Azure and AWS. But if you block those no wonder a huge part of the internet is gone.
They know who owns the VM, and now they know that you connected, sent & received a certain amount of data at a certain time.
[Edit: apparently this is more encrypted than I was thinking, so the next bit is probably wrong.] They could potentially look inside the VM to look at the specific data on the other side of TLS.
eSNI (or similar) still hasn't been rolled out at large scale. If your ISP wants to, it can know what domain the application is trying to connect to. Domain fronting may confuse them, but most services don't use that at all.
Google Cloud would be negligent if they didn't collect information about ingress and egress traffic mapped to each of their tenants. Since they own the servers and network, it's on them to be able to investigate and track abuse.
Perhaps I phrased it poorly. The inference seems to be Google using this data for their gain (beyond operational needs). Is there any proof of that? Or is my inference incorrect?
I know it blocks braze because an app I was testing it in an app I was developing. It's possible for them to hardcode IPs, but last time I checked they were too lazy for that. Maybe static IPs still cost extra?
Some apps will do one DNS query and cache the IP they get, so maybe what you're seeing is them using IPs they cached before you rigged up this custom network situation.
> Did the internet maybe stop working because you blocked all the major cloud providers?
You're saying this like it's not the point. Six or seven companies control the internet and have root and logs of all incoming traffic on ~all servers.
There's a huge difference between Google/Microsoft/Amazon having the possibility of accessing "root and logs" just as a side effect of running the service and the OPs assertion about them actively gathering information from that source. It would be a huge scandal if any of those cloud providers were caught peering inside their customer's VMs. Have I missed that scandal?
> It would be a huge scandal if any of those cloud providers were
caught peering inside their customer's VMs.
No it wouldn't.
At this point it could be revealed tomorrow by mountain of
incontrovertible evidence and most people would shrug, move on and ask
"what next?".
Snowden. Shrug. Cisco backdoors. Shrug. Pegasus. Shrug. Solar
winds. Shrug...
The past decade can be described by the pattern "It would be a
terrible scandal if X happened", and then precisely X happens. Then
we normalise to it.
> Have I missed that scandal?
You may say it semi-sarcastically, but of course the irony is that
actually you very well could have misse it. You only need to take a
vacation for one week, a major shitstorm hits the front pages and
fades from the news cycle. Now it's the "new normal".
The important point is, you might never know. Without homomorphic
encryption you simply have to trust entities that have the means, the
motive, the opportunity and the track record for screwing you over.
I'm all too aware that nobody cares. But I do and did follow all the examples you cite. So it would be very surprising to me if that was found out and it would indeed be a scandal like those others. That nobody cares after even a small amount of time is depressing but it's a different discussion.
You and I probably follow this stuff more than the average person.
These days it's pretty much my job to. And yet I missed Carrier-IQ,
the Android vendor malware. Eventually read about it a year after the
first investigations. Also I almost missed the Apple CSAM debacle,
being busy with a couple of contracts. Total time from tentative leak,
through disclosure, expert-public outrage to Apple backing down was
about 8 weeks, please correct me if I am wrong?
This is Blotto front exhaustion and fatigue in action. It's in the
counter-terrorism literature. When you're under attack on many fronts,
and adversaries regularly create new ones, and attacks are frequent
but random, eventually some get through.
And I very much consider "big tech" to be adversaries in the civic
cyber-security game, because they can and will do whatever would make
them money, bending and breaking laws, covering up wrongdoing,
silencing critics and smearing whistleblowers. They've done so
reliably for years.
Perhaps at issue is what we think a "scandal" is.
Scandals used to be mainstream news events that caused widespread
public discontent, led to lengthy investigations. government reports,
companies being fined, shut down, careers being ruined, even
suicides and jail time....
Today the word has lost its currency. Data leaks were once scandalous
but we long passed the point when weekly and then daily major breaches
lost the interest of the media. By definition, news has to be
something new. Otherwise it's "Oh-Dearism". Again, company X
installing malware and spying on you is hardly raising
eyebrows. People are coming to expect it.
I'm not making a point of moral outrage, or even passing much by way
of judgement here. It's just what's happening. But the essential
"criminality" of big-tech (if only in spirit not letter) does have
profound implications for the future of digital technology, and we
should not ignore it. The possibility that the main players have been
silently compromising rented VMs for reasons other than mandated
law-enforcement should not be lightly dismissed.
I'm curious to know what you think the mechanism/psychology is at play
in the "people not caring", other than the fatigue factor I mentioned.
Censoring and forcing local equivalents is a false equivalence.
The censoring makes the western internet quite hard to use without vpns (or last time I was there, Google Fi seemed to not have to go through the firewall and routed the traffic through Europe somehow?)
It's been a consistently effective strategy to assume any amount of surveillance or control they can implement without leaving evidence is already something they're doing.
What's interesting to me is how split into node type the Internet is. All the servers are Google/AWS/Azure, but there are almost no legitimate clients on those ASNs.
I've also done AS blocking (preventing certain IPs from getting a free compute trial without human intervention; this was back in the crypto days), and indeed, blocking the networks that you did is great for getting rid of bots and harms 0 legitimate users. (I think the big culprit is "free for open source" CI systems; they tend to be hosted on the major cloud providers and I found those doing a lot of command-and-control. I'm surprised those are viable to keep around for free, though.)
How important is self-hosting? Because as it stands, six or seven corporations control the majority of internet traffic and it seems to be fine. There have never been more users doing more things at once than right now.
Self hosting is a huge undertaking and one really has to justify the value proposition, especially for people who want a presence on the internet but don't want their full-time job to be internet administration.
Why is self hosting a huge undertaking compared to using the cloud?
What are the major differences from a layperson view? Is spinning VMs on your hardware that much harder than doing it on other people's computers? It's an honest question because I haven't done both
Yes. Cloud hosting puts out of the operator's space of concern actual maintenance of the hardware, which for most companies that want something like 24/7 uptime is a big deal.
I worked at a small company where we planned for 24/7 uptime before clouds were ubiquitous. We planned out which of the three engineers in the team would hold the pager and the cost of gas reimbursements for them to drive one state over to deal with the machine in the secured rack facility if it physically went down. We didn't have nearly enough bandwidth capacity to our building itself to support the traffic we anticipated for our service.
Smaller projects that don't require 24/7 uptime can be self-hosted, but you still want to be aware of fabric-layer security... If you're hosting on a machine in your building and somebody roots it, what will physical access to your intranet let them get away with? Can they see source code from there? Financials? Employee database? All of this is less a concern if you're using AWS with separated instances that are no more connected to each other than Netflix is to Disney+, even if they're in the same building.
Cloud hosting lets you focus on the software and credentials and pay someone else to focus on hardware and application of credentials.
Yes, it's not just clicking a "create VM" button or setting up a box in the closet. It's setting up the networking and security so that only the right people can access it and the packages are updated etc. And being the support person if anyone besides yourself is using it.
> Then I have expanded this to google, microsoft, amazon, facebook ASNs.
> The whole internet stopped working.
Not that surprising, considering these companies already had direct influence over 70%+ of internet traffic back in 2014 [0]
By now that number is probably in the 80-90% range as it's a problem the vast majority of people are either completely unaware of, or sometimes deny it to be even a problem in the first place.
int inet_aton(const char *cp, struct in_addr *inp)
[…]
The address supplied in cp can have one of the following forms:
[…]
a The value a is interpreted as a 32-bit value that is stored
directly into the binary address without any byte rearrangement.
> "Funny I wasn't aware of this despite 25 years of being an Internet user"
You're not alone… I been using the Internet longer'n that and I didn't ever think of converting an IP address to a decimal number either. It makes perfect sense now that it's been pointed out, but for some reason it just never crossed my mind to even try it.
IPv4 addresses are written in base 10 for human users.
The "special" case is supporting network addresses to be written as 10.2932832 ("convenient" for class A's), 172.16.61031 (ditto for B's), or just one big address like 39282329, when we're used to 4 octets separated by dots.
Not every bit of host software supports these cases anymore, as basically their sole remaining use case is as a curiosity or to circumvent bad security controls.
Offtopic, looks like the easiest way to record a screencast with sound from the machine was to film it with another device.
This is not a jab at linux, I wouldn't know how to do that with MacOS either; the default Cmd+Shift+5 video capture tool doesn't allow recording internal sound.
I'm surprised that in 2022 this task can still be so problematic.
> I wouldn't know how to do that with MacOS either; the default Cmd+Shift+5 video capture tool doesn't allow recording internal sound.
>If you would like to screen record on your Mac with audio, you can use the QuickTime Player provided by Apple. Select it from your applications folder and then go to File > New Screen Recording from the top menu bar.
If you want to record sound generated by the computer itself only (not the microphone), you can use a tool like BlackHole [1].
BlackHole adds a loopback audio device which can be used with the default video capture tool (select the loopback device in the options menu after pressing shift-command-5).
To be able to play audio from the speakers while you're recording, you need to add a multi-output audio device in the macOS "Audio MIDI Setup" app. Switch to the multi-output device by option-clicking the audio icon in the top menu bar. Now both the speakers as well as the BlackHole audio device will receive audio.
This is also really useful if you need to mute a Zoom call if you need to listen to something else (like an unrelated video). You can just temporarily set the Zoom audio output to the BlackHole device.
Recording has been built into Gnome for quite some time (ctrl+alt+shift+r), but with Gnome 40 it's got a nice UI and pops up with the screenshot tool (PrtSc). Though it doesn't do audio either.
You can also use ffmpeg to create a "screencast" with sound and it's just a couple of extra command-line options. E.G. the options
-f pulse -ac 2 -i default
work well for me.
Of course, if you don't know this or your machine's setup well, and you happen to have another device lying around, you're prone to pick it up and make the video that way instead.
Which may be harder because then you have to get the video off the other device somehow. Lol, epistemology is an interesting field.
I wouldn’t read much into it. I’ve had many people send me “screenshots” taken with their phone. The phone camera has become the goto tool for anything visual for many people.
Because it's convenient. Everyone knows how to use their phone's camera and click on "share". A screenshot involves more steps unless you know what you're doing and are already logged into Twitter.
And screenshots on Android phones are often just painful.
It's so untypically cumbersome and unintuitive for a smartphone UI that to me it doesn't feel like a feature intended for end users. The button combination isn't always consistent from one phone to the next and afaik there's no way to know the right one without googling it or trying out a few variants. If you get the combination or timing wrong you probably changed some audio setting or locked the screen. Sometimes it's easier but in general it feels more like trying to enter BIOS settings on a random computer - it's basically the same procedure as entering a smartphone's or Surface Pro's bootloader menu.
The tracking is mostly present in the scripts running on those sites, not the browser itself. You have to run NoScript and other blockers to deal with those.
The nice thing about his code[0] is that the C application just counts the new-lines on stdin and makes a beep for each. Just about anything could be fed into this app to make ad-hoc live monitor of activities.
A simple one is to monitor any access to files inside a directory:
inotifywait -r -m . | ./teller
You could also beep on important events in your own log files, beep for each request that hits your socket, or set up the runtime to tell you when the garbage is collected.
Genius! And I'm sitting here and debugging Google Analytics tracking with developer tools console.log and networking panel. On my next project I'm going all out and integrating TTS to debug Google Analytics events.
"Audible feedback on just how much your browsing feeds into Google" (290 points | 5 days ago | 205 comments)