Hacker News new | past | comments | ask | show | jobs | submit login
PiHole-Google: Completely Block Google and Its Services (github.com/nickspaargaren)
345 points by Jerry2 on June 29, 2019 | hide | past | favorite | 133 comments



This is a nice idea, but the one thing I haven't been able to divorce myself from is YouTube. I really hate how Google has allowed such a wealth of constant information that completely dwarfs alternative video hosting sites. As censorious as Google can be(now "up next" is always some video from CNN or Fox), blocking YouTube from my network would mean cutting myself off from a large portion of the world.


It’s fascinating how differently the YouTube algorithm treats people. I've not once seen a Fox or CNN video recommended, I didn't even know they had a presence on YouTube


Personalization makes it incredibly hard to "watch the watchers," because everyone is getting a slightly different view of what Google is doing. I would like to see a program where users submitted data about their recommendations to researchers so that we could uncover Google's opinions. It would have a lot of financial value to YouTubers and would make it harder for Google to abuse their role as censor.

I could imagine shadow-burning YouTubers without banning them by shrinking their recommendation audience.

Further, it would be good for Google. Every little shift in the weather is going to get blamed on them whether they deserve it or not, now that it's common knowledge that they weild this power in more than zero cases. Google is about to discover why judges write opinions. Administering justice from secret meetings leads to popular dissent more than it leads to justice.


I clear my YouTube search and watch history about once a week. Partly because of privacy, but also because a single binge of, say, metal casting videos does not mean I want them recommend ever again in the future.


One thing this opens you up to is the average recommendation, which is biased towards inflammatory or click-bait content.


I do it every week as well and the recommendations start off being seeded from my subscriptions, not from the trending tab.


They use browser fingerprinting and/or IP addresses as well. Even on a brand new browser session, doing something even slightly related to the previous session brings back its entire recommendation history.


> doing something even slightly related to the previous session brings back its entire recommendation history.

Are you sure that it does, and that it's not just a case of "hey we've never seen this person before, but they watched X, let's immediately start with recommendations Y and Z because other people who watched X were engaged with it?"


I used to think that and gave them the benefit of the doubt, but then I realised that some of the videos being recommended had nothing to do with the one currently watched other than the fact I watched similar ones previously.


Yeah, the fact that I can clear cookies, open a private browsing window with tracking protection turned on, go to YouTube, and be asked which of my two gmail accounts I want to log in with, is pretty creepy.


Yes you are right they are using browser fingerprinting, but with the right tool they can easily change it. They can use Kameleo software or Multilogin etc. https://kameleo.io/


How does that work at, say, the library, or some other public internet kiosk? Or maybe they're just assuming these are edge cases today, amidst the billions of private browser instances on handhelds?


It probably doesn’t work, considering the library users would be watching totally different & unrelated videos, where as in my case they have years of data of very specific viewing habits centred around a few topics.


The problem with this is that it then just suggests the content that hits the front page instead - so for the most part a load of crap. All I'm doing is swapping recommended videos on, say, metal casting, for YouTube's "on-brand" content creators which pump out generic content on a bi-daily basis at 10 minutes in length.


My understanding is the the only part of YouTube that is not personalized (except by region) is the "trending" tab.

It means it is the most representative of what would be a common YouTube experience.


It thinks I’m addicted to opiates because of the suboxone ads it presents, only because sometimes I listen to 90’s techno, rave music.


Not Google, but Twitter thinks I’m a 45 old married male with 3 kids

I’m 39, divorced, and have 2 kids

It’s almost as if the big data sales pitch is an epic joke like nuclear powered everything being pitched in the 50s


What? That is an incredibly accurate profile derived from random tweets / follows, and certainly accurate enough to serve you ads.


Fair point. Except I don’t post, follow feeds only, put nothing about my personal life in my profile, and they asked for my birthday.

So my point was aimed more at advertisers buying on Twitter: they filled in the blanks and got them all wrong. And the one they had the data for they got wrong.

That and I’m not a internet consumer really. Twitter can see I’ve blocked over 1,000 accounts that promoted tweets. Is that statistic being shared with advertisers?

Likely not. Advertisers believe so who cares.

It’s so ephemeral as to be useless.


Twitter is a mess in general. No wonder they struggle to monetize.


If your concern is privacy, I've developed this project which routes everything through Tor except for the video file: https://github.com/user234683/youtube-local

There is also this project which is similar but much better, more polished, and with more features than what I have currently, but I haven't tried it to see if it supports a similar kind of selective proxying: https://github.com/omarroth/invidious

Then there's Freetube, which supports proxying but I'm not sure of the details either. It doesn't scrape Youtube itself as far as I know; instead it consults with the main Invidous instance at invidio.us which provides an api: https://github.com/FreeTubeApp/FreeTube


My experience is if you watch videos on a given topic, they try to show you more of the same topic. So they probably decided you like American cable news.

It can get frustrating when it only recommends a single topic. I might go through a phase where I want to see videos about something specific. The recommendation algorithm will re-enforce that and prevent me from moving on to something else. I found that if I make some effort to watch a lot of videos about other topics, they appear. You can also manually edit your viewing history.


The trouble is that they often take viewing a video as a sign that you’re obsessed with that topic. You click one Flat Earth video to see what the crazy sounds like, and for the next three months half of your recommendations are “Scientists don’t want you to know this!”


I would rather that it would just play the next video by the current channel in reverse chron. Maybe if a channel made multiple videos in the last 24 hours, play that and then play other stuff. Instead, it immediately moves me to cable news if I am watching anything political, even though I never watch cable news voluntarily.


Try invidio.us - it hooks directly to the video feed of youtube, which means no ads, no tracking, reddit comments, your own subscriptions with rss which don't require "hitting the bell" and I just tested it works even when youtube hostname is redirected to localhost in /etc/hosts


YouTube and Maps are one of the few Google services left which are still available over Tor. You can proxy youtube-dl and retain some of your privacy this way.


I've been thinking of setting up a super-tiny (about $15/year) VPS as a youtube-dl proxy for a while now. It's the only Google service that still remains valuable to me; I enjoy channels like Bad Obsession Motorsports and various indie musician channels, and Vimeo just doesn't have enough of that type of content, sadly. I know proxying through a VPS that I pay for doesn't 100% divorce me from Google's watchful eye, but it's enough abstraction that hopefully they don't get enough info to build a profile of the real me.


Several video creators who got their start on Youtube have branched out to making their own platform, Nebula, if you are interested: https://watchnebula.com/


Unfortunately, they won't succeed. The way their homepage is designed suggest that they believe that their reputation and brands are enough to make people subscribe. Why is it not possible to click on a channel image and have a sneak preview? Their landing page is only good for people who already made the decision to join, not to attract new viewers. In other words, they rely on youtube to expand their audience.


They aren't doomed offhandedly, plenty of original content streaming platforms make money. However they won't help with people getting crushed by YouTube; I can't just create an account and start uploading to their HBO/Netflix analog.


I've got Little Snitch configured to block Youtube (and most Google services) when my browsers request them, but if there's a YouTube video that's interesting enough to warrant the extra effort, I just switch to my terminal and use youtube-dl to grab it and play it back locally.


What I do, and all the sites I visit (techie) are surprisingly not broken (aside from recaptcha spam):

1) use Firefox with multi-account containers, and disable 3rd party cookies.

2) put youtube in it's own "youtube" container. do not login to that container

3) put all other google stuff in it's own "google" container

If you do that, and don't login to google except in the "google" container it makes it more difficult for google to know who you are on youtube or other non-google sites.

But to make it so they REALLY don't know who you are, you need to do the above plus use a VPN. In my own usage I've discovered that youtube will recommend you videos based on your IP address's recent views if your not logged in.


You can always use YouTube through an alternate resource. Try invidio.us


Up next for me is great, always suggests videos of people I usually watch but haven't yet seen - or similar videos it thinks it might like.

Very rarely have any news outlet.


Video-hosting website alternatives as YouTube are indeed, pretty difficult to use in hope to replace entirely that service. I’ll say that peertube is going into a great way, but if it actually continues to gain success, it will surely take too many longs. You can however, if its mainly for telemetry purposes, use something else like invidio.us which i’ve been using alongside it, since i’ve deleted my Google account.


I set up a daily script to download new videos from channels I like using youtube-dl. It works really well, I rarely visit the actual YouTube site anymore.


[flagged]


This is Scott Wadsworth’s latest video. It’s not narcissism, it wouldn’t be better as a blog post, it doesn’t drag on to run more ads, and it isn’t low brow or toxic.

There are billions of hours of content on YouTube. You haven’t seen a representative sample.

https://youtu.be/cv6BdwMe560


That's an excellent example of the wealth that is on youtube. As he was going through those knots, I could think of a dozen ways to use each one. They're so simple, but effect -- and with zero waste.

He's got a real wisdom in his voice and style.


I have always thought that, out of all the hype and cambrian explosion that was Web 2.0, Youtube was the only site that clearly improved the world. There is an astounding amount of great content on Youtube. Turns out that giving normal people a platform, and a way of sharing, can have results which are utterly amazing.

I'm not happy that Google basically has a monopoly on this, but honestly, I consider youtube to be a force for good. And even more honestly - to some extent Youtube, and the fantastic content creators I've discovered on it, has gone a long way towards restoring my faith in humanity. Man there are some smart, wise, knowledgable people on this silly little blue speck we call home. And man am I glad they feel the need to share and educate the rest of us.


"but honestly, I consider youtube to be a force for good."

The comments section makes me think the internet was a mistake.


The people making those comments would exist with or without the internet. Don't you think it's better they at least have the chance to watch the video?

There's a little revolution that goes on in the hearts and minds of every geek on a site like this when they really, truly realise and admit that 50% of people have an IQ below 100. We need solutions, not complaints. You can't just kill them all.


I don't have any intention to do a construction work any time soon, but those are some very useful tips and I enjoyed the video. Thank you!


Thanks for that link. I'm a DIYer and this is decent quality content. I've subscribed.


You just made me watch some random, surprisingly interesting crafting dude, use string. On things.


I don't have audio on this VM host. And CC is way too annoying. But based on several seconds of it, this ~19 minute video could be maybe 2K words plus ten images. Then I might actually read it.


I think it's safe to say that 99.999% of people do not share your bizarre youtube-watching setup. I think it's also safe to say that the 60s-ish master craftsman featured is never even going to attempt to cut down his demonstrative, highly visual content to fit in an inferior, limited blog post.

I don't even know what your point is. How about you try not watching videos through a VM with no sound.


True. But even if I had sound, I'd pick text over video. I don't have the patience for unedited rambling.

I don't have sound because it'd be a security vulnerability. Given that my wife has a mobile on 24/7.


If you prefer text that's fine. But you must be cognisant that a very large number of people do not, and that in many cases you're losing a lot in the translation - including that the translation may never be made in the first place. Personally, I am fine with both. There is a right tool for the job.

> I don't have sound because it'd be a security vulnerability

This just sounds to me like a kind of narcissistic paranoia. Dude, the NSA isn't spying on your youtube habits, and if they were - the VM, or anything else, wouldn't help you.


Do you or did you consume all of educational content through static text and images only? Or maybe, just maybe, a lot of things were explained and shown to you?


Lectures and rambling are distinguishable.

Some of my professors did ramble, now and then, but that was mostly OK.


Some things are better communicated throughout a video, such as a video howto (my example would be a lockpick video, for example have a look at Bosnianbill's lockpick channel [1]).

[1] https://www.youtube.com/user/bosnianbill


There is so much political outrage-porn on YouTube, it's created an alternate reality.


pleroma (and i think maybe mastodon) provide media proxy that i think work for youtube. so when one person shares a video, one instance serves it to all the other users


Wouldn't that just be sharing a link to YouTube? I.e. not re-hosting the video (which would probably violate copyright).


The problem is that JS Fonts and other CDNed stuff won't load and websites will hang or work weird - particularly Stackoverflow. Bc it's all over https you can't MITM it and inject your own with OpenWRT/piholes. Decentraleyes (a Firefox browser extension) fixes some of this, but not all. If anyone has any additional suggestions, please let me know (it makes life bearable in China without a VPN)


Are there any extensions that modify external resources and point them towards a "trusted" cdn? e.g. requesting <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.4.1/jquery.mi...

Would automatically remap to

https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.m...


It is great that you could local cache the top X fonts in Google Fonts and never have to redownload them from Google's CDN. It's just too bad that having fonts locally installed or not can be a signal to trackers or otherwise it would be a lot easier to recommend to everyone to just install larger font banks.


Could we bundle more fonts with Firefox? Or provide a browser opt out for that behaviour...


Bundling more fonts with browsers and operating systems by default is probably the biggest way to do that.

(The corresponding problem with that being how many people would then blame that as browser bloat and complain about the size of all the fonts and how much they "clutter" one's font system.)

The browser would have to be pretty tricksy to solve the tracking problem with local fonts, because the tracking techniques themselves are pretty tricksy.

Such as: Render text to a GL target as fast as possible and hit detect the metrics of the asked for font versus the fallback font.

You would think techniques by the browser to minimize FOUT (flash of unstyled text) mitigates against this sort of tracking, but some of the techniques involve timing between JS load and DOM Ready events.

Admittedly there are easier tests than font loading tests for deanonymization on the web, but obviously if the goal is to de-Google it is worth keeping in mind.


Decentraleyes is an extension that replaces CDN references with local copies (Chrome, Brave, Firefox, Opera).


That’s a really cool idea


You can create a self-signed certificate for Google domains and trust it on your machines. Then you can MITM. This won't work well if you want to do it at a scale, with a number of 3rd party users, but if the only user is you or your family, it should do the trick.


> You can create a self-signed certificate for Google domains and trust it on your machines. Then you can MITM.

Can you point to or write up a blog post with a proof of concept?


mkcert[1] is probably the easiest way to generate root certificate and leaf certificate(s). Then you can use a proxy like Squid to intercept the traffic[2]. You’d also need a local DNS server to point hosts like fonts.googleapis.com to your own web server.

https://github.com/FiloSottile/mkcert

https://turbofuture.com/internet/Intercepting-HTTPS-Traffic-...

[Edit: Now that I think of it, I’m not sure if Squid is really required...]


Won’t work for Google as their Cets are pinned


Not sure about other browsers, but Chrome will ignore certificate pins if the cert provided chains to local trust anchor.

From: http://www.chromium.org/Home/chromium-security/security-faq#...

"Chrome does not perform pin validation when the certificate chain chains up to a private trust anchor. A key result of this policy is that private trust anchors can be used to proxy (or MITM) connections, even to pinned sites. “Data loss prevention” appliances, firewalls, content filters, and malware can use this feature to defeat the protections of key pinning.

We deem this acceptable because the proxy or MITM can only be effective if the client machine has already been configured to trust the proxy’s issuing certificate — that is, the client is already under the control of the person who controls the proxy (e.g. the enterprise’s IT administrator). If the client does not trust the private trust anchor, the proxy’s attempt to mediate the connection will fail as it should."


Does anyone know how Chrome does do distinguish a private trust anchor from all the other root certificates that are provided by the operating system? (Comodo, Comsign, Digicert et al)


I use a very similar setup (based on unbound): for Stackoverflow to properly work you need to whitelist ajax.googleapis.com.

> it makes life bearable in China without a VPN

If you're already a firefox user, you might try the "FoxyProxy Standard" extension to selectively bypass the GFW for the domains you need. Friends in China are reporting a varying degree of success with setting up forwarding on Apache (TLS1.3 with padding). Obvs, don't forget to set authentication. Once you're there you can add your own DoH to the mix.


Just this morning I setup a greasemonkey script to rewrite those URLs to a local webserver (things like ajax.googleapis.com serving things like jquery). Pages load faster now too. Very limited, but works in many cases:

  // ==UserScript==
  // @name     localize ajax googleapis
  // @version  1
  // @grant    none
  // @run-at  document-start
  var scripts = document.getElementsByTagName("script");
  for (i=0; i<scripts.length; i++) {
    var parent = scripts[i].parentElement;
    var url = new URL(scripts[i].src);
    if (url.host === "ajax.googleapis.com") {
      url.host = "ajax.googleapis.com.local";
      var newscript = document.createElement("script");
      newscript.type="text/javascript";
      newscript.src = url;
      parent.insertBefore(newscript, scripts[i]);    
      parent.removeChild(scripts[i]);
      console.log("Rewrote url as " + url); 
    }
  }
  // ==/UserScript==
EDIT: I just read the other comments and installed decentraleyes. I'm sure it's way better than this grease I just posted.


I’ve added your suggestion to our list, and I will try to see if I can make, a separate list mainly for those dedicated web services. For myself, I always block all JS fonts, and CDN domains, and I think, only really use Decentraleyes for that (Or LocalCDN as an alternative), and most of the time, It's usable, but not on the few cases when no content at all is being pulled from those domains.


Speaking as someone in china on business at the moment, does anything really make like in China bearable with out a VPN.


The real time censoring of all the news channel whenever the escalation in hong kong comes up is a bit too much for me.


You should look at the firefox decentraleyes plugin. Would be nice if pihole implemented something like this.


I find Google Container to be an excellent plugin to segregate my Google account from the rest of my browsing. It's not an official plugin from Mozilla, but it is forked from the Facebook container plugin.


Same here - been using Google Container for 6+ months now and very happy with it. Highly recommended - you can do this yourself with just normal containers in Firefox, but this comes preconfigured with all the non-obvious domains you might not know about. No connection - just a satisfied user.

https://addons.mozilla.org/en-US/firefox/addon/google-contai...

Only problem with it is now reCAPTCHA sites are a huge pain to use since you have to answer about 15 challenges before you can get (since you look totally unknown to Google outside of the container). It is often better to just ignore these sites now, but it is not always possible.


I just use Firefox containers. It puts every site into its own container. Then I made a "Google" container so that at least my login will hold across the various Google services.


Check into Containerise. You can setup wildcards and get the same effect with a lot more bad actors. Facebook properties are the worst in my opinion, with Google being a close second.


Another approach is whitelisting. Like a default firewall rule of "block all" and a set of specific exceptions, I find this approach can be easier to manage. Probably not going to work for everyone but works for me.

Figure out what domains I need to access for the content I am after[1] and just allow those. "Block" everything else. For example, I might need something like .googlevideo.com once in a while but I will never need something like googletagmanager.net.

1. To do this, I just go through the logs of a local authoritative nameserver that I run solely for this purpose, i.e. collecting lists of needed domains. Then I add the necessary DNS data to /etc/hosts or another local authoritative server, e.g., tinydns. I believe unbound or pdns_recursor can serve static data as well.

Does the author mention avoiding using Google as a third party DNS service. In the beginning, PiHole, i.e., preconfigured dnsmasq, was pointed at some third party DNS service, maybe Google. Not sure what the default configuration is today. If it was Google, then is there any irony in that a project designed to blocks ads is by default having its users send their IP and ISP location to an advertising company probably hundreds if not thousands of times over in a single day of web use.


>Another approach is whitelisting. Like a default firewall rule of "block all" and a set of specific exceptions, I find this approach can be easier to manage.

I tried the whitelisting approach but quickly found out this breaks many websites with shopping cart and credit-card checkouts because they use payments api gateways. Because the url for the card processing gateway is a different company from the ecommerce site you're visiting, it has a totally different spelling so you can't predict what to put in a whitelist beforehand. In turn, if you do whitelist the payment gateway url, you might then find out it makes another api call to a fraud detection url which is another totally different url that you didn't know you had to whitelist.

Whitelisting DNS entries is workable for use inside of a single virtual machine that deliberately restricts a web browser to access a few websites like youtube.

However, I don't see how it's possible to use the whitelisting strategy on a PiHole that globally filters the entire family accessing it with multiple desktops and smartphones. It's not easy to tell if a spinning hourglass or beachball is happening because the a website is slow or whether the whitelist is missing some url entries. The family members would constantly be visiting new and legitimate urls so it seems very cumbersome to try and keep up with adding new whitelist entries for everybody.


For commercial web use, I use a DNS cache just like the website creator would expect; I use a popular browser in these instances, too. Nothing out of the ordinary. For exactly the reason you mention. If something goes wrong I want to be able to say I am the "typical user", not an enlightened one.

However, I rarely use the web for commercial purposes. Almost all use is non-commercial.

I do not use a Pi-Hole. I do like dnsmasq. I prefer djbdns. I use older hardware running Net/OpenBSD as routers and newer hardware running OpenWRT.

I also do not use popular graphical browsers much. I probably would not use whitelisting if I was doing all web use via a popular graphical browser. I reasonably consistent speed across all websites by using text-only browsers and tcp/http clients.

Cannot really speak for other users. Everyone is different. For me, whitelisting works well.


I have been running a pi-hole server at my home for almost a year. We have, at times, around thirty devices on our network, (thermostat (non-nest), several Google Home devices, numerous phones, 4 desktops, 4 laptops, 3 ipads, 1 TV, a chromecast/roku/firestick, a few smart receptacles, and a Xfinity modem) and sometimes the traffic is pretty neat to examine. Its interesting to see which devices phone home.

Whenever a necessary site is blocked it only takes a few seconds to whitelist it. I can also easily blacklist sites. The GUI is very easy to access and use. We have never had an issue with YouTube (YT premium) or anything else really, but occasionally a link will be blocked because of Google or other ad traffic. This has never happened with YT or any other streaming services.

One thing to remember is VPN traffic ignores the Pi-Hole server. Even when the router/computer/device DNS is set to use it. This has never been an issue for us, as only a handful of devices here are using VPN, but I suppose it could be under the right circumstances, but easily fixable.


>GAFAM

Never seen it listed out like that, I thought it was FAANG. Or is FAANG only used in reference to top salaries in the Bay Area?


Yeah, It's due to myself being french, so I proposed including this in the readme.md as GAFAM, as other more ¨international¨ depiction of these group were less prominent. If you think that FAANG is a better acronym, i’ll change it.


GAFA is the french-speaking equivalent of FAANG (although they all seem to omit Amazon and Netflix).


GAFA(M) omits Netflix but not Amazon, since it stands for Google Amazon Facebook Apple (Microsoft)


Yes I understand - I was talking more about the ordering of them. English-speaking websites tend to use FA(A?)NG while French-speaking ones tend to use GAFA(M?).


If you don't mind blocking everything hosted on GCP as well:

> dig TXT +short _netblocks{,2,3}.google.com | tr ' ' '\n' | egrep "(ip4:|ip6:)"

Gives you a full list of all of Google's IP blocks. You can just blackhole those.


This is far from a complete list - I got only 8 netblocks with that command. Try this one instead which queries Merit RADb:

    whois -h whois.radb.net '!gAS15169'


That's just their SPF record. It's only a list of IPs that google.com email might come from (or any domain that imports those records)


It's in SPF format, but it's also everything. See e.g. https://cloud.google.com/appengine/kb/

Another method is using GeoIP's ASN database, but they also run many ASNs so it would require a little effort to ensure you have them all


how come this seems to only work with ".google.com" tld?

I tried .apple.com, .yahoo.com, etc. and got nothing.


It's not a shortcut or anything, those are just Google's SPF records (which as noted above are drastically incomplete, they are only the IPs from which Google sends email).


Has anyone actually used this? Does the web become completely unusable? I suspect blocking their fonts and their CDN for jquery would be enough to make most of the web unusable.


You can use the Decentraleyes add-on to deal with jQuery on a CDN


I can, but getting my whole house to use it including the iPhones may be a bit tough (this is a Pi-hole add on so it needs to work without device changes)


I get that. Using uMatrix, it becomes really obvious how many websites are reliant on jQuery and likely don't really need it.


Well, myself actually. What i habitually do in those case, when this is really not usable, I just usually temporary whitelist them only for this specific domain i’m trying to access. This break obviously the purpose of the filter list partially tho.


How does this deal with recaptcha? That thing is the bane of my web browsing experience, but at least with my current umatrix setup I can toggle it back on in 30 seconds if I need to pass the challenge; if I need to remote into the DNS everytime I hit a challenge it is a no go for me


It says at the bottom that you might want to whilelist recaptcha


I have run a Pi-Hole server in my home for almost a year and have never had an issue with recaptcha.

Anything that is blocked is SUPER quick and easy to resolve through the web GUI (but recaptcha has ever been an issue). It seems to be smart enough that I rarely have to access it to unblock anything.


This makes me think of the "Cutting the Big 5" article [0] that was on here a few months back. While I agree with a lot of the sentiment here it seems that a complete block is actually impractical. Instead I would love to see these kinds of projects not just blanket cover all of FAMG, but rather target the most nefarious ones. I definitely don't know if this is even possible. But is there a way to use services but cut out a significant portion of the tracking? Those are the curated lists I'd love to see.

[0] https://gizmodo.com/i-cut-the-big-five-tech-giants-from-my-l...


This is overkill.

A way simpler solution is to simply not have a registered account with those companies. That's where the problems start, when they tie certain browsing and telemetry data to your true identity.

For everything else a good content blocker + the typical pihole list that include telemetry domains are enough protection.

I am registered with Apple and Amazon, and there's no way for me to change that because there is simply no one else that delivers this kind of value.

Long-term I could see the possibility of leaving Amazon, but there is a security-advantage when using amazon because otherwise I would leave all my personal data to countless small vendors who regularly get hacked, etc.


This is actually not feasible as a solution because of shadow profiles. Google et. al. track you even when you are not logged in. Simply landing on a page is enough to capture your use habit and infer browsing/purchasing patterns from it. Look at Google Purchases revealed to many just a bit a go. It was retroactive for sure just scanning our inboxes which Google does have access to, but it can use known information to find seemingly anonymous data from referred info in the Anon chain.

It's not really a choice to say "just don't use it", because even appearing on a site with Google tie ins feeds mineable information.


You contradict yourself. Google Purchase history requires a google account, they can't connect it to you if you are not logged in.

The reason google pushes the log-in in their browser is exactly because they want to be able to tie this all to your account.


No, that was an example, not a requirement. Google has this history they associate whether or not you have a Google account. The account just solidifies it. You're still being tracked and identified without the account.


and that's an assumption you make that requires evidence. Your claim is that there is not only this kind of identification happening, but that it happens even if I have the common tracking blockers. Otherwise the blocklist in this thread would be completely overkill, just as I said.


It's folly to think an account is the only way for advertisers and retailers to identify and track you.

Signing up for an account is just more explicitly forking over and sharing data. But you're being tracked by every possible method, and it is possible to piece together the remaining information.


Where is your evidence?

Did you even read my comment? The typical pihole lists already include tracking domains.

What you imply is that if I visit youtube.com with tracking disabled, they still create a profile of me. Then tell me, what unique identifier do they use?


Browser fingerprint, referrers, available cookies, IP address immediately come to mind.


OK, but how do you account for changing IP-Addresses? You would need to take one identifier that never changes.

Maybe they have machine learning which combines a couple of factors and then create long-term profiles based on the likelihood of some data belonging to the same person.

That would be illegal though.


That’s what shadow profiles are: they are not necessarily illegal (depends on jurisdiction, illegal under GDPR), they act on probabilities (this particular set of data identifies a person and is significantly different from/similar to other sets of data), they look for patterns in data and behaviour (a user with browser fingerprint A from Pasadena that looks at youtube videos of category X yesterday and today is likely to be the same person even if IP changes)


I would like to simply block irrelevant YouTube ads while my toddler indulges in ‘Land before time’ episodes. Is that possible? Last time I checked they use some randomised domains to load ads...


Firefox with uBlock Origin takes care of that. I haven't seen an Youtube ad in years.

Yes, they use randomised hostnames, but there are other parts of the URL that are not.

If you don't want to use a browser extension for pattern matching the whole URL, you're gonna need a transparent proxy in your networks gateway.


Would that be worth €11.99/month to you? (YouTube Premium)


A router with dnsmasq, it supports wilcards, but devices can only use your router as DNS for it to work.


Try the YouTube Enhancer addon in Firefox


I like that this site punts on reCAPTCHA. The web is not usable without Google services, which fact is useful in talking about the impossibility of consent as a model for regulating Google.


Alternative exists. It might not be usable entirely without them, but it can on a certain point. We’ve separated the list into multiple categories, so that way, this could be easier to block some majors parts of their services only. And well, We´ve indicated the domain to whitelist in case you have issues with reCaptcha´s.


My issues with recaptcha don't matter. I can't use a large part of the web without enabling recaptcha.


If you block AWS, Google Cloud, and Azure... that is pretty much the whole internet.


But that isn’t the point of this blocklist tho.


Nice idea, but you're better off accomplishing it with TLD wildcards or AXFR transfers than a hosts list, since new sub-domains are always being created and rotated.


Hosted by Microsoft, suggests blocking Microsoft, too.


I´ve proposed the idea to move it elsewhere. Either from an alternative service like on gitlab, or even on a self-hosted gittea instance.


> Simply go into to your blocklist settings

Whatever does that mean? Is that a browser thing or a firewall thing or something else entirely?


Presumably it's a setting on Pi-Hole: https://pi-hole.net/


Why is Apple at this list? What have they done?


¨by their size, they are particularly influential on the American and European Internet both economically and politically and socially and are regularly the subject of criticism or prosecution on tax matters, abuses of dominant positions and the non-respect of Internet users' privacy.¨ To be short, that what it is. But we can pin point each cases one by one. This filter list only concentrate on Google tho for the moment.


this will not help much as some google products use IPs (4 and 6) directly too.


CIDR blocks are your friend, at the firewall.


Know of a lib to manage them?


Not offhand. The Routeviews project is a useful way to turn them up, though.

http://www.routeviews.org/routeviews/

Particularly reverse DNS queries.


At the moment things like youtube, twitter have become culture and 'technical solutions' to both unrestrained greed, surveillance and this rich fabric of human communication seem to miss the big picture of their cultural value.

The value of these platforms are not technical, they are entirely from the human element and everybody should be able to participate without opening themselves to surveillance and abuse.

Like everything else to run a civilized society we need laws and its unfortunate that this basic first principle of organizing human society needs to be reiterated and debated right untill 2019 because of propaganda by Koch brothers and their ilk on a self serving libertarianism which is as fantastic as a disneyland version of reality.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: