Google Tag Manager, the new anti-adblock weapon (2020)

pixeldetracking · on Feb 21, 2022

I'm the author, good to see this on HN, raising awareness on the topic

I don't know who made the translation and when it was made, but the original article in french (https://pixeldetracking.com/fr/google-tag-manager-server-sid...) contains more information on recent GTM "improvements"): mainly on how you can easily change JS library names and detailed instructions on how to host your container in other clouds or self-host

gildas · on Feb 21, 2022

> I don't know who made the translation and when it was made

This page was saved with SingleFile (I'm the author of SingleFile). Therefore, I can tell you that this page was produced on Tue Dec 08 2020.

easrng · on Feb 21, 2022

Thank you for making SingleFile, it's been an absolute lifesaver in a project I'm working on. I was having a lot of trouble trying to manually save pages with puppeteer but the singlefile CLI worked perfectly, even with added extensions. (To get extensions to work I had to add --browser-headless=false --browser-args ["--enable-features=UseOzonePlatform", "--ozone-platform=headless", "--disable-extensions-except=/path/to/extension", "--load-extension=/path/to/extension"] )

gildas · on Feb 21, 2022

Thanks for the feedback! It's very timely, I just have an issue that discusses the problem of sideloaded extensions (and profile data).

samstave · on Feb 21, 2022

Uhm, can you pack all those options in a simple "--E" or somesuch...

samstave · on Feb 21, 2022

Gawd I love HN you beautiful bastards.

sharps1 · on Feb 22, 2022

SingleFile and SingleFileZ are great!

It’s a shame with manifest v3 that it will hamstring the size of pages that can be saved (to 43mb if I remember on the SingleFileLite GitHub page).

gildas · on Feb 23, 2022

Thank you! Actually I did some tests recently with SingleFile Lite and was able to save a page weighting 120MB+ so the 43MB limit seems obsolete. There are still some annoying issues though.

pixeldetracking · on Feb 21, 2022

thanks for the info! maybe it's Jerry: https://info.woolyss.com/

jikoo · on Feb 23, 2022

Hello pixeldetracking, Yes it is me ;) I translated your excellent page and made an HTML archive with the excellent SingleFile extension. Thank you very much for all. I like to keep a copy of interesting content. https://chromium.woolyss.com/#guides

Regards

jikoo · on Feb 23, 2022

Hello pixeldetracking, Excellent article! Bravo. About the translation, yes, it is me! ;)

jeroenhd · on Feb 21, 2022

This kind of data collection abuse is why I think we need more addons like AdNauseam [1]. Unlike uBlock Origin, it's not available from the Chrome web store anymore, which is a good sign that Google hates these types of addons more than they hate simple blockers.

Blocking A/AAAA domains with custom URLs to prevent tracking is almost impossible, so instead let's flood the trackers with useless, incorrect data that's not worth collecting.

[1]: https://addons.mozilla.org/en-US/firefox/addon/adnauseam/

matheusmoreira · on Feb 21, 2022

Completely agree. Stuff like uBlock Origin is just online self-defense against hostile megacorporations. Maybe it's time we started going on the offensive by poisoning their data sets with total junk data with negative value. They insist on collecting data despite our wishes? Okay, take it all.

samstave · on Feb 21, 2022

I Like the cut of your jib, and I would like to subscribe to your newsletter.

y42 · on Feb 21, 2022

I worked for a agency a couple of years ago, when, out of the blue, tracked data contained tons of random data instead of the expected UTM parameters. It took us a while to figure out what was happening. It was some kind of obfuscating plugin that was messing up well known tracking parameters.

What I want to say is: stuff like that could actually cause a lot of fun on the other side.

malermeister · on Feb 21, 2022

Does anyone know which addon that might've been? Seems like a good addition to adnauseam.

toss1 · on Feb 21, 2022

Yup. I've used NoScript for years, and one of the most frequently appearing sites that remain blocked is googletagmanager.

I totally second the sentiment that this is merely minimal defense against hostile 'service providers'.

This avalanche of tracking libraries is now almost as toxic as email spam in its worst-controlled days. Much of the internet is literally unusable, as pages take dozens of seconds to minutes to load - on a CAD-level laptop that can rotate 30MB models with zero lag.

In fact, does anyone have a blacklist of trackers that we can just blackhole at the HOSTS file or router level? Maybe time to setup a pihole?

GoblinSlayer · on Feb 21, 2022

In my experience the most popular noscript trackers are googletagmanager and facebook, so with just two domains you can get a lot. But e.g. bloomberg uses full first party proxy for facebook pixel with pseudorandom base url, it's difficult to block even by url; I suspect they duplicate the page request to facebook too, but this is unobservable on client side. Hopefully this solution doesn't scale well.

troyvit · on Feb 21, 2022

This is my go-to: https://github.com/StevenBlack/hosts

It helps a lot.

walterbell · on Feb 21, 2022

Since this extension actively clicks on ads which may trigger payments, how do ad-fraud services classify endpoints running this extension? Could they consider this malware and add the client IP to blacklists?

matheusmoreira · on Feb 21, 2022

> Could they consider this malware and add the client IP to blacklists?

Do malware developers consider the countermeasure softwate created to resist them to be malware as well?

rplnt · on Feb 21, 2022

If we were to split what malware does into Infection (getting into the system), Avoidance (hiding from system, AV, or attacking AV) and work (sniffing, sending spam, etc..) then the Avoidance would be by far the biggest and most complicated (and most interesting) category.

GoblinSlayer · on Feb 21, 2022

They absolutely do.

ratww · on Feb 21, 2022

Good. If it is a shopping or some other service that charges money, then they lose business.

If it is some service that you have no choice but to use, but relies on network effects (like Facebook Events), then you can just send a screenshot to the interested party and they Might consider not using a service that is broken for other people.

danuker · on Feb 21, 2022

Sure, and perhaps also the accounts of users running this while logged-in. Have contingency plans if you run this and your, say, GMail account is blocked.

malka · on Feb 21, 2022

it is precisely why I degoogled my life.

I did not want to live under the constant threat of big G locking me out of my own life anymore.

User23 · on Feb 21, 2022

Anyone still using gmail today for anything other than throwaway purposes is behaving foolishly.

surajrmal · on Feb 21, 2022

You sound like you are living in a bubble. This is like asserting anyone who owns a car is being foolish.

foxfluff · on Feb 21, 2022

I lost my gmail account a decade ago. Since then, year after year, I've been watching people suffer the same fate with gmail, youtube, google play, etcetra. There's always someone who won't believe that google can screw you over all of a sudden. There's always someone who will be surprised, always someone who thought it couldn't happen to them...

I don't know what else I can say. It's a shame I haven't been maintaining a list of all incidents I've come across.

analog31 · on Feb 21, 2022

What's the jellybean alternative these days?

ohgodplsno · on Feb 21, 2022

With a bit of luck, it gets server owners banned from AdMob/MoPub/etc for fraudulent clicks.

jeroenhd · on Feb 21, 2022

I wish, but I haven't stopped receiving ads yet.

cobbzilla · on Feb 21, 2022

Can uBlock do payload inspection? It would be easy to block an upstream json POST that matches a certain structure.

Const-me · on Feb 21, 2022

Interesting idea, installed the addon.

I’m using MS Edge BTW, Microsoft doesn’t care about Google’s advertisement revenue, the addon is available in their marketplace.

fartcannon · on Feb 22, 2022

Microsoft doesn't care because they collect everything through their desktop environment. That's why you need an email to set windows up now.

Const-me · on Feb 22, 2022

> they collect everything through their desktop environment

There're many relevant questions during the install. If one actually uses the OS installation wizard GUI instead of skipping it with "next" buttons, Microsoft won't be collecting much.

Another thing, they don't have to because their business model is honest. They're building software, users are paying for that. Microsoft ain't an advertisement company, they have little motivation to track people.

> you need an email to set windows up now

I did clean installation of Windows 10 last week (recycled an old laptop after migrating to a new one), the email was optional.

design-material · on March 1, 2022

lol Windows 10 is essentially deprecated, dumbass

GoblinSlayer · on Feb 22, 2022

Luckily, ms provides throwaway mailboxes at outlook.com.

fartcannon · on Feb 22, 2022

It's not going to be much of a throwaway once it's associated with every activity you do on your computer and the internet. In fact, it might be one of the most valuable email addresses (to Microsoft) that you ever make.

consumer451 · on Feb 21, 2022

I am very interested in this, thanks for sharing.

Adding another party into my web browsing is always a tough pill for me to swallow. I am also a noob at reading trust signaling. What are some of the reasons that I should trust this dev and their processes?

danuker · on Feb 21, 2022

You should not trust them. You can download the add-on and inspect it yourself, if you know some JS. Right-clicking yields this URL:

https://addons.cdn.mozilla.net/user-media/addons/585454/adna...

But it seems to include a lot of code, including some uBlock Origin code.

Either way, this kind of sabotage might get you banned on Google. Be mindful of the risks, and have contingency plans.

jeroenhd · on Feb 21, 2022

You should put the same amount of trust in this dev as you should in any other. I myself trust Mozilla's store reviews enough to run the addon, but if you're more conservative with trust, you can inspect the source code and build the addon itself.

The addon comes down to a uBlock Origin fork with different behaviour. I believe most of the addon code is actually the base uBlock code base.

I haven't seen any obvious data exfiltration in my DNS logs, but then again I'm just another random on the internet. If you don't feel comfortable installing something with a privacy impact as broad as an ad blocker, you should definitely trust your instincts.

sizzle · on Feb 21, 2022

Will pihole automatically protect against A/AAAA domains if your blocked domain host file lists are updated regularly?

ashtonkem · on Feb 21, 2022

My experience is that Pihole has been getting less effective over time as more and more ads are being run through the same domain that legitimate content is. When I first installed it it killed ads on my Roku, that doesn’t happen anymore.

sizzle · on Feb 21, 2022

What apps on your roku? I had to whitelist a Hulu domain cause it froze when trying to load ads during commercials for example, but when I look at the logs it’s blocking a ton of telemetry and phoning home 24/7 by Roku and Alexa devices.

Are you regularly updating your ad blocking filters? When ads start showing up on my phone I know it’s time to go hit the update button.

ashtonkem · on March 7, 2022

I replaced my Roku a while ago, and yes I keep my Pihole up to date.

soheil · on Feb 21, 2022

I feel like the reason you initially used a strong word like abuse is to distract from the same behavior the blockers you mention engage in. Spamming Google event services and "flooding" them with garbage is surely considered to be in the abuse category at least if you're not an avid anti-ad proponent.

malka · on Feb 21, 2022

They simply have to stop shoving ads down my throat, if they do not want me abusing those same ads.

racuna · on Feb 22, 2022

I used to use AdNauseam a while back, until ads started to show suddenly. So I switched to UBlock Origin, and ads stopped to show again.

After I read your comment, I disabled UO and installed AN again. Maybe some update fixed the issues. But it didn't. I'm now back using UO again.

unicornporn · on Feb 21, 2022

That's cool, but it's only going to save the 1% that knows how to bend the internet to their will. What we need is legislation, like this: https://www.theregister.com/2022/01/31/website_fine_google_f...

That would actually make difference, not only for the HN crowd.

gigel82 · on Feb 21, 2022

God damn... this is it, this is the end-game. There's no way to fight this unless you customize and maintain blocking scripts for each individual website.

Yes, websites could always have done this, but the REST (CDN-bypassing) requests' cost and the manual maintenance for the telemetry endpoints and storage was an impediment that Google just gives them a drop-in solution for :(

I think Google is happy to eat some of the cost for the "proxy" server given the abundance of data they'll be gobbling up (not just each request's query string and users' IP address but -being a subdomain- all the 1st party cookies as well). I don't have the time or energy to block JavaScript and/or manually inspect each domain's requests to figure out if they use server-side tracking or not.

I honestly don't know if there's any solution to this at all. Maybe using an archive.is-like service that renders the static page (as an image at the extreme), or a Tor-like service and randomizes one's IP address and browser fingerprint.

hilbert42 · on Feb 21, 2022

"I don't have the time or energy to block JavaScript and/or manually inspect each domain's requests to figure out if they use server-side tracking or not."

By default, I don't run JavaScript. I don't see blocking JS as a problem - in fact, it's a blessing as the web is blinding fast without it - and also most of the ads just simply disappear if JS is not running.

On occasions when I need JS (only about 3-5% of sites) it's just a matter of toggling it on and refreshing the page. I've been working this way for at least 15 years - that's when I first realized JS was ruining my web experience.

I'm now so spoilt by the advantages of the non-JS world that I don't think I could ever return. I'm always acutely reminded of the fact whenever I use someone else's machine.

heavyset_go · on Feb 21, 2022

> By default, I don't run JavaScript. I don't see blocking JS as a problem - in fact, it's a blessing as the web is blinding fast without it - and also most of the ads just simply disappear if JS is not running.

Years ago I was on the "people who block JavaScript are crazy" bandwagon, until just loading a single news article online meant waiting for a dozen ads and autoplaying videos to load. I spent more time waiting for things to finish loading than I spent browsing the actual sites, which killed my battery life. I'd get a couple of hours of battery life with JS on, and with it off, I could work all day on a single charge. It was nice.

Ever since then, I've been using NoScript without a problem. I've spent all of maybe 5 minutes, cumulative over the course of several years, clicking a single button to add domains to the whitelist. If whitelisting isn't something you want to do, you can use NoScript's blacklist mode, too.

> I'm now so spoilt by the advantages of the non-JS world that I don't think I could ever return. I'm always acutely reminded of the fact whenever I use someone else's machine.

I relate with this 100%.

Semaphor · on Feb 21, 2022

> until just loading a single news article online meant waiting for a dozen ads and autoplaying videos to load.

That sounds like you not only didn’t block JS, you also didn’t block ads. Which is a very different argument. I only block 3rd-party JS by default (and that already requires a lot of whitelisting for almost every site that has any interaction) and I don’t have those issues because I also block ads.

heavyset_go · on Feb 21, 2022

There was a period around 2014 - 2016 where even if you used uBlock, ads would still get through. Even now, when I use computers that just have uBlock Origin installed, some ads, and especially autoplaying videos on news sites, still get through.

paulryanrogers · on Feb 21, 2022

Tried NoScript for years and it was a pain. Too many of the sites I use need so many domains full of JS. So I think this will vary widely depending on the person and their preferred/needed sites.

hilbert42 · on Feb 21, 2022

It has to be said: there are people who can get by without JavaScript and those who can't. You can almost predict those who can and those who can't by their personality.

If you are heavy user of Google's services, Twitter and Facebook as well as many big news outlets and heavy-duty commercial sites then you're the 'JavaScript' type and stopping scripts is definitely not for you!

If you are like me and don't have any Facebook, Twitter or Google accounts and deliberately avoid large commercial sites like, say, Microsoft then you can happily switch off JavaScript and experience the 'better' web.

You know the type of person you are, so with this fact in mind there's no point me proselytizing the case for disabling JavaScript.

mehdix · on Feb 21, 2022

I can relate 100%. In the past I was constantly using Twitter, Gmail, et al. I was using different hacks to bend them to the extent possible to my will. Time changed, my personality changed and the desire and need to use those services disappeared, therefore I naturally stopped using them. When people where talking about this or that service being down, I didn't notice it at all. I was also lucky enough to not rely on them on my $dayjob. I run my mail server, host my website and run my scripts. Old fashoin guy lets say. It works well for me. Moreover, JS-bloat is a red flag to stay away from certain services. Has served me well.

paulryanrogers · on Feb 21, 2022

This seems like a broad generalization. JS continues to permeate every industry brought to the web. It's increasingly not optional as employers and governments mandate more and more web services. Doubtful that can be predicted by personality.

hilbert42 · on Feb 21, 2022

"...as employers and governments mandate more and more web services."

It's not compulsory, especially governments. I never deal with government on the web at a personal level. If they expect me to fill in forms I simply say that I do not have the web and would they please send me a paper copy - which they're obliged to do at law - same goes for the census.

If the government expects me to do business with it on the internet then it will have to legislate to make it compulsory AND then provide me with the necessary dedicated hardware for said purpose.

Why would I act this way? Well, for quite some years I was the IT manager for a government department and I know how they work (or I should say don't work).

BTW, as IT manager I never used email within the department (perfunctorily emails sent to my office were received by secretarial staff). If the CEO wanted to send me an important memorandum then he had to have it typed up on paper and personally sign it (and I would reciprocate the same). When in government you quickly realize that atoms on paper and especially a written signature is real guaranteed worth - unlike ephemeral emails that can vanish without trace.

I'm forever amazed at the trust the average person has in these vulnerability-ridden flaky systems.

heavyset_go · on Feb 21, 2022

> If you are heavy user of Google's services, Twitter and Facebook as well as many big news outlets and heavy-duty commercial sites then you're the 'JavaScript' type and stopping scripts is definitely not for you!

I, unfortunately, use some of these services and similar ones, too, and it takes a few seconds to enable JS on them, and then the sites will work indefinitely afterwards.

eru · on Feb 21, 2022

I use NoScript with Firefox on Android (together with uBlock Origin). After I unblocked the websites I regularly use (and not the ad delivery domains), it doesn't get in the way that much.

pcthrowaway · on Feb 21, 2022

Unblocking the sites you use removes the advantage of not being tracked by Google through tag manager though.

heavyset_go · on Feb 21, 2022

I've had Google Tag Manager blocked for years and sites have worked fine without it.

eru · on Feb 21, 2022

That's probably true. Part of the reason why I still also use an ad-blocker.

gorjusborg · on Feb 21, 2022

> Too many of the sites I use need so many domains full of JS

I hear you, but I wonder if you are being honest with yourself when you use the word need.

At this point, I view Google and Facebook as the equivalent of loan sharks. A loan shark does provide a service, but most people shouldn't use one.

bogwog · on Feb 21, 2022

Are you a web developer by any chance?

unicornporn · on Feb 21, 2022

> Years ago I was on the "people who block JavaScript are crazy" bandwagon, until just loading a single news article online meant waiting for a dozen ads and autoplaying videos to load.

Seems like clear case of "crossing the river to collect water" (as the Swedish saying says)? This is what I use uBlock Origin (with the right blocklists) for and it happens automagically. I did use uMatrix for quite a awhile, but eventually ended up ditching it because uBlock Origin worked so well.

maccard · on Feb 21, 2022

uBlock Origin solves the problem you had too, without breaking multiple sites.

forgotmypw17 · on Feb 21, 2022

There's another, indirect benefit to blocking JavaScript.

Over time I have noticed a strong correlation between sites which don't work right without JS and low-quality content which I regret having spent time reading.

Most of the time I encounter one of these sites I now just close the tab and move on with a clear conscience.

hilbert42 · on Feb 21, 2022

"Over time I have noticed a strong correlation between sites which don't work right without JS and low-quality content...."

Absolutely true, I can't agree with you more. I've reached the stage where if I land on a site and its main content is blocked if JavaScript is disabled then my conditioned reflex kicks in and I'm off the site within milliseconds.

Rarely is this a problem with sites that I frequent (and I too don't have time to waste reading low quality content).

raspberry1337 · on Feb 21, 2022

Any tips for high quality content sites? It truly is hard to find these days

hilbert42 · on Feb 21, 2022

Yeah, read HN!

There are stacks and stacks of them here on HN that are of excellent quality - I use HN as my 'quality' filter (and I reckon I'm not alone).

Moreover, if one doesn't run JS like me then it's dead easy to avoid problematic sites as HN lists them (Twitter, etc. - and it doesn't take long to get to know the main offenders, thus avoid them).

:-)

BTW, I agree with you it is hard to find good sites these days but eventually most really good sites appear here on HN. Do what I do, when you come across them bookmark them.

IHLayman · on Feb 21, 2022

A pedantic note that follows from this particular thread: HackerNews’s search capabilities are powered by Algolia and require JavaScript to work (turn off all JS and the HN branded Algolia page will not load). The reason I bring this up is that even good websites sometimes lean on free or free-ish services to provide extra functionality (such as calendars, discussion boards, issue tracking, or search) without realizing that such functionality may be a back door to letting JS in and any tracking/privacy-erosion that could follow from it.

hilbert42 · on Feb 21, 2022

Right, HN does use JavaScript for certain functions, search etc. Now, if you read the second paragraph of my first post I've got such cases covered.

OK, here's the scenario: I log on to HN with JavaScript disabled, do all the things I do, read articles, submit posts all without JS. At some point I want to search HN so I hit the 'toggle JS' button on my browser, it then goes from red to green to tell me JS is now active. I then refresh the page and start searching HN. When I've finished I hit the JS toggle and the button goes back to red - JS is now kaput.

I really can't think of anything simpler - JS is off until I really need it and when I do it's immediately available without digging deep down into menus etc.

I'd add HN uses JS as it was originally intended and does so responsibly. I have nothing against JS per se, the problem comes from websites that abuse webpages and thus the user by sending megabytes of JS gumph and so on.

Running without JS and only turning it on when really necessary I reckon is a reasonable compromise.

forgotmypw17 · on Feb 22, 2022

It's true, there are some decent sites out there which use JS legitimately to add features. And there are some sites which require JS without really needing to, but still have good content and do not have unnecessary annoyances and performance problems.

Lucky for me, I can toggle on JavaScript for them individually and continue with my general policy.

Fnoord · on Feb 21, 2022

The thing with WWW is links, the web. So https://news.ycombinator.com is a good starter. From there, yes, you could end up on twitter.com for example but it would be worthwhile.

IHLayman · on Feb 21, 2022

“…you could end up on twitter.com for example but it would be worthwhile.”

Unpopular opinion: I never click on twitter links anymore. It’s almost never worth it.

IMHO, 140/280/N character limits are a way to cheapen discourse. I think there is something to be said for the “density” of text: text that offers very little to think about (less dense) is vacuous but encouraged by a character limit; yet, text that is compressed into a character limit either packs too much info into a short space that requires more discourse to properly get a thought across or elides too much from the text, making it less accurate/meaningful/important. Or worse: people chain posts into long 1/907, 2/907, 3/907… trains that should be blog posts rather than requiring some other application to string the thread together.

Of course the other reason (more central to this discussion) never to click on a twitter link is that JS and an account login is required now to read the posts past a certain point. If that makes me an old man yelling at a cloud, so be it, but aren’t there better ways to handle online public discourse without sacrificing people’s privacy and security?

hilbert42 · on Feb 21, 2022

"Unpopular opinion: I never click on twitter links anymore. It’s almost never worth it."

It's not unpopular with me, I agree with you completely. I was never a Twitter fan but when they forced the use of JS that was the end of it (you'll note I used Twitter as an example in one of my earlier posts).

You're right about sacrificing people’s privacy and security, as I said in another post 'I'm forever amazed at the trust the average person has in these vulnerability-ridden flaky systems'.

zelphirkalt · on Feb 21, 2022

Similar here. When I am searching for something and a website wont show it unless I enable JS on that website, then usually it is the case, that after enabling JS to see the content, I realize, that the website's content is worth nothing and that I activated JS for naught, regretting to have spent time on that website.

bentcorner · on Feb 21, 2022

I used to run NoScript then at some point (maybe switched browsers?) I stopped using it. You've persuaded me to re-enable it.

Also - Firefox on mobile supports NoScript!

behnamoh · on Feb 21, 2022

No, only FF on Android supports extensions.

exyi · on Feb 21, 2022

Because Apple essentially does not allow Firefox...

quambene · on Feb 21, 2022

Concerning noscript, is this [1] still a thing?

[1] NoScript is harmful and promotes malware - https://news.ycombinator.com/item?id=12624000

josefx · on Feb 21, 2022

Can't find any ads on NoScript.net with uBlock running and uniblue.com seems to have expired. However it is hilarious that the complaint comes from Ad block Plus, their entire business model is build around bypassing EasyList. For a generous fee they make sure that your ads are "acceptable".

Fnoord · on Feb 21, 2022

What makes you think this comes from ABP? The article linked to is from 2016, they link to a history between NoScript and ABP. The article by ABP is from 2009 (!!). Back in the 2009, ABP was the defacto standard. There was no uBlock. There was NoScript, but no uMatrix yet.

The developer issued an apology and reverted the change, and apart from a Ghostery one (who are also shady) no further controversies are documented at [1]. Perhaps the Wikipedia article is incomplete, given the one linked is from 2016?

[1] https://en.wikipedia.org/wiki/NoScript

kevin_thibedeau · on Feb 21, 2022

Firefox has never been slow for me over the last 15 years because NoScript makes it light years better than Chrome. Conversely, I routinely have the Android assistant lock up on me from JS bloat despite the supposed performance enhancement of AMP pages.

mderazon · on Feb 21, 2022

I don't know which web you're viewing that only needs JS for 3-5% of websites

PhantomGremlin · on Feb 21, 2022

HN totally usable for basic functionality w/o JS.

profootballtalk.com works great if you don't want to vote or comment

macrumors.com great functionality

nitter.net happily takes the place of twitter.com

drudgereport.com works great and I rarely turn on JS when I go to the sites he links to, usually the text on target sites is there if not as pretty as it could be

individual subreddits (e.g. old.reddit.com/r/Portland/ ) are quite good w/o JS. But the "old." is probably important.

I admit that there are lots of sites that don't work, e.g. /r/IdiotsInCars/ doesn't work because reddit uses JS for video. For so many sites the text is there but images and videos aren't. Also need to turn off "page style" for some recalcitrant sites.

In conclusion, contrary to your JS experience, I'd say that I spend over 90% of my time browsing w/o JS and am happy with my experience. Things are lightning fast and I see few or no ads. I don't need an ad blocker since 99% of ads just don't happen w/o JS.

zelphirkalt · on Feb 21, 2022

> In conclusion, contrary to your JS experience, I'd say that I spend over 90% of my time browsing w/o JS and am happy with my experience. Things are lightning fast and I see few or no ads. I don't need an ad blocker since 99% of ads just don't happen w/o JS.

Well, you still have lots of tracking stuff loaded probably, unless you got something extra for blocking trackers. A tracking pixels does not need JS. A font loading from CSS does not need JS. Personally I dislike those too, so I would still recommend using a blocker for those.

PhantomGremlin · on Feb 21, 2022

Well, you still have lots of tracking stuff loaded probably, unless you got something extra for blocking trackers.

Yes I'm sure I have that stuff loaded. But I don't care because it's quite ephemeral:

I exit Firefox multiple times a day, there's really no performance cost to doing that after every group of websites. E.g. if, while reading HN, I look up something on Wikipedia, or I search with Bing or Google, everything goes away together.

In my settings: delete cookies and site data when Firefox is closed

In my settings: clear history when Firefox closes, everything goes except browsing and download history

No suggestions except for bookmarks.

So when I restart Firefox to then browse reddit it starts with a clean slate.

Comcast insisted I purchase a DOCSIS3 modem quite a while ago. Once downloads are at 100 mpbs+, does it really matter if I repeatedly re-download a few items to cache?

The only noticeable downside is when I switch to Safari to view something that needs JS, I then see ads for clothing that my wife and daughters might be interested in. I presume this is due to fallback to tracking via IP address. Of course I always clear history and empty caches in Safari.

Obviously this doesn't work for someone who wants to or needs to keep 100 browser windows open at once, for months at a time. But that's not me. I don't think that way, never have.

Edit: just had to add that sites like Wikipedia are better w/o JS (unless you edit?). I don't see those annoying week-long pleas for money. Do they still do those?

zelphirkalt · on Feb 21, 2022

> Obviously this doesn't work for someone who wants to or needs to keep 100 browser windows open at once, for months at a time. But that's not me. I don't think that way, never have.

Caught me. Tab hoarder here : )

> I don't see those annoying week-long pleas for money. Do they still do those?

They still do those. At least I have seen them less than a year ago.

hilbert42 · on Feb 21, 2022

Read my reply to paulryanrogers about whether one's a JavaScript or a non-JavaScript type person.

The 3-5% of sites I'm referring to are ones where I have to enable JS to view them. In by far the vast majority of the sites that I frequent I do not have to enable JS to view them.

Also note my reply to forgotmypw17, one doesn't need JS if one avoids low quality dross.

mderazon · on Feb 21, 2022

I will give it another shot. Unfortunately though, this does not solve the server-side GTM issue, right ?

If the 3-5% of the website you use will start tracking via server-side GTM with the site's domain, you will not be able to simply use noscript to disable tracking ?

hilbert42 · on Feb 21, 2022

You're probably right, but then there are many factors involved - take Europe's GDPR, I'd reckon it'd be deemed unlawful under those regs but of course that doesn't help those of us outside Europe.

It remains to be seen how Google's Tag Manager actually works and I'd be surprised if data from your machine is ignored altogether. If your machine says nothing about you then Google won't know who you are - unless you have a fixed IP address and most ordinary users don't. Sure there's browser fingerprinting (but I never bother about this as I use multiple browsers on multiple machines which screws things up a bit).

When I used to worry about this more than I do now, I used to send my modem/router an automatic reboot signal during periods of inactivity, this ensured a regular change of IP address.

OK, so what info can be gotten from your machine if JavaScript is disabled? Some but it's nothing like what happens when JS is active - in fact the difference is quite staggering (ages ago I actually listed the differences on HN).

Presumably you could search for the post but there's an easier way. Use the EFF's test your browser site https://coveryourtracks.eff.org/ and do the test with and without JS. Note specifically the parameters with the 'no JavaScript' message.

Also note the stuff a website can determine about you even when JS is disabled - with this info you can start tackling the problem such as randomizing your browser's user agent, etc.

My aim was never to kill evey bit of tracking, rather it was to render tracking ineffective and I've been very successful at doing that. The fact is I don't get ads let alone targeted ones just by turning off JS and having an ad blocker as backup. The only other precaution I take is to always nuke third-party cookies and to kill all standard cookies when the browser closes.

I'm not too worried about Google's Tag Manager, for even if Google tracks me it still has to deliver the ads and it cannot do so with JS disabled and an ad-blocker in place.

__

Edit: if you want to watch YouTube then Google insists you enable JavaScript. This is bit of a pain but it's easily solved with say the Android app NewPipe (available via F-Droid). NewPipe also has the added advantage of bypassing the ads and having the facility to download clips as well if that's your wont.

Of course, there are similar apps for desktops too.

mderazon · on Feb 21, 2022

I have advanced protection on my Google account that unfortunately doesn't let me install apps outside Play Store...

I think I can still load NewPipe through usb debugging but not able to have auto updates

hilbert42 · on Feb 21, 2022

If you've advanced protection running then you're a dyed-in-wool Google user (hard core type) so I wouldn't even try.

I'm the exact opposite. I root my Android machines and remove every trace of Google's crappy gumph, Gmail etc. (I don't even have a current Google account.)

I occasionally use the Google playstore but I log on anonymously with the Aurora Store app (not available on the playstore).

I say occasionally because that's true, instead I use F-Droid or Aurora Droid to get my guaranteed spyware free apps. It's a different world - I'm the antithesis of the happy Google user.

Don't try to load NewPipe, in your case it's just not worth the effort (and Google will notice the fact).

ajdude · on Feb 21, 2022

This. I use the no script addon by default, and it’s amazing how many different domains sites try to bring in. Then I hit Twitter, imgurl, quora, etc and I am left with nothing but a blank page with plain text telling me that I need JavaScript to view the site. It makes me wonder what kind of tracking they are pushing.

Syonyk · on Feb 21, 2022

All of them. If you allow everything and have Ghostery running in "don't block anything but tell me what's there" mode, it's horrifying just how many things get loaded.

You can play with page load sizes in the debugger console with stuff blocked and without too - about half the downloaded material on any major news website is stuff that Ghostery will block. It's quite terrifying.

kobalsky · on Feb 21, 2022

> and also most of the ads just simply disappear if JS is not running.

since we are talking about the future I'd like to point out that they can always serve ads from the origin domain without javascript.

I mean the anti-adblock battle will evolve until each page we visit is a single image file that we have to OCR to remove ads. then we will need AI, and they will have captchas that will ask which breakfast cereal is the best.

you can stay ahead of the curve but it's always moving forward.

hilbert42 · on Feb 21, 2022

"...they can always serve ads from the origin domain without JavaScript."

But most of them don't. Yes, they can change their model and in time they likely will.

As it stands now, one doesn't have to watch ads on the internet if one doesn't want to - all it takes is a little perseverance and they're gone. If one can't rise to the occasion then one has a high tolerance for ads.

Even YouTube can be viewed without ads with packages such as NewPipe and similar.

You're right about AI, OCR etc. and I think in time it will come to that.

It seems to me people like us will always be ahead because we've the motivation to rid ourselves of ads. It reminds me of the senseless copyright debate - if I can see the image then I can copy it. No amount of hardware protection can stop me substituting a camera for my eyes. What's more, as the fidelity goes up HD, 4k etc. the better the optical transfer will be (less comparative fidelity loss).

That said, the oldest technology - standard TV - is still the hardest to remove ads from. Yes, one can record a program and race though the ads later (which most of us are very adept at doing) but it's still inconvenient.

What I want is a PVR/STB that figures out the ads and bypasses them. Say I want to watch TV from 7 to 11pm (4 hours) and there's a total of one hour of ads and other breaks in that time that I don't want to watch then I want my AI-aware PVR/STB to suggest that I start watching at 8pm instead of 7 as this will allow it to progressively remove ads on-the-fly across the evening.

The person who makes one of these devices will make a fortune. If the industry tries to ban it (as it will) then we resort to a software version and download it into the hardware. Sooner or later it's bound happen and I'll be an early adopter.

kobalsky · on Feb 21, 2022

> What I want is a PVR/STB that figures out the ads and bypasses them. Say I want to watch TV from 7 to 11pm (4 hours) and there's a total of one hour of ads and other breaks in that time that I don't want to watch then I want my AI-aware PVR/STB to suggest that I start watching at 8pm instead of 7 as this will allow it to progressively remove ads on-the-fly across the evening.

I wonder if something like sponsorblock for youtube (which is a must have) could be done for TV? it's a crowsourced effort and works flawlessly for popular channels.

hilbert42 · on Feb 21, 2022

Good question, I don't know. It's certainly worth thinking about.

jcfrei · on Feb 21, 2022

How does blocking javascript in this case prevent tracking? It's done via the same cookies the website uses, as I understand it. Do you disable cookies too?

minimilian · on Feb 21, 2022

i used to have javascript turned off for a long time, but i've given up. you can't even search hacker news without javascript (for some reason).

3836293648 · on Feb 21, 2022

Pretending as if you can search hacker news with JS turned on...

zelphirkalt · on Feb 21, 2022

There is some truth to this though. It is sometimes hard to find that HN topic, that you remember just a few words of through the aglolia search thing.

scim-knox-twox · on Feb 21, 2022

Exactly! If something didn't work without JS, I don't use it. There are many alternatives.

ec109685 · on Feb 21, 2022

Apple’s Private Relay blocks this type of cross site tracking.

Given this tracking is all server side, third party cookies across sites aren’t possible using this mechanism, and private relay cycles through your IP addresses frequently and uses common IPs across multiple users.

Regarding your other point, unless Google execs want to be thrown in jail / sued, they can’t use things like first party cookies for their benefit since that is against their terms of service.

novok · on Feb 21, 2022

How is private relay different from a vpn? A lot of fingerprinting scripts also can track you despite vpn.

top_sigrid · on Feb 21, 2022

Private Relay uses ingress and egress relays. The ingress proxy does know your IP but not which sites you are visiting and what you are doing. The egress proxy is only connected to the ingress, sees what you visit but does not know who you are. Both proxies are run by different parties.

With a VPN you would have to trust one provider, who sees all of your traffic.

mkmk3 · on Feb 21, 2022

Then is Private Relay equivalent to a two layer tor setup?

Engineering-MD · on Feb 21, 2022

From my understanding yes, but with the caveat of being organised by a single entity (apple)

irrational · on Feb 21, 2022

I wonder why Safari is required? I’d be interested in paying for this if it worked with Firefox.

_abox · on Feb 21, 2022

Yeah that would be a useful service that Mozilla could offer and I'd actually pay for.

I don't like their VPN as it's too basic in terms of privacy protection and it's much more versatile to just sign up with Mullvad myself because then I can use it on other stuff than just the browser.

bhauer · on Feb 21, 2022

I think in the short-term the strategy is this from the article:

> Or ... block all the IP addresses of Google App Engine, at the risk of blocking many applications. having nothing to do with tracking.

Anyone hosting legitimate apps in the Google ecosystsm is indirectly complicit in this and at least for my personal network, I have no concern with blocking Google App Engine holistically.

Additionally, I think it's important to hurt Google as much as possible for escalating in this way. Widespread blocking of GAE may seem extreme but it's also arguably warranted.

sdepablos · on Feb 21, 2022

The thing is that you can host the server container also in AWS https://www.simoahava.com/analytics/deploy-server-side-googl... or Azure https://www.simoahava.com/analytics/server-side-tagging-azur...

reaperducer · on Feb 21, 2022

I have no concern with blocking Google App Engine holistically

Unfortunately, it seems that more and more government web sites rely on Google services to function. And there's no replacement for those.

timbit42 · on Feb 21, 2022

Use two browsers. One where you don't block tracking and can access government and make purchases on shopping sites, and one tracking is blocked and JavaScript is turned off.

paulryanrogers · on Feb 21, 2022

How can it be legal for a government to make increasingly core services depend on these amoral, for profit monsters?

boondaburrah · on Feb 21, 2022

The military-industrial complex would like to have a word.

l33t2328 · on Feb 21, 2022

I’m not sure if this is a serious question, but what would this imaginary law say?

The government can only do business with companies who aren’t in it for the money?

Ansil849 · on Feb 21, 2022

How about that government services must be built by the government?

throwaway2037 · on Feb 21, 2022

Yes, I feel the same, at least for a lot of things. Certainly, all externally facing websites should be designed and maintained by gov't staff.

From time to time, HN features high quality UK gov't websites. In the last five years, the UK gov't has made dramatic strides on "digital gov't" initiatives that benefit regular citizens. As I understand, most of those sites are built and maintained by gov't employees. This runs counter to the normal, all-prevailing attitude in UK that "any gov't is too much gov't" (or "any gov't that does not directly benefit me...").

azalemeth · on Feb 21, 2022

The trouble is, they're mostly Microsoft and either Azure or AWS behind the scenes. The UK government as a whole seems to love Microsoft. I just worry it will be out of the frying pan and into the fire...

ssl232 · on Feb 21, 2022

Brit here. On your last point, there is no such widespread attitude in the UK towards government. We are historically conservative, but not libertarian. Don't forget two of the most famous and loved British institutions are the BBC and the NHS. I'm not saying such attitudes don't exist, because they do, but it's not "all-prevailing" by any stretch.

sofixa · on Feb 21, 2022

I think it's a typo/autocorrect and they meant US at the last instance instead of UK.

gbear605 · on Feb 21, 2022

The Conservatives want to privatise the BBC and the NHS though - abolishing the BBC licensing fee is a recent move, and steps to privatise the NHS have been repeatedly popular among politicians over the last decade.

zelphirkalt · on Feb 21, 2022

I would like that law. However, they would have to pay wages and offer working conditions, that actually attract good developers and they would have to stop outsourcing everything. Outsourcing everything is also a problem with otherwise qualified engineers unfortunately. The big picture long term consequences are unpleasant.

sofixa · on Feb 21, 2022

You have to draw a line somewhere with that logic, otherwise you'd have governments running their own fabs.

I'm fully in favour of governments doing everything from hosting up ( hosting, design, dev), with as much as possible open source.

For instance the French government fares well on this front, with most government services being developed in-house, and many parts are open source; in emergencies specific services were delegated to third parties ( e.g. vaccine bookings) so it isn't taken to a religious NIH level. However hosting is delegated to commercial entities.

KerrAvon · on Feb 21, 2022

Realistically, Congress could in fact mandate that government website implementations must be transferable between software vendors. That’s both technically feasible and in line with past government requirements for hardware procurement.

HWR_14 · on Feb 21, 2022

The US government isn't shy about adding rules for its contractors. It should be trivial for them to demand (or provide) dedicated IPs for their sites. Then they won't get caught up in the IP address blocking of GCP.

efitz · on Feb 21, 2022

The big tech companies have all built out lobbying capabilities; such a law would end up helping big tech and harming small companies because the big companies would be involved in authoring the law and would be contributing to the sponsors and committee chairs and members to get their favorable language included. And it would all be legal and business as usual.

HWR_14 · on Feb 21, 2022

They don't have to be laws. It's something that Biden can just add into every RFP the US government puts otu.

But no, typically things like that don't hurt small companies.

carapace · on Feb 21, 2022

> but what would this imaginary law say?

IANAL, but how about something like, "Government services offered via WWW must not contact commercial servers and must be fully usable with non-JS browsers."

tgsovlerkhgsel · on Feb 21, 2022

Aren't browsers shifting to a per-domain cookie jar?

While you can never prevent one specific site from tracking you, this still doesn't (directly) allow your activity on Site A to be linked to activity on Site B, does it?

Of course, fingerprinting combined with IP addresses will ultimately allow something that comes very close to it, so the current state (a few hundred trackers per website, all ending up harmlessly incrementing the adblocker's counter) is better for privacy for power-users, but I'm not sure if this is the big "game over".

josefx · on Feb 21, 2022

Google is pushing to have the browser itself track your interests and share them with whoever asks. The first attempt FloC backfired rather quickly as it was an all around privacy nightmare. The second attempt Topics promises to fix a lot of the problems FloC had but that is not a high bar and Google left itself a lot of room for future changes.

lewantmontreal · on Feb 21, 2022

This is what I’m interested in. Article itself did not mention cross site tracking.

Every website having their own tracking subdomain makes third party cookies not work cross site even without browser changes.

_abox · on Feb 21, 2022

They can still cross-track based on IP or any other fingerprint worthy information. I expect this is exactly what they're doing. Doing this all on a central service makes this process much easier unfortunately...

pixeldetracking · on Feb 21, 2022

yes, they would need to get another identifier, and that's what is done with players like Facebook.

Sorry another of my articles in french: https://pixeldetracking.com/fr/les-signaux-resilients-de-fac..., but Facebook is making it easy to integrate their "Conversion API (CAPI)" with GTM Server-Side tagging

tgsovlerkhgsel · on Feb 22, 2022

But that should only help e.g. a web store to track you from the ad you clicked, which seems reasonable.

It should not allow e.g. Facebook to link your activity on a news site to your Facecbook cookie, because while you're on cnn.com, your browser is using the cnn.com-specific cookie jar for everything, including the like button?

callmeal · on Feb 21, 2022

The cross site tracking is done by a third party. From reading the docs, the way it works is, publisher sets a unique id, browsers send that unique id to the publishers domain, publisher forwards that (via the tag manager app engine) to the third party.

quicklime · on Feb 21, 2022

> Maybe using an archive.is-like service that renders the static page (as an image at the extreme)

A lot of companies are starting to use "browser isolation" which is essentially what you're saying. A proxy runs between the client and the server, but it does more than just direct TLS streams - it actually builds the DOM and executes the JS. The resulting web page is sent to the actual client browser, which might send back things like mouse and touch events to the proxy, which will then update the page.

I think most companies are using this as a malware protection thing, but it does hide the actual client IP address and fingerprint, and I imagine it would make tracking very difficult.

https://en.wikipedia.org/wiki/Browser_isolation

_abox · on Feb 21, 2022

Browser isolation isn't quite that. It's just running a browser that is heavily sandboxed from internal files and networks, or running on another machine so any exploits don't hit your machine.

It's very much like running a browser through Citrix (in particular the remote flavour which is the most common as far as I've seen). But of course any data in the browser itself is still within reach for the malicious code... Which only solves half the problem. Unless you rigidly separate internal browsing from external sites.

But it doesn't run all the JavaScript and then send you a screenshot or anything. The resulting page is still interactive.

Remote browser isolation has the ability to change the landscape of personal computing enormously by the way. Right now we equip all our laptops with at least 16GB (32 for customer care) because some web apps like Salesforce Lightning are such memory hogs.

Considering the importance of the browser in modern computing this model world basically make the PC more like a terminal and require much less resources.

Of course this has already been going on with web based apps and streaming of things like games but this could be the final nail in the coffin of the PC as we know it. Not sure I'm happy with that...

kibibu · on Feb 21, 2022

Opera Mobile has been doing this for years and years

Quai · on Feb 21, 2022

The Opera product you are thinking of is Opera Mini. Opera Mobile is a browser running mostly on your device (except for "turbo" which optimized media trough a proxy setup, but did not, afaik, execute any of the javascript).

Opera Mini can be looked at as a browser running in the cloud, sending OBML (Opera Binary Markup Language, if I remember correctly) causing the (very thin) client to draw things on the mobile screen, like text, images, etc without having to transfer, parse, execute, flow and paint every thing on the device.

Fnoord · on Feb 21, 2022

Yeah, they released countless of rebrands and versions and what not.

The equivalent on desktop would be Browsh (e.g. with terminal + Mosh), but it runs Firefox under the hood. Opera Mini is just akin to a remote browser with the result being send to the client (as a compressed picture like in RDP/VNC, or a proprietary markup language like OBML).

cookiengineer · on Feb 21, 2022

> Maybe using an archive.is-like service that renders the static page (as an image at the extreme), or a Tor-like service and randomizes one's IP address and browser fingerprint.

I'm building a peer-to-peer network of Web Browsers [1] that doesn't trust anything by default, and only allows to render types of content incrementally; while disabling JS completely. Most of the time, you can find out what the content is with heuristics. The crappy occasional web apps that don't work without JS can be rendered temporarily in an isolated sandbox in /tmp anyways.

I think that the only way to get ahead of the adblocking game is to instead of maintaining blocklists, we need to move to a system that has allowlists for content. The user has to be able to decide whether they're expecting a website serving a video, or whether the expectation is to get text content, image content, audio content etc. News websites are the prime example of how "wrong" ads can get. Autoplayed videos, dozens of popups, flashing advertisements and I haven't even had time to read a single paragraph of the article.

And to get ahead of the "if fanboy gets hit by the bus" problem... we need to crowdsource this kind of meta information in a decentralized and distributed manner.

[1] https://github.com/tholian-network/stealth

misterbwong · on Feb 21, 2022

Called it [1]. It's a cat-and-mouse game and, unfortunately, advertising is just _that_ lucrative. Privacy-minded browsing will help those that care (for now...), but that's an unsustainable option with the current monetization channels available.

If a content publisher cannot monetize you, they will think nothing of blocking you. There will be some public backlash against companies that do so and there will be some sites who will lose money because of it, but the rest of the publishers will simply follow the money while the industry shifts towards more intrusive tactics.

There needs to be a monetization channel that is 1) good for both users AND publishers and 2) pays just as much as current methods. Unfortunately none of the current systems support that.

[1] https://news.ycombinator.com/item?id=9975955

drusepth · on Feb 21, 2022

>There needs to be a monetization channel that is 1) good for both users AND publishers and 2) pays just as much as current methods.

I agree, but what party would you like that money to originate from?

Ads work well right now for consumer-to-consumer (e.g. I create a blog and you view it) because there's a rich, third-party that money can flow from (a company running ads --> money to me) without having to charge you, the end-user who is more than likely significantly less well-off than a corporation.

To buck that pattern, you need the money to come from somewhere else. Subscriptions and direct payments are an obvious choice (see: the boom of SaaS over the past few years) but people are already complaining that they have so many subscriptions they lose track of them all, and spend too much money on what used to be a "free" internet.

So, I don't think there's a solution where the money comes from the end-user. However, any time you add in a third party for the money to flow from, they're going to want something in return. And unless you want that cash flowing from the site owner to that third party (...why would you?), they're gonna need to offer something else.

I don't see any solution other than "a third party pays for something users and/or the site can create for free". Is the answer to just find something free other than analytics/usage, or are there other approaches to monetize a site while still making it "free" to access?

misterbwong · on Feb 21, 2022

Unfortunately I don't see a good solution either. Large direct to consumer business models like SaaS or subscriptions are really only sustainable at scale, and even then it's dicey. In a SaaS model, the big fish win and we lose the democratic nature of the current internet.

Society has driven the perceived price of content so low that the content itself is worth less than the aggregate audience. Really, in what other space does the average consumer set their price expectations at free AND balk at paying $5/mo for unlimited access to a product?

The only thing that seems to come close to moving the needle towards privacy is somehow pushing advertisers into in-market advertising (think early internet-style site banner ads) and out of programmatic/user tracked ads. There is some evidence that these programmatic ads don't really perform as well as they claim but from what I can gather, the data is still unclear.

booleandilemma · on Feb 21, 2022

Simpler protocols (Gemini, Gopher...), outright refusing to use what the modern web has become. I only use HN and a few select sites. You don't need an ad-blocker if there are no ads in the first place.

ReactiveJelly · on Feb 21, 2022

Using Gemini as an allowlist doesn't seem any better than allowlisting known-good domains for HTTPS sites

EE84M3i · on Feb 21, 2022

HN is a link aggregator for HTTP(s) links. How do you read them?

aenis · on Feb 21, 2022

Not sure about the parent poster, but I am here mostly for the comments, and rarely visit the linked content.

ComodoHacker · on Feb 21, 2022

Doesn't exactly this behavior create echo chambers and lead to polarization?

PhantomGremlin · on Feb 21, 2022

I usually do read the linked content but I agree with GP poster that comments are often more informative.

Yes there is sometimes an echo chamber here, but it's only for limited topics. It very much has a Silicon Valley feel to it, but @dang and I have gone around on this and he assures us that the readership and comments have broad geographic representation.[1] It's a worldwide echo chamber. :)

Fortunately the echo chamber doesn't exist for most submissions. Most of the discussion on HN is on non-polarizing topics.

[1] https://news.ycombinator.com/item?id=26869902

thinkingemote · on Feb 21, 2022

The time of the day is reflective of broad geography, generally.

So some UK or EU specific topics will appear, be commented upon but then disappear later in the day.

It would be interesting to see what kind of topics are commented on from different places.

tremon · on Feb 21, 2022

Which behaviour would that be? The "reading only the comments, not the article"? I don't see how reading creates an echo chamber.

What creates an echo chamber is if all the posts are similar or otherwise in agreement with each other. Those threads make for boring reading and I tend to only scan them for less boring content (yes, that means I read the context surrounding greyed-out comments more than the rest). The threads where people discuss various aspects and experiences is what I come here for.

(full disclosure, I mostly read the comments before even opening the article. I only read the article if there's a high-quality comment thread about some details in the article, or if multiple commenters state that it's a great article. And I tend to upvote an article based on the quality of the comments, not just the article itself).

aenis · on Feb 22, 2022

I dont think so. I'd think Echo chambers are created by lack of diversity in the user base. I think HN has a lot of actual diversity, and its possible to see controversial topics disputed without unceremonial downvoting.

mtsr · on Feb 21, 2022

I don’t think the solution here is a technical one. This should just be solved by legislation.

Google Analytics has been recently ruled illegal in multiple European countries. And either this already is illegal under the same laws or it should be made so.

tick_tock_tick · on Feb 21, 2022

> Google Analytics has been recently ruled illegal in multiple European countries.

Just about everything hosted by a non EU company just got ruled illegal (in the EU that is).

mtsr · on Feb 21, 2022

It's very doable to disable google analytics for EU visitors.

welterde · on Feb 21, 2022

Not quite - only everything US-based, since they fall under the purview of the cloud act, which is incompatible with the GDPR (on purpose.. this is an entirely self-inflicted wound by the US).

tick_tock_tick · on Feb 22, 2022

> Not quite - only everything US-based

No anything with laws similar to the "cloud act", which is the norm rather then the exception, is illegal. It's quite rare for a country to allow companies inside it to say no to there government.

welterde · on Feb 23, 2022

It's not about companies inside it, but companies outside the country. And is it actually the norm? Since clearly even the US didn't have the Cloud Act until 2018. Was the US such a rare case until that recently?

MomoXenosaga · on Feb 22, 2022

This is interesting to me. The US is basically doing the same thing as Russia and China yet the media never talks about it...

tick_tock_tick · on Feb 22, 2022

> The US is basically doing the same thing as Russia and China

I don't understand or really get what you're referencing.

The whole issue here is the USA claims global jurisdiction over US companies forcing them to obey the USA legal system even for data located in the EU. On the other hand EU law makes it illegal for anyone globally to turn over data for EU customers without a court order from the EU.

mhoad · on Feb 21, 2022

I suspect this might end up as a slightly trickier scenario because when you get down to the details it’s hard at a technical level to make a distinction between a server log file and a tool like analytics which takes those same bits of data and mostly just organises and displays it in an intuitive way with charts and a nice UI.

mtsr · on Feb 21, 2022

The ruling against google analytics in France is quite simple: google analytics as used by an unnamed website was not compliant with GDPR, because it exports user data to a country that has privacy laws that are not up to GDPR standards, which is not allowed. This is on the unnamed website and they or compelled to stop this illegal export of user data by either only exporting anonymized statistics or stopping use of google analytics entirely.

Of course this isn’t yet a perfect banning of GA and Google might be able to work around it, but it’s something. And in fact, anonymized statistics would probably be OK (depending on the details of course).

mhoad · on Feb 21, 2022

But this actually highlights exactly what I mean. What if I simply stood up a plain old Apache server to host my website but that happened to be hosted in the US. No analytics, just a few HTML files and that’s it.

I’m still in this scenario sending PII of EU citizens in the form of IP addresses to the US which are just written to /var/log/apache

It seems obviously different and yet as that ruling seems to imply it wouldn’t be unless I’m missing something here between first and third party capture or something?

nickpp · on Feb 21, 2022

Default configurations of logging on most servers is illegal now under GDPR since it saves IP addresses.

hyperman1 · on Feb 21, 2022

This pops up regularly, but AFAIK it's not correct. The law is much more fine grained than the USA PII concept. IP addresses are only personal data (PD) if you are capable of using them as identification mechanism. If you don't they are not. This also means that something that is not PD for you, can become PD when you give it to someone else. Or that 2 items which are not PD themselves, become PD when you combine them. Or that being hacked turns non-PD into PD.

Even as PD, using IP addresses to maintain a website is fine, even without consent. Using them to track individuals is not fine. Having a log rotation policy and a sane security policy so you can demonstrate when you throw them away is a good idea.

To be short: Install debian, drop nginx on it, then let it log as it wants. This is legal. But don't you dare mine the logs for abusing PD.

nickpp · on Feb 21, 2022

Do you have a source? My observation came from multiple lawyers in the context of "to stay on the safe side".

rndgermandude · on Feb 21, 2022

Incorrect. In the "Breyer" ruling[0] the highest European court concluded that dynamic IP addresses are PII (not just personal data, and not just data), as there is an abstract risk that combining IP addresses with other data can lead to identification of a user. The ruling explicitly said that the mere risk of such an identification is enough, not that such an identification has to actually happen.

Subsequent rulings by many courts have found that all IP addresses are PII, for various reasons, such as "static" IP addresses bear the same risk of indirect identification, and there is no reliable way to distinguish between "dynamic" and "static" addresses anyway.

The recent German ruling that Google Fonts violates the GDPR just by transmitting an IP to google (by making the web browser fetch a resource from a google server) hammered home this point, citing the EU ruling again[0].

This is different to e.g. of a streaming provider keeping a history of songs you played. This data is personal data, but it is not personally identifiable data as this history alone cannot be used to identify a person. However, if this history has some kind of identifier attached that links back to account information or an IP address, that identifier would be PII, as this identifier could be used to indirectly identify a person.

[0] https://curia.europa.eu/juris/document/document.jsf;?text=&d...

[1] https://rewis.io/urteile/urteil/lhm-20-01-2022-3-o-1749320/

Die dynamische IP-Adresse stellt für einen Webseitenbetreiber ein personenbezogenes Datum dar, denn der Webseitenbetreiber verfügt abstrakt über rechtliche Mittel, die vernünftigerweise eingesetzt werden könnten, um mithilfe Dritter, und zwar der zuständigen Behörde und des Internetzugangsanbieters, die betreffende Person anhand der gespeicherten IP-Adressen bestimmen zu lassen (BGH, Urteil vom 16.05.2017 - VI ZR 135/13)[2].

Translated, best to my abilities:

The dynamic IP address is to a web site operator a piece of personally identifiable data, because the web site operator abstractly has legal means, which could be reasonably used, with the help of third parties, namely the the responsible authority and the internet service provider, to identify the person in question with the use of the stored IP address (BGH, ruling from the 16th of May 2017, VI ZR 135/13)[2]

[2] The BGH ruling quoted is the "Breyer" ruling again, just at the German national level instead of the EU level. The Bundesgerichtshof (BGH, highest German court of ordinary law) asked the European Court of Justice to settle the question of whether dynamic IP addresses are PII, which the ECJ affirmatively settled in [0].

hyperman1 · on Feb 21, 2022

This is a very interesting legal document, and I'll have to take the time to read it slowly before I can judge it.

It centers around this line:

   ... not PD for you, can become PD when you give it to someone else

and claims that, as this potentiality can always be fulfilled, you should consider it PD. This would invalidate the first part of the post, but is still not enough to make a default deploy of a logging http server illegal because of the 6.1(f) legitimate intrest rule. In fact, things like 21.1(b) might make it obligatory.

Now we are in lawyer 'interesting question' territory which costs a lot of money, and I still don't think you'll need to worry, because you're not violating the spirit of the law. Personally, I'll go on depending on 2.2(c)

rndgermandude · on Feb 21, 2022

It's not illegal to store such information in default logs per se, even without explicit consent, if it would fall into the "legitimate interest" category[0], e.g. you need it to operate the service and prevent abuse, and there is no less intrusive way to e.g. reasonably monitor for and prevent abuse.

However, you cannot share such logs without consent, you still have an obligation to inform users about your legitimate interest assessment and what data you store, and you still have to abide to other rights of users such as the right of users to ask for a copy of the data you store about them.

[0] Art 6.1.f https://gdpr.eu/article-6-how-to-process-personal-data-legal...

nickpp · on Feb 21, 2022

Gdpr.eu is not an official EU resource. There is no official guidance saying that IP address in logs falls under "legitimate interest" and every lawyer I asked advised against it "just to be on the safe side".

One actually added: Do you really want to test our government's understanding of "legitimate interest" for your business in court?

rndgermandude · on Feb 22, 2022

>Gdpr.eu is not an official EU resource.

Yes, but I never claimed that they were. The text that I linked is a copy of the official GDPR text (and recitals), not an article they wrote on the topic. I used their website, because I find it more usable as they added cross-references links and recital links. But if you prefer, read the official EU version[0], which is the same in content and in words.

>There is no official guidance saying that IP address in logs falls under "legitimate interest"

I haven't said that. I said storing IPs in logs might be legal, if there is a legitimate interest and/or there is consent.

There are actually two official recitals straight up addressing that topic. Recital 47 states (in part): "[...] The processing of personal data strictly necessary for the purposes of preventing fraud also constitutes a legitimate interest of the data controller concerned. The processing of personal data for direct marketing purposes may be regarded as carried out for a legitimate interest." (This is not meant to be an exhaustive list)

Recital 49 states (in full): "The processing of personal data to the extent strictly necessary and proportionate for the purposes of ensuring network and information security, i.e. the ability of a network or an information system to resist, at a given level of confidence, accidental events or unlawful or malicious actions that compromise the availability, authenticity, integrity and confidentiality of stored or transmitted personal data, and the security of the related services offered by, or accessible via, those networks and systems, by public authorities, by computer emergency response teams (CERTs), computer security incident response teams (CSIRTs), by providers of electronic communications networks and services and by providers of security technologies and services, constitutes a legitimate interest of the data controller concerned. This could, for example, include preventing unauthorised access to electronic communications networks and malicious code distribution and stopping ‘denial of service’ attacks and damage to computer and electronic communication systems."

These recitals were specifically added to address some points that had already been litigated in the past in various European courts.

>and every lawyer I asked advised against it "just to be on the safe side".

Good for your lawyers (that you keep mentioning all across threads). I don't know your lawyers, but they seem overly cautious - even for lawyers - and maybe a little bit under-educated on the subject matter. But they still have a point. You cannot just store access logs containing IP addresses, you have to have a legitimate interest, and be able to articulate this legitimate interest, and see if law makers and courts would consider your "interest" to be "legitimate". Which is easy when it comes to fraud detection and network security/abuse (thanks to the recitals), less easy when it comes to other areas, and pretty easy when it comes to different areas that are clearly against the text or spirit of the GDPR; e.g. nobody will buy an argument of "my legitimate interest is that I want to earn money from tracking and selling user data".

[0] https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CEL...

[1] https://gdpr.eu/Recital-47-Overriding-legitimate-interest or https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CEL...

[2] https://gdpr.eu/Recital-49-Network-and-information-security-... or https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CEL...

nickpp · on Feb 21, 2022

When you use laws to ban businesses from other countries, those countries will feel entitled to use laws to ban businesses for your countries as well.

It's how protectionism works and it's generally the consumers who lose.

rndgermandude · on Feb 21, 2022

These laws do not ban businesses, they ban business practices. And consumers often win. E.g. laws to ban the business practice of just dumping toxic waste into rivers because it's cheaper were hugely successful - at least in places were they were enforced. On the other hand, there is a danger of regulatory capture, which has to be considered as well...

The GDPR does not ban Google, and it does not ban analytics. But, according to recent court rulings, it bans the business practice of Google Analytics to collect and transfer data to the US - which isn't considered to be a place with "adequate" privacy laws - and other places without prior user consent. Google could potentially come up with ways to make a Google Analytics that does abide by the law, but so far they choose not to. Maybe the changes that would be required would cut severely into revenues, or even make (free) GA cost-prohibitive, but this is in line with environmental protections killing off certain products/businesses that got too expensive when they had to dispose of their toxic waste properly and in a way that doesn't poison people and the environment.

nickpp · on Feb 21, 2022

Comparing tracking with "dumping toxic waste into rivers" is comparing a breeze with a hurricane.

> Google could potentially come up with ways to make a Google Analytics that does abide by the law

I personally know of no way to have legal analytics under GDPR, as advised by multiple lawyers.

sweetbitter · on Feb 21, 2022

>I personally know of no way to have legal analytics under GDPR, as advised by multiple lawyers.

If this is truly the case, then these businesses must consider shifting to ethical business models to stay afloat. If not, then competition will steamroll them.

onion2k · on Feb 21, 2022

The article is from 2020, and I don't think I've ever seen a site using this approach yet. It is an egregious attempt to circumvent the Same Origin security policy in browsers that developers and privacy advocates should rightly be angry at, but it doesn't seem to have caught on. That's something to be thankful for.

1shooner · on Feb 21, 2022

>I don't think I've ever seen a site using this approach yet.

What have you been looking for? It seems like this would be hard to observe.

pixeldetracking · on Feb 21, 2022

your are optimistic, most analytics guys I know are working with clients to transition to GTM server-side tagging

toastal · on Feb 21, 2022

While impractical, I liked the article's suggestion of blocking the proxies. I'm curious what reaction this would have. Ad blocking users get no content and move to alternatives and stop being users, or would the sites cave and realize having users interacting is more important than all of the data collected.

tasha0663 · on Feb 21, 2022

It's a fine suggestion. If it breaks the site, then I'd call that a broken website and move on. Maybe next time someone points me there, they'll have fixed their critical issue for users who block tracking proxies.

I'm okay with not being in the target audience of sites that really want to do this. I've got enough other things to do at less hostile places that my FOMO isn't triggered in the least.

gigel82 · on Feb 21, 2022

How do you identify tracking proxies though? When everything is going through the same domain you don't even know if data is being sent to Google, it's all a server-side black box.

pixeldetracking · on Feb 21, 2022

ublock origin has actually an experimental option for this: https://github.com/gorhill/uBlock/wiki/Static-filter-syntax#...

only issue with blocking the proxies is that you can now decide to host the container on your own infra through docker, and it's documented by Google: https://developers.google.com/tag-platform/tag-manager/serve...

I guess this is very interesting for many people, especially in Europe with the "Google Analytics ban"

aembleton · on Feb 21, 2022

By using Cname uncloaking that uBlockOrigin can do on Firefox. It should see that the real domain is Google Tag Manager.

thejohnconway · on Feb 21, 2022

I think the article mentions that Google recommends against using Cname for this, and using A records instead.

tasha0663 · on Feb 21, 2022

> Google recommends against using Cname for this

So use Cname? :D

thejohnconway · on Feb 21, 2022

Sites want the ads to get through, right? So they’re going to do the thing that makes that happen: A records.

downrightmike · on Feb 21, 2022

The greatest minds of a few generations really should think about not being evil.

inlined · on Feb 21, 2022

I think the solution will be for ad blockers to invest in neural nets to detect the graph of the code flow for known variants of the script. The software that detects plagiarism will be a good start.

xigoi · on Feb 21, 2022

That sounds like it's going to be slower than not using an ad blocker at all.

aembleton · on Feb 21, 2022

Not if the signatures are uploaded and shared.

brobinson · on Feb 21, 2022

Hashmap lookups are O(1)

dsr_ · on Feb 21, 2022

There's no way to fight this unless ... you pass legislation against it or comparable technologies, preferably at a policy level.