If you're concerned as a user of a malicious site:
* Link click tracking - So what, the site could route you through a server side proxy anyways
* Hover tracking - Can track movements of course, but doesn't really help fingerprinting. This is still annoying though and not an easy fix
* Media query - So what, user agent gives this away mostly anyways
* Font checking - Can help fingerprinting...browsers need to start restricting this list better IMO (not familiar w/ current tech, but would hope we could get it down to OS-specific at the most)
If you're concerned as a site owner that allows third party CSS:
* You should have stopped allowing this a long time ago (good on you, Reddit [0] though things like this weren't one of the stated reasons)
* You have your Content-Security-Policy header set anyways, right?
Really though, is there an extension that has a checkbox that says "no interactive CSS URLs"? I might make one, though still figuring out how I might detect/squash such a thing. EDIT: I figure just blocking url() for content and @font-face.src would be a good compromise not to break all sorts of background images for now.
> * Media query - So what, user agent gives this away mostly anyways
I was earnestly surprised how much data macOS and Android devices tend to put into the user agent. Not only the exact patch level of the browser, but also the OS patch level and Android devices even tend to broadcast the precise device model as well -- more accurately than just looking at the device!
Some examples:
Mozilla/5.0 (iPad; CPU OS 10_3_3 like Mac OS X) AppleWebKit/603.3.8 (KHTML, like Gecko) Version/10.0 Mobile/14G60 Safari/602.1
Mozilla/5.0 (iPhone; CPU iPhone OS 11_2_1 like Mac OS X) AppleWebKit/604.4.7 (KHTML, like Gecko) iOS/16.0.7.121031 Mobile/15C153 Safari/9537.53
Mozilla/5.0 (Linux; Android 7.0; LG-H840 Build/NRD90U) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.111 Mobile Safari/537.36
A Linux in comparison:
Mozilla/5.0 (X11; Linux x86_64; rv:57.0) Gecko/20100101 Firefox/57.0
While it includes the version number, no patch level (57.0.X) is included.
The user agent is such a mess, why should any website know all that? Why should a website know anything about the visiting guest, they should be using feature detection instead. Lets get rid of the user agent or just put "Mobile/phone", "Desktop" or similar in it. Maybe OS and a short browser name and main version number for statistics.
Without user agent: How would I easily detect which browser breaks a certain feature on my project?
If I deploy a new feature and see through logging that a browser X is not able to do Y then I can install X on my machine and test and fix it.
If I don't have a user agent then I can just detect that after deploy there are more cases where Y fails but I don't know which browser is responsible for this.
As a developer: If we actually pushed browsers to fix things, you wouldn't need to worry about that. Why should the job fall to you to work around their shitty implimentation of the spec?
Because when management asks you why their site that they paid hundreds of thousands of dollars for doesn't work on <insert major browser here>, your answer can't be "the browser's implementation of the spec is shitty, blame them." Your answer is going to be, "Yeah, sure, let me fix that."
> your answer can't be "the browser's implementation of the spec is shitty, blame them."
If it's a major browser which management cares about, then you should be testing with it already. If you're not, then logging user agent strings isn't going to help.
Logging user agent strings would help if, for example, an unexpectely-large proportion of users are using a "non-major" browser, in which your site is broken.
If the proportion is small, management won't care.
If the proportion is expected, then market/demographic research is partly to blame; update the spec.
If the browser is "major", you should be testing with it anyway.
I see what you're saying, but unfortunately, especially in enterprise, the browser version is often locked to something quite old. One of our clients has locked to Chrome 48.
Even if Chrome followed the spec to a T, programmers still write bugs. So, I'm not going to expect a browser (at least) 15 versions old to behave perfectly. And we all know that the spec isn't perfectly implemented.
So, no. Unfortunately sometimes there are things that will make management care a lot about a browser that they really shouldn't.
> Unfortunately sometimes there are things that will make management care a lot about a browser that they really shouldn't.
I never said management should or shouldn't care about this or that browser. I never said anything about browsers being new or old.
I said that developers should be testing with whatever browsers management cares about. If management care about it, and there's some justification, then add it to the spec.
> unfortunately, especially in enterprise, the browser version is often locked to something quite old. One of our clients has locked to Chrome 48.
That's an excellent justification for having Chrome 48 compatibility as part of the spec, so you should already be testing your sites with it. What has that got to do with user agent strings?
Is Chrome 48 even old? I tend to ensure IE6 compatibility, unless I have a good reason otherwise (e.g. voice calls over WebRTC, or something). When I'm using w3m, e.g. to read documentation inside Emacs, I occasionally play around with my sites to ensure they still degrade gracefully.
Because 100% of implementations are differently shitty. There's no amount of "pushing browsers to fix things" that is going to catch 100% of novel interactions resulting from different combinations of the declarative HTML and CSS languages out in the wild (especially when JavaScript then comes along and moves all those declarations around anyway).
Sure, and the browsers that stray further off will get used less and die off.
And if you are using the latest and "greatest" JS features, you have to expect the failures that happen. If you enjoy sitting on the bleeding edge, don't complain about getting cut.
If you implement features using known, simple and stable tech, things will generally work great without needing to worry about special cases.
So you think a better way is spending your evening trying to fix your square pegs so that they fit in round holes?
Why would you willingly do that to yourself? If we pushed browser developers to actually do their job, they wouldnt be pushing their weight around like they do now.
Who said it's not their problem? But it's also relying on someone else to fix something that you could fix. If I need to get somewhere it doesn't matter if my car's engine is broken because the company stabbed it with bolts I just need a working car. I can sit around whining about how awful the car company is but it doesn't get shit done fast.
Extreme ownership of problems. It's a really helpful concept. You'll stop trying to blame people all of the time for things that you can control and find solutions for them instead. On top of that, if you can't control it you can let it go as something that you can't fix.
If getting shit done fast is your goal, then you are gonna get burned, and I have very little sympathy for you. We should be focusing in getting shit done solid. If it's such a big deal that something works, why build unstable systems in the first place?
If you need your car to be reliable, don't bolt experimental features onto it, and test it before you need to take it on the road.
Not exactly related to your point about the user agent giving all kinds of arguably unnecessary information, but there's an interesting write-up about why the core user agent is the mess it is for those who've not seen it already.
The user agent string definitely has a place on the web, the problem is that it's been used and abused by web developers in the 90s and 2000s when trying to deal with the utter mess that was "browser compatibility" back then.
I run whatismybrowser.com and it's a perfect case of why user agents are useful information. It'll tell you what browser you've got, what OS, and whether you're up to date or not. It's extremely useful to know this info when helping non-tech users - you would not believe how many people still reply "I just click the internet" when you ask them what browser they're using. My site helps answer all those complicated "first" questions.
I completely agree that using User Agents for feature detection/brower compatibilty is a terrible idea, but apparently enough websites still do it to warrant having to keep all that useless, contraditory mumbo jumbo in it too - it isn't what they should be used for any more!
And also, I don't think there's any problem with including "too much" information in the user agent either - point in case: Firefox used to include the full version number of Firefox in the user agent, but now it only shows the major version number, not the exact revision etc. The problem is I can no longer perfectly warn users if they're actually up to date or not.
The reasoning for this is given as a security concern, which I still don't understand - if there's a security problem in a particular point-revision version of Firefox which can be exploited by a malicious web server - odds are they're just going to try that exploit for any version of firefox and it either will or won't work - how does the malicious site knowing the exact version make the situation any worse?!
I've always thought this. Just code to the standard, and if the browser doesn't render it correctly, then tell the user to fuck off and fix their browser.
I don't know why we ever thought sending all this data to the server was a good idea
if 99% of websites you visit work great, and 1 website you visit tells you to fuck off and fix your browser, are you going to do that or are you going to just not use that site?
Remember: incentives. The goal of a web developer is to make sites people use.
I mean, in an ideal world, of course it does. But again: incentives. Keep in mind: search engines themselves are extremely empowering, and they are not generally considered to be something a person pays directly for.
Yeah, empowering users does get developers paid. I get paid to do that myself, and know a lot of other people who also get paid to do that. Of course it's sometimes easier to get paid by treating users like cattle. But if someone doesn't intuitively understand why screwing their users is a bad idea, I'm not sure I can help them.
I believe Safari will be freezing the user agent string soon; Safari Technology Preview is already doing this (it's "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.1 Safari/605.1.15" if you're curious).
> If you're concerned as a user of a malicious site
Or if you're concerned as a user of a regular "safe" site... Google Analytics does all of these things: link tracking, hover tracking, media query tracking. GA or something like it is being used by vast swaths of the web. I don't claim it's the majority, because I don't know, but that's what I assume, that all sites are tracking (whether or not the site even knows it.)
IMHO I think they are different. So much of what makes the web and surveillance creepy is the ability to correlate across data sources. CCTV isn't creepy for one store owner to have - it's bad when the government has it everywhere.
It's not bad for people to analyze how users interact with their site. It's bad when one entity (or a handful) can track you across the Internet.
So in other words, I don't mind Piwik and have considered sending in a patch to uBlock and others with a switch to disabled "locally hosted analytics" or something similar. Like the drive to push "ethical advertising", I think it's reasonable to permit some benign tracking as a way to coerce more sites into decentralizing user analytics.
I'm fine with with self-hosted analytics, and use a log analyzer myself.
My primary objection is automated profile-generation and identifier sharing - third parties don't need realtime updates on my reading habits. I like to think folks who run their own analytics aren't sharing identifiers with adtech shops, but of course can't know.
> * Font checking - Can help fingerprinting...browsers need to start restricting this list better IMO (not familiar w/ current tech, but would hope we could get it down to OS-specific at the most)
Oh my. I wish this madness ended. Quoth tedu:
> I don’t know a whole lot about typography and fonts, but there’s two things I know about font files. They’re ridiculously complex and their parsers have only just begun to experience life with hostile inputs. In short, I’d put fonts second on my list of files likely to pwn your browser, after Flash [...].
I've thought for some time that the only reason people have not exploited fonts to take over browsers is because even hackers don't understand how they work.
Note that browsers pass downloadable fonts through a sanitizer before they even consider handing them off to anything else that might need to parse the font. And browser security teams have spent years now fuzzing both those sanitizers and various font libraries...
There's still a lot of attack surface here, but "only just begun to experience life with hostile inputs" isn't quite true either.
> * Media query - So what, user agent gives this away mostly anyways
It doesn't; without media queries you can't detect thing like browser window size or screen pixel density.
> * Font checking - Can help fingerprinting...browsers need to start restricting this list better IMO (not familiar w/ current tech, but would hope we could get it down to OS-specific at the most)
There's a lot of trade-offs here. Plenty of people have fonts installed for various reasons (some because none of the system fonts cover a script they want, some because they're using fonts designed to mitigate some issues caused by dyslexia, etc.), and breaking it for those people would not be good.
> It doesn't; without media queries you can't detect thing like browser window size or screen pixel density.
Sorry if I wasn't clear. I shouldn't have said media queries, I should have said "CSS property queries". What CSS properties you have doesn't leak any more than your UA I would guess.
> There's a lot of trade-offs here [...]
I'll take it as an option to have a strict subset (though would prefer it as opt-out to discourage font-list-based fingerprinting as a practice, though metric-based may never leave). With downloadable fonts, I don't really like the "script they want" excuse. For accessibility reasons, I am admittedly naive, but I would assume it would be a substitute for an existing font name. Unique font names per user seem unnecessary.
>> Sorry if I wasn't clear. I shouldn't have said media queries, I should have said "CSS property queries". What CSS properties you have doesn't leak any more than your UA I would guess.
>> Unique font names per user seem unnecessary.
As a dev that has worked in CSS for over a decade, I have no idea what either of those mean.
EDIT: I see the unique font names part now, totally missed that while focusing on the other parts.
By "CSS property queries" I mean the technique in TFA using @supports + before/after-content URLs to query whether certain CSS properties are supported.
If somebody has fonts installed for specific reasons then presumably they don't want web pages changing them. Web pages shouldn't need any more control than selecting from "serif", "sans-serif", or "monospace". The most legible font is the font you're most familiar with. I don't want web pages to use different fonts just because some marketing drone thought it was good for branding. It's disappointing that we allowed websites to abuse fonts for vector icons.
Give one example where users benefit from the website forcing a special font. And even if that happens, the website can provide the font to the user. None of this requires the user telling the website host which fonts are installed.
All of this is an easy fix: disable css. In the same way that "I don't want to be tracked by javascript" can easily be resolved by disabling javascript. I'm not seriously suggesting everyone does that, but anyone who is so paranoid that they don't want a site knowing that they're reading its content might want to consider it.
Happily used to (5 years ago) surf the web with no JS and no CSS, or rather applying my own style-sheet for 90% of my web viewing. I'd fall back to Chrome when absolutely necessary. It was fast and comfortable, it just relies on well structured accessible content.
> it just relies on well structured accessible content.
Honest question: how much of this is left? What popular sites are still accessible this way? HN might be the only site I visit frequently where browsing with no js/css has any hope of working.
Try it and see. I block a ton by default[1]; most sites are just fine without it.
I get that some people have low tolerances for things not being perfect. CNN stories without JS usually have a pile of empty images at the top, for instance. But that is probably fixable; I just haven't bothered to figure out which bit of JS to allow for that.
Usability depends on your tolerance for imperfections vs. your tolerance for being observed.
[1] Current setup uses JS Blocker 5, uBlock, an aggressive cookie manager and my home proxy, which does a ton of things, many of which I don't even remember at this point.
Check surfraw (by Assange), you can actually access a surprising amount of resources using sr and lynx from a terminal. Text only, but still makes internet pretty useful.
I browse HN using w3m, from which I'm commenting right now. It works on probably 80% of the links I attempt to visit. In many cases it works better than a graphical browser: I only see article text, and for reasons I haven't investigated I often seem to be ignored by paywalls. I never see subscription nagboxes or ads.
Sometimes I have to search forward for the title to skip the load of garbage that precedes the article text.
While we, as devs, may get tired of the constant beat-down between site flexibility and privacy, many of our users are unaware. They will go blindly towards flexibility and we have a duty to find as much compromise as possible between those two values lest we just say "it's an easy fix, just turn off your computer". There has to be a middle ground between extremely paranoid turn everything off and extremely liberal with my anonymity (and on the internet, it's not governments who are going to help find it).
I guess my point is "how much anonymity is it reasonable to expect?" Should I have a problem with the fact that nigh-on every URL in the world will leave behind a little footprint when I request it? I don't see an enormous problem with a website anonymously recording the fact that I've clicked a link.
("anonymously" assuming I'm blocking their cookies, which I would if I were that paranoid)
> ("anonymously" assuming I'm blocking their cookies, which I would if I were that paranoid)
Note that cookies are mostly a convenience vis-a-vis tracking. For user tracking, there's nothing really stopping the server from vending you a version of their site with custom CSS that loads images with a fingerprint in the URL, which would still work with cookies disabled. That'll get you coherent signal on a session (gluing different sessions together would be a bit more challenging, of course, but I wouldn't be surprised if it were possible).
I don’t mind sites tracking to know what products sell and what doesn’t, what browsers people use or how Long I spend on the site etc.
What I hate, is the fact that I go to agoda, I search for hotels in jiufen in taiwan, I look at only 2, I book one of the 2, I close it. Open up Facebook on my phone seconds later and have adverts saying: hey how about these 2 hotels in jiufen.
That shit annoys me. Stop following me and tracking what I’m doing and sharing it with all these companies. It makes me want to not use the internet...
Sure. I think it's mainly interesting in that CSS injection vulnerability can turn into tracking. it never occurred to me that a CSS injection vulnerability could do anything actionable before.
> So what, the site could route you through a server side proxy anyways
There's little interest in proxying through a token system (this would require a DB read at each click, and a DB write at each page generation), which means the actual link is available client-side and the whole thing can be bypassed.
It's easy to design a system like this where the actual link isn't available client-side, and the server doesn't need to wait on a DB read and write before responding to the client: make the URL parameter be the destination URL encrypted so that only the server can read it. That kills the need for a DB read. Then the server can respond to the request before the DB write finishes since the integrity/consistency of that write is likely less critical than the response time.
> I usually fetch web pages from other sites by sending mail to a program (see https://git.savannah.gnu.org/git/womb/hacks.git) that fetches them, much like wget, and then mails them back to me. Then I look at them using a web browser, unless it is easy to see the text in the HTML page directly. I usually try lynx first, then a graphical browser if the page needs it (using konqueror, which won't fetch from other sites in such a situation).
Using lynx, the only thing that makes reading Hacker News somewhat inconvenient is the lack of indentation to show the nesting hierarchy, but otherwise it works quite well.
Some other sites are so messed up that it's actually more comfortable to read them in a text-only browser that completely ignores CSS and replaces images by their alt-tags.
Of course I frequently do want to look at images, so my main browser remains Firefox, but it's still useful to remember that other browsers with different tradeoffs exist and can be used.
Sometimes, you really just want to read some text and don't need any of that fancy other stuff.
You can see the indentation if you use w3m. HN uses tables to structure the comment hierarchy, and the w3m browser does a pretty great job rendering tables.
w3m sets the column width for the HN nested table spacer all to one value, so you visually only get two levels on nesting. Here is a screenshot of this thread as rendered by w3m:
I always browse HN using the links2 browser. No CSS! (Although elinks is pretty interesting, in that it's a text-only browser that implements some CSS.)
If a website doesn't look like it was made in the last couple of years (think: Medium-like centered content with large fonts), I click that handy reader view button out of a habit.
I can't stand reading articles with <18px font size. Some pages (like HN) I simply zoom in to something like 150%, but if it's just an article, hitting that button is easier to me than zooming in.
Links (a fork of lynx IIRC) does images, it might be off by default, can't recall. Back when I used Slackware it was handy to have a terminal based browser for looking up how to fix things.
A more interesting question would be if it is possible to disable the "dynamic" part of the CSS in any browser. Things like ":hover", ":active" that this proof of concept abuses and leave just the more benign static styling rules.
Probably not, you'd also need to disable a lot of other optimizations.
For example, a browser will not load an image if it's set to `display: none` in CSS (at least not right away). That could be abused to then trigger the download when CSS changes without a URL needing to be on the CSS at any point.
Even if that's set to true (the default), doesn't Firefox prevent the page from reading the :visited state of links? I'm not sure what the privacy value of that pref is.
Screen readers interface with normal browsers, so JS and CSS will be loaded as per usual (unless the user has gone to the trouble of turning them off).
I don't see what's problematic about this. The tracking is not really done in CSS, so much as on the server. You could accomplish the same thing with 1x1 images, or loading any remote resource. Effectively the only difference is you're loading the URL conditionally via CSS, as opposed to within a `<script>` or `<img>` tag. Furthermore, this can be blocked in the same way as any tracking URL.
I concede this is a novel way of fingerprinting the browser from within the client, without using JS. However, I think a better way to describe this would be "initiating tracking on the frontend without the use of javascript."
The difference is that CSS can trigger remote resource loads in response to post-pageload user behavior, which intuitively seems like a JS-only thing. For example, tracking where the mouse has moved, as mentioned in the readme.
I wouldn't say it's some sudden, alarming capability, but it is distinctly more capable than <img> tags.
Think about people using extensions like NoScript to block JS because it offers this functionality. This is fairly relevant to these people, as they clearly also need a “NoCSS” extension.
This stuff is relevant for sites that allow users to upload CSS to be used by other people. If I can make a subreddit or a social media page on a site and upload custom CSS for it, then I can make the CSS trigger requests to my own personal server on certain events and track the people who visit my subreddit/page. (Reddit adds certain restrictions to CSS that can be uploaded to it to defend against this.)
About 8 years ago, a colleague and I interviewed a nervous kid fresh from undergrad. He was applying for a junior front-end position at our fast-growing startup. Dressed in a shiny, double-breasted suit and wingtip shoes, he followed us into a tiny office (space was so limited) where we conducted interviews.
"Tell us about your CSS experience," we asked him.
"Ah, yes. I, well, haha, of course. The CSS is where you make your calls, to the database, ah, server, ah, of course."
Unsurprisingly, we did not hire the applicant, though his answer to our question lived on in infamy for many years. But all that changed, today, reading this. The joke was on us. That kid was clearly from a future of which we had no awareness. Starting today, I'll always trust programmer applicants donning double-breasted suits.
> However using my method, its only possible to track, when a user visits a link the first time
This suggests that browser history sniffing is still possible - as long as you make the user click the link (in contrast to the old a:visited method where this could be done with no user interaction)
Any part of a browser that can make a request can be used to do this sort of thing. Any part of a browser that can alter the view and its related DOM attributes can cause a user to interact with it and give up data involuntarily.
Turn off JavaScript and CSS media queries can cause resources to load based on a number of parameters. Have canvas enabled and you can be fingerprinted. Use one browser over another and get feature detected. Anchor states give away browsing history. Hell even your IP address sacrifices privacy, and that's before the page gets rendered.
So with that being said, if you're browsing the web, you're giving up information.
Very smart! This is a few line of code away from a css class based mini tracking framework...
Aside from the obvious, this could also be used as a fallback (restricted) A/B testing for no js users ? I'm thinking data about just what was hovered, clicked, and media query allows for some basic UI testing of responsive websites.
This doesn’t mention my personally favorite css tracking trick, timing attacks that can be used to detect what sites you have loaded. This can be done by interweaving requests to a remote URL (say background-image) with requests to your server script, which times these differences.
The fanciest tracking trick is the HSTS supercookie.
You use a bunch of subdomains -- a.example.com, b.example.com, etc. -- each configured so that a particular URL (call it the 'set' URL) sends an HSTS header. A different URL (the 'get' URL) doesn't.
You generate an ID for the user, and encode it as a bit pattern using the subdomains to indicate positions of '1' digits. Say your ID is 101001 -- you serve a page which includes images loaded from the 'set' URLs for subdomains a, c, and f. On later page loads, you serve a page including images loaded from the 'get' URLs of every subdomain, and you pay attention to which ones are requested via HTTPS. Since the 'set' URL sent an HSTS header, subdomains a, c, and f get requested over HTTPS, and now you reconstruct the ID from that: 101001.
I feel like this changes all the time; I was recently surprised to discover 'TLS Client Channel ID' (my nomenclature is a bit fuzzy - an RFC for automatic client certs "for security") and would love to learn more about the extent of its current implementation in Chrome.
>londons_explore: In Chrome, it also uses the TLS Client Channel ID, which is a persistent unique identifier established between a browser and a server which (on capable platforms) is derived from a key stored in a hardware security module, making it hard to steal. Ie. if you clone the hard drive of a computer, when you use the clone, Google will know you are a suspicious person, even though you have all the right cookies.
But how many users disable JavaScript in their browser to prevent tracking? And is the fact that a website can track all your clicks and mouse movements a privacy/security issue to begin with? Isn’t it by design that the website you’re visiting can track you?
By design, the web is a "1. send me the document <-> 2. here it is" transaction, not a series of many small notifications. By design, the url() property almost certainly wasn't intended to be dynamic. This is clearly 'bending the established rules' — cleverly, admittedly.
> By design, the web is a "1. send me the document <-> 2. here it is" transaction, not a series of many small notifications.
That was true in 1998, but most of the web has been turning, by design, into what you might call a series of many small notifications ever since then. Gmail's been doing it since 2004. Today, almost all large sites are running Google Analytics or something like it, which track everything this article discusses, and operates on constant micro transactions. All web apps are built on many small notifications, and many of them even use websockets which was explicitly, by design, built for streams of micro transactions.
> By design, the url() property almost certainly wasn't intended to be dynamic. This is clearly 'bending the established rules'
There was never a rule against, even if dynamic usage wasn't expected or imagined (which I find unlikely). CSS allows it, therefore it's allowed by design.
The `url()` function wasn’t intended for tracking, sure. But my point is that it doesn’t matter, since it is accepted that the website you’re on can track you to begin with. I don’t think anyone in the standards bodies is trying to prevent that.
Call me naive but, as a dev, I don't see why this would be any better than using JS. The group of people that block JS is likely to do the same for this and, as mentioned by others, common sources of such mucking are blocked by a good ad blocker.
Then, there is the whole, "how could it be integrated into an existing site with minimal fuss" issue. With JS you can specify targets and the like for actions and observations, the only comparable thing would be to offer sass / less integration so that it works with clients that disable or block JS, which is arguably much more difficult.
While it is definitely clever, I just don't see a practical use for it. It would really only benefit those willing to put the work into using it and only work so long as their logging URL is available and not blocked. I just don't see the real value.
I seriously think we need an alternative to HTML that axes styling and scripting and concentrates solely on the markup / content description. Websites would use a certain set of elements/descriptors to describe the content they contain. The user’s website reader would parse the markup / content description and display a page how it thinks it should be displayed (according to the user’s preferences). All websites would have the same styling – the one chosen by the user. This HTML alternative could provide an API that makes it possible to have dynamic websites but still prevents scripting and fingerprinting.
I know this is a joke but you will still have tracking in the rss reader or in the images loaded along side the articles. The only solution is paying for software that does not track.
It’s pretty trivial to make server side calls to google analytics [0] passing lots of different data using async commands so the user doesn’t even feel the hit.
Additionally you could queue these stats messages and send in bulk when your server load falls below a certain threshold. I’m not talking hours, just seconds. Like a workflow engine.
honestly, I don't care if you use your own tracking solution on your own domain, as long as it's not passing the data to a third-party.
I get that some of that data is genuinely useful in determining what parts of an app are popular and what is not. Even though I don't like being tracked for dumb shit like ads, it does have valid uses.
Most trackers are built by third parties. This is true for analytics tracking and for ad tracking. Few companies ought to track their own impressions for many, many reasons.
Why do you think it's mostly advertisers building CDN's? It's not because they really wanted to make the web faster. These are typically let through by ad blockers.
For many users (i.e. desktop) bandwidth is something you're throwing away if you're not using it. I'm not saying this is a good idea in all cases, but it might be in some.
Very interesting, it's always intriguing to see how much of a cat and mouse game this privacy stuff is. I'm always thinking that this needs an overhaul and slightly different approach altogether, sadly I can't produce any viable solutions.
With this huge and complex kind of issues I don't think we have to find one solution but rather point in the right direction, but I'm not even sure we're doing that.
In my opinion the new approach should be: Let websites deliver content and let the user’s website reader interpret the markup / content description and style the page according to the user’s preferences. Websites shouldn’t be able to style and script themselves any longer.
The website should load more content when the user scrolled to the bottom? Let the website reader retrieve the content itself. The website wants to know the dimensions of the viewport to load the appropriately sized image or change the layout? Tough luck, this is none of the website’s business! Let the user’s website reader handle this.
This is where the talent, funds, and resources will go as ads and marketing are industries with lots of funding available. Even more tracking and of even more pervasive kind. We hate tracking while we bet our time and money on it. The web is cancelled, go back home everyone.
Is there a way to turn off CSS media queries in Firefox, or fake their conditions? Apart from the security issues, it's plain annoying when the page layout will change completely because a few pixels of window size are missing for the perfect experience.
This could easily be stopped by a change in browser behavior. If web browsers downloaded contacted every address specified with `url()` automatically on page load, without considering the conditions, this type of conditional requests would be impossible.
Conceivably, you could solve it through a simple browser extension that looks through all of the page’s stylesheets and calls all URLs present in the CSS before the page is rendered.
In an ideal implementation, though, URLs dependant on “static”, non-identifiable conditions, such as an image with `display: none`, would be left alone.
The obvious solution is to block the server-side pages that the CSS elements link to. This kind of tracking can be mitigated the same way any other kind of tracking is already handled by uBlock or uMatrix.
uBlock can't block manual tracking...just third-party scripts that do it.
Example:
You visit example-site.com
example-site.com is the php server that sends you the html. It also the site that does the tracking. So when you click something it sends that data to example-site.com and then it can forward the data to a third-party tracking service.
If you blocked or used host files on the server-side pages then the site example-site.com would be completely blocked too.
Ultimately if everyone uses ad-blocks to block tracking script they can be added to the back end. If you block the back-end you effectively block the website you are accessing in the first place.
Probably not everyone would be willing to create their own user tracking solutions, most websites use third party analytics, which can be handled by generalized rules. For those that do roll their own solutions, per-website block lists would be needed, but that's how site-specific adblocking already works. The lists are maintained by the community and updated very frequently.
* user and machine readable content (text with hyperlinks, pictures, audio, video, rest)
* universal app store (javascript, css, intents, permissions...)
Every user could consume or style content as he wishes. If my IDE has dark theme, I want all web pages to have dark theme. Why do I need javascript to read news or browse pictures.
If user wants to installs app from app store he should accept software license and give permissions to that application.
This is an interesting concept, but I'm not seeing anything that couldn't already be done with a properly set up website and server logging.
Things like "@supports (-webkit-appearance:none)" doesn't give you chrome detection. It gives you webkit detection, which is a rather large subset of the whole. Plus some of the other browsers started supporting webkit prefixes.
It would only, at best, give you outdated browser versions. Unless you are going to create a huge set of rules checking certain properties against other properties. Plus it doesn't tell you which browser, only maybe which webkit engine version. Which tells you next to nothing.
”Interesting is, that this resource is only loaded when it is needed (for example when a link is clicked).”
The resource is retrieved using GET, so I wouldn’t think that is required by the http standard. If so, browsers can mitigate this kind of attack by pre-fetching these resources (even pre-fetching a fraction at random already might be enough)
It's not after event, the "after" is for inserting a pseudo-element. That then sits inside the original link element and hits the tracking URL by trying to load a resource from it when the link is active.
Tracking seems to only really be server side. The css just dispatches requests with qs params. Probably not an ideal production tracking solution as it severely limits the data you can send back for better analytics
How does 'check spelling as you type' work, via a dictionary that is previously downloaded, or is this an online service that leaks all/or some of your key presses?
* Link click tracking - So what, the site could route you through a server side proxy anyways
* Hover tracking - Can track movements of course, but doesn't really help fingerprinting. This is still annoying though and not an easy fix
* Media query - So what, user agent gives this away mostly anyways
* Font checking - Can help fingerprinting...browsers need to start restricting this list better IMO (not familiar w/ current tech, but would hope we could get it down to OS-specific at the most)
If you're concerned as a site owner that allows third party CSS:
* You should have stopped allowing this a long time ago (good on you, Reddit [0] though things like this weren't one of the stated reasons)
* You have your Content-Security-Policy header set anyways, right?
Really though, is there an extension that has a checkbox that says "no interactive CSS URLs"? I might make one, though still figuring out how I might detect/squash such a thing. EDIT: I figure just blocking url() for content and @font-face.src would be a good compromise not to break all sorts of background images for now.
0 - https://www.reddit.com/r/modnews/comments/66q4is/the_web_red...