Oh? So how do I use it to detect which box model to use in my CSS?
How do I use it to detect whether I can use gradients?
And if you're just thinking about javascript, I've ran into a few text manipulation functions that always returned "not implemented". But hey, sniffer says they're there!
edit: Oh, and there's still plenty of html fun, too. How does your browser support <video>? <thead>? <svg>? I'm guessing the answer starts with "javascript".
The user-agent has two implied pieces of functionality:
1) Describe the device that the agent is coming from (operating system)
2) Describe the capabilities of the agent (this browser, those plugins)
One of the things I loathe about the user agent header is the lack of reasonable maximum length, and the inconsistent way in which developers have overloaded the value. Parsing it is difficult (especially given that the length means there is a lot of scope for bad input).
I would love to see user agent be a virtual header comprised of other headers.
The other headers would not be mandatory, but as most browsers would provide them you could reasonably use them in most cases.
These other headers may be things like:
os: Windows
os-version: 7
client: Gecko
client-version: 16
plugins: [{'flash':11}]
Basically... same info but more structure with known acceptable types for certain values.
Headers taking uncompressed space it would also be helpful if shorthand names were accepted: c-v for client-version, etc.
This is me thinking aloud, and perhaps it's an idea that has been thought of before and rejected... but by offering User-Agent as a virtual header that is comprised of all of the other headers you maintain some background compatibility whilst providing something easier to parse, use and trust for developers.
The problem with "fixing" the user-agent string is that making it easier to pass/use only means web-developers will find it easier to continue to abuse it.
In an ideal world web-developers should be testing if individual pieces of functionality exist rather than inferring what is supported based on the browser.
I think JS does a fairly good job of allowing developers to test for functionality, unfortunately CSS does not. I am well aware that it is meant to "fail gracefully" but to a lot of developers they want to supply alternative looks where functionality isn't available and CSS doesn't lend itself to that.
So you wind up inferring CSS support from JS support which is just as broken as inferring JS support from the browser's version/name/platform.
The user agent string is too loaded with backwards compatibility to remove or change. So the next best thing to do is supersede it - add a new agent-id or some such which is mandated by standard to be in the form "BrowserName/Version", e.g. "Chrome/22.0" or "Firefox/15.0.1", while keeping the old user-agent. Problem is I guess it's not really worth it - it doesn't expose any new information not already in the user agent, and it doesn't stop site authors relying on specific agent-ids. So I guess the way forward is try to ignore the user agent completely and just use feature detection.
I enjoy sites which provide me with the correct download link for software, based on the fact Firefox places "Linux x86_64" in my User-Agent.
Having said that, the flip side is just as bad, when you get completely rejected from sites because "We haven't tested this site for your browser or operating system". It's a website for crying out loud. I don't get too mad at sites which implement this as long as they provide me a way to continue "at my own risk". However it's the final straw when they flat out refuse to serve me anything other than a page telling me they haven't tested the website for my browser/OS combination... sigh
Edit: This might give some food for thought for AshleysBrain. I like your suggestion, but am curious if we can find a way to send OS and architecture to sites so that they can give me nice download links...
I wonder why browsers with a modern automatic-update process don't set their user-agent to something that discards all this madness ("Chrome/23.4.5678 (Windows)", or similar) for the cutting-edge/nightly builds only (or even betas, if they wanted to discourage casual users from switching to them, but I don't think that's the case at this point). Surely their users have signed up for a little breakage in exchange for the latest features? And if they actually get website operators to stop or at least fix their sniffing, the whole prisoners-dilemma situation would disappear.
(I guess this assumes that the huge user-agent that my Chrome is currently sending is necessarily bad, and in the real world maybe no one really cares...)
When we actually do this, it does not necessarily convince publishers to fix things. For example: for several months Mozilla has been testing one tiny reduction to the User-Agent string in Firefox Nightly builds (replacing "Gecko/20100101" with "Gecko/16.0"). Zillow.com is the highest-profile site that is broken by this change, and after five months they still haven't even responded to any of our attempts to contact them: http://bugzil.la/754680
It's much better to resist adding things to the UA in the first place, since removing anything later on is a huge pain and inevitably breaks things for users. Mozilla has managed to keep the UA relatively minimal (and successfully reduced it a bit in Firefox 4): https://developer.mozilla.org/en/Gecko_user_agent_string_ref...
You're implying that if nightly builds of a browser with a simplified UA broke a website that the website owners would fix their code, but that is unlikely to happen. Most websites, particularly the sort with bad UA sniffing, have a high cost to change (engineering, QA, making releases) and no incentive ("it broke on the new Chrome, probably a Chrome bug").
Even a relatively flexible company like Google gets UA sniffing wrong for many of its domains. At one point (as an author of Chrome and an employee of Google) I tried to track down the right people to get things fixed and ran into more or less the above problems. (The non-Chrome non-Safari webkit browsers these days must spoof Chrome to not fall into some "other" browser bucket.)
Ah, the pain of working with the development process of websites. I still remember the Hotmail globalStorage fiasco that led to Firefox 13.0.1 putting it back temporarily:
https://bugzilla.mozilla.org/show_bug.cgi?id=736731
1. I think the most interesting thing about that blog post is that it illustrates how the incentives in standards building get warped. I like to describe this sort of thing as "the effect of economics on programming" - not because there is money involved, but because of the nature of the incentives.
2. Graceful degradation. We've sniffed UA's from the minute they were invented. Any change whatsoever would create untold problems for untold millions of people. The UA is just an arbitrary string so… who cares? Very few people (you and I are amongst these "very few") have to be concerned with this compared to the people such a change would affect.
It's because of 1 and 2 (my second point is really an instance of the first) that we're stuck with Javascript. No one in their right mind thinks it's a good language, but getting all the different browser vendors to adopt a good bytecode would be nightmarish (and not necessarily in the interest of every browser vendor).
Imagine the problems that would cause! You have Chrome 58 Beta, and stuff works one way. Then they say it's good and release Chrome 58 final, and all of a sudden, stuff changes all over the web.
UA string is just one example of unfortunate hacks that evolved in the web protocols. Compared to probably everything else in HTML it's probably just not even worth it to consider fixing it. We'll always need the old string for compatibility, so it's really only to save a few lines of parsing. Compared to the nightmare of parsing rules for HTTP and HTML, it's not even relevant.
Not only the user agent, either. Try javascript `navigator.appName` in any browser, and you'll get "netscape". `navigator.appCodeName` in most browsers returns "mozilla".
Mike Taylor gave a talk about this and more at yesterday's GothamJS conference:
I wanted to try making an HTTP request from Telnet the other day. I tried Wikipedia, using the Host header. I got a 403 for not including a user agent, so I tried again with User-Agent: Telnet and it worked!
It's one of the most important headers for clients, since if you don't include it you might not get a 200.
In the particular case of Wikipedia, I think they check User-Agent to prevent people from unthinkingly wasting gigabytes of bandwidth scraping Wikipedia via tools like wget. In Wikipedia's case, better ways exist to download large quantities of their content in a more usable form.
They may do that ('though requesting a single article works fine), but it's not very smart. Throttling heavy users - possibly returning 429 with a link to the download pages - would make much more sense. It's not like wget users can't change their UA.
I return a 403 if User-Agent or Host headers are missing. And my firewall will lock you out completely if you use "User-agent" instead of "User-Agent" (among many other obvious giveaways in the User-Agent header).
I block anything that looks like penetration testing or content scraping if there's no chance of false positives. Even when there's no vulnerability present, it conserves resources on dynamically generated sites.
This ugly quagmire makes me wary of compatibility fixes where mimicking another browser is somehow involved. When i heard about non-WebKit browsers adopting -webkit CSS vendor prefixes, the user agent string mess was the first thing that came to mind.
The problem with the user agent is that you can't fix it without repeating the same cycle. All you'd do is make it easier.
It's a good depiction of the issues you have with trying to write code once, and have it work the same in many different environments, though. It's just with browsers, rather than operating systems or hardware.
And then JavaScript driven feature detection came to be, and everyone thought it was a good idea. And the people wrung their hands and wept