Yet another demonstration, if you needed it, on why blacklisting user input to avoid code injection is highly unlikely to succeed. There are many ways to skin a cat, and you only need to miss one.
You can also assume that the bad guys start with a list similar to this and tools to semi-randomly perturb their inputs until they find the right combo of hocus-locus to get past your filter and then "neutralized" by your regular expression magic into functioning exploit code.
(n.b. I'm talking about more interesting attacker goals than forcing a reload, obviously.)
What do you mean precisely? Normally you prevent code injection by not allowing user-supplied strings to be interpreted as JavaScript in the first place, not by trying to catch all the possible ways in which one can construct a script doing something harmful. There aren't so many ways to inject a <script> tag or something equivalent into the page.
Off-topic: Have you been hanging around tptacek a lot lately? You have suddenly started giving out security advice. A good thing, but quite noticeably different. ;-)
I've worked with Thomas (not primarily on the security side of his house), consider him a friend, and go out to dinner with him every time we're in the same time zone. That has probably increased my interest in web application security, since I nurse secret dreams of being a web pentester. (He would probably say much the same about running a primarily product business, depending on the level of seriousness we had in the conversation.)
That said, every professional engineer should understand enough about security to carry out their duties to their customers. Accordingly, I've been peripherally interested in it for years. See, among many others:
True as that may be, this doesn't really demonstrate that— it's just 535 different ways to access `location`. A regex would trivially disable every one. Of the things on that list.
I'm not saying there's a trivial regex solution to XSS, I'm saying there's a trivial regex solution to disable every one of these so-called "ways" of reloading the page: s/[\.\[=]//g will do the trick.
I mean, s/location//g will do the trick too, but I was aiming a little more generic.
In case the import of this epic retort is lost on anybody: in the real world, you have to pick and code your defense and then the attacker, who gets essentially infinite time to observe the behavior of your defenses, gets to pick their attack string(s).
Do not rely on regular expressions or blacklists to sanitize code for you. It will not end well.
...and if that were on this list of ways to reload a page in JavaScript that I'm talking about, it would be relevant here.
Look, just read what I'm saying. I swear I know what I'm talking about, and I swear I'm not saying you can reliably sanitize JavaScript with a regex. Really.
This must be like the FizzBuzz thing— programmers just can't resist a challenge, even if it is explicitly marked as not being a challenge.
See, there's another clever trick that actually does something interesting. Really, there's a whole class of toString solutions missing from this list... but I guess you have to stop at some point.
and things that mean exactly the same thing. As others here have pointed out, why no meta tags? Form submissions? Surely there's another way or two as well. The lack of creativity in this list is rather astonishing - if it were sorted, it'd be merely annoying because of the blatant repetition.
1) one of hundreds of equivalent syntaxes for the AST that assigns to location, and
2) location.reload()
So if we're going to play this game and these count as different, why not encode the identifiers in hex notation? Or replace the identifiers with lambda calls to compute them? Or eval()?
Seems like if these 535 ways count, so should the other infinity of them.
That's not completely true— location.replace() and location.assign() do slightly different things. But, yeah, this is basically 535 different ways to spell location = location.
And not even very interesting ones. window.window is window, so you can just go nuts with the chaining. So is window.self, window.top, window.frames... window.self.top.frames.location = frames.top.self.window.location? Now we're talkin'.
Yeah, not very exciting. Once you go down that road you pass by window['l' + 'ocation'] on the way to window['loc' + (''+![])[!![]+![]] + (''+!![])[-![]] + 'ion'] and so on. I would have completed it but it's 2:30 and I'm tired.
Interesting to see so many ways to do it, but what's missing is explanations of pros and cons of the different approaches (though I'm sure there probably aren't many differences). Why not just stick with one then?
No real difference, but on an older browser, you might shave a few milliseconds off by using "window.location" instead of just "location", as the latter will need to traverse from local scope up to the global (window) scope.
Is there any way which doesn't break the back navigation button (since upon pressing back the JavaScript statement will execute again, causing an immediate change of location and jumping forward again in the history)?
I thought about checking whether history.forward is empty as a condition for changing the location, but I don't think you're allowed to do that check...
Alternatives such as meta refreshes and non-JavaScript solutions are cool ways to solve this, but for POST calls it remains an issue...
You can use history.replaceState in newer browers to modify the URL in the address bar without affecting the history. It doesn't actually load the new page, though.
There is a javascript function that allows you to change the url, so you can change the url when the page changes (like infinite scroll), so that going to one url and back will land you on the page you left (I think it is pushState() ).
Several years back I was on a job where the boss wanted to allow virtually any user-provided HTML/CSS/JS in a content area, while preventing redirects.
No implementation ever happened of course, but my first thought was, what would happen if we did a "<script>delete(window.location);</script>" near the top of the template?
Answer: nothing. But what would the implications be if browsers allowed it?
Interesting. I did a few tests and it seems Chrome will let me override window.location with __defineSetter__() but not document.location, while Safari was the other way around. Firefox 3.x won't let me redefine either one ("TypeError: redeclaration of var location").
File under: Unintended Consequences. Also, Postel's law could probably be tricked into admitting itself as an accomplice.
1. Browser based Javascript provides a location object for managing and accessing the current browser location, and this location object is available as a global variable.
2. Browser based Javascript also provides a special case in the interpreter/processer/etc., where setting the location object equal to itself will reload the current page. This is also true for certain properties of the location object (href)
3a. There is also a more conventional reload method on the location object which accepts either a location object or string href. Also, many of the "go to this URL methods" exposed to Javascript will interpret "go to the url I'm at" as a request to reload the page. Many of these methods will accepts a location object, or a string representation of a URL as a paramater.
3b. location.href is a string representation of a URL
3. There are many ways to access global variables in Javascript. There are many ways to assign a value in javascript. There are many ways to call a method in javascript.
4. All of the above can be combined into lots (likely more than the 535) of ways to achieve the same thing.
If the original title begins with a number or number + gratuitous adjective, we'd appreciate it if you'd crop it. E.g. translate "10 Ways To Do X" to "How To Do X," and "14 Amazing Ys" to "Ys." Exception: when the number is meaningful, e.g. "The 5 Platonic Solids."
I think in this case the number is meaningful, because the point of the list isn't to show us X amount of ways to actually refresh a page with javascript, but rather to highlight the (slightly humorous) vast number of ways it can be done, due to the ambiguity and redundancy found in Javascript (and in the DOM).
I would have guessed that this number would be high, but definitely not in the 500's.
You can also assume that the bad guys start with a list similar to this and tools to semi-randomly perturb their inputs until they find the right combo of hocus-locus to get past your filter and then "neutralized" by your regular expression magic into functioning exploit code.
(n.b. I'm talking about more interesting attacker goals than forcing a reload, obviously.)