Example HTML output in a user's profile: Would you like to contact ${NAME}? Wher...

Tehnix · on July 22, 2016

But that script tag would be taken care of in the input sanitation step. You normally remove all hints of HTML tags on input sanitation, which renders output sanitation a moot point.

tetrep · on July 22, 2016

Unless you application has a static mapping of input -> output that never changes, you can't properly sanitize input for all potential output contexts. The string ';alert(1) is perfectly safe to drop in between HTML tags, but can be very dangerous in JavaScript, but only if it's inside a single-quoted string.

You can try to filter for anything that may be potentially dangerous, but that's going to make a very long list of invalid inputs and once again you're playing whack-a-mole, hoping you correctly sanitize your input for all potential output contexts (unless you go through and re-sanitize all your user data whenever you add a new output context, which is a bit absurd).

From a programming perspective, it's akin to a function not checking that the input it has received is valid (because the caller is always going to do that...).

nommm-nommm · on July 22, 2016

>>You normally remove all hints of HTML tags on input sanitation

Then what happens when you want to use that input in an excel export? PDF export? CSV file? Text file? How about if you want to use it in an HTML attribute? In a URL? Export the database elsewhere? (Such as a credit card company reporting to the CSAs). You can't assume that your data is going to be inside an HTML page between tags always because that mucks up your data. Data should be able to be used in many different ways because it will be and should not be tied to HTML.

vollmond · on July 23, 2016

Ok, this is the comment that best explained it to me -- you want to sanitize (escaped, etc, whatever) output because, even if you sanitize all HTML/CSS/JS on input, they might have inserted malicious Excel scripts or PDF exploits, etc, that eventually do get executed in an output context.