Is it an XSS if you can only make it output code to your own browser? I can already execute whatever JavaScript I want in the console, so what's the advantage of having the server deliver that code to (only) me?
I can definitely see the issue if the server saves the unfiltered input and tries to print that out for other uses, but it seems to me that outputting a raw $_GET variable will only go to the requester and therefore could only run unfiltered code on that requesters machine.
EDIT: Answering my own question:
This is a security hole because the unsafe JavaScript is stored in the URL for the page (it is a GET parameter). This bad URL could then be sent as a link in a spam email or similar. Victims clicking on the link would then see a page that comes from the legitimate source, but is running unsafe code compromising that user's session. This sort of attack relies on the attacker distributing the link with the bad code as a URL parameter, and is not a vulnerability that a user could encounter when just visiting that site as I had first assumed.
That's _exactly_ what XSS is about. One possible way to exploit things like this is if I send you a link to a website, that embeds the target page through an iframe with javascript output injected. I could then have the JS steal your cookies/session or worse.
Or bit.ly/shortener links - you got it. I used to think like you did - what harm is there in 'exploiting' my own browser? Took a while for the penny to drop in my case.
Yes. They are referred to as "reflective" or "non-persistent" XSS vulnerabilities. The attacker might exploit these using for example an "invisible" iframe on a different website, and thus loading the vulnerable website, in the background, with the desired parameters.
This will result in the malicious javascript being executed "on the vuln. website", in the victims browser.
Heh, cute. This means Github could probably do some automated means of informing these people that their code is insecure and would be a danger to themselves and their users. I'm not sure if they should, but it's interesting that they could.
This is a potentially great idea. You could make your build process include submitting your code to a search engine like this (perhaps in some obfuscated manner) and making illegal patterns fail if not manually "approved". Just because the halting problem exists doesn't mean there's not a low hanging fruit in approaching it.
Tons of code review tools already exist out there, runnable on your CI machine. Sonar and Fortify are a few off the top of my head, but there are many, many more.
Nothing new here. Google code search a while back was similar. Also search for index.php~, a great way to see the source for an index page with a unix temp file that Apache will forward as plain text. Oops.
Not sure if this search would be good, but your responses triggers the id of a distributed code review like reCAPTHA. Possibly a pattern search brings up suspicious code and then there is distributed review and once enough people flag it as SQL injection, it could be flagged on Github and someone could submit a patch, etc. Clearly not bulletproof, but this would provide a way of doing code review on projects you might have no other interest in helping.
This isn't a search for SQL injection, its a search for a couple things that you often find in older PHP code that is generally hacked together and likely to have SQL injection vulnerabilities for historical and cultural reasons. However it's perfectly easy to avoid SQL injection even using these things.
$id = mysql_real_escape_string($_GET['id']);
$res = mysql_query("SELECT foo FROM bar WHERE id='$id'");
That may be ugly, but it's bulletproof regarding injection.
"mysql_real_escape_string" is the silliest function name ever. I assume there is a "mysql_escape_string" function that doesn't do what you expect it to do?
PHP is a simplistic wrapper around the C API's, so you get all of the legacy of C without any of its performance and memory size benefits.
PHP has some of the insanest defaults due to its C heritage. String functions deal with bytes not characters, so cannot be used safely with utf8 without setting the mbstring.func_overload setting to replace them with unicode-aware versions (except for str_pad, which always deals in bytes). Sort() defaults to binary sorting, and cannot be tricked in any way to sort utf8 in dictionary order if you're running your server on windows (and even on linux it requires an extra parameter on every call). Natsort(), which is supposed to sort like a human would, cannot be made to sort in dictionary order at all. The proper way to sort is by using the Collator class, which is not referenced from the sort() documentation, didn't exist before PHP 5.3, and is in the optional intl extension which is usually disabled by default.
Still better than mysql though, which has a very unique interpretation of unicode collation.
Surprisingly, no one mentioned that mysql_real_escape_string still fails to properly escape a string when the character set has been changed using a SET NAMES or SET CHARACTER SET query. Never mind the big yellow warning sign in the manual.
The difference between the two is _real_ reacts based on the current connection's character set, and the other doesn't. Yet another reason to jump on the PDO bandwagon is to avoid having to know that subtle difference.
Currently, the use of PDO is preferred and anything involving the mysql libraries should be avoided, and support for them is being deprecated in PHP anyway.
basically the argument appears to boil down to mixed character sets causing escaping not to act as predicted. I can't speak to the validity of it though.
Support for the mysql_ library is being deprecated. Support for the mysqli library is alive and well (although I personally think people should adopt PDO anyway.)
It's not. The preferred method of preventing SQL injections is via prepared statements. mysql_real_escape_string is only suitable for strings (as the name implies). Something like
SELECT * FROM table WHERE id=$_GET['field']
where $_GET['field'] has been passed through mysql_real_escape_string is still vulnerable. Using prepared statements forces php to send data to the DBMS in such a way that it cannot confuse user input from the actual SQL. This is due to the fact that preparing data forces you to give types to the data before you use it in a query. Escaping input (such as with mysql_real_escape_string) makes this confusion still possible.
Most of the search results show exploitable code. Regardless, this is a search for SQL injection vulnerability, even if not every search result falls under that category.
use PDO;
use PDOException;
/**
* Used for interacting with the database. Usage:
* <pre>
* $db = Database::get();
* $db->call( ... );
* </pre>
*/
class Database extends Obj {
private static $instance;
private $dataStore;
/**
* Sets the connection that this class uses for database transactions.
*/
public function __construct() {
global $dbhost;
global $dbname;
global $dbuser;
global $dbpass;
try {
$this->setDataStore(
new PDO( "pgsql:dbname=$dbname;host=$dbhost", $dbuser, $dbpass ) );
}
catch( PDOException $ex ) {
$this->log( $ex->getMessage() );
}
}
/**
* Returns the singleton database instance.
*/
public function get() {
if( self::$instance === null ) {
self::$instance = new Database();
}
return self::$instance;
}
/**
* Call a database function and return the results. If there are
* multiple columns to return, then the value for $params must contain
* a comma; otherwise, without a comma, the value for $params is used
* as the return column name. For example:
*
*- SELECT $params FROM $proc( ?, ? ); -- with comma
*- SELECT $proc( ?, ? ) AS $params; -- without comma
*- SELECT $proc( ?, ? ); -- empty
*
* @param $proc Name of the function or stored procedure to call.
* @param $params Name of parameters to use as return columns.
*/
public function call( $proc, $params = "" ) {
$args = array();
$count = 0;
$placeholders = "";
// Key is zero-based (e.g., $proc = 0, $params = 1).
foreach( func_get_args() as $key => $parameter ) {
// Skip the $proc and $params arguments to this method.
if( $key < 2 ) continue;
$count++;
$placeholders = empty( $placeholders ) ? "?" : "$placeholders,?";
array_push( $args, $parameter );
}
$sql = "";
if( empty( $params ) ) {
// If there are no parameters, then just make a call.
$sql = "SELECT recipe.$proc( $placeholders )";
}
else if( strpos( $params, "," ) !== false ) {
// If there is a comma, select the column names.
$sql = "SELECT $params FROM recipe.$proc( $placeholders )";
}
else {
// Otherwise, select the result into the given column name.
$sql = "SELECT recipe.$proc( $placeholders ) AS $params";
}
$statement = $this->getDataStore()->prepare( $sql );
//$this->log( "SQL: $sql" );
for( $i = 1; $i <= $count; $i++ ) {
//$this->log( "Bind " . $i . " to " . $args[$i - 1] );
$statement->bindParam( $i, $args[$i - 1] );
}
$statement->execute();
$result = $statement->fetchAll();
$this->decodeArray( $result );
return $result;
}
/**
* Converts an array of numbers into an array suitable for usage with
* PostgreSQL.
*
* @param $array An array of integers.
*/
public function arrayToString( $array ) {
return "{" . implode( ",", $array ) . "}";
}
/**
* Recursive method to decode a UTF8-encoded array.
*
* @param $array - The array to decode.
* @param $key - Name of the function to call.
*/
private function decodeArray( &$array ) {
if( is_array( $array ) ) {
array_map( array( $this, "decodeArray" ), $array );
}
else {
$array = utf8_decode( $array );
}
}
private function getDataStore() {
return $this->dataStore;
}
private function setDataStore( $dataStore ) {
$this->dataStore = $dataStore;
}
}
I don't know much about PHP but I happened to rewrite some old forms a couple of years ago. The original author had relied on a technique called "magic quotes" (http://php.net/manual/en/security.magicquotes.php) which automatically sanitized user input. When we upgraded our version of PHP "magic quotes" had been deprecated and dropped.
It would be interesting to know if some of these developers are relying on "magic quotes" or something similar... and also to know how large share of the total number of projects these projects represent.
There's more low-hanging fruit, if you're willing to use more specialized searches. For example, guess what mode of operation the PyCrypto library uses by default for all its block ciphers if you don't explicitly pick a sane one:
Looking around I found a simple CMS sold to small online stores. Through their links you can find a listing of their customers (people who use their CMS). Problem is the CMS is open to SQL injection everywhere. If a script kiddie found this info they could take down a lot of online stores. Not good.
That's an example of hazardously bad programming practices. You're one mistake away from complete disaster. You should be sure that it takes more than one mistake to expose you to that sort of risk.
Casting to int is not a general purpose escaping system, and further, if you miss even one of these your entire application can be trashed.
Using mysql_query at all is a sign there's something severely wrong with your application.
It's a sign that the developer has had a fairly informal introduction to SQL. Personally I see no reason not to use the PDO extension in any situation where you would use mysql.
Yes, and given that (IIRC) $_GET always returns strings anyway, casting to int before adding to a database makes sense. Even so, you should still attempt to validate the string with ctype_digit() and make sure it accurately represents the integer you expect. If you just cast directly to an int, you can't really predict the results.
And also, doing the cast inside the sql statement makes it difficult to see. It should be done, if at all, outside the query where it's obvious to anyone looking at the code that this is something that should be paid attention to.
This is an actual, serious problem which I would note in a code review, as opposed to the int thing which cannot, as far as I can can tell, ever lead to an exploit or malfunction which would have been avoided by using a named escape function in this code. If anyone can think of a specific example to prove this wrong, please say so.
It still doesn't check if a particular id belongs to the user, so you can delete all the items in the table. But I agree that's different kind of problem :)
The search obviously doesn't find all cases, but is a good start.
While there's nothing technically wrong with the example given, I might argue that since that won't work in all cases, it might be better to enforce a more rigorous policy of SQL query cleansing, or using bound params. Although this example is so simple I might not.
Then again, the fact that $_GET is even available at the location the query is taking place means this is most likely a type of design that I abhor, that PHP makes easy. Put actions in functions or methods, and then call them.
Pretty much every language will make getting direct user input then passing it to a database easy. What generally makes this less easy (or at least less intuitive) is a framework. Don't compare the likes of Rails or Django to PHP. Compare Laravel4 or Symfony2.
That doesn't mean PHP doesn't deserve some stick, it does, but most of it's current reported problems spawn from backwards compatibility. Nobody should be using mysql_*, they should be using prepared statements via PDO.
They could solve this by deleteing all the functions you're not supposed to use, but a whole bunch of legacy PHP would stop working. I'm fairly sure this would illicit more hate than the current method of slowly deprecating.
What I meant is that $_GET, $_POST and $_REQUEST aren't available in functions unless you declare them global. The fact that $_GET is being used in the query means either that the query is being performed in the main of that script (probably lots of small php files meant to be called by the browser), or that they are in a function/method and then pull in the global $_GET array.
IMHO, neither are good design choices, as one couples the program to HTTP too tightly (and not in a sane way), the other leads to crazy spaghetti PHP as the control flow can be affected by global variables set (or received from the user) far from where they are used.
I would like to say i'm surprised, but I'm not. PHP makes this easier by not even supporting parameter binding in the older, original mysql binding, so it's more prevalent.
That said, I'm sure a slight tweak to the search would find a lot in other languages as well.
If you do it right, it won't. Many security audit companies are asked to do test audits, and I've seen security audit contracts starting with responsible disclosure of vulnerabilities. Of course, certain amount of trust is necessary so the sides have to behave in a way that is conductive to the establishment of the trust.
The person who checked in code isn't necessarily the person who wrote it in the first place.
When dealing with legacy code, especially code going back into the 1980s for example, it's not at all unusual for the code to have been stored in several different version control systems over that time. I know of projects that started with SCCS, moved to RCS, then to CVS, then to Perforce, then to Subversion, and most recently to Git.
If those transitions are done quickly, then somebody will often just take all of the code from a checkout of the old VCS, and check it into the new system.
The same can happen when initially using a VCS, after having not used one before, especially when taking a project over from a developer or even a team that was less-talented. Even if the new developer(s) are going to review the code, and fix bugs and other flaws, a version with the initial state of the code is often a useful thing to have.
I guess people would pay for a service that could identify 90% of all security issues with an online service by going through source code and available routes. Anything that is available today?
Tinfoil scans remotely for known vulnerabilities (i.e. WordPress version X has exploit Y). What you really want is source code analysis. WhitehatSec.com and Veracode.com are the two huge players in this space.
I believe the ext/mysql is deprecated, not the function names. Mysqlnd is a drop in replacement and unencumbered by the copyleft license issues that plague the original extension.
The mysql_* function API is part of ext/mysql and is deprecated — mysqlnd is a lower level library that basically replaces libmysqlclient in the stack (and is used from PDO and mysqli as well).
I'd be interested in a cite on the licensing issues, incidentally; I'm not aware of anything. (There _is_ an issue with the JSON extension at present, though, so it's not impossible.)
Without some real-world context for each project, we just can't say whether the code we're looking at represents a real security problem. I'm certain that github hosts just as many one-off, let-me-scratch-some-code-together-that-I'll-never-use-again type projects as it does hard core, production quality ones.
That said it's a neat technique for quickly auditing code. Someone should now write an automated tool for submitting security patches to all of these projects.