Google results for PHP tutorials contain SQL injection vulnerabilities

ChuckMcM · on July 25, 2021

Pretty much. The best way to insert supply chain exploits is to embed them in a stack exchange answer to a beginner's question.

This isn't new, we've always had programmers who programmed by "recipe" rather than first principles, and DRY paints that as a feature, but it underlies a lot of pain and cost over the years.

To give some context, I inherited some kernel code when I worked in the Systems Group at Sun Microsystems in the 80's that was written by a mathematician who had become a programmer because the money was in programming, not applied math. They had cut and pasted code they didn't understand in order to achieve the result they wanted out of the code they were "writing." When I inherited it I read through it and found a couple of dozen ways the code would panic the kernel[1]. Once fixing those obvious issues, it became clear that the original owner of the code didn't really understand what computation did. They had an idea, and mathematically they could show that it was correct, but literally no ability to express that algorithmically.

This is not a "new" problem but it is an important one that managers of software engineers need to watch for.

[1] At the time the only difference between "kernel" programmers and "application" programmers was that kernel programmers recognized that unsafe code crashed the whole system, not just the application. So they tended to be cultivated from paranoid programmers.

kadoban · on July 25, 2021

In PHP's case, Stack Exchange is not necessary to get SQL injected tutorials. The official docs for _years_, if not decades, included them. The docs for how you were supposed to do SQL were just full of the antipattern of building queries by string formatting and concatenation. I wouldn't be surprised if some dark corner of the docs still had those available.

mike_d · on July 26, 2021

> The official docs for _years_, if not decades, included them. The docs for how you were supposed to do SQL were just full of the antipattern of building queries by string formatting and concatenation.

PHP directly exposed the libmysqlclient C library. Any language that provides the ability to send a raw SQL query (hint: almost all of them) has documentation you can copy and paste to introduce an injection vulnerability.

You'll find injection vulnerable examples in the MySQL docs themselves: https://dev.mysql.com/doc/c-api/8.0/en/mysql-real-query.html

nextaccountic · on July 26, 2021

> You'll find injection vulnerable examples in the MySQL docs themselves: https://dev.mysql.com/doc/c-api/8.0/en/mysql-real-query.html

I can't find those examples, is there something in the mysql docs that I can copy and paste and be instantly vulnerable?

iamgopal · on July 26, 2021

MySQL docs shouldn’t worry about injection.

iamgopal · on July 26, 2021

Why downvote ? Why database needs to worry about executable wrong input ? That’s input ( App ) level problem. Database will do what it is asked to do.

anakaine · on July 26, 2021

Because people learn from and refer to the documentation. Examples therein should be presented in a manner that demonstrates safe execution.

mike_d · on July 26, 2021

So why should Perl, Go, PHP, or Rust?

iamgopal · on July 26, 2021

Because those language are usually used at application level, and application does need to worry about wrong input. Databases doesn’t need to worry about if some correct security level input ask to delete the database , it should delete it, not to question the input , but at application level , that is not acceptable. How much simpler can this get ?

thallium205 · on July 26, 2021

Because not everything that uses MySQL is a web server?

cerved · on July 26, 2021

because it's serious if it's a web server?

mlang23 · on July 26, 2021

In defense of PHP, this is probably due to simplicity. PHP got many people started with server side web development, and docs were supposed to be "simple". The thinking back then probably was "if we make the example more complicated, we will loose people", which was probably even true for the target audience.

chriswarbo · on July 26, 2021

> In defense of PHP, this is probably due to simplicity

I don't consider this to be a good defence; in fact, I'd argue that makes PHP itself insecure.

I like to think of secure coding in terms of the 'path of least resistance' for a lazy/busy/inexperienced developer. If doing things securely makes life harder, things will be done insecurely.

We don't need to make the secure approach easier in absolute terms; we can just make the insecure approaches more painful. In this case: have database functions require arguments of an 'SQL' type, rather than strings; make it easy to write literal SQL values; make it easy to parameterise SQL values; make it as hard as possible to convert a string value to an SQL value, e.g. bury it in some deep namespace hierarchy, with a long and scary-sounding name, require a config value to be enabled (or even make it a compiler flag!), etc.

This way, the docs (plus stack overflow, blog posts, etc.) don't have to choose between showing a secure approach or showing the simplest approach; since they are the same!

michaelmior · on July 26, 2021

> make it easy to parameterise SQL values

It usually is easy to parameterize things like values used in WHERE clauses. It's often much harder to work with dynamic query conditions (e.g. optional filters on a particular column). I don't believe I've seen an approach that does this in a way that can provably provide any sort of safety guarantee.

chriswarbo · on July 26, 2021

Do you have an example? I can only imagine three scenarios:

- SQL logic. We should be able to write this as a literal query, parameterised as needed, e.g. (made up syntax)

    query(
      addParam(
        "MyParam",
        $myParam,
        SQL"SELECT foo FROM tbl WHERE @MyParam IS NULL OR bar = @MyParam"
      )
    )

 - PHP logic. This is just ordinary control flow, like anything else, e.g.

    query(
      ($myParam === null)
        ? SQL"SELECT foo FROM tbl"
        : addParam("MyParam", $myParam, SQL"SELECT foo FROM tbl WHERE bar = @MyParam")
    )

 - A mixture of SQL logic and PHP logic. This seems inherently unsafe to me, so it's not surprising that safety guarantees can't be proven. My point is that such things should be be made difficult ("artificially", if needed), such that nobody would choose to go down that route when another option is available.

michaelmior · on July 26, 2021

I'm not quite sure what you're asking for an example of. But suppose you have a table tbl(foo, bar, baz). Depending on user input, you may want to query on any combination of foo, bar, and baz. With this and larger number of columns, it becomes impractical to have conditions for every combination.

One approach would be to construct a list of conditions as well as a list of parameters to be substituted. Shown below without any particular language syntax, but hopefully comprehensible.

conditions = ("foo=?", "bar>?") parameters = (fooValue, barValue)

Then when building the SQL, you join the conditions together with AND and substitute in the parameters. This works in the sense that you are still prevented from injection. But it's rather messy. I suppose perhaps you can actually do what I'm discussing with some ORMs in a reasonably clean way. But my point is that most SQL interfaces make it easy to parameterize a single set of fixed values, but hard to do so for table and column names. Arguably this is a feature not a bug since you probably want to avoid such parameterization anyway. But having a safer way to do so would be nice.

mlang23 · on July 27, 2021

You are aware that PHP was made popular roughly 20 years ago, right? In any case, I agree with your theoretical approach, however, I think you are ignoring the reality of "coders" out there. Security is still a niech topic in CS. Most people still want to get their work done, not caring about security at all.

chriswarbo · on July 27, 2021

> Most people still want to get their work done, not caring about security at all.

That's exactly why tools and languages should be designed for security, so that their users don't have to care.

I also disagree about 'not caring about security at all': security is opposite to functionality, since it prevents things rather than enabling them. If developers truly didn't care about security, we would see far more use of 'eval' as a way to plug systems together. Instead, we see a huge amount of effort spent on defining data and interchange formats, parsing parameters, branching on their values, etc.

For example, any time an API/URL provides options like '?sortField=age&sortDirection=DESC', that indicates developers who do care about security. In contrast, we can make a far more flexible API by accepting arbitrary code instead, like '?postProcess=(x) => x.sortBy((elem) => elem.age).reverse'; and this is much easier to implement, since we could just send it to 'eval()'.

guskel · on July 26, 2021

>I don't consider this to be a good defence; in fact, I'd argue that makes PHP itself insecure.

Agree with you completely on that. PHP, is in fact, insecure by design.

pdimitar · on July 26, 2021

I don't see this as a defense, I am seeing it as a way to reach to more programmers, and basic security practices be damned in the process.

I have to wonder: what did these people have to gain if PHP got popular (which it did)? That's an egotistical way of popularizing your language.

tored · on July 26, 2021

Now do C.

pdimitar · on July 26, 2021

I know right. That's why I'm not doing C anymore, for 10+ years now.

tored · on July 26, 2021

Was Dennis Ritchie egotistical for popularizing C? He would have known all the security flaws in it, right?

pdimitar · on July 26, 2021

Very disagreed with your analogy. By the time those PHP tutorials were written, SQL injection attacks were already known.

So not sure what you're trying to push here but I refuse to participate.

tored · on July 26, 2021

Still to this day you have free-after-use bugs in the Linux kernel, about 20% of all known bugs in the kernel. How is this not comparable?

pdimitar · on July 26, 2021

It is comparable, that's why I'm moving to Rust. Hence I said I don't want to participate simply because I am not defending C either. It played its role and it's time for it to be phased out.

NationalPark · on July 26, 2021

Which MySQL library did Ritchie wrap back in 1972?

tored · on July 26, 2021

https://news.ycombinator.com/item?id=27959145

NationalPark · on July 26, 2021

My bad, which version of Linux should Ritchie have been auditing in 1972?

Cthulhu_ · on July 26, 2021

I see what you mean, but at the same time, back then just getting a working PHP environment up and running was complicated enough - PHP, Apache, MySQL, and you had to get them all working together.

If it was about accessibility, they should have made an easy installer and even offered cheap hosting themselves I think.

As for SQL injection, were prepared statements even a thing back then? Either way they should never have allowed and normalized string concatenation to build up SQL queries.

piokoch · on July 26, 2021

"just getting a working PHP environment up and running was complicated enough"

Hah, that was one of the biggest strength of PHP stack - it was not complicated; on you MS Windows machine it was enough to install some wammp/xammp, etc. PHP/MySQL/Apache bundle, open editor, put in the first line <? and start coding.

On production, typically some shared hosting (cheap! Another said stack advantage) this was already installed, so it was sufficient to FTP files over there and be done (one more advantage).

There was no other comparable stack in terms on simplicity and being able to do something quickly. I believe there is none today, only PHP stack matured, so there are frameworks, etc.

Yes, there were security concerns, but still much less comparing to its server-side predecessor CGI scripts (better known today as AWS lambdas or "serverless").

jimmaswell · on July 26, 2021

Configuring the "plugins" and everything to get PHP working on Apache on Linux can be complicated and annoying if you're not already familiar with the process.

tored · on July 26, 2021

There have existed multiple installer projects for PHP, Apache & MySQL on Windows for two decades.

Mysqli driver was released with PHP 5.0 in 2004, it has prepared statements.

zzzeek · on July 26, 2021

> As for SQL injection, were prepared statements even a thing back then?

wow.

yes, prepared statements have been a thing since there were relational databases.

but also, (server side) prepared statements are not required in order to use SQL with bound parameters. the binding can occur just as easily on the client side, and this is in fact quite common. the point is that the programmer is not manually deciding whether or not to escape a parameter on a query-by-query basis, the process is automated.

kstrauser · on July 26, 2021

> As for SQL injection, were prepared statements even a thing back then?

Yes. I was writing prepared statements in Perl before PHP3 was released.

prox · on July 26, 2021

When I started like this, I always followed up with a google query for securing what I did. Not the best way to learn, but as you say, it got you started.

mxd3 · on July 26, 2021

The fact that it got newbies started doesn't mean it was built for newbies. I think you're making too many assumptions here.

nine_k · on July 25, 2021

"Else you are not getting the authentic php4 experience!" /s

PHP has been in a poor shape for many, many years. It started shaping up in last rather few years, and there is a large backlog to tackle, colossal if you include all the numerous tutorials and Q&As from getting copy-pasted since 2000.

mschuster91 · on July 25, 2021

The language never was the problem, not since PHP 5.1 at least (which introduced PDO) and that is 16 years ago.

The problem always was the ecosystem that took decades to update and the fact that Google's search is algorithm-ranked and not supposed to be curated by humans, which would have kicked out at least the most horribly insecure stuff.

acdha · on July 25, 2021

PDO was slower, the interface - critically - tons of examples still either made prepared statements a side note or actively encouraged string concatenation, and it preserved the flaws of the previous interfaces like ignoring errors and warnings (silent is no longer the default in PHP 8).

I no longer consider myself part of the PHP community (started around 1999), in part due to the low priority reliability and security had. It was exhausting having to vet code so frequently because even experienced developers forgot all of the rakes in the grass.

jeltz · on July 25, 2021

I disagree. As someone who has used PDO a lot it still makes it unnecessarily hard to use parametrized queries compared to the libraries available in most other languages, even C (at least for PostgreSQL). The pgsql library for PHP is also pretty good, better than PDO.

tored · on July 26, 2021

You can do parametrized queries in two lines of code with PDO. I wouldn't classify that as "unnecessarily hard", sure you could split in half to one line of code, for that you need to write your own wrapper function.

habibur · on July 26, 2021

But, you need to write the hard thing once. In a 10 line wrapper function and then call that function from everywhere.

sql_query($db,$sql,$params);

Problem solved.

I guess every PHP developer writes a bunch of these wrapper functions for common sql tasks before he starts his work.

porker · on July 26, 2021

Or installs Doctrine DBAL which handles it all for you.

At the cost of some performance, but the ease-of-use with a highly-tested and relied-upon library normally outweighs that.

Cthulhu_ · on July 26, 2021

Yup, make it work (securely!) before making it fast; you can't diagnose performance if your code isn't working as it should in the first place (and SQL injection safety is a non-functional requirement; you can't consider code working if there's that weakness). In practice, bad database design or access (like the n+1 problem) will weigh heavier than the overhead added by a library like Doctrine. Or, it should, if Doctrine adds THAT much overhead it's a problem.

arp242 · on July 26, 2021

This is the most PHP comment in this thread yet; "yeah, I know this very basic API is inadequate and broken, but you can just write a wrapper, everyone is writing the same wrappers anyway!"

Is PHP supposed to be a high-level language or what? Hell, I'd consider it a very flimsy excuse even in C (in most contexts anyway).

tored · on July 26, 2021

Already answered in this thread. There is nothing hard or wrong about the API. It is just lots of misunderstandings and commentators not looking things up before commenting.

  $stmt = $pdo->prepare('INSERT INTO user (email) VALUES(?)');
  $stmt->execute([$email]);

https://news.ycombinator.com/item?id=27954454

toast0 · on July 25, 2021

I mean, PDO wasn't great (in contrast to perl's DBI which was pretty good and offered parameterized queries even if the underlying interface didn't), so it made sense to prefer mysql_query. But mysql_query didn't do parameters, and mysqli did, but only it you ran a new enough version of MySQL.

I wouldn't call it a language failure per se, but a problem with the libraries that shipped with the language. That distinction may not make a difference.

Bad examples that stuck around don't help either, of course.

kadoban · on July 25, 2021

Long after PDO was a thing, the official docs still included the old insecure jank.

Is your contention that Google is to blame for indexing them?

sellyme · on July 25, 2021

> The docs for how you were supposed to do SQL were just full of the antipattern of building queries by string formatting and concatenation

This is still to this day the recommended way to construct a "WHERE foo IN (a,b,c,...)" query in PHP. It's insane that there's no way to pass an array of values into a parameterised query for that use case.

yourenotsmart · on July 26, 2021

If there's anything more annoying than the lingering bad examples of PHP code online, it's people lying about what PHP "recommends", like you do.

Common database drivers like PGSQL, MySQL, SQLite etc. don't accept arrays of values for parameterized queries. This is at C level, in their own client libraries, and their own communication protocols.

This means there's nothing specific to PHP about this problem. Many higher level libraries, including for PHP, abstract over this problem and do offer arrays by binding per query.

So where are statements like yours coming from? Probably just being eager to say something bad about PHP without checking your facts too much.

nh2 · on July 26, 2021

Maybe I'm misunderstanding you, but it looks like it is a PHP specific problem. Python does not have it.

Of course at C level parameterised queries ("prepared statements") do not accept arrays. It would not make sense for them to do.

This is because the placeholders can be substituted by values of different types.

In typed programming languages, arrays have elements of the same type.

For example in SQLite, you are supposed to call `sqlite3_bind_int()`, `sqlite3_bind_text()` [1], once for each query parameter.

In languages like PHP and Python, where arrays can carry values of different types, their wrappers around the SQLite C functions can do this function calling for each value in the array. In Python, that is easy, the default, and explained in the very beginning of the official standard library's sqlite documentation [2]:

It states at the very beginning that query construction by string construction is unsafe and must be avoided. It immediately provides an example of how to safely call a parameterised query with an array of values, using `execute()` and `executemany()`.

PHP's standard library simply does not seem to have such an `execute()` function that accepts an array [3], nor do the official docs seem to contain any prose that could explain how to use the library safely [4]. The only way you can find out is by reading user-contributed comments on some specific functions in the function reference.

So Python's standard library provides safe functions, and immediately instructs the user how to use them. PHP's does not. Unclear to me how one can conclude that this isn't a PHP specific problem.

[1]: https://www.sqlite.org/c3ref/bind_blob.html

[2]: https://docs.python.org/3/library/sqlite3.html

[3]: https://www.php.net/manual/en/sqlite3.prepare.php

[4]: https://www.php.net/manual/en/class.sqlite3.php

dreyfan · on July 26, 2021

You're misunderstanding You cannot do something like the following:

   "WHERE status in (...?)" and then ->execute($status_array)

But you can pass an array of parameters just fine (to individually bound input placeholders). It depends on which API you're using but it's of the format:

    ->bind_param($types, ...$params); or ->execute($params);

ransom1538 · on July 26, 2021

"Maybe I'm misunderstanding you, but it looks like it is a PHP specific problem. Python does not have it."

You can concatenate a sql query just fine in Python anyway you want. Adding a sql injection is just as easy in Python, PHP, or lisp. Thus this is a choice. Nothing to do with a language. Language bashing is gross and spreads lies. And yes, you can bind an array of params in PHP.

jve · on July 26, 2021

Actually if you use .NET EF Core 2 FromSql or ExecuteSqlCommand, you get parametrized queries from string interpolation for free: https://docs.microsoft.com/en-us/ef/core/what-is-new/ef-core...

yourenotsmart · on July 26, 2021

> Maybe I'm misunderstanding you, but it looks like it is a PHP specific problem. Python does not have it.

I don't know how carefully you read what I said, if you misunderstood that this is a native library limitation (C level) of the actual database clients, and that most of the popular PHP libraries can bind arrays.

sellyme · on July 26, 2021

> This means there's nothing specific to PHP about this problem.

I don't particularly care if languages that I am not using have the same problem. It's up to people who use those languages to make those criticisms.

> Many higher level libraries, including for PHP, abstract over this problem and do offer arrays by binding per query.

Feel free to link an in-built PHP library that supports this feature, that would be far more useful than just obliquely suggesting that one exists.

TrispusAttucks · on July 26, 2021

PDO + Prepared Statements

https://www.php.net/manual/en/pdo.prepared-statements.php

sellyme · on July 26, 2021

PDO explicitly does not support the functionality being discussed [1]:

> For example, you cannot bind multiple values to a single parameter in the IN() clause of an SQL statement.

You're required to roll your own implementation, something that thankfully isn't particularly difficult, but unfortunately also seems to be enough of a barrier that a lot of programmers don't bother.

[1]: https://www.php.net/manual/en/pdo.prepare.php

noduerme · on July 26, 2021

I'm pretty sure PDO doesn't support escaping WHERE/IN(...) because MySQL prepared statements don't support it. In theory, PDO could support it in emulation mode, but supporting something like that only in emulation would be an antipattern.

Similarly, PDO doesn't support dynamic table names. You're expected to roll your own precisely because it demands more situationally specific escaping than regular parameters would, e.g. testing a list of allowed tables.

yourenotsmart · on July 26, 2021

> would be an antipattern

I just want to point out the above is identical to saying "is bad" without explaining why it's bad.

We can say it doesn't align with PDO's goal of being barebones library that doesn't add non-native features on top. But that's not what it does. It both omits many native features on the various databases it supports, and has non-native features like hydrating objects, for some reason.

I'd say if a PHP library is at the same level as the C library that backs it, that's a failure of design goals. C is intentionally low-level (by modern standards) and PHP is supposed to be high-level and consumable by people with much less clue than C programmers.

It's unfortunate that you pretty much need a DBAL (like Doctrine DBAL) on top of a DBAL (PDO) to get some of the missing features. Like, say, escaping identifiers.

tored · on July 26, 2021

It is hard to design good API:s that can last for decades. Just look at Java, language that is backed by large international corporations but the Java standard library still sucks. PHP core is a much smaller community driven project.

Key to great APIs is battle tested APIs in real world projects, finetuned over years of experience. PHP has that in the much larger general PHP community and to access it you use composer.

We, the PHP community, must use all resources to compete with other languages, it is unrealistic that the core PHP team can implement a great API, for example we can look at the filter API, it works but it is not great. PHP core also has longer release cycles.

We need to push PHP developers to use composer more, maybe the PHP docs should state that.

TrispusAttucks · on July 26, 2021

To be fair prepared statements happen at the RDBMS. PDO is just a library to interface with that. It can't compile a query execution plan without knowing the table and columns involved. So dynamic tables or columns as variables doesn't make sense so you need to handle that outside the prepared statement as it doesn't belong there.

noduerme · on July 26, 2021

Just out of curiosity, do you have a preferred mysql library in nodejs? I wrote my own based on node-mysql that does server-side prepared statements, but it does them by compiling PREPARE and SET calls, not through a lower-level language. It works for my needs, but it's not really high-performance.

Sander_Marechal · on July 26, 2021

Doctrine DBAL: https://www.doctrine-project.org/projects/doctrine-dbal/en/l...

geon · on July 26, 2021

Are you just talking about prepared statements? Mysqli was released in 2004.

https://www.php.net/manual/en/mysqli.quickstart.prepared-sta...

pjungwir · on July 26, 2021

I see what you're saying, but maybe "array" is the wrong term. Indeed you can't say `WHERE foo IN (?)` and pass more than one parameter (or an array parameter) for the single `?`. That's a limitation of the databases, not the programming languages. But Postgres does have arrays, and you can pass them as parameters. In fact that's how you solve this problem. An equivalent to `foo IN (...)` is `foo = ANY (array[...])`. So parameterized that would be `foo = ANY (?)`. Instead of a bunch of parameters, you have one parameter of array type. The parens here signify a subquery, not a list of single-attribute tuples. In fact using ANY is more expressive, since you can say `ANY (array[])` (but you might need to add a cast), but you can't say IN ()` (which is a syntax error).

Anyway your point is true that lots of languages' client libraries and ORMs implement sql "parameters" by string substitution. That's still better than having the programmer do it himself, but not as good as it could be.

Akronymus · on July 26, 2021

> Common database drivers like PGSQL, MySQL, SQLite etc. don't accept arrays of values for parameterized queries.

One workaround to that is passing the array as a string with some separator, deconstructing it into a temp table and then using that table as a array when it is part of a stored procedure.

matharmin · on July 26, 2021

Even better, you can pass it in a JSON array, and use the built-in JSON functionality to iterate through it. I started doing this in SQLite, and it made some queries a lot cleaner.

x0x0 · on July 26, 2021

postgres does. PQexecParams, which executes a sql statement with parameters, accepts oids that are array types.

fouc · on July 26, 2021

Probably good idea to use a framework or a database library in PHP. Laravel has Model::whereIn('foo', [a, b, c, ...])

billisonline · on July 26, 2021

I'm shocked no one else gave this answer earlier in the thread. If you're using PDO directly in 2021, you're absolutely doing it wrong. You don't need to use all of Laravel, or even all of Eloquent for that matter. If you don't want to depend on a framework or use an ORM, you can use "illuminate/database" (https://packagist.org/packages/illuminate/database) for a secure wrapper around PDO. No need to reinvent the wheel.

sellyme · on July 26, 2021

> If you're using PDO directly in 2021, you're absolutely doing it wrong.

This is somewhat the point. If using the language's standard libraries is "absolutely doing it wrong", that's an indictment of the language.

BoxOfRain · on July 26, 2021

>This is somewhat the point. If using the language's standard libraries is "absolutely doing it wrong", that's an indictment of the language.

Exactly, all languages have footguns but some have a lot more than others. You don't hear for example Java developers bitching about JDBC to anywhere close to the extent PHP developers bitch about the various common approaches to database connections.

billisonline · on July 26, 2021

> If using the language's standard libraries is "absolutely doing it wrong"

You are being deliberately obtuse. Other comments in this thread offer correct examples of using PDO to avoid SQL injection. I didn’t mean it was impossible to write safe database code using the standard library—obviously, PHP is a Turing-complete language, it can be done!—I just meant it’s awkward, and verbose, and developers are unlikely to do it consistently throughout an application. Hence this type of concern is best abstracted into a library.

To your point about “indicting a language,” most languages have footguns like this. The worst you can say about PHP is that the documentation should do more to discourage new users from working with PDO directly. (And I mean the official documentation—the language maintainers can’t be held responsible for the kind of unofficial tutorials the article complains about.) But regardless of what the official docs say, most PHP development today is done using frameworks like Laravel, Symfony, and Zend framework that do not suffer from SQL injection issues.

chipotle_coyote · on July 26, 2021

> It's insane that there's no way to pass an array of values into a parameterised query for that use case.

Maybe I'm misunderstanding you, but assuming $params is an array in the following code, isn't this passing an array into a parameterized query for that use case? (Edited to note this is literally an example from the PHP documentation, and not one of the squiffy comments.)

    $place_holders = implode(',', array_fill(0, count($params), '?'));
    $sth = $dbh->prepare("SELECT id, name FROM contacts WHERE id IN ($place_holders)");
    $sth->execute($params);

In Python, using MySQLdb, I believe this would be something like

    place_holders = ','.join(['%s'] * len(params))
    cursor.execute("DELETE FROM foo.bar WHERE baz IN (%s)" % place_holders,
                    tuple(params))

Which, while more succinct, seems to be functionally exactly the same thing. I don't see what PHP is doing that's "worse" here offhand.

One could argue, "Yes, but a Python programmer would use SQLAlchemy," which is probably true, but then you need to let the PHP programmer use Doctrine or Eloquent.

sellyme · on July 26, 2021

> isn't this passing an array into a parameterized query for that use case?

Yes, I definitely should have been more specific there - what I'm referring to is passing it as one parameter, instead of potentially dozens or hundreds. There's a lot of ways to do this safely, but none of them are elegant. In the example presented here I believe it's the case that you can't do the ->execute($params) and bind some parameters explicitly, so if you had something like " AND status = ? AND due_date < ?" at the end of your query you have to chuck those variables into the same nondescript array.

I prefer the looped bindParam() method for this reason, but that has its own challenges. Firstly, it requires some boilerplate (not a big deal, but no-one likes writing boilerplate), and more pressingly it still has the issue where it actually is each individual element of the array being parameterised, and spams the ever-loving crap out of any debug outputs.

Obviously all of these issues are way less concerning than SQL injection vulnerabilities, but life would be so much easier if you could just do $sth->bindParam(1, $params); on a single question mark, and have that show up logically in things like debugDumpParams(). Even if you had to use special syntax to indicate when a parameter is expected to be an array, that would be a huge improvement.

I'm sure there's technical reasons why this is more difficult to implement than it would initially seem, but I've seen enough string-concatenated queries on StackOverflow from people who just give up on getting the parameterisation to play nicely that I believe it's worth the effort to make doing things right as frictionless as possible.

ipaddr · on July 26, 2021

How often are you passing dozens or hundreds of parameters to a single sql statement? Maybe there is a better way to structure things.

sellyme · on July 26, 2021

How often am I doing it? Not very. But when there's a WHERE foo IN(a,b,c,...) query that has an arbitrary list as input? Could be any number of parameters in there (although I think most SQL drivers start complaining in the early quadruple digits).

ec109685 · on July 26, 2021

Better is to concatenate question marks and then pass the params as an array.

tored · on July 26, 2021

Yes, something like this

  $user_ids = [1, 6, 46, 3, 17];
  $count = count($user_ids);
  $in = '(' . implode(', ', array_fill(0, $count, '?')) . ')';
  $sql = "SELECT user_id, email FROM user WHERE user_id IN {$in}";
  $stmt = $pdo->prepare($sql);
  $stmt->execute($user_ids);
  var_dump($stmt->fetchAll());

throwawayboise · on July 26, 2021

I realize this is just an example but if you're in control of all the values then there's nothing unsafe about concatenating them into a query. The problem comes with values that are submitted via a web form, API, or some other external, untrusted source.

sellyme · on July 26, 2021

Yep. Fortunately a lot of the use cases of IN() are with controlled values - often IDs obtained from a previous query - so string concatenation is safe (and a lot less hassle than the alternatives).

Unfortunately that gets people into the habit of using string concatenation, which is not a great habit to have.

wk_end · on July 26, 2021

This is true but it’s still better to default to the Safe Thing, as long as there isn’t a good reason not to. How long until those values you know you control carelessly get turned into values you don’t control somewhere along the way?

sumtechguy · on July 26, 2021

I used to mess with other devs by injecting bits of code from other sources (when blink in html worked it was one of my favorite ones). The correct way is to bind your params and do not trust that the data you got from some other system is 'OK'. What may be fine in one system could be an escape code in another.

I speculate that the reason this is such an issue is because the interface at the ODBC level is basically security wise broken. It works 'OK' for getting/putting the data but it has 2 modes of execution. One of those paths is not great for security, the other has a usage issue. 'Binding' can be a real pain as it takes at least 1 call per variable parameter. Then managing the buffers correctly. So just building up the strings is an easy way to skip a lot of steps. So many take it. But that path leads to security vulins.

tored · on July 26, 2021

Not sure if I follow you completely, but in my example $user_ids could come from an external source because I’m only concatenating question marks (?), then bind $user_ids with execute and that is safe. What I also always do before passing the $user_ids to execute if it comes from an external source is to validate that everyone of them is an integer with filter_var.

remram · on July 26, 2021

Can't you at least build a "IN (?, ?, ?)" string, if you're going to build a string dynamically?

sellyme · on July 26, 2021

Not only is it the case that you can do that, it's the case that you should do that. Which leads to the question of why it's not in-built functionality. Making secure code harder to write than insecure code is a great way to ensure that lots of people write insecure code.

adzm · on July 26, 2021

You end up writing code with 16 ? parameters and filling the empty ones with -1 or something ;)

kijin · on July 26, 2021

No, you use array_fill() and implode() to generate exactly as many placeholders as you need.

path411 · on July 26, 2021

What do you mean? You can Google php parameterize query and get atleast 2 different methods of doing this. You shouldn't ever touch string concat in any language when doing queries

sellyme · on July 26, 2021

> You can Google php parameterize query and get atleast 2 different methods of doing this

There's no way of doing this with a single parameter. You need to parameterise every single individual item in the IN clause to do it that way, which is a horrific solution when it's of a completely unknown length.

Still better than string concatenation in many cases, but that the language has no in-built way of doing it is one of the many reasons PHP code is so often vulnerable to injection attacks. There's so much friction to writing secure code.

hattmall · on July 26, 2021

That's what the docs should show because that's how it works. The docs should give you the streamlined barebones implementation. It's trivial to write your own function to use parameterized queries and add in all the type checking etc you need. It's only a few lines of code.

sellyme · on July 26, 2021

> It's trivial to write your own function to use parameterized queries and add in all the type checking etc you need. It's only a few lines of code.

It's fairly trivial to do this, but now you're potentially adding thousands of parameters per query in circumstances where the contents of the IN() are variable in count. This is not ideal for a number of reasons.

Additionally, a language should be designed such that the easiest possible way to do something is at least moderately secure. If you need to attach some boilerplate code on top of the standard libraries every single time you use them for it to be safe, then there is no reason for that boilerplate to not be in the standard libraries.

ssully · on July 26, 2021

My first job while in school was doing web development with a LAMP stack. I had zero PHP experience, so it was 100% learning on the job and my learning resources were basically the official docs, a PHP book (can't remember which) I got at a book store, and stackoverflow.

PHP has a very forgiving design; it makes it very easy to get any trash code up and running. It really is great for newbies to get their hands dirty. I look back fondly on that first job, but boy did I have to unlearn a lot of bad lessons from those days.

tyingq · on July 26, 2021

To be fair to PHP, a lot of other languages had bad examples around as well. Pretty much every language has a way to do string interpolation on untrusted input and pass it to to a database .

flomo · on July 26, 2021

Java and C# 1.0 examples (mostly) did not have anything like this, because they shipped with a database layer (JDBC/ADO.net) and not just a raw driver. PHP instead spent many years fucking around with hacks like addslashes() and etc before addressing the root issue.

tyingq · on July 26, 2021

Java ones were pretty easy for me to find...

  String insertQuery = "insert into student values('" + studentNo + "','" + studentName +"','"+ studentAddress + "','"+studentAge+"')"; 
  int result = statement.executeUpdate(insertQuery);

https://www.onlinetutorialspoint.com/jdbc/jdbc-insert-progra...

Result #5 on Google for: jdbc insert program example

And this one, #1 for "jdbc example variable where" on Google:

  String query = "select LastModified from CacheTable where " + " URL.equals(url)";

https://stackoverflow.com/questions/2608376/specifying-a-var...

kaba0 · on July 26, 2021

> Pretty much every language has a way to do string interpolation on untrusted input and pass it to to a database

Otherwise they would not be Turing complete.. but defaults/easiest route does matter very much in this case.

SilverRed · on July 26, 2021

And the default docs and tutorials. If you look at Ruby on Rails. The way every tutorial will show makes it impossible to get an sql injection. Then as you gain more skill you will eventually find the function that lets you run a string query but no guide or tutorial shows you this so it is less likely you will use it without understanding.

codedokode · on July 26, 2021

Ruby and Rails tutorial is not a Ruby tutorial. The ancestor comment talked about language tutorials, not framework tutorials.

kaba0 · on July 26, 2021

JDBC does have an option to pass parameters without doing string interpolation and I would not consider it a framework.

tyingq · on July 26, 2021

I don't mean going through hoops. String interpolation is easy in most languages. See https://news.ycombinator.com/item?id=27958651 for example.

nkozyra · on July 26, 2021

> I wouldn't be surprised if some dark corner of the docs still had those available.

It's not exactly a dark corner, but at least there's red boxes all over the place.

Back in the 5.x early days it had disclaimers but clearly not enough to discourage people from keeping unsafe code in place.

latchkey · on July 25, 2021

> The best way to insert supply chain exploits is to embed them in a stack exchange answer to a beginner's question.

I'd love to see a concrete example of this happening in this way!

The rest of your story just describes 'smart, but bad jr. programmer' and doesn't really discuss the exploit issue.

crazygringo · on July 26, 2021

> The best way to insert supply chain exploits is to embed them in a stack exchange answer to a beginner's question.

Do you have any actual evidence for that?

As Hanlon says, "never ascribe to malice that which is adequately explained by incompetence."

Incompetence explains this one fully for me.

eyelidlessness · on July 26, 2021

It doesn't take a malicious actor to make a honeypot, only to exploit it. If someone naively posts a widely used solution on a help site, they've done the job for the exploiter, who only needs to know what low hanging vulnerability fruit awaits them.

ericye16 · on July 26, 2021

"Programming from first principles" doesn't solve security problems, in fact security problems in this case come from being ignorant of best practices and industry experiences. You would _want_ programmers to use established techniques to avoid this problem. Of course developers should always understand what their code is doing though.

p1necone · on July 26, 2021

> DRY paints that as a feature

That's not what DRY should be. Good developers should understand at least a couple of levels of abstraction underneath what they're writing in order to produce sensible code.

The idea of abstraction is that you only have to spend the afternoon/day/week (depending on the complexity that's being abstracted over) learning how everything comes together once, as opposed to spending that time to grok a slightly different version of the same complex system every time you read/write something new.

ahmedalsudani · on July 26, 2021

Yeah; anyone who thinks DRY means copy the answer on SO is completely missing the point.

DRY should not introduce vulnerabilities. It avoids them by

1. reducing complexity and cognitive overhead

2. allowing you to fix your code in one place and propagate your fix throughout the codebase

p5a0u9l · on July 26, 2021

Ironic how in today's market the applied math dude could easily transition into some higher paying role like ML Engineer or Data Scientist. Not knocking those roles by any means, but the tech world seems ravenous for algorithms.

From my experience, algorithms are easy and the software engineering is hard.

What’s more, many, not all, but many of these algorithm scientists look down on their programmer counterparts. It’s these folks who end up making the algorithm successful for the company.

treeman79 · on July 25, 2021

Thought the best way was to develop a basic and easy library. Then upload a malicious binary that doesn’t match the source.

There was a really good post on how to do it and evade detection.

raxxorrax · on July 26, 2021

I believe all programmers will resort to copying code without too much of a review at some point. Strict first principles would mean not to rely on mental work of predecessors, which is entirely unrealistic today. Maybe that could work in the 80s, but the amount of software layers today is astonishing.

I develop bare metal code for special µC, but I would never imagine to build even a basic OS. This is work you spend your lifetime in. Of course if I could just copy the things from established OS things might look differently.

That said, I don't think the first steps in SQL should guard against SQL injection. That is a topic for later and only hides the main learning target. You can understand SQL perfectly by first principles and still cause your first program to allow for such injection. But that should be a different lesson at first. Being able to identify it as a danger also relies on the experience of others.

tshaddox · on July 26, 2021

> This isn't new, we've always had programmers who programmed by "recipe" rather than first principles, and DRY paints that as a feature, but it underlies a lot of pain and cost over the years.

Something tells me that exploiters love the programmers who attempt to build user authentication systems from first principles.

dragonwriter · on July 26, 2021

> This isn't new, we've always had programmers who programmed by "recipe" rather than first principles, and DRY paints that as a feature

No, it doesn't. Programming by recipe rather than building the recipe into a reusable abstraction is the exact opposite of DRY.

simion314 · on July 25, 2021

The problem seems to me to be if you hire newb developers and don't have any mentor-ship or code review. Any non-newb dev will know that you ALWAYS have to sanitize strings for SQL. file names or whatever. Most SQL ORMs or libraries will let the developer run raw SQL so you better have some competent person writing teh code or at least review it.

bluedino · on July 26, 2021

Certain people just aren’t “real programmers”. I worked with a guy who would cut and paste some example and change a couple things, and respond with “I got this working”

Meanwhile I would said okay, what does it actually do, where’s you copy it from, what did you change and why...

At this point he would just get mad at me. I'm sorry I don't want people cutting and paste code they don't understand and sticking it in our codebase.

simion314 · on July 26, 2021

And the issue is the fucking search engine, I can't understand why when you search for JS,html,css documentation google sends me to outdated websites like w3c schools. I always have to foce a search on MDN.

A newb can also copy paste bad Python code, mess up some ORM clause and delete all your database.

But in this case oh PHP and MySQL even the worst dev shop uses a framework or library, so this article is probably affecting almost nobody that matters.

bluedino · on July 26, 2021

>> I can't understand why when you search for JS,html,css documentation google sends me to outdated websites like w3c schools

Probably the same reason they still send people to sites like Expert Exchange

acdha · on July 25, 2021

> Any non-newb dev will know that you ALWAYS have to sanitize strings for SQL. file names or whatever.

This is the “No true Scotsman” fallacy of programming. There are many people with plenty of experience who haven’t yet learned that lesson memorably and at least an order of magnitude more who know it but don’t exhaustively trace every data flow through the system and assume something already handled validation or escaping, being correct all but a fatal few times. With unsafe defaults an attacker only has to find a mistake once — you have to find them all.

simion314 · on July 26, 2021

There is no unsafe defaults, all ORMs allow you to run raw SQL so same "newb dev" can do same stupid thing if it finds a SQL tutorial when googling "how to do X".

You need to teach developers to always escape strings, even if we remove old and bad tutorials you still need to teach the devs about this issues, otherwise they will do the mistake with file names, or with parameters to shell commands. It might mean having to read a book and not reeling on Google and soon on AI to teach you to code or SQL.

acdha · on July 26, 2021

There’s a difference between what you encourage and what’s technically possible. PHP code so commonly has bugs of this class because there’s a quarter century of tradition, examples, and tutorials normalizing the idea of taking request variables and passing them directly to other code. Contrast with something like the Django tutorials & docs where the examples pervasively use the ORM’s escaping & validation and the extension points describing how to use custom SQL tell you to use placeholders and emphasize why it’s important.

That might not seem like a lot at first thought but it lowers the bug frequency enormously in the code I’ve looked at because you only need extreme caution in rare cases rather than every view. That means that when someone is busy, having a bad day, etc. they either have no problem or it’s a safe crash rather than an exploitable hole.

simion314 · on July 26, 2021

That is not the reality. Even the most stupid and lazy person will find a framework to use.

This are just old pages that bad search engine surface. IMO if you focus the actual lessons here are:

- developers are lazy, you need to fix that, there is no magic language that solves the issue though some fanboys will say that their favorite language is more idiot friendly.

- search engine and soon the AIs are stupid, let's try to encourage books or other quality materials. Recently I found a collegue that did not know that in JS the "addEventListener" function exists and you can use it to add more then 1 listener at a time, this person probably can put on his CV 5+ yearts JS experience and a few frameworks. Mayb e if we stop focusing on the "my language is cooler" we could find the actual problems.

Back in my starting years I was reading books to learn, there when you get to the SQL chapter you were explained all about SQL injection and related bugs and how to use prepared statements. With PHP you can block this dangerous functions (PHP is flexible and many stuff like "exec" is blocked in most hosting places, but the problem with newb developers remain, if we can agree about this real problem(IMO_ we can maybe address it. Sure, downranking bad tutorials would be a part of the solution, also Google should probably stop their shit where they put the solution directly on the search page and not forces you to actualy visit SO and see comments, limitations and alternatives, shame Google, you make software industry worse with your greed...

pvg · on July 25, 2021

The best way to insert supply chain exploits is to embed them in a stack exchange answer to a beginner's question.

None of these answers seem to come from SE so this might be harder than you might assume.

Godel_unicode · on July 25, 2021

It's not. Note that these threads are a few years old, but in recent research (stay tuned!) it has if anything gotten worse.

https://news.ycombinator.com/item?id=13099690

https://laurent22.github.io/so-injections/

pvg · on July 26, 2021

It may have but this research doesn't show that. It appears to be in questions, rather than answers, when I can find a page that hasn't been removed. Beside most of these not actually being there, the google result is for answers which additionally have been ranked by Google. The thing you link is not meaningfully comparable to the Google result, nor is it representative of 'if you search for something on SE, how often does it tell you to put SQLI in your code'.

dylan604 · on July 26, 2021

In 1980s, where did one go to find code to copy&paste?

dataviz1000 · on July 26, 2021

copy & paste ... ha I wish. We had to type it bit by bit from the back of a magazine back in my day. Do you remember this? [0]

[0] https://arstechnica.com/staff/2018/11/first-encounter-comput...

dylan604 · on July 26, 2021

This is exactly how I got into coding. A friend had the computer, and the 2 of us would hunt&peck the code in. We were maybe 12 years old. The DATA lines were the tricky spots. Everything else used words we could understand, and it just made sense after doing it enough. The DATA was just jibberish. We found it easiest if one typed while the other called out the data. Faster and fewer mistakes.

jonwinstanley · on July 26, 2021

Same. I used to buy Amiga Format magazine for the game reviews then found the code for creating my own game was way more fun

jussij · on July 26, 2021

Back in the late 80s and early 90s many people learned to code from a book.

You would just study the code samples found in those books.

hnick · on July 26, 2021

Or the existing code on your machine or local documentation. I learned quite a bit of QBASIC from reading nibbles.bas and gorillas.bas and the editor's own help was extensive.

fencepost · on July 26, 2021

Dr. Dobb's Journal of Computer Calisthenics & Orthodontia

Running Light without Overbyte

PeterisP · on July 26, 2021

You would use a reference source, which may be paper documentation provided by the tool vendor (e.g. API documentation with usage samples), online documentation provided by the tool vendor, sample source code provided by the tool vendor (including illustrative sample applications for reference was quite common), or third party reference books. Since documentation was much more necessary than nowadays, it was more thorough and generally of a higher quality than now, though of course not always perfect.

jimmygrapes · on July 26, 2021

Mailing lists and USENET mostly, with FTP listing of open source code here and there, followed by transcribing from printed material

pjmlp · on July 26, 2021

By using the brain as clipboard between book, eyes and keyboard.

gregjor · on July 25, 2021

I freelance fixing and maintaining legacy web apps, almost always PHP.

Anecdotally I see SQL injection vulnerabilities in about half the code I look at. It’s one type of problem among many other problems and vulnerabilities in code written by amateurs and often copy/pasted.

PHP programmers can find lots of resources online. Some of those are terrible, either very old or written by amateurs excited to show how they got something to work.

I have seen the same kind of thing with Java and Python, but the popularity of PHP means there’s a lot of junk info and examples online.

PHP has supported safe SQL and safe HTML for decades, but the programmer has to understand the problem and the solution.

gerdesj · on July 25, 2021

I run a small IT company and the Windows sysadmin stock answer to nearly all problems:

  C:\Windows\System32> sfc /scannow

... followed by "reinstall your operating system". OK so no harm done apart from rather a lot of downtime, assuming you can put it back together again. The number of times I see "disable your AV" still is frightening.

I have a browser plugin that I discovered thanks to this parish called uBlacklist which you can use to try and clean up your search results by banning known bad sites from your results. social.microsoft.whatever was first ... 8)

I also note an awful lot of Linux related link farms and "blogs" with ads and cloned content from other sources have surfaced over the last few years. WordPress is another quagmire. I could go on but basically, search is very close to completely screwed (but not quite.)

pixl97 · on July 25, 2021

Disable your AV is a perfectly cromulent suggestion. It is a root kit that operates at the lowest level of your operating system, and any issue with it and it will affect every layer above it.

Now, if disabling works you should set reasonable exclusions and enable the product again.

gerdesj · on July 25, 2021

Disabling your AV is never a good starter for 10 and is often proffered as the canonical fix for a problem. I shudder to think how many people have been debagged and radished (I'll take your cromulent and raise you really odd) as a result of following "sage" advice.

I read the logs and set exclusions until the damn thing works. I have briefly disabled the whole AV/firewall/browser plugin thing sometimes to double check but that is quite rare. When I smile my teeth make a "ping" sound and briefly flash white.

Aeolun · on July 25, 2021

So I install a rootkit… to save me from rootkits?

gerdesj · on July 25, 2021

Yes you do. You install something with all privileges on your system that claims to keep the baddies out.

Hopefully you choose wisely on what to install on your system. Hopefully you even know what is "wise" to install on your system.

If you find out what is wise to use on your system, please let us know.

DaiPlusPlus · on July 26, 2021

Microsoft Defender is pretty legit

syshum · on July 25, 2021

Now Now...

The Sysadmin stock answer to nearly all problems is

    shutdown /r /t 0

If that fails, then

    sfc /scannow

BeFlatXIII · on July 26, 2021

Blog spam is the bane of the n00b programmer. Even if you’re not totally new and are merely picking up a new language, it turns tutorial hell into eternal Hell.

Raspberry Pi tips is another quagmire of replicated garbage.

gregjor · on July 26, 2021

I've been mentoring a couple of junior programmers for a couple of years and I have seen the kind of junk tutorials and online misinformation they find. Some of it is useful because it shows so many bad ideas and implementations -- like studying a plane crash to find out what went wrong.

I wrote an article about this back in 2007, regarding Javascript examples in an O'Reilly book -- a source I used to recommend because of the quality of their writing and editing (I no longer have that opinion).

https://typicalprogrammer.com/learning-by-example-how-bad-co...

arp242 · on July 26, 2021

It's not just an issue with blogspam, plenty of "bookspam" as well. A few years ago a friend of my then-girlfriend was learning C for some project. The book she had was so badly written even I could hardly follow it, and I can already program in C. It's no surprise this part of her project failed.

I started programming on MSX-BASIC (kind of like C64), and when we finally got a PC in 2000 or so I got a book titled "Learn C++ in 10 minutes". It was so bad hat I was turned off from programming for a few years, as I thought I just didn't have what it takes (it also didn't help that the tooling and "getting started" was much harder back then, especially on Windows; if I had known you could just download e.g. Python instead of mucking about with this pirated Visual Studio I probably would have had an easier time – but I didn't know about that. It wasn't until I started playing with FreeBSD a few years later that I got back in to programming).

cerved · on July 26, 2021

also csharp, python, JavaScript

voltagex_ · on July 26, 2021

Yeah, but what are your other options?

* procmon, if you're lucky you'll catch WTF is going on somewhere deep in the registry

* Hoping Microsoft still has the answer in a KB article somewhere (hope you didn't need any Server 2008/2012 stuff that was on UserVoice, it's gone now)

* WinDBG if you're that good

Which brings us back to cargo culted answers like sfc /scannow

I wouldn't compeltely discount social.microsoft, very very occasionally it's had a tiny tidbit of information in between the people incorrecting each other.

arp242 · on July 26, 2021

I used to fix Windows computers for a living; this was 10 years ago and I don't really know what changed in Windows 8/10 as I never used it, but I imagine it's roughly similar to XP and 7.

With some knowledge and experience it's possible to fix a lot of problems. Actually, a lot of problems people chuck up to "Micro$ucks bad" are just hardware problems. If someone comes in with "I get random BSODs" then there's a good chance it's just faulty a faulty RAM module, disk, or something like that. The first step for random issues should always be to run memtest and a disk check tool (I don't recall the name of the tool I used for that, but there are some subtleties involved in testing this well, and I don't know the status of SSDs as this was kind of before they became common). Checking hardware is easy, checking software isn't.

Software problems can be a bit trickier to solve, depending on what the issue is. They're very hard to debug remotely over the internet: but there's a lot more you can do than "sfc /scannow" if you're sitting in front of the computer.

You really don't need WinDBG in most cases.

pixl97 · on July 26, 2021

Most problems can be solved if you want to put massive amounts of time in it. The issue you state, without realizing you've stated it, is that 'knowledge and experience' is in demand and expensive. So I have the option of investing a few hundred bucks of someone's time into fixing the issue, or running 'sfc /scannow' and surprisingly often fixing the problem.

arp242 · on July 27, 2021

It's not that time-consuming if you know a little bit what you're doing, you can go a long way with just 30 minutes; and this saves time/money too as 1) 1) hardware problems are correctly identified and fixed instead of lingering for ages (and reducing productivity), and 2) no need to reïnstall everything, which is time-consuming as well.

tannhaeuser · on July 25, 2021

> PHP has supported [...] safe HTML for decades, but the programmer has to understand the problem and the solution.

That's not good enough for a language advertising as "Hypertext Preprocessor" though. PHP's distinguishing feature is that's kicked off from SGMLish processing instructions in otherwise static HTML, and it has all context available for perfect injection-free HTML-aware templating. Eg escaping quotes when it's outputting into attributes, escaping "]]>" when outputting into CDATA sections, or with the help of a real markup processor, suppressing/escaping <script> elements or onclick or other event handler attributes where advised through a grammar such as an SGML DTD. But it doesn't because it's just such a hack job of a language, by the developer's own admission.

DaiPlusPlus · on July 26, 2021

Further evidence of this is the fact that `<?= $foo ?>`, and the long-form `<?php echo $foo ?>`, don’t offer a way to easily HTML-encode the output; instead you have to use `htmlentities()`. Whereas ASP.NET has had `<%: foo %>` to encode output for almost 15 years now, and Razor defaults to encoding: they make it harder to render unrecoded output.

wvenable · on July 26, 2021

Actually only 12 years since that syntax came out with ASP.NET version 4.0. ASP.NET went a long time without it (and classic ASP before that).

And, like with razor, you can use plenty of libraries with PHP that will encode by default.

DaiPlusPlus · on July 26, 2021

Razor is stock though, but there is still no in-box way using PHP's own syntax to auto-encode output.

corobo · on July 26, 2021

Comparing a language to a framework is a bit wonky isn't it? Laravel (PHP framework) for instance has {{ $foo }}

arp242 · on July 26, 2021

PHP started life as a template engine for CGI applications written in C. And some pretty major projects like WordPress use PHP as a "template language".

There are various constructs in the language rarely seen in PHP code that make this easier, such as <?=, but also also "if (..):" which can be ended with "endif", and "foreach (...):" which can be ended with "endforeach".

It's not a hard feature to add. PHP devs want to move away from this "PHP as a template language" (I think they tried to remove the <?= a few years back); that's all fine, but fact of the matter is that people ARE using it as a template language and will continue to do so in the foreseeable future. Not supporting that with something as simple as "automatic escape special HTML characters" is extremely disappointing, and would actually prevent a lot of problems.

DaiPlusPlus · on July 26, 2021

"I think they tried to remove the <?= a few years back)"

Not quite. They deprecated `<?` but not `<?=`, see here: https://wiki.php.net/rfc/deprecate_php_short_tags

Another PHP-RFC removed `<%`, `<%=`, and `<script language="php">` (I'll admit I didn't know about that one): https://wiki.php.net/rfc/remove_alternative_php_tags - but as with before, this specifically retained `<?=`.

arp242 · on July 29, 2021

Ah yes, it was just the "<?" tag and not "<?="; I misremembered.

The ASP tags were always a bit of a misfeature; I don't think I've ever seen it used even once. <script language="php"> is just weird because it's intended for client-side scripts :-/

DaiPlusPlus · on July 26, 2021

PHP is simultaneously a framework and a language, though. It features a very simple framework, though, and has been supplanted by others, including those that resort to reimplementing their own templating system, which defeats the point of using PHP in the first place as that was its main goal: to be a templating system for Personal Home Pages.

gregjor · on July 26, 2021

I won’t disagree that PHP has its flaws. A lot of them are legacy problems to support old code. Every language and tool with a big installed base has this problem — just look at the legacy crap in Windows.

It’s fairly easy to write clean and safe PHP. Any number of libraries and frameworks exist to do safe SQL queries and escape HTML. The problem is a lot of programmers don’t even know the vulnerability, not that it’s hard to fix.

I could bitch and moan about PHP or make a good living fixing bad code. Complaining won’t make that legacy code better or magically rewrite it.

tannhaeuser · on July 26, 2021

Fair enough, but in the case of PHP, the target that XSS attacks are after are not the PHP sites themselves most of the time, but weaponizing those for c&c attacks on third-party sites. Thus merely using PHP, with its well-known combination of copy/paste culture, popularity among newbs, and poor security practices opens site owners up to liability claims (if nothing else, such as gross negligence with PII). And PHP's defense is weak, with not even an attempt to bring its built-in web templating into something that could remotely be called state of the art, considering that eg SGML is 35 years old.

Not saying this to diss PHPers; in fact, I like the PHP community for their get stuff done mentality, and I think they deserve better. If I were contracting for PHP, though, I'd make sure to negotiate strong liability disclaimers.

gregjor · on July 26, 2021

Are you aware of an actual case of an owner of a web site getting sued because their site was used to attack other sites, without their knowledge? This could (and has) happened with other tools besides PHP. Is Microsoft liable because Windows is used as a launchpad for attacks?

tannhaeuser · on July 26, 2021

IANAL, much less a judge, but I think there's a plausible legal theory for suing isn't there?

marvinblum · on July 26, 2021

Same here. PHP is also often picked by beginners (including me 15 years ago) and you can see that. However, I have a lot of fun fixing these kind of issues and improving the code. It feels like archology/restauration sometimes and it makes me happy to keep them running securely. Also, it pays really well usually.

cosmodisk · on July 25, 2021

From my anecdotal data, a whole lot of tutorials are written by 'learn Java in 21 day' stage developers. People are excited, want to put their name out there and start churning out tutorials on concepts only yesterday they had no clue about. Similar situation with many online courses too.

paulddraper · on July 26, 2021

> PHP has supported safe SQL and safe HTML for decades, but the programmer has to understand the problem and the solution.

The ecosystem has a ton of exposed wires in builtins and libraries.

When the function's name is literally `mysql_real_escape_string` ... what does that tell you?

chipotle_coyote · on July 26, 2021

While you're not wrong, per se, this has a bit of that "never give PHP credit for getting better when it's still possible to do bad things with it" vibe to it. I mean, it's fairly well-established PHP did some pretty boneheaded things in its history and one could argue they didn't get serious about cleaning those up until rather late in PHP 5's life cycle. (Some would say not until PHP 7.)

In the case of your example, what it tells me is that they had a "mysql_escape_string()" that they needed to remove but had to deprecate first to avoid breaking existing code, however bad it might be, and so replaced it with "mysql_real_escape_string()" -- which itself hasn't been in PHP for over 5 years, since that whole MySQL driver was deprecated. There's still a "mysqli_real_escape_string()", but that name is likely a quirk of history, as there's no matching "mysqli_escape_string()" for people who would like to use the supported driver but continue screwing up the charset.

(Edit: another comment reminded me of something that I knew once but had forgotten. The MySQL C API has the "escape_string" and "real_escape_string" functions in it which do precisely the same things the old PHP functions did. So this actually tells us even less about PHP the language, although it may tell us something more about MySQL.)

paulddraper · on July 26, 2021

> "never give PHP credit for getting better when it's still possible to do bad things with it" vibe to it

I was trying to go for the "PHP has tons and tons of terrible shoddy baggage" vibe.

https://preview.redd.it/v53przfht6n01.png?width=960&crop=sma...

gregjor · on July 26, 2021

What language anyone uses that's older than a couple of years doesn't have terrible shoddy baggage in at least someone's opinion? I have the same opinion about node.js/npm, and Java. My opinion doesn't make anyone stop using those languages.

Stroustrup quipped "There are only two kinds of languages: the ones people complain about and the ones nobody uses." PHP is the first kind. Like every language and tool before it that came with a low barrier to entry it led to a proliferation of bad code. My friends who work in ML/data science make the same complaints about Python -- it's easy to get something to work but the code quality -- ugh. And in a few years lots of that code will face the "upgrade and break it or keep it and cross our fingers" point that so much legacy PHP is at already.

paulddraper · on July 26, 2021

That's true. PHP just started so, so, so far down, it's had more to overcome than most.

gregjor · on July 26, 2021

Not sure if you're just trolling the low-hanging fruit or not but I'll assume not.

When PHP came out in 1997 the other available products for putting web sites together, at least for smaller organizations, were:

- ASP (classic, not .Net)

- ColdFusion

- Perl

The first two were proprietary packages that required a license for the software and a license for the operating system (Windows). I got into PHP when a customer wanted to migrate away from Windows/ASP because of licensing fees -- they took the leap with open source, which was a big gamble at the time. The CTO had read "The Cathedral and the Bazaar" and swallowed the kool-aid. We still had to use SQL Server though, that company was committed to it across all of their applications, so I got to use PHP + ODBC for a while. Fun.

Perl had a fairly big base of CGI scripts but in most respects seemed worse than ASP, CF, PHP because Perl had a steep barrier to entry. PHP was an easy choice for shops looking to get off of ASP -- which Microsoft was making noises about discontinuing -- and ColdFusion, which several of my customers back then used, but complained about the cost (Adobe now owns CF).

So it was PHP. Then along came WordPress and the PHP world exploded. As you point out the language has had a hard time keeping up with the demands placed on it (Rasmus certainly didn't imagine Facebook-scale sites back then), and the evolving security threats (lots of web sites were purely internal back then, not exposed on the public internet, and the script kiddie hackers were still in nursery school in Kiev). Hosting providers sprung up to offer turn-key PHP/MySQL hosting, with the proviso that the site owner and developers did not control the PHP configuration.

Since 1997 a lot has changed and it's easy to point to problems in PHP and say "That could have been done a lot better." And that's true, but no one had that crystal ball back in the mid-90s. The push was to get something on the web. Planning for future maintainability has never been an aspect of software development we can boast about and the PHP code out there today is no different, there's just a lot of it.

For my part I push my customers to upgrade to the latest version and to do a security analysis and vulnerability test so we can find and fix the most egregious problems. Even this level of upgrading can get expensive and risky. I wish no one was still running PHP 5.4 in production in 2021 but wishing won't change that it's still fairly common, and companies using that code are only going to call someone like me after they've had a serious problem.

pjmlp · on July 26, 2021

Nah, writing CGIs in C, that was my first handling of FORM submits.

chipotle_coyote · on July 26, 2021

> PHP just started so, so, so far down, it's had more to overcome than most.

This is probably fair. :) I think PHP tried to combine Python's "batteries included" approach with Perl's "more than one way to do it" style, but did it in a pretty disorganized way that created lots of Catch-22 issues later -- when you get that popular, it makes backward-incompatible changes fraught with peril, even if you're addressing obviously craptacular past mistakes.

I think PHP has become pretty solid in version 7+ on, although my feelings about using it remain mixed. I've joked in the past that it's stopped being a cargo cult version of Perl and is now a cargo cult version of Java.

wvenable · on July 26, 2021

> The ecosystem has a ton of exposed wires in builtins and libraries.

PHP is a light wrapper around C libraries.

> When the function's name is literally `mysql_real_escape_string` ... what does that tell you?

That it comes from the MySQL directly:

https://dev.mysql.com/doc/c-api/8.0/en/mysql-real-escape-str...

paulddraper · on July 26, 2021

That doesn't surprise me.

MySQL is the PHP of databases.

gregjor · on July 26, 2021

Exactly. Free, well-supported, useful, widely-deployed, used by lots of developers.

zzt123 · on July 26, 2021

It makes me wonder if there’s a mysql_fake_escape_string or mysql_doesnt_actually_escape_string function. And why those functions would even exist in a language.

hnick · on July 26, 2021

It exists because "mysql_escape_string() does not take a connection argument and does not respect the current charset setting."

zzt123 · on July 26, 2021

I thought the security of escaping is dependent on not having mismatched charsets? In which case, not respecting charset settings seems potentially not actually escaping?

Seems like a strange function to have, although I could be foggy on my charsets.

SilverRed · on July 26, 2021

It shows the culture in PHP. They would rather keep a function around that doesn't work properly just so existing code still works instead of making everyone test that the new function works.

tored · on July 26, 2021

No, every PHP release deprecates functions or fixes functions that is either considered bad or not working as intended. Quick search & I found these (note that some of the rfc's include multiple deprecations), but there is more if you bother to actually look. Stop spreading misinformation.

https://wiki.php.net/rfc/deprecations_php_8_1

https://wiki.php.net/rfc/deprecations_php_7_4

https://wiki.php.net/rfc/deprecations_php_7_3

https://wiki.php.net/rfc/deprecations_php_7_2

https://wiki.php.net/rfc/remove_deprecated_functionality_in_...

https://wiki.php.net/rfc/deprecate_curly_braces_array_access

https://wiki.php.net/rfc/ternary_associativity

https://wiki.php.net/rfc/deprecate_null_to_scalar_internal_a...

https://wiki.php.net/rfc/deprecate-png-jpeg-2wbmp

https://wiki.php.net/rfc/mcrypt-viking-funeral

https://wiki.php.net/rfc/removal-of-deprecated-features

https://wiki.php.net/rfc/deprecate_mb_ereg_replace_eval_opti...

paulddraper · on July 26, 2021

https://www.reddit.com/r/ProgrammerHumor/comments/8667lt/sta...

gregjor · on July 26, 2021

It shows a trade-off between arbitrarily breaking code in production or not. Lots of PHP sites are hosted on services that don’t give the programmer control over the PHP version. If the hosting provider upgrades and breaks a bunch of sites that’s a problem every bit as serious (to the site’s owner) as unescaped HTML opening up XSS attacks.

SilverRed · on July 26, 2021

This is not sustainable or a desirable thing to keep. By accepting this state of things, no security fix with breaking changes can ever be implemented.

gregjor · on July 26, 2021

With all respect your comment is both arrogant and unrealistic. Exactly how do we not accept this state of things? No one claims it's desirable, it just is: bad code is out there, and it's not easy to fix.

What would you tell a small business that relies on clunky 10-year-old code to run their business? To rewrite it in a more modern language at huge expense and risk (given that a majority of rewrite projects fail)? Can you guarantee the new thing won't be just as obsolete and vulnerable and buggy in ten years?

These kinds of problems -- poorly-written and vulnerable code, amateur programmers, lack of professionalism, maintaining back-compatibility with an installed base -- are not specific to PHP. They afflict the entire software industry, and always have. Who could have seen into the future back in 2000 (when I first got exposed to PHP) that a new site would get probed by an army of bots within five minutes of going live? Or that it would be even harder today than back then to find and hire experienced programmers?

PHP has had many security fixes implemented since the early releases, but how can anyone force users of an open-source language to upgrade for their own good? Or pay someone to ferret out and fix vulnerabilities they have never got hit by?

Even brand new code has this problem. Look at all of the cryptocurrency code written in the last few years. We read about hacks and thefts and vulnerabilities every day, and that was written by supposedly smart people with access to modern languages and with knowledge of the contemporary security issues. And it still gets hacked. If we knew how to write perfect code that would still be perfect into the future I'm sure we would do that but until then we'll have to live with what we have. So far it has been sustainable, just less than optimal, if by optimal we mean what we can imagine rather than what we, as programmers, actually deliver.

eska · on July 26, 2021

Yes, yes, the poor companies. But do you ever consider the poor customers/users that put private information in the companies' databases? Or that the price they pay assumes the companies do not let their software rot for 10 years?

Then there's the typical logical fallacy of taking a trivial problem of escaping SQL and conflating it with something more complicated, and comparing to eternal perfection.. yawn.

SilverRed · on July 26, 2021

>companies do not let their software rot for 10 years?

I think that one of the big mistakes made in the last 20 years is that every company needs its own custom software and that software is like an asset that you buy once and not a constant cost source.

The vast majority of businesses have no need for custom software and should be using 3rd party services. Then those 3rd parties have the income to dedicate to keeping the software secure.

Its honestly terrible how many local businesses have their own complex software built on some ancient version of a frame work which is sitting on an ancient server box in their office. Its a ticking time bomb no one wants to think about. Prolonging the explosion is not the solution.

gregjor · on July 26, 2021

Agreed. I often tell customers to use an off-the-shelf solution and get on with their real business. Custom software development is expensive, risky, and incurs long-term maintenance costs. I outright refuse to take on custom e-commerce sites or accounting or CRM systems at this point.

About half the time the customer will find someone else who will happily bid on writing custom code despite my suggestion. That’s one reason the legacy code problem just gets bigger every year, and a lot of it shouldn’t have been written in the first place.

gregjor · on July 26, 2021

Look at the major data breaches over the past decade -- TransUnion, Experian, multiple US government sites, etc. and point to one that was caused by a PHP SQL injection attack. This kind of thing can happen to anything accessible on the public internet.

Do you know how old the software your bank uses is? Pretty much every government agency and utility you rely on? What price do you pay for that? A lot of that code has been rotting longer than any PHP web site.

There's no logical fallacy. I wrote multiple times that escaping SQL is essentially trivial in PHP, and has been not only easy but the recommended best practice. The problem is lots of inexperienced programmers don't know the problem to begin with. They would write vulnerable code in any language. I had to work on a Rails site a few years ago that was vulnerable to XSS and SQL injection, even though Rails by default protects against those things. Someone had gone around all of that because they didn't understand the problem in the first place. I don't know that any language can protect us from that.

eska · on July 26, 2021

Again, frantic hand-waving and pointing fingers, filled to the brim with logical fallacies. I can only imagine what kind of work culture exists in your company that you keep repeating the same tired, generic excuses that I've heard thousands of times before, thinking that they're not fallacies.

The fact that you and many others in this industry think these arguments are in any way rational or defensible puts our industry to shame.

gregjor · on July 26, 2021

I freelance supporting legacy software. I wrote that already. There’s no culture in my company, just me.

There’s a difference between an explanation and an excuse, and between counterexamples and “hand waving.” I’m sure it makes you feel superior to dismiss opinions and comments with vague references to logical fallacies or indefensible arguments, but just hauling out big words doesn’t make you right, or even make any sense.

I can’t fix everything wrong with software development. I’ve been doing it for 40 years and we just keep making the same mistakes. My small contribution is fixing broken code one customer at a time, at least leaving the campsite cleaner than I found it. I don’t lose a lot of sleep over our collective failure to write perfect software.

SilverRed · on July 26, 2021

Of course this is a lot of work but it means that unless PHP takes security seriously, no one will take PHP seriously and the language will die off / be relegated to dirt cheap contractor work.

No serious org is going to use a product where you have to remember that the sql escape function doesn't work and you have to use the one that says real sql escape.

gregjor · on July 26, 2021

This is a canard, really. The PDO library, which is a core PHP module, has SQL injection mitigation built-in (with escaped parameter substitution). It was introduced with PHP 5 in 2004. The popular PHP frameworks such as Laravel and CodeIgniter also protect against SQL injection and XSS by default.

The MySQL escape functions are named the way they are because that's what they are called in the MySQL API, which PHP exposes pretty much verbatim. I don't see a lot of people using that interface in new PHP code (because Laravel and PDO), but it comes up on older code.

Again the problem is not obscure function names or that PHP makes it possible to shoot yourself in the foot. The problem is a whole lot of inexperienced programmers (and quite a few who should know better) not understanding the problem in the first place. If you don't know what SQL injection is or how it happens or how code can make it possible you aren't going to know how to protect against it. PHP does do it for you if you use PDO (more than 15 years old at this point), or any of the numerous other safe RDBMS libraries. This is like complaining that Honda makes shitty cars because some people put glass packs and spoilers on a Civic -- people use languages and tools wrong out of ignorance and inexperience.

I think it's clear that PHP has been taken seriously for some time, even if largely because of WordPress. It's not going to die off or get relegated to the language ghetto because it has some (obvious, well-known) flaws that serious programmers have known how to live with for literally decades. Regardless of what you think or see on Upwork, PHP contractors are not cheap. No one who can and will work on legacy code is cheap because most programmers won't even do that work if they can help it. Supporting legacy software, which includes improving and securing and upgrading it, is maybe the most lucrative and secure niche for programmers sitting there in plain sight.

corobo · on July 26, 2021

That command doesn't even exist anymore and hasn't since 2013

okeuro49 · on July 26, 2021

That's not completely true. Over its evolution, PHP has removed some functions completely to provide better and more secure functionality, such as mysql_* in PHP 7.

ericbarrett · on July 26, 2021

As another commenter points out, this is actually a quirk of the underlying MySQL C library, which has (or had) both functions.

hnick · on July 26, 2021

PHP (and old languages in general) are full of ASCII-only English-centric assumptions. I think both functions are now considered deprecated since we have even more variations like mysqli_real_escape_string (or just use PDO with bound params).

makeitdouble · on July 26, 2021

At this point there's a ton of CI tools to check for injection and dangerous patterns, and serious companies have been using them for years/decades now, ranging from local options to online tools like Scrutinizer or Sonarqube. I'd wager even PHPCs would catch the copy/pasted ones.

To me the language or online examples is no excuse for SQL injections for a long time now.