I freelance fixing and maintaining legacy web apps, almost always PHP.
Anecdotally I see SQL injection vulnerabilities in about half the code I look at. It’s one type of problem among many other problems and vulnerabilities in code written by amateurs and often copy/pasted.
PHP programmers can find lots of resources online. Some of those are terrible, either very old or written by amateurs excited to show how they got something to work.
I have seen the same kind of thing with Java and Python, but the popularity of PHP means there’s a lot of junk info and examples online.
PHP has supported safe SQL and safe HTML for decades, but the programmer has to understand the problem and the solution.
I run a small IT company and the Windows sysadmin stock answer to nearly all problems:
C:\Windows\System32> sfc /scannow
... followed by "reinstall your operating system". OK so no harm done apart from rather a lot of downtime, assuming you can put it back together again. The number of times I see "disable your AV" still is frightening.
I have a browser plugin that I discovered thanks to this parish called uBlacklist which you can use to try and clean up your search results by banning known bad sites from your results. social.microsoft.whatever was first ... 8)
I also note an awful lot of Linux related link farms and "blogs" with ads and cloned content from other sources have surfaced over the last few years. WordPress is another quagmire. I could go on but basically, search is very close to completely screwed (but not quite.)
Disable your AV is a perfectly cromulent suggestion. It is a root kit that operates at the lowest level of your operating system, and any issue with it and it will affect every layer above it.
Now, if disabling works you should set reasonable exclusions and enable the product again.
Disabling your AV is never a good starter for 10 and is often proffered as the canonical fix for a problem. I shudder to think how many people have been debagged and radished (I'll take your cromulent and raise you really odd) as a result of following "sage" advice.
I read the logs and set exclusions until the damn thing works. I have briefly disabled the whole AV/firewall/browser plugin thing sometimes to double check but that is quite rare. When I smile my teeth make a "ping" sound and briefly flash white.
Blog spam is the bane of the n00b programmer. Even if you’re not totally new and are merely picking up a new language, it turns tutorial hell into eternal Hell.
Raspberry Pi tips is another quagmire of replicated garbage.
I've been mentoring a couple of junior programmers for a couple of years and I have seen the kind of junk tutorials and online misinformation they find. Some of it is useful because it shows so many bad ideas and implementations -- like studying a plane crash to find out what went wrong.
I wrote an article about this back in 2007, regarding Javascript examples in an O'Reilly book -- a source I used to recommend because of the quality of their writing and editing (I no longer have that opinion).
It's not just an issue with blogspam, plenty of "bookspam" as well. A few years ago a friend of my then-girlfriend was learning C for some project. The book she had was so badly written even I could hardly follow it, and I can already program in C. It's no surprise this part of her project failed.
I started programming on MSX-BASIC (kind of like C64), and when we finally got a PC in 2000 or so I got a book titled "Learn C++ in 10 minutes". It was so bad hat I was turned off from programming for a few years, as I thought I just didn't have what it takes (it also didn't help that the tooling and "getting started" was much harder back then, especially on Windows; if I had known you could just download e.g. Python instead of mucking about with this pirated Visual Studio I probably would have had an easier time – but I didn't know about that. It wasn't until I started playing with FreeBSD a few years later that I got back in to programming).
* procmon, if you're lucky you'll catch WTF is going on somewhere deep in the registry
* Hoping Microsoft still has the answer in a KB article somewhere (hope you didn't need any Server 2008/2012 stuff that was on UserVoice, it's gone now)
* WinDBG if you're that good
Which brings us back to cargo culted answers like sfc /scannow
I wouldn't compeltely discount social.microsoft, very very occasionally it's had a tiny tidbit of information in between the people incorrecting each other.
I used to fix Windows computers for a living; this was 10 years ago and I don't really know what changed in Windows 8/10 as I never used it, but I imagine it's roughly similar to XP and 7.
With some knowledge and experience it's possible to fix a lot of problems. Actually, a lot of problems people chuck up to "Micro$ucks bad" are just hardware problems. If someone comes in with "I get random BSODs" then there's a good chance it's just faulty a faulty RAM module, disk, or something like that. The first step for random issues should always be to run memtest and a disk check tool (I don't recall the name of the tool I used for that, but there are some subtleties involved in testing this well, and I don't know the status of SSDs as this was kind of before they became common). Checking hardware is easy, checking software isn't.
Software problems can be a bit trickier to solve, depending on what the issue is. They're very hard to debug remotely over the internet: but there's a lot more you can do than "sfc /scannow" if you're sitting in front of the computer.
Most problems can be solved if you want to put massive amounts of time in it. The issue you state, without realizing you've stated it, is that 'knowledge and experience' is in demand and expensive. So I have the option of investing a few hundred bucks of someone's time into fixing the issue, or running 'sfc /scannow' and surprisingly often fixing the problem.
It's not that time-consuming if you know a little bit what you're doing, you can go a long way with just 30 minutes; and this saves time/money too as 1) 1) hardware problems are correctly identified and fixed instead of lingering for ages (and reducing productivity), and 2) no need to reïnstall everything, which is time-consuming as well.
> PHP has supported [...] safe HTML for decades, but the programmer has to understand the problem and the solution.
That's not good enough for a language advertising as "Hypertext Preprocessor" though. PHP's distinguishing feature is that's kicked off from SGMLish processing instructions in otherwise static HTML, and it has all context available for perfect injection-free HTML-aware templating. Eg escaping quotes when it's outputting into attributes, escaping "]]>" when outputting into CDATA sections, or with the help of a real markup processor, suppressing/escaping <script> elements or onclick or other event handler attributes where advised through a grammar such as an SGML DTD. But it doesn't because it's just such a hack job of a language, by the developer's own admission.
Further evidence of this is the fact that `<?= $foo ?>`, and the long-form `<?php echo $foo ?>`, don’t offer a way to easily HTML-encode the output; instead you have to use `htmlentities()`. Whereas ASP.NET has had `<%: foo %>` to encode output for almost 15 years now, and Razor defaults to encoding: they make it harder to render unrecoded output.
PHP started life as a template engine for CGI applications written in C. And some pretty major projects like WordPress use PHP as a "template language".
There are various constructs in the language rarely seen in PHP code that make this easier, such as <?=, but also also "if (..):" which can be ended with "endif", and "foreach (...):" which can be ended with "endforeach".
It's not a hard feature to add. PHP devs want to move away from this "PHP as a template language" (I think they tried to remove the <?= a few years back); that's all fine, but fact of the matter is that people ARE using it as a template language and will continue to do so in the foreseeable future. Not supporting that with something as simple as "automatic escape special HTML characters" is extremely disappointing, and would actually prevent a lot of problems.
Another PHP-RFC removed `<%`, `<%=`, and `<script language="php">` (I'll admit I didn't know about that one): https://wiki.php.net/rfc/remove_alternative_php_tags - but as with before, this specifically retained `<?=`.
Ah yes, it was just the "<?" tag and not "<?="; I misremembered.
The ASP tags were always a bit of a misfeature; I don't think I've ever seen it used even once. <script language="php"> is just weird because it's intended for client-side scripts :-/
PHP is simultaneously a framework and a language, though. It features a very simple framework, though, and has been supplanted by others, including those that resort to reimplementing their own templating system, which defeats the point of using PHP in the first place as that was its main goal: to be a templating system for Personal Home Pages.
I won’t disagree that PHP has its flaws. A lot of them are legacy problems to support old code. Every language and tool with a big installed base has this problem — just look at the legacy crap in Windows.
It’s fairly easy to write clean and safe PHP. Any number of libraries and frameworks exist to do safe SQL queries and escape HTML. The problem is a lot of programmers don’t even know the vulnerability, not that it’s hard to fix.
I could bitch and moan about PHP or make a good living fixing bad code. Complaining won’t make that legacy code better or magically rewrite it.
Fair enough, but in the case of PHP, the target that XSS attacks are after are not the PHP sites themselves most of the time, but weaponizing those for c&c attacks on third-party sites. Thus merely using PHP, with its well-known combination of copy/paste culture, popularity among newbs, and poor security practices opens site owners up to liability claims (if nothing else, such as gross negligence with PII). And PHP's defense is weak, with not even an attempt to bring its built-in web templating into something that could remotely be called state of the art, considering that eg SGML is 35 years old.
Not saying this to diss PHPers; in fact, I like the PHP community for their get stuff done mentality, and I think they deserve better. If I were contracting for PHP, though, I'd make sure to negotiate strong liability disclaimers.
Are you aware of an actual case of an owner of a web site getting sued because their site was used to attack other sites, without their knowledge? This could (and has) happened with other tools besides PHP. Is Microsoft liable because Windows is used as a launchpad for attacks?
Same here. PHP is also often picked by beginners (including me 15 years ago) and you can see that. However, I have a lot of fun fixing these kind of issues and improving the code. It feels like archology/restauration sometimes and it makes me happy to keep them running securely. Also, it pays really well usually.
From my anecdotal data, a whole lot of tutorials are written by 'learn Java in 21 day' stage developers. People are excited, want to put their name out there and start churning out tutorials on concepts only yesterday they had no clue about. Similar situation with many online courses too.
While you're not wrong, per se, this has a bit of that "never give PHP credit for getting better when it's still possible to do bad things with it" vibe to it. I mean, it's fairly well-established PHP did some pretty boneheaded things in its history and one could argue they didn't get serious about cleaning those up until rather late in PHP 5's life cycle. (Some would say not until PHP 7.)
In the case of your example, what it tells me is that they had a "mysql_escape_string()" that they needed to remove but had to deprecate first to avoid breaking existing code, however bad it might be, and so replaced it with "mysql_real_escape_string()" -- which itself hasn't been in PHP for over 5 years, since that whole MySQL driver was deprecated. There's still a "mysqli_real_escape_string()", but that name is likely a quirk of history, as there's no matching "mysqli_escape_string()" for people who would like to use the supported driver but continue screwing up the charset.
(Edit: another comment reminded me of something that I knew once but had forgotten. The MySQL C API has the "escape_string" and "real_escape_string" functions in it which do precisely the same things the old PHP functions did. So this actually tells us even less about PHP the language, although it may tell us something more about MySQL.)
What language anyone uses that's older than a couple of years doesn't have terrible shoddy baggage in at least someone's opinion? I have the same opinion about node.js/npm, and Java. My opinion doesn't make anyone stop using those languages.
Stroustrup quipped "There are only two kinds of languages: the ones people complain about and the ones nobody uses." PHP is the first kind. Like every language and tool before it that came with a low barrier to entry it led to a proliferation of bad code. My friends who work in ML/data science make the same complaints about Python -- it's easy to get something to work but the code quality -- ugh. And in a few years lots of that code will face the "upgrade and break it or keep it and cross our fingers" point that so much legacy PHP is at already.
Not sure if you're just trolling the low-hanging fruit or not but I'll assume not.
When PHP came out in 1997 the other available products for putting web sites together, at least for smaller organizations, were:
- ASP (classic, not .Net)
- ColdFusion
- Perl
The first two were proprietary packages that required a license for the software and a license for the operating system (Windows). I got into PHP when a customer wanted to migrate away from Windows/ASP because of licensing fees -- they took the leap with open source, which was a big gamble at the time. The CTO had read "The Cathedral and the Bazaar" and swallowed the kool-aid. We still had to use SQL Server though, that company was committed to it across all of their applications, so I got to use PHP + ODBC for a while. Fun.
Perl had a fairly big base of CGI scripts but in most respects seemed worse than ASP, CF, PHP because Perl had a steep barrier to entry. PHP was an easy choice for shops looking to get off of ASP -- which Microsoft was making noises about discontinuing -- and ColdFusion, which several of my customers back then used, but complained about the cost (Adobe now owns CF).
So it was PHP. Then along came WordPress and the PHP world exploded. As you point out the language has had a hard time keeping up with the demands placed on it (Rasmus certainly didn't imagine Facebook-scale sites back then), and the evolving security threats (lots of web sites were purely internal back then, not exposed on the public internet, and the script kiddie hackers were still in nursery school in Kiev). Hosting providers sprung up to offer turn-key PHP/MySQL hosting, with the proviso that the site owner and developers did not control the PHP configuration.
Since 1997 a lot has changed and it's easy to point to problems in PHP and say "That could have been done a lot better." And that's true, but no one had that crystal ball back in the mid-90s. The push was to get something on the web. Planning for future maintainability has never been an aspect of software development we can boast about and the PHP code out there today is no different, there's just a lot of it.
For my part I push my customers to upgrade to the latest version and to do a security analysis and vulnerability test so we can find and fix the most egregious problems. Even this level of upgrading can get expensive and risky. I wish no one was still running PHP 5.4 in production in 2021 but wishing won't change that it's still fairly common, and companies using that code are only going to call someone like me after they've had a serious problem.
> PHP just started so, so, so far down, it's had more to overcome than most.
This is probably fair. :) I think PHP tried to combine Python's "batteries included" approach with Perl's "more than one way to do it" style, but did it in a pretty disorganized way that created lots of Catch-22 issues later -- when you get that popular, it makes backward-incompatible changes fraught with peril, even if you're addressing obviously craptacular past mistakes.
I think PHP has become pretty solid in version 7+ on, although my feelings about using it remain mixed. I've joked in the past that it's stopped being a cargo cult version of Perl and is now a cargo cult version of Java.
It makes me wonder if there’s a mysql_fake_escape_string or mysql_doesnt_actually_escape_string function. And why those functions would even exist in a language.
I thought the security of escaping is dependent on not having mismatched charsets? In which case, not respecting charset settings seems potentially not actually escaping?
Seems like a strange function to have, although I could be foggy on my charsets.
It shows the culture in PHP. They would rather keep a function around that doesn't work properly just so existing code still works instead of making everyone test that the new function works.
No, every PHP release deprecates functions or fixes functions that is either considered bad or not working as intended. Quick search & I found these (note that some of the rfc's include multiple deprecations), but there is more if you bother to actually look. Stop spreading misinformation.
It shows a trade-off between arbitrarily breaking code in production or not. Lots of PHP sites are hosted on services that don’t give the programmer control over the PHP version. If the hosting provider upgrades and breaks a bunch of sites that’s a problem every bit as serious (to the site’s owner) as unescaped HTML opening up XSS attacks.
This is not sustainable or a desirable thing to keep. By accepting this state of things, no security fix with breaking changes can ever be implemented.
With all respect your comment is both arrogant and unrealistic. Exactly how do we not accept this state of things? No one claims it's desirable, it just is: bad code is out there, and it's not easy to fix.
What would you tell a small business that relies on clunky 10-year-old code to run their business? To rewrite it in a more modern language at huge expense and risk (given that a majority of rewrite projects fail)? Can you guarantee the new thing won't be just as obsolete and vulnerable and buggy in ten years?
These kinds of problems -- poorly-written and vulnerable code, amateur programmers, lack of professionalism, maintaining back-compatibility with an installed base -- are not specific to PHP. They afflict the entire software industry, and always have. Who could have seen into the future back in 2000 (when I first got exposed to PHP) that a new site would get probed by an army of bots within five minutes of going live? Or that it would be even harder today than back then to find and hire experienced programmers?
PHP has had many security fixes implemented since the early releases, but how can anyone force users of an open-source language to upgrade for their own good? Or pay someone to ferret out and fix vulnerabilities they have never got hit by?
Even brand new code has this problem. Look at all of the cryptocurrency code written in the last few years. We read about hacks and thefts and vulnerabilities every day, and that was written by supposedly smart people with access to modern languages and with knowledge of the contemporary security issues. And it still gets hacked. If we knew how to write perfect code that would still be perfect into the future I'm sure we would do that but until then we'll have to live with what we have. So far it has been sustainable, just less than optimal, if by optimal we mean what we can imagine rather than what we, as programmers, actually deliver.
Yes, yes, the poor companies. But do you ever consider the poor customers/users that put private information in the companies' databases? Or that the price they pay assumes the companies do not let their software rot for 10 years?
Then there's the typical logical fallacy of taking a trivial problem of escaping SQL and conflating it with something more complicated, and comparing to eternal perfection.. yawn.
>companies do not let their software rot for 10 years?
I think that one of the big mistakes made in the last 20 years is that every company needs its own custom software and that software is like an asset that you buy once and not a constant cost source.
The vast majority of businesses have no need for custom software and should be using 3rd party services. Then those 3rd parties have the income to dedicate to keeping the software secure.
Its honestly terrible how many local businesses have their own complex software built on some ancient version of a frame work which is sitting on an ancient server box in their office. Its a ticking time bomb no one wants to think about. Prolonging the explosion is not the solution.
Agreed. I often tell customers to use an off-the-shelf solution and get on with their real business. Custom software development is expensive, risky, and incurs long-term maintenance costs. I outright refuse to take on custom e-commerce sites or accounting or CRM systems at this point.
About half the time the customer will find someone else who will happily bid on writing custom code despite my suggestion. That’s one reason the legacy code problem just gets bigger every year, and a lot of it shouldn’t have been written in the first place.
Look at the major data breaches over the past decade -- TransUnion, Experian, multiple US government sites, etc. and point to one that was caused by a PHP SQL injection attack. This kind of thing can happen to anything accessible on the public internet.
Do you know how old the software your bank uses is? Pretty much every government agency and utility you rely on? What price do you pay for that? A lot of that code has been rotting longer than any PHP web site.
There's no logical fallacy. I wrote multiple times that escaping SQL is essentially trivial in PHP, and has been not only easy but the recommended best practice. The problem is lots of inexperienced programmers don't know the problem to begin with. They would write vulnerable code in any language. I had to work on a Rails site a few years ago that was vulnerable to XSS and SQL injection, even though Rails by default protects against those things. Someone had gone around all of that because they didn't understand the problem in the first place. I don't know that any language can protect us from that.
Again, frantic hand-waving and pointing fingers, filled to the brim with logical fallacies. I can only imagine what kind of work culture exists in your company that you keep repeating the same tired, generic excuses that I've heard thousands of times before, thinking that they're not fallacies.
The fact that you and many others in this industry think these arguments are in any way rational or defensible puts our industry to shame.
I freelance supporting legacy software. I wrote that already. There’s no culture in my company, just me.
There’s a difference between an explanation and an excuse, and between counterexamples and “hand waving.” I’m sure it makes you feel superior to dismiss opinions and comments with vague references to logical fallacies or indefensible arguments, but just hauling out big words doesn’t make you right, or even make any sense.
I can’t fix everything wrong with software development. I’ve been doing it for 40 years and we just keep making the same mistakes. My small contribution is fixing broken code one customer at a time, at least leaving the campsite cleaner than I found it. I don’t lose a lot of sleep over our collective failure to write perfect software.
Of course this is a lot of work but it means that unless PHP takes security seriously, no one will take PHP seriously and the language will die off / be relegated to dirt cheap contractor work.
No serious org is going to use a product where you have to remember that the sql escape function doesn't work and you have to use the one that says real sql escape.
This is a canard, really. The PDO library, which is a core PHP module, has SQL injection mitigation built-in (with escaped parameter substitution). It was introduced with PHP 5 in 2004. The popular PHP frameworks such as Laravel and CodeIgniter also protect against SQL injection and XSS by default.
The MySQL escape functions are named the way they are because that's what they are called in the MySQL API, which PHP exposes pretty much verbatim. I don't see a lot of people using that interface in new PHP code (because Laravel and PDO), but it comes up on older code.
Again the problem is not obscure function names or that PHP makes it possible to shoot yourself in the foot. The problem is a whole lot of inexperienced programmers (and quite a few who should know better) not understanding the problem in the first place. If you don't know what SQL injection is or how it happens or how code can make it possible you aren't going to know how to protect against it. PHP does do it for you if you use PDO (more than 15 years old at this point), or any of the numerous other safe RDBMS libraries. This is like complaining that Honda makes shitty cars because some people put glass packs and spoilers on a Civic -- people use languages and tools wrong out of ignorance and inexperience.
I think it's clear that PHP has been taken seriously for some time, even if largely because of WordPress. It's not going to die off or get relegated to the language ghetto because it has some (obvious, well-known) flaws that serious programmers have known how to live with for literally decades. Regardless of what you think or see on Upwork, PHP contractors are not cheap. No one who can and will work on legacy code is cheap because most programmers won't even do that work if they can help it. Supporting legacy software, which includes improving and securing and upgrading it, is maybe the most lucrative and secure niche for programmers sitting there in plain sight.
That's not completely true. Over its evolution, PHP has removed some functions completely to provide better and more secure functionality, such as mysql_* in PHP 7.
PHP (and old languages in general) are full of ASCII-only English-centric assumptions. I think both functions are now considered deprecated since we have even more variations like mysqli_real_escape_string (or just use PDO with bound params).
At this point there's a ton of CI tools to check for injection and dangerous patterns, and serious companies have been using them for years/decades now, ranging from local options to online tools like Scrutinizer or Sonarqube. I'd wager even PHPCs would catch the copy/pasted ones.
To me the language or online examples is no excuse for SQL injections for a long time now.
It’s no excuse for professional developers working at “serious” companies. It is an excuse for the legions of amateur developers just trying to get something to work.
If Google and Amazon can’t find and hire enough developers imagine what that supply/demand and cost problem means for small companies. I have clients who have been trying to hire a f/t or p/t programmer for years. They can’t pay $100/hr for a simple web site. So they hire amateurs trying to get that experience needed to get a real job at a serious company.
Yes, they leave a trail of crappy code full of vulnerabilities and bugs. The only way to blame that on PHP is to criticize its low bar to entry, which is a good thing for beginners.
> Some of those are terrible, either very old or written by amateurs excited to show how they got something to work.
And the current bootcamp trend will only amplify that. Lambda has "instructors" that are students only 4 months ahead of the students they are teaching...
Anecdotally I see SQL injection vulnerabilities in about half the code I look at. It’s one type of problem among many other problems and vulnerabilities in code written by amateurs and often copy/pasted.
PHP programmers can find lots of resources online. Some of those are terrible, either very old or written by amateurs excited to show how they got something to work.
I have seen the same kind of thing with Java and Python, but the popularity of PHP means there’s a lot of junk info and examples online.
PHP has supported safe SQL and safe HTML for decades, but the programmer has to understand the problem and the solution.