Can you get StackOverflow to "sponser" this idea? Not just for crypto (which you've done) - but for all PHP related answers?
Because the problem is there are many "old" accepted answers, with high upvotes, that will always come up as number 1 or 2 in google searches.
Given PHP has changed so much, many of those answers are outdated, use incorrect and insecure methods, and some are now just wrong. This is not just security - but a whole host of answers.
The "meta" StackOverflow rules will tell you to downvote the old answer and post a new one - but that is not practical - and will take years to take effect. Plus, many people simply read the first large upvoted answer, copy + paste the code, and move on.
edit: I guess it would be nice to be able to "flag" an accepted answer (not a question) as outdated, get 5x people with gold badges for that tag to accept it - and then the answer is highlighed as wrong/out of date (or even deleted). Something like that.
> Can you get StackOverflow to "sponser" this idea?
Possibly. I wouldn't even know where to begin.
> Because the problem is there are many "old" accepted answers, with high upvotes, that will always come up as number 1 or 2 in google searches.
Yeah, that's my concern. I can definitely edit anything tagged [php], due to having a gold [php] badge, and I think I can edit anything because I have a reputation higher than 10,000.
The hardest problem for me, here, is identifying these old accepted answers with high upvotes.
IMO (as someone with 45k rep on StackOverflow, largely in PHP) - the best way is to modify the "flag" option on answers - and have an extra option called "out of date". If someone flags an answer as out of date, then anyone with a gold badge in that tag (i.e. PHP) - can review and accept or decline the flag.
If 5 gold people accept it - then the answer is formally highlighted as out of date, or even deleted.
edit: There probably should be another flag option - called "security" - where even if an answer "works" and is "in date" - people can flag it as insecure due to a better option. Think of all the stupid SQL injection answers. You can downvote it to hell - but sometimes they should just be flagged/deleted.
Ok, so it's not deleted but maybe it gets a banner with links on up to date stuff like in OP. Then, if they can't use that due to legacy software, they go forward with whatever the answer contains for what they're stuck with. That might help both types of users.
Ask the folks on meta.stackoverflow.com. Be prepared for some resistance -- the concept of calling out bad answers in any way more nuanced than downvotes or comments is not popular -- but your involvement with the greater PHP community may give you some extra leverage.
The other problem with StackOverflow is that even if you try to correct outdated and wrong answers that will not work anymore at all, the edits will often be refused by elitist assholes who think you don't have enough points to edit such popular answers.
I think it would be better to write a plugin for various PHP IDEs that contain fingerprints of bad code which is used to yell at the programmer if they are using these old, insecure code snippets. Heck, you could add a feature or note to the developer to down-vote the source of said code.
It would be a lot of work, but I think it's easier than trying to get authors of defunked blogs to take down 10 year old answers.
As a side-effect, this increases awareness among developers that you can't blindly accept what's on SO. If developers begin to lose trust in SO, then SO is going to be incentivized to do something about, or go the way of expertsexchange.
This is called static analysis and it's a very active field of research. The main pain is figuring out which analyses to enable in existing code bases, because if you have a hundred diagnostics from legacy code you don't want to deal with, you're going to miss the one or two new warnings in the code you just added.
For greenfield projects it can be a godsend, if you have the experience to know which warnings you should disable and the discipline to keep the entire code base warning-free. I've done this with StyleCop / FxCop in C#.
This could certainly be a starting off point for what I was imagining.
I was thinking that they could do analysis on code snippets that are copy-pasted into the editor. It should be straight-forward to compare the contents to a finger-print database and issue a warning when they find a match to online code examples that are marked as vulnerable.
If you ever type in mysql_query it should archive your project, encrypt it, and hold it for ransom until you apologize and promise to never touch that function ever again.
I've been using netbeans recently and it has this very annoying and effective squiggles that warn you when you use deprecated stuff. It also tries to enforce some best practices by default
Reputation management companies struggle to wipe "bad content" off of 3rd party sites.
Your goal is laudable, just not sure it's practical. Did you consider focusing on publishing and promoting net new content instead? Get traction that way, and that content will move up the Google serps, effectively moving the old/bad content down.
> Did you consider focusing on publishing and promoting net new content instead?
We've been doing that for years. We've hit diminishing returns because:
1. There is a lot of incumbent material in the same genre, most of which is 10+ years old, but people still link to it in droves.
2. We don't support the at-best-sketchy SEO industry that's sometimes arm-in-arm with adware.
If you read the 2018 guide this post mentions a few times, you'll notice that it, in turn, links to relevant blog posts (and open source libraries) spanning back to early 2015.
That's sort of what I'm asking people for, except presented as a choice.
I'll update the article to try to make it clear that "link to clearly superior content" is a viable alternative to rewriting said content, as long as the same goal is achieved.
Well, PHP has finally come full circle in the lifecycle. It's fully following in the footsteps of Perl. There was a big push and a lot of talk in the Perl community around 2010-2012 to find and remove old and poor quality Perl tutorials so people searching didn't end up with bad information. Part of this push included writing new and better tutorials and trying to get them placed higher in results for old tutorials that could not or would not be removed.
Since we all saw and remember how Perl regained it's top spot as most popular programming language, I'm sure this will work out wonderfully. :/
Outside of HN, you'll find that PHP is by far the most popular server-side web application language. HN's kinda got a language hipster thing going where if you aren't using the latest language or framework then you're stale and old and irrelevant (headline: "Why you should be building your next ORM in Q#"; summary of thread: "I don't get why anyone uses anything else, my team of three has been very successful using Q# already").
The same thing happens about four times a year when COBOL is mentioned and someone realizes to their horror that the plastic card in their wallet works thanks to a language whose last major revision was in 1974.
Perl still powers a lot of stuff. It fell behind PHP because PHP outpaced it for growth in web development. Nothing has outpaced PHP's growth yet, no matter how badly trend-addicted programmers want it to.
This is the fallacy. Perl never stopped growing in popularity, it's just that it was suddenly eclipsed by other languages that came out of nowhere and grew well beyond what it ever achieved.
You're also falling victim to this same fallacy. Something did eclipse PHP.
The back-end is getting thinner, the front-end is where all the business logic happens these days. So yeah, you might have a PHP back-end, but you also have a mountain of view logic in JavaScript (TypeScript, CoffeeScript, etc.) that utterly buries it.
There's also Node.js which is catching on really fast because it fits that role of "JSON over REST/GraphQL provider" really well.
> Perl still powers a lot of stuff. It fell behind PHP because PHP outpaced it for growth in web development.
Oh, I'm well aware. I'm a full time Perl developer, have been for a long time, and love the language. It's just interesting to see history repeat itself.
I almost wrote a similar comment, but I figured I would be downvoted. I realize it sounds cynical, but why should this effort be made for PHP? I loved PHP back in the year 2000, and at that time it had advantages that no other language had, but those days are long gone. There are many better languages now. I wrote about this in my essay "PHP is obsolete" and there was a fairly good conversation about that essay on Hacker News:
It is still popular with newcomers, people that don't come from a structured CS background, etc. Because the barrier to entry is just so low. Meaning, it's a built in, working option on hosting providers, deceptively simple in the beginning, etc. Also, PHP still dominates the "host it yourself e-commerce" space, because of Magento, Opencart, and PrestaShop. Oh, and WordPress...it's some insane percentage of all sites. And lousy plugins are everywhere.
We could ignore that crowd, but guiding them down the right path is better for everyone.
Edit: I'm also not convinced that node.js, which is gaining ground with the same crowd, doesn't have similar issues. Footguns aren't unique to PHP.
The availability on hosting providers is a HUGE deal. Sure, you can write an application in Python, or Node, or Java, or whatever else is theoretically better... but good luck finding somewhere to host it that's cheap and doesn't require you to do all the sysadmin work yourself (i.e, not a VPS).
PHP may not be the best of all possible languages, but it's by far the most widely available.
I agree. And, as mentioned, these hosting providers will likely make node.js just as simple soon, and node will replace php as the whipping boy for flippant remarks. What's old is new again. This is more about providing newbies with good advice than it is about PHP.
Honestly, I don't see that happening soon, if ever. PHP was easy for hosting providers because it was easy to plug in to existing virtual hosting support in Apache and FTP servers. Node is more complicated; there's no obvious "right" way to handle many Node apps running on a single shared server.
There were several ways for PHP too, and still are, FWIW. The shared hosting providers eventually settled on the best compromise of price/performance/security.
What they provide now is better and more scalable than old school CGI. Most of the shared hosts are using LightSpeed[1] as it squeezes as much as possible out of PHP in a shared environment. That vendor, and their low end shared host customers, aren't dumb. They will respond to market changes and make node.js a 1st class support item when it is clear that's where the money is.
For web development, PHP still has a key advantage that no other language can match: ease of deployment. Many developers woefully underestimate how important this is, particularly if non-technical users need to self-host your software.
Telling potential users to spin up a VPS or type some command line instructions to install code is too complicated for many users, even though it may seem trivially easy to a developer.
Actually, I think the current state of the art is significantly easier than PHP was, in that you just choose your langauge and the framework written in your language, start it and go to localhost and some port. It used to be that you had to have a web host, and then( sure you could just drop PHP in place, but you still needed that host. Setting anything up yourself required twiddling with Apache and mod_php (or using a distro that shipped with packages that enabled it by default).
The whole "start up a little webserver in the same language" trend makes everything much* easier to develop.
Where PHP still has an edge is deployment for non-development. If you just want to drop in some software and use it on the web (e.g. a bulletin board), PHP is still easier in many cases for novice users. But that's not development, it's just deployment.
I would even hazard that correct deployment of that software in a secure way is made harder by PHP's ease of deployment, as you have to rely on permissions in lieu of separate locations that have no remote read access, as many frameworks implement.
> The whole "start up a little webserver in the same language" trend makes everything much* easier to develop.
I actually agree, but that's not the approach that's pushed. In python the "tiny http server" is explicitly advertised as insecure and unsuited to production use. In Java, embedding Jetty involves dozens of arcane lines and manually translating suggested XML files into code - not something a beginner can do. What other language than PHP can I dump a single file on a shared host and have it be running?
(Ironically Java is actually a legitimate answer here, if the shared host ran an application server like Tomcat and accepted .war uploads. But I don't think I've ever seen a shared host that offered that?)
> I actually agree, but that's not the approach that's pushed. In python the "tiny http server" is explicitly advertised as insecure and unsuited to production use.
Sure, but we're talking about developing, and that's a pretty easy way to get started.
In Python, don't you have Gunicorn, Tornado and Twisted?
In Perl there's Mojolicious and Dancer.
Ruby has Rails, Sinatra and probably a least a few more.
Java... Well, IMO Java has a habit if either not delivering simple smaller projects, or communicating they exist very poorly (but I'm not really in the loop so my opinion may be very uninformed). Tomcat was too big and complex to fit this need a decade ago, so I doubt it does it well now (but maybe I'm wrong?).
> What other language than PHP can I dump a single file on a shared host and have it be running?
My point is that needing a shared host to get started is actually a step you often don't have to wait for now. If you're a developer, it's easier to just start up your dev locally and work out hosting in a bit.
If you're developing something beyond the simplest things, the host allowing drop in scripts is not actually a problem I think. For anything that's super quick and dirty, you still have CGI.
That's the point I was making. The drop in script support isn't really all that beneficial to PHP developers, but it is very beneficial to regular user that just was to deploy a PHP webapp of some sort, such as a bulletin board or blog software.
> Sure, but we're talking about developing, and that's a pretty easy way to get started.
> My point is that needing a shared host to get started is actually a step you often don't have to wait for now. If you're a developer, it's easier to just start up your dev locally and work out hosting in a bit.
I have to disagree. It's silly to try to develop something you can't actually run. Getting your software in front of actual users is the critical path; everything else is secondary. (And exposing your dev laptop as a web host on the internet is pretty complex in practice for most people, quite aside from whether it's a good idea)
> For anything that's super quick and dirty, you still have CGI.
Do any of the frameworks you listed (Gunicorn or Mojolicious or Rails or...) support running as CGI? Is it documented in their tutorials?
> I have to disagree. It's silly to try to develop something you can't actually run. ... (And exposing your dev laptop as a web host on the internet is pretty complex in practice for most people, quite aside from whether it's a good idea)
It's trivial to run. Get a minimal AWS instance, or a small digitalocean instance, or whatever. Hosting is not a problem these days. Godaddy is offering managed (they patch and provide support) VPS instances for $18/mo right now. Surely not as cheap as some dedicated host, but also more secure and if you're actually spending your time developing something, not a lot to pay.
Nobody is saying expose your laptop to the internet. But getting started isn't about getting someone to look at your stuff, it's about actually getting started and getting some code written.
> Do any of the frameworks you listed (Gunicorn or Mojolicious or Rails or...) support running as CGI? Is it documented in their tutorials?
When you control the webserver running a subprogram as CGI (which is what I was referring to) is trivial, and can be handled in a few lines of code.
If you're talking about whether the framework has instructions on how to run as a CGI, I'm going to go out on a limb and say yes, they all do because that's step one in telling people how to run your stuff when making a framework, where it makes sense. In some cases it's just a WSGI/PSGI etc server and you can interchange them. It may not make sense to run twisted as a CGI, but if you write WSGI you can just use django, which does support deployment as a CGI. Mojolicious and Dancer support it. I'm pretty sure Rails and Sinatra do in some way to.
> It's trivial to run. Get a minimal AWS instance, or a small digitalocean instance, or whatever. Hosting is not a problem these days. Godaddy is offering managed (they patch and provide support) VPS instances for $18/mo right now. Surely not as cheap as some dedicated host, but also more secure and if you're actually spending your time developing something, not a lot to pay.
These trivial things matter when you're getting off the ground. Dropping money on that idea you were playing with in your evenings is a big psychological step. A couple of hours faffing with server admin can be even more offputting.
> Nobody is saying expose your laptop to the internet. But getting started isn't about getting someone to look at your stuff, it's about actually getting started and getting some code written.
Disagree. The goal isn't to write some code, the goal is to get a viable business, and these days business concerns - the famous product-market fit - are a much bigger risk factor than technical ones. A mostly-static page with a simple text form that can start getting you actual customers puts you much closer to that than any amount of code running on your dev laptop. (It's different when technical innovation is at the core of the business - when the actual code is the biggest risk factor - but in that case you're probably not doing web stuff and probably not using any of these languages).
> If you're talking about whether the framework has instructions on how to run as a CGI, I'm going to go out on a limb and say yes, they all do because that's step one in telling people how to run your stuff when making a framework, where it makes sense. In some cases it's just a WSGI/PSGI etc server and you can interchange them. It may not make sense to run twisted as a CGI, but if you write WSGI you can just use django, which does support deployment as a CGI.
It's never "just" though; these differences are important. Every extra step beyond "upload this file to the host" probably cuts the number of projects that get off the ground in half. And a lot of these frameworks will tell you how to "run" them in a one-off way but completely gloss over how to set them up so that they'll still be running after you restart the server you're hosting on.
I just had a quick look through the django tutorials to check I wasn't talking nonsense and it's even worse than I thought. The "installing" page gestures vaguely in the direction of virtualenv (which is how people actually use django in practice) but it's not integrated in the tutorial at all. The 7-page "beginner tutorial" only ever uses the officially-not-secure-enough-for-production local development server. I was excited by the link to an "advanced tutorial about creating reusable apps", but turns out that's just telling you how to package up your webapp so that other people can run it locally with the development server. Nothing about WSGI, not one word about how to actually put your "webapps" on the web. I don't mean this as an attack on django specifically - I like django, and it's documentation and tutorials are better than a lot of options - but no wonder people keep using PHP.
>sure you could just drop PHP in place, but you still needed that host. Setting anything up yourself required twiddling with Apache and mod_php (or using a distro that shipped with packages that enabled it by default).
php has builtin webserver since 5.4. that's like 5-6 years ago
The figures on that page don’t mean that 83% of developers use PHP... Just that some things written in PHP are very popular. I’m sure that most of that figure is packages like WordPress.
Because people get into programming via PHP via writing extensions for popular PHP software (WordPress) because that's what they can get paid to do as an entry level dev. It would be prudent to have them get into it in a way that doesn't cause exploits.
Sure everyone and their dog may one day get into programming via Javascript via Ghost or something else along that line, but that's just not how it is right now. How it is right now is people writing HTML forms that post into raw SQL queries - you can make rookie mistakes like that in any language, no matter which one happens to be the top dog of the moment.
I also have to note that on a skim a lot of the comments you linked to can be distilled to "No it isn't"
> we can raze the mountains of collective technical debt that have accumulated over the past decade.
Not likely when professionally written code is full of errors, many security-related (e.g. attempting to load and execute .gifs). I recently presented a compilation[1] of the errors produced by one widely-used commercial ad function. If this is how a professionals write commercial code, I don't want to imagine what amateurs have been doing.
An admirable goal. One of my pet peeves with lots of php code and extensions is the unending desire to make every bit of cold backwards compatible to some of the oldest versions of PHP.
There are some great, new cryptographic functions that have been implemented in PHP >= 5.5
Password_hash() and password_verify() are so simple, it's hard to mess up password hashing now. When I upgraded my projects to PHPa7, I replaced dozens of lines of code with those 2 functions alone.
But I have seen plenty of implementations of them that still fall back to old more convoluted and error prone methods when you are using some old version of PHP.
The universe of developers with PHP projects to maintain is filled with people who do not share the same goals and constraints that you might enjoy.
There is an enormous amount of lousy PHP code, and more being made by clueless developers. But please do not dismiss the need to support old PHPs as being driven primarily by those reasons.
1. The PHP project itself has EOLed 5.3.3, however distros continue to support it with backported security and bug patches of their own.
2. PHP 5.3.3 (with backported security and bug patches) remains the default in CentOS 6.9, which is supported until 2020. More recent versions are not available via their repositories. Hosts would have to upgrade PHP outside of the CentOS packages and assume the maintenance burden from then on.
3. About 50% of sites running PHP are at versions less than 5.5:
4. Updating PHP is not like simply updating my Web browser. Real-world production hosts like mine are filled with various work by various developers over years. Bumping PHP further than a maintenance release would almost certainly mean unnecessarily breaking things that are hard to find, probably tricky to fix and written by people I’ve never met who are long gone.
5.3??? 5.3 has been EOL for 3.5 years now. I do not envy the person that has to maintain backports for so many years. In fact, I'm not even sure all the backports would always work properly...
I was referring to new code and code packages that are available out there.
It sounds like you are talking about providing hosting for older versions of PHP code. Which I of course understand is a necessary thing. You can't force people to upgrade their codebase. They'll just find another host.
Yeah, it's... ugh. I've got to deal with a mission-critical PHP 5.3 app. It goes down, and a subsidiary of around 700 people can't do their sales, control production flow, or create their invoices.
But they'll finally listen to engineering... any year now...
Please see point #2 if you're still confident that 5.3 to 5.5 should give a host like mine no trouble:
"PHP 5.3.3 (with backported security and bug patches) remains the default in CentOS 6.9, which is supported until 2020. More recent versions are not available via their repositories. Hosts would have to upgrade PHP outside of the CentOS packages and assume the maintenance burden from then on."
The point isn't that you can't or shouldn't do it, but that it is much more trouble than you might think for many people. There are implications that apply to others that you might not see or care about for your particular situation.
If your distro's package maintainer is asleep at the wheel then you'll have to take matters into your own hand and source install.
The Node team has done a great job here, they have a number of ways of doing managed source installs that avoid a lot of the ugly hassles you can usually hit. Ruby has rvm which also handles this quite well. This helps work around any friction you might get at the distribution level.
If PHP is being held back by distribution maintainers then that's a problem that the PHP community should fix. 5.3 came out in 2009, it's ancient.
This kind of breezy dismissal is really frustrating. Developers and admins in the field are dealing with this situation for the reasons I gave above, and not out of simple ignorance.
The biggest problem in the PHP world is there's a wealth of functions and software that people don't use.
Sometimes it's for lack of knowledge (PEAR? Composer? What's that?) but all too frequently it's outright hostility to the very idea, like if it isn't PHP core it's not even worth considering.
`password_hash` and `password_verify` have been around for ages now and if your PHP is old enough that it doesn't support it, that PHP version is no longer actively supported and is probably riddled with unpatched security holes, so good luck with that!
1) Sign (HMAC) all the session ids that your server issues. This allows requests with bogus session ids to be rejected at the network borders without doing any I/O or hitting the database.
2) Use web crypto (now supported by all major browsers) to have clients generate a private key with which to sign all requests. Using session keys as bearer tokens opens users up to attacks.
3) Do NOT send passwords to the server! Use passwords to decrypt the private key, and do not expose the decrypted key to external Javascript. And make sure the key can't be exported.
4) Clients authenticate new devices using two-factor authentication. If the person is using a previously authorized device and knows their password this may be considered two factors already. Unless the password was saved, in which case they better have a password on their device. Ultimately you gotta trust the OS.
5) Authentication and authorization for data may be done automatically by a side-channel (QR code via camera, or bluetooth) with the proof submitted to the server by either device. Revocation ultimately needs a blockchain.
6) If you lost all your authorized devices, the backup should be: M of N public keys, plus a passphrase you know. This is only for rare cases and the passphrase can be weak.
I haven't seen this particular article, but having read it, I can tell you: there are different threat models. You can't be secure against them all on the Web. Of course we have to trust the server to deliver the initial code. The same is true with apps delivered via the appstore etc. But that doesn't mean you should be send and storing password hashes to the server, even if generated by PHP. It doesn't mean you should be using the session cookie alone as a bearer token to access a session.
First of all, you have to agree that a security requirement in addition to a cookie doesn't make things less secure.
Secondly, with Web Crypto the Web has a way to mark keys "non exportable". If the website is sending you the wrong resources then of course anything can be sent, and web-based code isn't the ultimate way to protect the user. The same is true of other approaches. However if the initial code download wasn't tampered with, then you are far more protected. Because the secret private key won't be exported from the browser website. And it won't be accessible to anyone outside the JS environment that asks for your password or finger to derive a key to decrypt the master key from the local database. And in that JS environment, you can make sure (via closures) that no one gets access to it in "userland".
OH AND YOU SHOULD ALSO BE USING A PRIVATE KEY PER USER TO ENCRYPT DATA AT REST ON YOUR DATABASE, AND STORE THIS KEY IN THE DB MULTIPLE TIMES - EACH ONE ENCRYPTED BY THE USER'S DEVICE KEY. You don't store these device keys. Successfully authentication requests from the device send this key. So once again you need to obtain this key in order to unlock user's info needed for the request. And users can send permissions to unlock their information to each other in sidechannels. You can take this security VERY far...
So it's strictly more secure than the server side database for passwords, even hashed and salted with key strengthening. BUT, don't advertise it because then it introduces security attacks where people over-rely on this to te detriment of the vectors mentioned in the article.
PS: Oh. This was written in 2011, before the Web Crypto standard I am referring to was published and adopted by all web browsers. I do NOT recommend doing the crypto methods in JS! And yes it has a secure RNG now.
> It doesn't mean you should be using the session cookie alone as a bearer token to access a session.
Well, maybe.
A random ID that just tells PHP where to look for the session data, with all the data persisted server-side, is secure as long as it's transferred over HTTPS. Most frameworks/libraries abstract the implementation details away, but generally:
<?php
use ParagonIE\ConstantTime\Base32;
$random = Base32::encode(random_bytes(32));
This value will be unpredictable and doesn't require an HMAC to ensure this property. The only time the HMAC adds value is if you're using the user's computer as a data mule for the entirety of session state rather than "look up this identifier in a database".
In this genre, we're working on PAST (although this is probably going to be renamed before it's finalized) to solve the cryptography flaws baked into the JWT standards (collectively, JOSE): https://github.com/paragonie/past
That doesn't solve the "replay attack" issue (which may be what you were referring to with bearer tokens).
> OH AND YOU SHOULD ALSO BE USING A PRIVATE KEY PER USER TO ENCRYPT DATA AT REST ON YOUR DATABASE, AND STORE THIS KEY IN THE DB MULTIPLE TIMES - EACH ONE ENCRYPTED BY THE USER'S DEVICE KEY.
I'm not entirely sure what you're getting at here. If I needed to share a key across devices, I'd either use Diffie-Hellman or Shamir Secret Sharing to accomplish the task (depending on use-case and threat model).
Perhaps there's a lot of implementation detail that's not being discussed here that I'm missing and what you're saying is a conservative local maximum, but it stuck out a tad bit.
Perhaps. What I mean is, the session id as a bearer token for example can still be insecure even if sent via https. For example, if PHP scripts use $_REQUEST and session_start() uses the id sent in $_GET from GPC. So you can have session fixation attacks. However, if you additionally require devices to sign all requests with their private key (stored in IndexedDB for the domain and not exportable) then Web Crypto can help mitigate many classes of attacks, including session fixation and CSRF. (It can even increase security over http without https, even though it's an academic point.)
Today we wrote up an article about this, actually, referencing your guide:
> Today we wrote up an article about this, actually, referencing your guide:
Neat. I'll have to give that a read in the morning.
Although, it looks like one of your links in the opening paragraph under the "Web Security in 2018" header is broken, and presumably that was the one meant to link to our guide.
I disagree with their suggestion to comment out or remove old PHP code. People need to have a date and timestamp at their top of the articles AND state what version of PHP they are coding for.
Information on older version should not be pruned as there are legitimate reasons to keep them. Not everyone is using PHP 7; what do you do when you inherit an older code base; what do you do when you just need to modify a few things on an older code base; what if you are trying to learn security based programming; what if you are trying to learn how to break into systems (to then learn how to protect them); what if PHP7 is not available or feasible for your project; etc.
Hi Scott, thanks for all the great efforts. This is a good start, and baseline but security is so much harder once you get past these basics. Once you get into the realm of timing attacks and all the ways to fuck up password recovery for example, it becomes a mine field.
What can we do about that, short of “use framework X or Y” as they are the only ones peer reviewed?
I've been a PHP developer for over ten years. Paragon Initiative is amazing. Their Github repos are full of useful things: https://github.com/paragonie
An overdue step would be an interface to execve that isn't based on a single string, but uses actual lists.
I recall the python docs have a big red warning box that you can enable shell-style-single-string mode instead of a list, but it's highly dicouraged due to security problems. php has about 5 ways to execute programs, but all of them enforce this insecure interface.
Many of the PHP issues I see cited aren't unique to PHP, but more unique to it's heavily neophyte centric user base.
For example...there's nothing about current day PHP that makes it more susceptible to SQL injection than any other dynamically typed language with easy string interpolation. And the official PHP docs do guide you down the right path for SQL.
Hmm, I distinctly remember seeing a talk from the Chaos Computer Club Congress about how the PHP language is fundamentally broken in terms of security in comparison to other scripting languages. Edit: Apologies it was actually Perl (title "The Perl Jam"). In any event well worth watching if you have the time!
I remember that. That is not a good talk. That the software had those errors sucks, but it's not a fundamental problem with Perl, but in how people had designed their internal APIs. In some cases he was citing problems in a module that was in core, but had long been noted to have problems and was not considered acceptable to use in anything except a quick and dirty script (and has since gone through a long deprecation cycle and finally been removed (CGI).
What the talk really exposed was a few similar bugs found in various projects (which is good!), and exposed a fundamental misunderstanding of a language by the researcher due to unfamiliarity. This was all covered in depth here at the time.[1]
Golang community, please take the history of the PHP community as a series of models for 1) what to do and 2) what not to do! Also do this for Java, Ruby, Javascript, LISP, C++, and Smalltalk!
And if you think the history of PHP isn't applicable, then go and find the Golang library authors who are advocating filtering as a defense against SQL injection!
What's needed here is a "naming and shaming" effort. Make a public directory of bad PHP tutorials/references/etc., with the names of the people and companies who wrote and host them prominently attached. Maybe even give them a score, based on how frequently cited/linked to their bad advice has become, or how many pieces of it they've proffered. Then only take them off the list when the documents are cleaned up or removed.
If you're a professional PHP developer or a company that builds on PHP, it would be very embarrassing to find yourself prominently featured on such a list. Which would create an incentive for those people to clean up their work so they can get off it.
As things stand currently, publishing outdated and dangerous information costs the publisher nothing, so they see no reason to stop doing it. Create a cost by attaching reputational damage to the act, and you create a reason for them to stop.
This plan would fail, because someone sophisticated enough to find the naming and shaming list of awfulness would probably already be sophisticated enough to realize that old, bad advice is bad and why it is, and where to find new, good information. The problem is not merely that bad old content exists, but there is a class of PHP developer who is insufficiently sophisticated to distinguish between good and bad information independently.
Whether that's a feature or a bug of PHP is an exercise I leave for the reader.
That's certainly a novel approach to solving the problem.
I'm hesitant to do this myself, because it might open us to legal action, and I don't have the emotional bandwidth or cash reserves to fight a lawsuit right now.
YouTube is littered with awful, preposterously bad tutorials. If there's one thing that'd fix PHP it'd be for the community to put out better, more prominent, higher quality material than the toxic dreck that's out there.
This isn't to say that there aren't good YouTube videos, but for each one of these there's easily a hundred where people with no clue are explaining PHP as if they know everything.
If that list become popular search engines will use it as a validation that the linked articles are good, so the bad tutorials will get better SEO and even more newcomers will find and use them. The list can even be an incentive to write bad articles since you get free references from this site, which give scamy authors more visitors and more income from ads
Because the problem is there are many "old" accepted answers, with high upvotes, that will always come up as number 1 or 2 in google searches.
Given PHP has changed so much, many of those answers are outdated, use incorrect and insecure methods, and some are now just wrong. This is not just security - but a whole host of answers.
The "meta" StackOverflow rules will tell you to downvote the old answer and post a new one - but that is not practical - and will take years to take effect. Plus, many people simply read the first large upvoted answer, copy + paste the code, and move on.
edit: I guess it would be nice to be able to "flag" an accepted answer (not a question) as outdated, get 5x people with gold badges for that tag to accept it - and then the answer is highlighed as wrong/out of date (or even deleted). Something like that.