Hacker News new | past | comments | ask | show | jobs | submit login

Cool research. I like how you "connect-the-dots" from the benign-looking MySQL's behaviour to the bad code in Wordpress. This reminds me of http://www.suspekt.org/2008/08/18/mysql-and-sql-column-trunc....

I'm surprised that the fix in Wordpress wasn't explicitly marking fields that need to be serialized/unserialized, instead of second-guessing based on the broken promise by MySQL.

> MySQL replaces characters it doesn’t recognize (for the given character set), with a placeholder. MySQL will sometimes replace byte sequences with “?” or “�” (U+FFFD). Such replacements would not be harmful.

This is so wrong. A database must never change any data that it's asked to stored. Wordpress, and other applications, always make that assumption, and when it isn't true anymore all hell breaks loose.

PS: it blows my mind that it looks like strpos in PHP could return either boolean or integer [1].

[1] http://core.trac.wordpress.org/browser/tags/3.6.1/wp-include...




Would you rather return it -1 or something else? You are gonna need a comparison in any case because you can get 0 offset. Return type doesn't matter as much because php has dynamic typing. Pretend it is Option[Int] if you will.

I would also like to use this moment to go on a tangent with my unwilling audience that, not wasting any remote opportunity to badmouth php, or any other language for that matter, just for its standard library or trying to not break as much legacy code as possible, is bad form. Sure language syntax itself may suck, semantics may suck, I am always up for a good PL flamewar. If you want to bash the library, don't blame the language itself for the poor choices of the library.

cryptbe, I would like to apologise in advance and humbly request you to not take this personally.


I'm always sort of flabbergasted when I see PHP programmers doing this "maybe_xyz" stuff. I recall there's a PHP api for escaping stuff that has weird options for either allowing "double escaping" or ignoring successive invocations. It screams amateur hour to say "uh, i have a string, and I don't know if it's escaped yet, so I'll just call this API that escapes it because it magically avoids 'double escaping' for me." There's no such thing as "double escaping" -- it's just "escaping". The fact that you might be escaping something that appears to be an already-escaped string is irrelevant. If you are dealing with user input strings and you don't know for sure whether a string is escaped or not (or how many times), you are probably writing a security hole somewhere.


While I totaly agree, this is in no way specific to PHP.

Ruby on Rails has such ugliness too, a view helper called "escape_once": http://api.rubyonrails.org/classes/ActionView/Helpers/TagHel...

What's crazy is that I can't even find an "escape" helper. Ho it's called html_escape. Ho and there is a html_escape_once too!

Python Django too: https://docs.djangoproject.com/en/dev/ref/utils/#django.util...


Fair enough. It's funny how angry I get when I think of someone needing an "escape_once" function or "is_serialized". I think this discussion might have to become part of my interview process, because if someone doesn't understand the absolute undeniable terribleness of trying to determine if a string has been escaped or serialized by inspecting its contents, then I really don't want them in my code.


a lot of php functions are like this due to the fact that 0 evaluates to false. They need to return boolean false so you can do a strict comparison to determine between failure and a result of 0




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: