Hacker News new | past | comments | ask | show | jobs | submit login
Bugzilla CVE-2015-4499: All Your Bugs Are Belong to Us (perimeterx.com)
123 points by Chris911 on Sept 18, 2015 | hide | past | favorite | 35 comments



> Will the DB raise an exception? Will it crash? No – It automatically truncates the data so it fits the column size.

If you write software, never be "helpful" and try to fix problems with input data you can't handle. You have no idea whether the fix you have in mind is going to be appropriate in all situations and there is no guarantee that your users are going to acknowledge the warning you might (or might not) issue in that case (case in point: MySQL does fire warnings, but many libraries don't even expose them to the user, so even if they wanted to check for warnings after every statement, they couldn't)

"Fixing" broken input data is nothing but actually corrupting data because after the fact there is no way to know what the original was, nor is there a way to know whether the "fixed" data was actually fixed or whether it has been that way to begin with.

If you are in a position where you can't do something based on the input data, then blow up. That way you help users to avoid ever having to deal with corrupted data (or all their private bugs exposed to the public like what happened here)


> If you write software, never be "helpful" and try to fix problems with input data you can't handle.

Yep. Accepting bad data in any way is a bug, not a feature, and it isn't always (I'd suggest it isn't commonly) intentional: for instance mySQL used to accept dates like 2015-02-30 simply because its internal date format permitted it rather than because they were explicitly allowing it - it was an accident not an attempt to be helpful.

Accidental bugs that accept bad data are often indistinguishable from intentional "help", and can get passed up through a stack of calls (App A uses library B that wraps library C which calls service D that uses library E...) so even if the help is very well documented at source that knowledge is usually lost as you move up the chain and the application dev is left with a library that accepts bad data unbeknownst to him/her which could end up masking a serious bug that will bite them later.

> "Fixing" broken input data is nothing but actually corrupting

Exactly. The old "be strict in what you send, but generous in what you receive" is often misinterpreted - you should only be generous in what you receive if everything you receive can be properly handled all the way into your stack and all the way back out again, and never be generous enough to accept bad data. Back to the date example: yes, accept any unambiguous date format the user might throw at you, but never accept an invalid date.

> If you are in a position where you can't do something based on the input data, then blow up.

Definitely. If something could fail it should fail as early as possible, and not quietly in development (caveat: failures should not be outwardly verbose in production for security reasons), that way you are more likely to catch problems early, before they compound into something worse.


I don't think it is quite that clear cut. See, for example, Postel's Law AKA the Robustness Principle.

Be conservative in what you do, be liberal in what you accept from others

https://en.wikipedia.org/wiki/Robustness_principle


The Robustness principle isn't, and Postel's Law is pretty much a failure. This kind of thinking introduces tons of implementation compatibility issues. By being liberal in accepting from others (aka accepting malformed messages), you allow broken implementations to "work". Now those broken implementations form a de facto standard that everyone else must implement.

I demonstrated how this kind of thinking, coupled with "simple" text-based protocols, introduces security issues. SIP is a protocol with nutty parsing rules like HTTP. Lines end with CRLF, body is separated from headers by two CRLFs.

Some implementations act liberal and will accept any combination of CR and LF instead of just CRLF. So header \r\r body is OK with some implementations, and not others. Which means some stacks will read body as more headers. It's not hard to see how this creates a security problem, as you pass a message to a trusted proxy and it asserts things are OK, except the two stacks don't agree on what the headers actually are. Oops. This is a real, live, issue that affects SIP networks today and can be exploited for profit. And it's hard to fix, because some networks are actually sending non-CRLF lines, creating a compat issue. If implementations had been harsh on the CRLF requirement, those networks wouldn't be sending non-CRLF lines, as it would never have worked in the first place.

In short, being liberal just means "each implementation creates its own interpretation". This is because not all impls are going to agree on what "liberal" means. And if "liberal" could be defined, then it should be defined in the spec! No need for interpretations.


The problem with

> should accept non-conformant input as long as the meaning is clear

on that article is that the definition of "clear" is somewhat murky. Let's use the MySQL string truncation that spurred this thread as an example.

When asked to put 300 character string in a 255 character wide field, for MySQL it was "clear" that the user only meant to actually store 255 characters. From the perspective of MySQL, that actually makes sense: The field is declared as storing a maximum of 255 characters, so obviously, the user intends to store a maximum of 255 characters in that field.

Now look where this interpretation of "clear" facts has lead us to.

The issues that can be caused by "fixing" up invalid data are usually much harder (and more embarrassing) to fix than an exception that happens early.

Trust me: I'd rather get an exception than a CVE when given the choice.

Furthermore: By "fixing" up data, your fixes become part of the protocol. Implementations of clients derived from the broken implementation might not notice the brokenness and suddenly you're stuck with having to fix issues the same way for all eternity because becoming more strict will cause backwards compatibility breaks.

Worse: You also lock yourself out of the ability to later actually extend the protocol in a meaninful way because you're already accepting broken data.

Let's say you have a JSON based protocol that has a flag "foo" that can be set to some value: {"foo": 12}. Now there's a broken implementation of a client around that sends {"foobar": 12}. As you're "sure" that they actually mean "foo", you add an alias "foobar" to mean "foo".

Even though you've never intended it, now "foobar" is part of the official protocol and clients start sending this all over the place.

If at a later point, you actually want to support "foobar", you can't because that's already a hack to mean "foo", so now you'll end up with some crap like {"real_foobar": 1234}.

So not only is this behaviour irresponsible to clients (see the bugzilla issue), no, it's also a sure way to make your own life harder in the future as it makes for harder to maintain code and makes you lose flexibility in protocol design.


> should accept non-conformant input as long as the meaning is clear

This is the part I don't agree with their. If the data is wrong it isn't getting into my database.

The principal breaks itself IMO: if I accept junk, then when asked for data I may have no choice but to respond with junk. Accepting too liberally precludes being in control of how conservative you are in what you output.

Of course I'm talking ideally here. In the real world sometimes there are inputs you have to accept stuff from which you have no control to correct or authority to reject, but except when that is absolutely the case reject away and demand the other side fix their output. If you do accept iffy data make sure it is marked as such as early as possible and that mark stays with it for as long as relevant, so you can identify any data that has has an unsafe transformation applied or has simply been left incorrect.


Not really the full story. Bugzilla explicitly disables strict mode (due to another problem). Pretty stupid to have done that!

See for details https://bugzilla.mozilla.org/show_bug.cgi?id=321645#c20

PS: I wrote the patch to disable strict mode.


If I'm reading that correctly, the problem was that some MySQL versions couldn't actually restore database dumps they created using mysqldump if strict mode was enabled. That's a pretty ugly bug.


If you're on strict mode, TEXT and BLOB types in MySQL 5 don't support a default value. Bugzilla didn't set anything for those fields; relying on MySQL putting in the default value. At that time Bugzilla had loads of places where the database would insert things into the database. Fixing all of those would've been pretty daunting task and IMO MySQL not supporting those defaults anymore is a too invasive (backwards compatibility breaking) change. See http://dev.mysql.com/doc/refman/5.6/en/data-type-defaults.ht....

This was around the time that GNOME once again was upgrading its Bugzilla instance (IIRC), plus lots of people were moving to MySQL 4. Fixing this was pretty urgent and assumed that MySQL's behaviour would change back.

Because of this MySQL change being backwards incompatible, it also broke restoring backups. This patch wouldn't help with that though (it just changes the setting for the database connection created by Bugzilla; it doesn't change the server setting).

It took many years until the Bugzilla code was less of a mess. If the same bug would've been opened nowadays a proper fix could've been made. Back then, urgh.


> Because of this MySQL change being backwards incompatible, it also broke restoring backups.

No matter how you paint it, if MySQL isn't able to restore a backup created with the backup utility of the same version, then this is MySQL's fault.

It is absolutely not acceptable for a backup utility to run through but produce something that then cannot be restored. I would have expected mysqldump to blow up or even better, the newer daemon to not start up if it sees a schema that it doesn't support any more.


MySQL is basically taking Visual Basic's "On Error Resume Next" approach. Which is also what a lot of JS seems to be doing, too, based on limited experience.


Indeed, and in some cases this can have side effects like skipping certificate validation: https://github.com/github/hubot/issues/1046


I never understood why people use MySQL for anything where correctness is important. The auto-truncaction is only one of many nasty surprises.

Did you know that auto-truncation also kicks in if your columns values are small, but you run a CONCAT on them? That was a nasty surprise when a generated a CSV line on DB side, which by bad luck was truncated immediately before a comma, so the result was correst, just the jast few items were missing. And ever looked at the handling of invalid dates? Not to mention what happens if you forgot to specify a column in your GROUP BY clause, but use it in the SELECT part ...

To anyone who is so brave to write their application using that database: Be sure you know all those issues and take care of them in your application. Or make sure you aren't affected by them. Or, switch to a high-quality database!

I switched to PostgreSQL a long time ago and never looked back. These people care about exactness, quality and clean extensibility in all directions, even user-defined index structures are possible (GIN, GIST). The problem that PostgreSQL was slower than MySQL disappeared as soon as MySQL became transaction safe with InnoDB, because it was now a fair comparison as PostgreSQL was transaction safe from the very beginning. Also, the PostgreSQL project management is one of the best I have ever seen, meeting schedules, regular commit fests, quality-driven, honest beta releases, and so on.


  > I never understood why people use MySQL for anything where
  > correctness is important.
They probaly know how to configure it.


This doesn't seem to be any harder than MySQL, it's just a bit different. Also note that PostgreSQL can perform "peer" auth, so you don't have to fiddle with DB passwords unless you really want to access your DB over a network.

Install package:

    apt-get install postgresql-9.4
Become PostgreSQL admin:

    su - postgres
Create a DB user who is allowed to create databases (assuming your app will run as user "someuser"):

    createuser -d someuser
Login as that user:

    exit
    su - someuser
Create your database:

    createdb somedb
Play with your database:

    psql somedb
        CREATE TABLE ...


If you are going to the effort to learn all the ways you need to tell MySQL to not corrupt your data, you may as well spend that energy learning PostgreSQL


It is important to point out that access to security bugs (the ones that contain descriptions of unpatched vulnerabilities) is not automatically granted to all Mozilla employees, so simply having an @mozilla.com e-mail is not enough to get information on how to exploit Firefox.

It is enough to get access to potentially sensitive partner information (as can be seen from the posted screenshot), but those are usually a different class of bugs from security bugs.

Further, we now classify security bugs by the area of the code they affect, and give engineers access to just the areas for which they are responsible, rather than blanket access to all security bugs. This is a process that began even before the recent attacks, but we have been pushing it more aggressively in their wake.


1. This is mostly MySQL's fault for default silently truncating values longer than the requested length. Postgres and SQL Server don't do this, but that's all I tested

2. Lack of email validation by Bugzilla. Emails are max length 254, though this might not have been official when the code was written. See https://stackoverflow.com/questions/386294/what-is-the-maxim...

3. Consider using an HMAC'd URL for registration email links instead of putting a token in the database, as per https://neosmart.net/blog/2015/using-hmac-signatures-to-avoi...


1. I copletely agree it isn't exactly the best design decision, on the contrary, but the developpers are still too blame as well in my opinion. They chose the datatype so they should have read the documentation, especially the caveats like this, and they should have done something to deal with this.

This applies more generally. There are a ton of programming languages, standard libraries and APIs out there with weird/confusing/exceptional/undefined-behavior-invoking types and functions depedning on how they are used so a sane developper should alwyas check this and handle accordingly, even though it might lead to more code to handle edge cases than to the actual expected main code path.


You know, if you're going to use this meme, you should get it right. It should be "all your bug are belong to us" [0]. Yes, it's ungrammatical -- that's the point!

[0] https://en.wikipedia.org/wiki/All_your_base_are_belong_to_us


I would argue that in the original, "base" was referring to a singular object and was not reduced down from "bases" (plural) to "base" (singular). Therefore, I do not think it is necessary to reduce "bugs" to "bug".

I know you are joking but if we are being pedantic here, "all your swarm are belong to us", would be more apt.

EDIT: Instead of swarm, a name for a group of bugs (errors) would be appropriate.


I don't speak Japanese, but according to whoever translated it for Wikipedia it was really "bases":

https://en.wikipedia.org/wiki/All_your_base_are_belong_to_us...


Translation is an art.

Japanese does not have plurals in the same way that English does. It does have an optional pluralizing suffix. However, the fact that it was not used with "base" does not imply that "base" is singular.

Looking at the sentence structure, it seems to me that "base" was really intended to be singular (which also fits with the context provided by the script on your wikipedia link). Roughly the sentence is structured as follows:

By means of the Federation Army's corporation, as for yall's base, entirely CATS has taken

The word I translated into "entirely" could also be translated into "all". However, gramaticly, it still does not modify "base", so I can see no sense in which "all your bases" is a correct, literal translation.


The Wikipedia translation disagrees. Add your information to the page, please.


So "All your base" is Engrish for "The entirety of your base", not "All your bases"?


Are you guys really trying to "fix" a meme?


what you say?


I'm not sure which is worse: not validating your data, or silently truncating invalid data.

It seems that the fix [1] was to limit the length of email addresses to 127 characters. That'll do for the time being, but they really should pay more attention to database warnings. MySQL does all sorts of stupid things, but at least it tends to emit warnings when it's being stupid. Catching those warnings and turning them into exceptions should go a long way toward preventing bugs like this.

[1] https://git.mozilla.org/?p=bugzilla/bugzilla.git;a=commitdif...


It looks like the devs are doing exactly the right thing: Fixing the immediate problem quickly on released branches and following up by stricter checking in the development version (which looks like it will need a lot more changes) See: https://bugzilla.mozilla.org/show_bug.cgi?id=1202447#c17 https://bugzilla.mozilla.org/show_bug.cgi?id=1202509


MySQL can turn those warnings to exceptions on its own, via sql_mode [1]. New installations of MySQL 5.6 have this behaviour enabled by default on InnoDB tables.

[1] http://www.tocker.ca/2014/01/14/making-strict-sql_mode-the-d...


Incredibly interesting, and incredibly simple. Very cool stuff.

I do wonder... Why on earth is domain based privileges ever a good idea? There should be a proper roles system behind the scenes which doesn't grant privileges based on an email domain.


Because the business requirement is "Anyone from this company we are partnering with, which has signed a mutual NDA at the company level, should be able to see these bugs," and "The local Bugzilla admins shouldn't be involved in the hiring processes of that other company," and "No, they don't have a public facing identity management system we can integrate with."


There is a proper roles system. But as a shortcut, roles are assigned by domain. It's either this or manually creating/accepting accounts if they need privileges other than the default.

In this case, they could get away with the shortcut (if you exclude the MySQL misconfiguration), so that's what they have done.


For anyone interested in similar issues: here you can find a report for a vulnerability in Phabricator with exactly the same cause (truncation by MySQL), and pretty much the same result: https://hackerone.com/reports/2224

If Bugzilla would allow non-ASCII characters in the email address, MySQL's truncation behaviour with astral symbols (e.g. 𝌆) would probably have lead to a similar vulnerability as well. (It did so in Phabricator: https://hackerone.com/reports/2233)


The bug impact description is completely false. Mozilla has "corporate confidential" bugs behind the @mozilla.com email check, but everything with a security rating is restricted to specific accounts that have been explicitly vouched for.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: