You're right. I started using mysql in 2000 when it was still a toy. It's an outdated bias :) I was more flabergastered they would choose 'InifiniDB' when there are so many other great options out there.
Well, MySQL still doesn't implement the SQL standard from 1999 (20 years ago), because it's missing common table expressions. Although the next release will support them, thankfully. Sorry, I just had that axe to grind.
I disagree with your assertion that MySQL /is/ a serious database.
The questions I usually ask myself when evaluating database solutions is:
* Does it accept invalid data?
* Does it change data on error?
* Does the query planner change drastically between minor versions?
* How strong is transaction isolation? can I create constraints, columns or tables in a transaction?
* Does it scale vertically above 40~ CPU threads and 1M IOPS?
The answer to all these questions, for MySQL is "No". You could argue the value of some of them, but a lot of them highlight architectural or development procedural misgivings.
Not the OP but one example of accepting invalid data is Mysql defaulting values to null/0/0000-00-00 when no value assigned and no default on column. Unless one is in strict mode.
I appreciate that it might sound like that to someone who hasn't used MySQL in production for 10+ years.
To start with, this is still true today: https://vimeo.com/43536445 Despite being 6 years old, strict mode is still required.
Anything prior to MySQL 5.7 will accept "0000-00-00 00:00:00" as a valid date, 5.7 will not (which is sane) however this means migrating from 5.6 -> 5.7 just got a little harder.
In fact it wouldn't validate /any/ date so it would assume every year was a leap year and febuary always had 29 days.
Regarding performance:
This is what I found from my own experience: I was given the task of testing the limits of MySQL, MySQL was the chosen technology and I was no involved in making that decision so- whatever.
We were given 10 servers, with 40 cores (2014-2015~) 128G of DDR3/ECC and 8 SATA SSDs in RAID-0 with 1G of RAID cache for write-back.
We managed to get MySQL to bottleneck pretty quickly, our queries involved a lot of binary data so we should have been raw IOPS bound, but we weren't we were memory bound. So we replaced the memory allocator with a faster one (jemalloc) and we get a 30% performance improvement. We suspected that the kernel sockets implementation was slowing us down so we compiled a custom "fastsockets" linux kernel. The improvement was around 4%, but we were bottlenecked on memory. After doing a full trace of what MySQL was doing we saw that InnoDB was spinning on a lock quite a lot.
I asked if we could try other SQL solutions (MSSQL/PostgreSQL) Postgresql was first chosen because we could just install it, no license and no OS change... it was twice as fast as the optimised MySQL installation out of the box with a stock CentOS6 kernel.
We never even bothered testing MSSQL because PostgreSQL met our performance targets, we were now IOPS bound.
--
More anecdatum:
Regarding data consistency we (tried) to migrate to postgresql for performance reasons in 2014 (my previous company), and failed because MySQL had been corrupting our data very slowly and silently for many years (corrupting meaning not honouring NOT NULL, not honouring type safety, allowing invalid dates, inserting data on error) So far in that actually reimporting the output of `mysqldump` would not work.
Isn't it? I thought that today()-(2018 years, 4 months and 10 days) would be approximately that date? Maybe you prefer +0000 vs just 0000?
'ISO 8601 prescribes, as a minimum, a four-digit year [YYYY] to avoid the year 2000 problem. It therefore represents years from 0000 to 9999, year 0000 being equal to 1 BC and all others AD. However, years prior to 1583 are not automatically allowed by the standard. Instead "values in the range [0000] through [1582] shall only be used by mutual agreement of the partners in information interchange."
To represent years before 0000 or after 9999, the standard also permits the expansion of the year representation but only by prior agreement between the sender and the receiver.[19] An expanded year representation [±YYYYY] must have an agreed-upon number of extra year digits beyond the four-digit minimum, and it must be prefixed with a + or − sign[20] instead of the more common AD/BC (or CE/BCE) notation; by convention 1 BC is labelled +0000, 2 BC is labeled −0001, and so on.'
I think a lot of what have you written can be solved software-side. Good Database should be no excuse for bad code.
I do not think MySQL is a technological debt as in 80% startups moving to the different solution is cheap and non-problematic. The LAMP is good enough and quickest/cheapest for the majority of tech companies.
I'm being perfectly fair in being critical of a software which claims to be doing those things.
You can solve issues in your application if you know there will be issues like these, knowing the pitfalls and drawbacks of a technology is certainly noble- but if you do then why not choose something that follows principle of least surprise. (There might be reasons).
I would never claim that you should move everything from MySQL if you use it. However if you care about data consistency ensure that you change the defaults, engage strict mode, ensure that your application has no bugs in handling data.
This is actually hard to do correctly, it's overhead in development that you shouldn't be caring about. Just choose something that has sane error conditions and the problem vanishes.
Considering many, many of the world's largest tech companies use MySQL or MySQL compatible databases, it's rather absurd to say that MySQL isn't a serious database. Regardless of whether it matches yours or someone else's personal list of capabilities.
To be perfectly fair with you, you can make bad choices and still get something useful done.
Most companies are not alive "because they chose mysql over something else" they're alive because they have "good enough" tech to get the job done. The job that they're trying to accomplish is the thing that makes them successful.
Uber isn't super huge because it used a specific database technology. It's huge because it's good at marketing, it's providing some value to people.
If it got the work done at a reasonable cost and performed reasonably well (i.e. it served the purpose it was meant to serve), how “bad” a choice could it have been?
I'm reminded of an article about zombie companies I read recently: they're companies which are inefficient/pporly-managed/poorly-executing, but due to market/regulatory inefficiencies they're not dead yet. Companies which use MySQL are in a similar situation: they're not doing as well as they could be, and all other things being equal they ought to be put out of business by their competitors — but all other things are rarely equal.
Still, if you are making choices for yourself, you don't choose mediocrity and hope to muddle through: you choose excellence. Choosing MySQL isn't choosing excellence.
“Excellence” is a poor criterion for comparative analysis because it is (a) subjective and (b) unquantifiable.
Do you have objective or quantifiable data and references upon which your opinion is based, _and_ is universally applicable to any arbitrary problem that a SQL database might be an appropriate solution for?
If it costs development time because they need to be extra careful about not sending queries that ERROR and 'wipe' data.
If it silently corrupts data over years and gets discovered much later. (As was the case with my previous company, an e-commerce retailer that lost large chunks of order history)
Are those problems still unresolved in MySQL today? How do you know that similar or worse problems did not exist in alternative solutions at the time it was implemented?
MySQL is making strides to fix these kinds of issues ever since the Oracle acquisition for sure.
> How do you know that similar or worse problems did not exist in alternative solutions at the time it was implemented?
Because I've been working on database solutions for over 10 years, there are problems in other software but I consider data loss to be worse than any of them. For example the autovacuum in postgresql 8.3 and before was mostly garbage which ended up bloating highly transactional databases. But deleting data when you fail a constraint is worse.