Looking at TAO just as a data-model (instead of as a database system) and I'm really liking its simplicity.
But when you have a huge system like this, there's always engine-level rules (e.g. referential integrity) and business-rules that need to apply: for example, a Location-object can't "follow" a Comment and a PM cannot "check-in" to a User, and Human Users can't "friend" a declared bot User (I think?), given their overall design I assume the rules about what objects and associations can and cannot exist are enforced by the Association APIs when an association is created - so rejecting an invalid create-association request is straightforward, but how does it handle ever-changing business-requirements that can retroactively apply to existing objects and associations (e.g. supposing that FB decides to allow Human users to befriend bot users, so suddenly Mark Zuckerberg suddenly now has real Friend associations (shots fired), and then they decide to undo that policy change, how is the rule-change applied? What happens to the existing objects? What happens if two different physical FB servers running different versions of the software want to apply conflicting business rules?
Another thing I don't yet understand about using MySQL (or any SQL-first RDBMS for that matter) as a storage layer for an ostensible object/document system (hierarchical structured data) is that now you run into the object-relational impedance mismatch problem - which has many unpleasant solutions (e.g. using the EAV anti-pattern, which defeats the point of using an RDBMS in the first place - or going all-in with Codd's normalization and ending up with a rigid, strict data model schema design which is very difficult to update when business requirements change (i.e. using a table for each Codd relation (tuple) type to for each TAO object type and having to run ALTER TABLE ADD/DROP COLUMN every time you want to update the TAO object schema - which obviously does not scale. I assume each TAO object can be treated as a keyed blob with type-name (i.e. a 3-tuple of (id, type, blob) ) - in which case using an RDMBS is overkill and introduces many overheads, so why use an RDMBS? Especially as I understand that pretty much every major web-service will disable things like referential integrity constraints for the sake of performance (and you can't have a constraint w.r.t data in a blob column anyway).
> how does it handle ever-changing business-requirements that can retroactively apply to existing objects and associations
If the change is backwards-compatible:
1. Update the schema definition
2. Run codegen
3. Done
If it’s not:
1. Create a new field/assoc
2. Double-write data to the new and old fields/assoc
3. Backfill data from the old one to the new one
4. Delete the old one
5. Done
> I assume each TAO object can be treated as a keyed blob with type-name (i.e. a 3-tuple of (id, type, blob) ) - in which case using an RDMBS is overkill and introduces many overheads, so why use an RDMBS?
I believe data is stored in tuples, same as most other graph DBs. I can’t speak to why MySQL in particular.
Facebook's database is distributed - it's not as simple as "updating the schema definition" because that process-step alone is non-trivial with many cases to consider.
The process is more complicated under the hood. I’m describing it from the POV of an engineer that wants to update a field — from that POV, it really is that simple.
The key is in making sure every step-wise change is backwards-compatible.
Creating a new field, double-writing, and backfilling are all relatively easy tasks in a distributed environment. Individual nodes can take their time catching up to the latest schema.
After a day or two, the engineer can verify that all data have been correctly migrated and double-written to the new ent/assoc. They then turn off double-writing to begin deprecating the old assoc. This can be done in an A/B test to ensure no major breakages.
Once the double-writing is fully turned off, the engineer checks one last time that all the data have been migrated. They check that the read/write rate to the old ent/assoc is 0, which then confirms that the old field is now safe to delete.
Overall, the trick is to think of the migration as a series of safe, backwards-compatible, easily-distributed steps, rather than a single atomic operation.
Facebook started, as any other web app, with a relational database used as a relational database (fks, indexes, all that stuff). They built tooling around that and acquired a great deal of knowledge on how to run mysql at scale. Switching to a different system is a huge investment with no real benefits. In short, they are using mysql because inertia, caused by the difficulty to run such a system at this scale.
> Switching to a different system is a huge investment with no real benefits
But there would be "real benefits": there are significant performance overheads, especially at Facebook's scale, with using an RDBMS compared to something specifically written for their use-case. If a different storage system has even, say, 10% better performance overall then that translates to a 10% reduction in hardware-costs, which at Facebook's scale is easily tens-of-millions of dollars per year.
The first thing that comes to mind is RocksDB - of course then I remembered just now that Facebook does use RocksDB, but with MyRocks (which combines RocksDB with MySQL - though the MyRocks website and docs don't clearly explain exactly how and where MyRocks sits in-relation to application servers and MySQL - or different MySQL storage engines... ).
> But there would be "real benefits": there are significant performance overheads, especially at Facebook's scale, with using an RDBMS compared to something specifically written for their use-case. If a different storage system has even, say, 10% better performance overall then that translates to a 10% reduction in hardware-costs, which at Facebook's scale is easily tens-of-millions of dollars per year.
And how would you know that? Based on what did you come up with those numbers? I find that hyperbole puts me off discussions of this kind.
Of note is that it halved the storage requirements for the user DB tier. That's a pretty big win.
As for why MySQL persists: I no longer work at FB so I'm not up to speed on the current thinking but one thing to remember is that TAO isn't the only thing that talks to MySQL. The web tier itself still has (or had) data types that were backed by MySQL and memcache. Lots of other systems for data processing, configuration, and so on stored data in MySQL. Replacing all of that would be a huge undertaking. I doubt there's a 10% win laying around just waiting for the taking with all this work, especially after MyRocks rolls out.
But when you have a huge system like this, there's always engine-level rules (e.g. referential integrity) and business-rules that need to apply: for example, a Location-object can't "follow" a Comment and a PM cannot "check-in" to a User, and Human Users can't "friend" a declared bot User (I think?), given their overall design I assume the rules about what objects and associations can and cannot exist are enforced by the Association APIs when an association is created - so rejecting an invalid create-association request is straightforward, but how does it handle ever-changing business-requirements that can retroactively apply to existing objects and associations (e.g. supposing that FB decides to allow Human users to befriend bot users, so suddenly Mark Zuckerberg suddenly now has real Friend associations (shots fired), and then they decide to undo that policy change, how is the rule-change applied? What happens to the existing objects? What happens if two different physical FB servers running different versions of the software want to apply conflicting business rules?
Another thing I don't yet understand about using MySQL (or any SQL-first RDBMS for that matter) as a storage layer for an ostensible object/document system (hierarchical structured data) is that now you run into the object-relational impedance mismatch problem - which has many unpleasant solutions (e.g. using the EAV anti-pattern, which defeats the point of using an RDBMS in the first place - or going all-in with Codd's normalization and ending up with a rigid, strict data model schema design which is very difficult to update when business requirements change (i.e. using a table for each Codd relation (tuple) type to for each TAO object type and having to run ALTER TABLE ADD/DROP COLUMN every time you want to update the TAO object schema - which obviously does not scale. I assume each TAO object can be treated as a keyed blob with type-name (i.e. a 3-tuple of (id, type, blob) ) - in which case using an RDMBS is overkill and introduces many overheads, so why use an RDMBS? Especially as I understand that pretty much every major web-service will disable things like referential integrity constraints for the sake of performance (and you can't have a constraint w.r.t data in a blob column anyway).
...so I assume I'm missing something, but what?