This is probably not interesting to anyone, but I remember the first (and only) time I published a paper, based on Mr. Farach-Colton's and Mr. Bender's (specifically this paper [0]) work, and after presenting it someone in the audience asked a few questions for which I was unprepared for.
After I finished my presentation I walked to the patio and got to meet the guy that asked me those questions. Turns out that he was Martin Farach-Colton!
Got to chat with him and even went to eat something and he turned out to be a really nice guy, laid back, not cocky at all. I mean, here there was a guy that had done some great research on algorithms and other areas and that I happen to admire a lot, just chatting with me, telling me about this Tokutek project he was starting (at the time the website was just an "under construction" page), and in general being an interesting person to talk to.
If anyone is remotely interested in a different way of obtaining the Least Common Ancestor between two nodes in a tree, definitely check this paper out. When I managed to understand why it works, it rekindled my love for algorithm design.
I'm nervous about what this means for TokuMX. We've used and loved it for a few years now, but Percona is a MySQL company, not a MongoDB company. Support on TokuMX has rather flagged over the last few months, and unless Percona makes it a first-class citizen, I'm nervous that it's going to wither and die in the face of MongoDB's new WiredTiger engine, which is approximately equivalent (though not entirely comparable).
I'm hoping to be proved wrong, but it's probably time to start drafting migration plans.
It is time. I think Tokutek saw the writing on the wall.
How revolutionary and how patent-encumbered is this fractal tree indexing? If it's based on academic research, are there other B-tree variants or improvements that accomplish some of the same advantages that Tokutek's fractal tree indexing does?
Having used MongoDB in a number of projects, I had been holding out to try out TokuMX with production loads, however I've since settled on PostgreSQL for new projects.
TokuMX was really "MongoDB done right", at least up until MongoDB's integration of the WiredTiger engine. They also did some stuff with things like election protocols and stuff to improve on vanilla MongoDB, as well. It was really quite nice to work with. I've been a vocal advocate for it over the last couple of years.
All that said, I am rather concerned about a lack of ongoing support for it, though. TokuMX 2.0.0 was released 7 months ago, with the promise of MongoDB 2.6 support "shortly" (MongoDB 2.6 was released just over a year ago). That's yet to materialize, and I've reported a number of bugs which haven't received prompt fixes (inlcuding a segfault which was closed as "working as intended"). These are edge cases, and certainly not the norm, but I was really impressed with Tokutek's support of the product a couple of years ago, and it sadly feels like it's dwindled to life support.
Unless Percona makes a strong showing with it shortly, I'm not going to be able to justify building on it any longer.
Yeah, there's no way they will be able to easily merge into all the changes that have happened since MongoB 2.6 and now 3.0. They're pretty far behind. 2.6 introduced a lot of refactoring (and therefore broke a lot of things) and 3.0 broke even more I can only assume.
However, I believe they could sell their storage engine as MongoDB exposes a storage engine API now. Then again, I don't know why anyone would buy it now.
MongoDB has done a decent job of patching things up and I guess TokuMK just isn't as interesting as it was before.
TokuMX still provides a lot of features that TokuMXse (vanilla with the tokumx storage engine) doesn't, like multi-document ACID and clustering keys. I would happily use it over vanilla if I was confident it was going to be well supported.
Percona provides a patched version of MySQL with many enhancements. I believe they incorporation many of the useful community patches, such as from Google, and some of their own to make InnoDB function better on SSDs, etc (at least that was the case in the past, many of those may have made it into mainline MySQL now). They also provide their own storage engine, XtraDB, with it's own set of interesting features.
Additionally, they provide a set of utilities as the percona toolkit to automate more advanced things, such as replication testing through table data checksums, zero-downtime table changes, etc.
Most of it is free, I think some advanced XtraDB features require payment (?), but they sell support and training.
Tokutek make a database engine.
They offer it in two versions-
• One version (TokuDB) which runs inside MySQL (Alongside InnoDB/XtraDB, MyISAM, etc).
• One version (TokuMX) which runs in/as a fork of MongoDB. Since Mongo didn't historically use plugable storage engines, they had to fork the MongoDB code in order to use their storage engine.
Their engine has some advantages over the traditional Mongo storage engine (Now called MMAPv1) - One of the most prominent was that it supported in-line compression.
Many of these advantages are mitigated somewhat by MongoDB's new WiredTiger engine.
Tokutek provides an "engine" for MySQL. Each table in MySQL has associated with it a engine that does the storing and retreaval. Like all software it's optimized for certain use cases. Tokutek used a fractal tree index to allow faster lookups. I think they could reindex a table without locking .
They more recently did an engine for mongodb
A description from the Boston MySQL group a few years back (I was at the talk:)
I was wondering so I went to look. This is what I see:
"ACQUISITION ENABLES PERCONA TO EXPAND AND EXTEND OFFERINGS DRAMATICALLY IMPROVING DATABASE PERFORMANCE WHILE LOWERING TOTAL COST OF OWNERSHIP FOR CUSTOMERS"
Does anyone know the details of the deal? Equity of Founders Vs VCs at acquisition? How many rounds of funding did the company take? How much did the founders make?
This is probably not interesting to anyone, but I remember the first (and only) time I published a paper, based on Mr. Farach-Colton's and Mr. Bender's (specifically this paper [0]) work, and after presenting it someone in the audience asked a few questions for which I was unprepared for.
After I finished my presentation I walked to the patio and got to meet the guy that asked me those questions. Turns out that he was Martin Farach-Colton!
Got to chat with him and even went to eat something and he turned out to be a really nice guy, laid back, not cocky at all. I mean, here there was a guy that had done some great research on algorithms and other areas and that I happen to admire a lot, just chatting with me, telling me about this Tokutek project he was starting (at the time the website was just an "under construction" page), and in general being an interesting person to talk to.
If anyone is remotely interested in a different way of obtaining the Least Common Ancestor between two nodes in a tree, definitely check this paper out. When I managed to understand why it works, it rekindled my love for algorithm design.
[0] The LCA Problem Revisited - https://dl.acm.org/citation.cfm?id=690192