MySQL development server, not meant for production use, in development for almost 2 years, release date unknown, now includes memcached, and a plugin that allows fast NoSQL-style access.
I hope they are taken seriously however; I'd like to see all the major RDMS tightly integrate memcache in front. It only makes sense, web developers have been doing it themselves since LiveJournal's hayday.
"MySQL 5.6 includes a NoSQL interface, using an integrated memcached daemon that can automatically store data and retrieve it from an InnoDB table. By default, you use the memcached API purely to speed up database operations, letting InnoDB handle memory caching using its buffer pool mechanism. Advanced users can enable the traditional memcached in-memory caching and control whether operations look up and store data in memory, in InnoDB tables, or both. This offering provides users with the best of both worlds"
To really get this right they'd need to add some auto-rebalance / sharding magic and have it integrated much more deeply. This is really only a head-nod toward NoSQL.
The MySQL server now includes the widely used memcached in-memory caching system, and a plugin that allows fast NoSQL-style access to InnoDB tables through the memcached protocol. This access method avoids the overhead of SQL parsing and constructing a query optimization plan. You can store the underlying data in a single InnoDB table, or spread it across multiple tables. You can read and write data through both memcached and SQL. For example, you can do fast single-key lookups through memcached get calls, and do statistical reports across all the data through SQL.
Advanced users can enable the traditional memcached in-memory caching and control whether operations look up and store data in memory, in InnoDB tables, or both.
One of the versions[1] of the 5.6 documentation[2] has a comment regarding that.
> This configuration differs from another popular MySQL NoSQL solution, HandlerSocket, by skipping even the Handler API layer and directly accessing the InnoDB storage engine through low-level APIs, making the code path even shorter and more efficient.
However, I wish there's something in reverse: Redis as a MySQL storage engine. The most obvious useful things is that you can save a `tags` table without m2m tables, juse use Redis sets.
Why have an m2m table? Just have a table with two columns: post_id (or whatever is being tagged) and tag (varchar). If you want to present a list of possible tags just query for uniques. I may be missing something, so please correct me if there is a better way or I am misunderstanding your point.
One might choose to use the array datatype, combined with GiST, GIN, and/or expression based indexes, depending on the workload. That would be somewhat more lucid in most situations than a string packed into an attribute.
"The Memcache Python API
High performance scalable web applications often use a distributed in-memory data cache in front of or in place of robust persistent storage for some tasks"
MySQL memcached is not a cache, it's a persistent storage with memcached API.
And it's fast.
Given that the most common reason for sticking with MySQL is the maturity of InnoDB and the surrounding toolsets, and the ability to hire knowledge easily this sounds like an excellent development.
Last line: "If there is no innodb_memcache.containers.name value of default, the row with the first name value in alphabetical order is used. "
This is in regards to which table is used as the default for get/set operations. Anyone else foresee this leading to mass confusion/issues for unsuspecting users down the road?
This makes a lot more sense than adding the HandlerSocket API which provided marginally more utility at the cost of being a completely new API. Being able to bypass the query engine for key lookups is pretty cool and a write through cache is a bonus.
w.r.t "useless": Joins against memcached data, and expiry of keys via trigger. Different, yes, useless -- well, up to you. Not written for no reason, though.
That expiring triggers could be somehow useful.
But in most cases it's not worth to query memcached from database (selects). The reason is it's much faster to query memcached directly without even opening database connection.
All these caching things make applications complex, because cache invalidation it's a complex task.
MySQL is doing right steps - it's doing caching itself via InnoDB buffer pool. And makes the interface faster by replacing SQL queries with fast memcached API.
> The reason is it's much faster to query memcached directly without even opening database connection.
Sometimes you need to augment data in a SQL database with memcache information, and then subsequently compute a predicate, and it would be faster to use your sql executor than Ruby, say. I'm not suggesting you'd always use this access path, but it can be handy, just like dblink is.
> MySQL is doing right steps - it's doing caching itself via InnoDB buffer pool. And makes the interface faster by replacing SQL queries with fast memcached API.
I'd really like to see some benchmarks on that. If one has a protocol-level prepared plan in Postgres (which is not an exotic entity, some drivers even create them transparently), one just has to send "Bind" and "Execute" messages to call functions.
If there are major advantages to be had where there is any disk i/o involved at all, my suspicion it'd be having a caching strategy that is more suited to the memcached workload: most SQL implementations already have fast-path mechanisms (such as prepared statements, which are in my understanding relatively slow in MySQL due to some vagary in the protocol) where I'm a bit hard pressed to believe on intuitive grounds alone that parsing (and not planning, if one uses prepared statements) is the principal culprit for poor performance.
What does postgres have that's equivalent or better? If I'm understanding this MySQL improvement, it persists memcache data on disk, a great thing. Does postgres already do that, or better?
I think this MySQL feature has the potential to be neat, but if one is using it in the mode where one persists or faults to disk it seems like the biggest feature will not be performance, but protocol-compatibility and ease of development and deployment.
It seems like there are a few tuning modes relayed in the documents that have various persistence trade-offs, though, so the performance angle of the protocol could come into play.
I'd like Postgres to be able to support some kind of carefully-sized multi-backend caching utility someday, but as an extension if at all possible. I think such a thing is technically not that exotic (but still a lot of work), yet does not exist AFAIK. Prepared statements would usually be fast enough as-is if one had the right backend functions (UDFS) I suspect.
does anybody know how this would work with a pool of servers? if servers were replicated, would the memcache pool stay in sync as well? I'd assume it has to work that way to be effective. it sounds cool though, I'm glad MySQL continues to evolve.
This part of the manual lists several advantages as compared to simply packaging MySQL and memcached together, the first two probably being the most significant.
MySQL development server, not meant for production use, in development for almost 2 years, release date unknown, now includes memcached, and a plugin that allows fast NoSQL-style access.
I hope they are taken seriously however; I'd like to see all the major RDMS tightly integrate memcache in front. It only makes sense, web developers have been doing it themselves since LiveJournal's hayday.