Comparing MySQL and Postgres 9.0 Replication

dstorrs · on Nov 18, 2010

My takeaways from this are:

1] MySQL supports more replication options, whereas PostgreSQL focuses on the ones that 95% of users will care about and does not implement the rest.

2] MySQL's replication has several failure modes that can lead to data loss or non-synchronization. For example, it is statement based by default, so any statement which includes a call to NOW() will return different results on master and slave. PostgreSQL also has failure modes, but the defaults are designed to minimize them -- e.g. it always uses WAL and data file rewrites to ensure that data is properly synced and consistent.

3] Due to points 1 & 2, if you aren't an expert and don't need the extra features offered by MySQL, you are probably better off with PostgreSQL.

4] 3rd party tools exist for both servers which close many of the feature gaps between them.

Could someone who is more knowledgeable than me tell me if this is a fair reading?

bmurphy · on Nov 18, 2010

Close. Postgres has traditionally focused on letting the community provide replication solutions, but we've been screaming so long for one that "just works" that they're finally getting around to building it.

I wouldn't say it covers 95% of the use cases. In fact, I find it very limiting, but if all you need is to make sure your database is backed up it's a great step in the right direction.

That being said, the topology limitations suck. If you lose your master server, you are running in a degraded environment until you can rebuild ALL slaves from scratch. This really sucks if you have a very large database that takes a long time to rebuild.

Postgres 9.0 replication is a great first step, but it's only a step. There's still a lot of work to be done.

wiredfool · on Nov 18, 2010

  If you lose your master server, you are running in a
  degraded environment until you can rebuild ALL slaves from
  scratch. This really sucks if you have a very large
  database that takes a long time to rebuild.

I just tested this, and it's not quite true.

Assume you have 1 master, and 2 slaves(1=new master/2=additional slave) each replicating directly from the master. If they're in sync, then if you:

  * shutdown the master
  * shutdown slave1/new master
  * remove slave1's recovery.conf file
  * assign slave1 old master's ip address
  * start slave1 as new master

then slave2 will follow and you'll still have a replicated set. If, instead of doing the shutdown/remove recovery.conf/startup dance, you touch the trigger file, a new timeline is created and the slave2 isn't going to follow that timeline. I haven't found a way to make a slave follow to a new timeline yet.

bmurphy · on Nov 18, 2010

Yes, that's exactly what I was referring to, you're getting a new timeline. You need to rebase the slaves and you have no real-time backup for a few hours. Sucks. You have to be very careful, and there are all sorts of ways you can break it. It's not something I'd want to rely on right now when under fire.

wiredfool · on Nov 18, 2010

Looks like there's at least one person who's looking at doing something about the timeline issue: http://blog.tapoueh.org/articles/blog/_Back_from_PgCon2010.h... .

I'd bet that there are some pretty basic failure modes in the timeline preserving case where the slaves could be out of sync when one was promoted, and then that out of syncness could be propagated. I suspect that a missing transaction on the new master would trigger some sort of duplicate transaction oid issue, but a missing transaction on the slave would be more likely to go unnoticed.

I've just tested a case where I killed the master when I was running transactions against it, and it seems to work as well as the shutdown case. Bringing up the slave is where the timeline gets incremented.

I suppose I could throw one of the replication connections through a ip delay and intentionally introduce replication lag..

edit: Looks like http://www.postgresql.org/docs/9.0/static/warm-standby.html#... , section on monitoring gives the pg_current_xlog_location on the primary and the pg_last_xlog_receive_location on the slaves, which would at least tell one if the slaves are in sync, and if not, which one is the farthest ahead.

So, the process could be:

  * master dies. either gracefully or not.
  * all slaves are queried
  * most advanced one gets shutdown, promoted by ip and removing recovery.conf then started
  * all the other slaves should track those changes and bring themselves up to date.
  * Resync the old master as a new slave.

bmurphy · on Nov 19, 2010

I just ran the same test. Set up a 2 slave cluster, killed slave 2, then killed the master. Ran pg_last_xlog_receive_location on each slave, picked the one that was furthest along in the timeline, promoted it to be the new master and the other slave caught up as expected.

That's great. That's the first time I've been able to get this to work smoothly. I'm sure there are still ways to break it but this gives me hope.

spudlyo · on Nov 18, 2010

1. I don't pretend to know what 95% of users will care about.

2. MySQL writes the timestamp for NOW() into the binlog so provided your master and slave are in the same timezone, it's replication safe. Functions like UUID() and RAND() will behave as you describe.

3. I disagree. MySQL's replication is more mature, widely understood, and has more documentation than PostgreSQL's.

moe · on Nov 18, 2010

I disagree strongly with "more mature", "widely understood" and "more documentation".

1. Mature?

I have seen MySQL replication blow up in so many awful ways, it's not funny. And yes, I've seen it happen on 5.* deployments. The worst part is that MySQL hardly ever detects a problem, and you hardly ever know why it went wrong - it just silently corrupts. The only way to detect the corruption is through regular checksum runs which can be costly on large databases (there's a reason why the checksum-utility from maatkit has fairly sophisticated options for incremental checksumming).

2. Widely understood?

I challenge you to prove your understanding and explain only a small subset of the functionality: Please enumerate all failure modes that can lead to corruption in statement based replication, and how to avoid them.

3. More documentation?

That has to be a joke. The MySQL documentation is a mess. It is poorly structured, poorly written and full of conflicting and misleading bits. Please show me the MySQL document about replication that comes remotely close to http://www.postgresql.org/docs/9.0/static/hot-standby.html in terms of clarity and exhaustiveness. Notice the long section about "query conflicts" - where is the MySQL equivalent?

spudlyo · on Nov 18, 2010

1. This is the first release of baked in replication for PostgreSQL. MySQL has had years to evolve their approach and to fix bugs. That's not to say either system is perfect, but their relative maturity seems obvious.

2. No. It's widely understood because MySQL has a much larger mindshare than PostgreSQL. Statement based replication has its faults, but complexity isn't one of them.

3. It's not a joke. There is more documentation. There is the official documentation, but if you don't like that there are plenty of blog posts, books, and recorded talks on the subject.

moe · on Nov 19, 2010

MySQL has had years to evolve their approach and to fix bugs.

That sounds nice in theory, but completely ignores the reality of the respective code-bases. Features in MySQL have a tendency to rot rather than mature. It could hardly become more obvious than in the comparison between this version 1 of postgres replication against the "evolved" MySQL equivalent.

Postgres may have the smaller feature set, it may be the first version, yet I'm still inclined to trust it a lot more than what MySQL has to offer. Simply because postgres has a track-record of being rock solid and MySQL, well, not so much.

Sure, in theory you can build fancy rings, cascades and even multi-master topologies in MySQL. But what is that worth when even the simplest master/slave mode still randomly and silently desyncs in various, undocumented situations?

Statement based replication has its faults, but complexity isn't one of them.

There's a nice quote from Alan Perlis:

    Fools ignore complexity,
    experts avoid it;
    geniuses remove it.

Some complexity is inherent to replication. The question remains which parts, if any, they removed and which parts they ignored...

It's not a joke. There is more documentation.

Sorry, measuring documentation in terms of quantity sounds like yet another joke to me. As you can tell by my snarky comments I'm responsible for babysitting the odd MySQL cluster in my day-job (and postgres, too, so I have a frame of reference). It is indeed true that for most common problems I'll find the solution not in the official docs but rather on some more or less obscure forum, in broken english, linked from page >3 of the google search results (after 2 pages of outdated and plain wrong information).

So, in all fairness, yes, most common problems can eventually be solved that way. But good luck debugging a not so common ring, multi-master or otherwise non-standard scenario that way. Or encoding issues. Or silent truncation. Or hotbackups that are strangely much smaller than they should be. Or the dreaded 'mysql'-database being screwed after restoring a backup to a newer minor-version...

Anyways, the way postgres handles this seems more sensible to me. In postgres the overwhelming majority of the "common" MySQL-issues simply doesn't exist.

spudlyo · on Nov 19, 2010

You win. Your sweeping generalizations, personal anecdotes, and FUD have won me over. Bravo.

wanderr · on Nov 19, 2010

Do you have any experience with MySQL replication? I do, and mine aligns with what moe is describing. Nice in theory isn't worth much if it's crappy in reality.

spudlyo · on Nov 19, 2010

Yes. Over the last five years I've supported replication on hundreds of servers, powering sites you've probably used. I've had to clean up after crashes with unsynched binlogs, writes that completed partially on master, corrupted relay logs, failed hubs, and endless inconsistencies for a myriad of reasons.

Guess what, things have gotten incrementally better and continue to improve. Bugs get found and fixed. MySQL replication is mature enough to power huge sites like Twitter, Facebook, and countless others. There has been exactly one PostgreSQL release that supports replication suitable for read scaling.

wanderr · on Nov 19, 2010

I measure stability by how well something works, not by how many releases it has had. MySQL replication is getting better, but it does still have major problems. Sure, it's good enough to power huge sites (we use it here as well), but it's frustrating to deal with. Fortunately it's being used by web 2.0 companies and not banks, so generally if a slave is corrupted and returns weird results, customers aren't going to be too upset and in a lot of cases might not even notice, but it's still an annoying problem to deal with. I wonder how many more releases it's going to take for them to get it solid.

morgo · on Nov 19, 2010

Almost. RAND() writes the rand seed as meta data to the binary log. It's safe (as is NOW() - like you pointed out). UUID() is not safe though.

But examples like this are so special case - it doesn't show the real danger. A DELETE with a LIMIT clause and no ORDER BY can semantically delete different rows on the slave than on the master.

You never, EVER want to use circular replication (A->B->C->A). Surrounding a crash it's such a pain to re-piece together the data.

wiredfool · on Nov 18, 2010

Pretty reasonable, from someone who has been following postgres for 10 years. I'm not sure I'd say that postgres covers 95% of use cases, as it really only covers single master. That is probably the most common case, but I'd have a hard time putting numbers on it. Extra features cut both ways, some apps are better with postgres, some with MySql.

Postgres was 'late' to the replication party, and only in this last release have they had hot slave machines. (with core functionality, 3rd party has a range of alternatives). Currently, it is single master, multi slave replication. Previously, it was been a warm spare arrangement, up to 1 WAL log behind, for durability, and not for load balancing. There were ways of making it less laggy, but it was not trivial to restore a log, bring the db up, and then be in a position to restore another log.

_dctx · on Nov 18, 2010

This is a wonderful in-depth article and the PostgreSQL team deserves a lot of kudos for the work they put into this. I'm looking forward to using it. The thing I find interesting that MySQL offers is multi-master and circular replication. Cal Henderson noted that Flickr uses multiple masters replicating to slave servers in their setup during DjangoCon 2008 (a great presentation if you haven't seen it: http://www.youtube.com/watch?v=i6Fr65PFqfk).

Granted, multi-master setups aren't needed for most sites and PostgreSQL's new WAL replication and the fact that it makes sure that the WAL is written on the slave before the transaction is committed on the master means that data integrity should be top notch. It's exciting to see PostgreSQL's development progressing so well.

wiredfool · on Nov 18, 2010

  it makes sure that the WAL is written on the slave
  before the transaction is committed on the master 
  means that data integrity should be top notch.

That's not my reading of the documentation. The slaves may delay applying a transaction for some configurable period of time to allow conflicting read-only queries to finish. Slaves can also disconnect and reconnect at will without stopping the master.

_dctx · on Nov 19, 2010

The slaves may delay applying the transaction, but the data is already on the slave machine. As the article points out, the data gets written to the WAL on the slave before the transaction is committed on the master. The slave might not update the database from the WAL immediately, but the data is on the slave in case the master goes down.

Basically, it means that there is the potential for replication lag and so you have to make sure you query from the master for just-updated data. However, it also means that if the master goes down, any data committed on the master is already on the slave in the WAL, just possibly not applied yet. So, it doesn't isolate you from replication lag, but it is a nice thing when it comes to making sure that the data remains intact even if a complete failure of the master occurs.

wiredfool · on Nov 19, 2010

   As the article points out, the data gets written to the WAL on
   the slave before the transaction is committed on the master

Then the article is wrong, it doesn't. I can reboot or kill the slave and the master will still happily accept transactions. It's also pretty clear from tcpdump that the data only goes on commit, and it's one way from the master->slave.