I have a production system that uses MySQL statement based replication for hot failover should a master database die. Running version 5.5 Percona. I have to use statement based replication for reasons that are immutable for the purposes of this question.
Now, I'd like to see the row-based replication stream of the same data, with the goal of trying to adapt this into an HBase based data store.
Is it possible to set up a MySQL server to slave (read) using statement-based replication, but at the same time be replication master (write to other slaves) using row-based replication? If so, how do I set this up? I've looked in the docs but failed to find it.
Best Answer
I assume, for the purpose of this question, that there may indeed be compelling reasons for using statement-based logging¹ but it is not generally recommended, because statement-based logging is relatively fragile. On any system where you have the flexibility not to use statement-based logging, don't use it -- use
MIXED
orROW
.²MySQL Server (and compatible systems, like Percona Server, MariaDB, and Aurora for MySQL) automatically "translate" from one format to the other, based on the configuration of each individual server.
To restate this with some additional implications, what you want to do “just works,” because the setting of
binlog_format
on the slave does not specify what the slave expects. It only sets what the slave will generate.Configure the slave with
binlog_format=ROW
and enablelog_slave_updates
inmy.cnf
on the slave (which causes incoming events to be rewritten to the slave's binlog)....and you're done.
The slave will log all of its DML as row-based events in spite of the master's binlog format. You don't really have to do anything else to make a slave also be a master, since every MySQL server with binary logging enable is, essentially, already a master -- it just may happen to be a master without any actual slaves.
Any combination of master and slave
binlog_format
is valid except for a master configured asROW
and a slave configured asSTATEMENT
(the opposite of what you're doing here), because while statements can be translated into row events (they affect rows on the slave, after all), the converse isn't true -- you can't necessarily determine the specific statement that changed rows, if your only knowledge is the actual changed data. But for the application you're asking about, the above should do exactly what you intend.I also discussed the interactions of the possible combinations of master and slave binlog format here, on dba.stackexchange.com in some additional detail.
¹logging is used here rather than "replication" because it's a more accurate description of what is actually being configured, though arguably the meaning is unchanged.
²
STATEMENT
logs the actual queries that made the change;ROW
logs "row images" of the rows inserted/updated/deleted by the query. For updates, boththe old and new values are logged.MIXED
mode allows the server to choose the format for each query, always usingROW
when there is anything about the query that makes its impact on the database potentially unsafe for statement-based replication, because the replica could potentially interpret it in a way that would cause the replica's data to diverge from the master, because the query isn't deterministic. Examples might include an unorderedUPDATE ... LIMIT
where the replica might update a different set of rows depending on its index selection, and statements using non-deterministic functions likeUUID()
. Other seemingly-non-deterministic functions likeNOW()
andRAND()
are compatible with statement-based replication, because there are hints written to the statement log to indicate the master's system time, and the master's random seed, at the time the query was executed.