Mysql – How to sync two MySQL tables (on demand or via cron)

MySQLmysql-replicationpercona

I have seen Cron sync mysql tables . But I cannot use replication. I used to use percona-toolkit and it worked perfectly. I could run the sync command on demand or just run it via cron. It would compare checksums on the two tables and do inserts, updates, deletes etc. However Perl(DBD::mysql) and MySQL have some incompatibilities in a new server I'm going to run this on and I cannot use pt-table-sync. Is there a similar solution which uses something other than Perl/DBD?

Edit : (more details, clarity)

  1. Both the source and target tables are live tables and in constant use. Hence a solution with the table not existing (eg. a DROP TABLE is done) would not be acceptable.
  2. Cannot use replication or any such modifications to the server itself. The solution has to work on the client side.
  3. In this particular scenario both the tables are in the same server, but different DBs (eg. db1.tbl db2.tbl). However, a solution which does not rely on this fact would definitely be a bonus
  4. Network latency is unlikely to be an issue. In this case the script runs on a server in the same datacenter.
  5. Cannot use Perl (incompatibility between Perl and MySQL – 64 bit vs 32 bit)

Best Answer

use mysqldump --opt <database> <tablename> to create a dump of your table and feed it to your new server. As you apparently have access to the remote database via TCP/IP, you simply could use

mysqldump --opt --user=<youruser> --password=<yourpassword> -host <yourhost> \
<yourDB> <yourtable> | mysql -u <newserveruser> -p<password>

to connect to the remote database, dump it and feed the output into your new server.

If you did not have direct TCP/IP access to the remote database, you still could do pretty much the same by tunneling the data through SSH after setting up public key authentication:

ssh -C -l <remoteuser> <remoteserver> \
'mysqldump --opt --user=<youruser> --password=<yourpassword> <yourDB> <yourtable>' \
| mysql -u <newserveruser> -p<password>

See the documentation to mysqldump and the man page for SSH for more details.

If you need more bandwidth efficiency, consider creating a dump with mysqldump, storing it on the source server and using rsync for copying/updating the counterpart on the destination server before importing. As rsync will create rolling checksums on the source and the destination file, it likely will not need to transfer most of the dump's contents on subsequent runs.

There has been a mysqldump patch which was meant to use temporary tables when inserting rows and rename the table afterwards to the original table name to reduce lock time, but I would consider it experimental as it has unresolved issues and never made it into the main branch. See this discussion for patch code and details.

If you simply can't drop the table on the destination for whatever reasons, you might insert the dumped data into a new table (a quick & dirty but somewhat unsafe approach would pipe the mysqldump output to sed -e 's/mytable/newtable/g' before further piping to mysql) and then run an UPDATE / DELETE / INSERT cycle with a couple of JOINs like this (untested, do a sanity check):

/* set write lock on the table so it cannot be read while updating */
LOCK TABLES mytable WRITE;

/* update all rows which are present in mytable and newtable */
UPDATE mytable AS M LEFT JOIN newtable AS N ON M.primarykey = N.primarykey 
SET M.column1=N.column1, M.column2=N.column2 [...]
WHERE N.primarykey Is Not NULL;

/* delete all rows from mytable which are no longer present in newtable */
DELETE M FROM mytable AS M LEFT JOIN newtable AS N on M.primarykey = N.primarykey 
WHERE N.primarykey Is NULL;

/* insert new rows from newtable */
INSERT INTO mytable (primarykey, column1, column2, [...]) 
SELECT (N.primarykey, N.column1, N.column2, [...]) FROM mytable AS M 
RIGHT JOIN newtable AS N ON M.primarykey=N.primarykey WHERE M.primarykey Is NULL

/* release lock */
UNLOCK TABLES;

Note: of course, your database's data would be inconsistent while you are inserting/updating its data, but as long as you are not using transactions (not available for MyISAM tables), this would be the case no matter what you do - dropping and recreating the table would create temporary inconsistencies just as doing the update/delete/insert cycle would. This is due to the very nature of a non-atomic transactionless design of MyISAM.