We use replication across datacenters in several European countries (so they aren't across the world from each other, but they are certainly not local) and it works without any problem.
Replication will automatically restart if possible. If there is a problem with a query (e.g. a database is present on the master and not the slave, and a query uses it), then it will require manual correction by default (but you can set it to ignore such errors). If the databases are exact mirrors, then you should never need to manually restart replication.
If you have two servers and the master disappears, then to turn the slave into the 'master', just stop replication and alter your code (to write to the new 'master'). If you have three or more servers and the master disappears, then stop replication on the slaves, change them to use the new master, and start again. If they aren't exactly in sync (depends how much data is being transferred, how busy the servers are, how good the network connection is, etc), then you might have to do more work than that. The replication section of the MySQL documentation covers this in more detail.
I would suggest that you ensure that you are replicating over SSL (i.e. set the replication user to require a SSL connection).
There are two alternatives
ALTERNATIVE #1 : Use XtraBackup
It is capable of copying MyISAM as well as InnoDB on a running master.
ALTERNATIVE #2 : Run rsync multiple times
You could run rsync against /var/lib/mysql on a master and copy it to /var/lib/mysql on a slave. Of course, I would run rsync several times. Until the final rsync you should run the FLUSH TABLES WITH READ LOCK. Before copying make sure you hose all binary logs and start from scratch.
Before running anything, please make sure binary logs are written in /var/lib/mysql on both master and slave by having something like the following in /etc/my.cnf:
[mysqld]
log-bin=mysql-bin
Please try running this script in the event you do not want to shutdown MySQL on the master:
mysql -u... -p... -e"SET GLOBAL innodb_max_dirty_pages_pct = 0; RESET MASTER;"
RSYNCSTOTRY=7
cd /var/lib/mysql
X=0
while [ ${X} -lt ${RSYNCSTOTRY} ]
do
X=`echo ${X}+1|bc`
rsync -r * slaveserver:/var/lib/mysql/.
sleep 60
done
mysql -u... -p... -e"FLUSH TABLES WITH READ LOCK; SELECT SLEEP(86400);"
sleep 60
SLEEPID=`mysql -u... -p... -e"SHOW PROCESSLIST;" | grep "SELECT SLEEP(86400)" | awk '{print $1}'`
rsync -r * slaveserver:/var/lib/mysql/.
mysql -u... -p... -e"KILL ${SLEEPID};"
I am a little more conservative in terms of data and index pages being cached while doing this. Personally, I prefer to shutdown mysql after several rsyncs instead of the FLUSH TABLES WITH READ LOCK. Another alternative to this script would be the following script which shuts down mysql for the final rsync:
mysql -u... -p... -e"RESET MASTER;"
RSYNCSTOTRY=7
cd /var/lib/mysql
X=0
while [ ${X} -lt ${RSYNCSTOTRY} ]
do
X=`echo ${X}+1|bc`
rsync -r * slaveserver:/var/lib/mysql/.
sleep 60
done
service mysql stop
rsync -r * slaveserver:/var/lib/mysql/.
service mysql start
That's all for the rsync portion from the master. What about the Slave ???
Before you start mysql on the slave, you need to have the log file and log position from the master. The binary logs you copied have you need, particularly the last binary log. Here is how you get it on the slave:
cd /var/lib/mysql
for X in `ls -l mysql-bin.0* | awk '{print $9}'`
do
LOGFIL=${X}
done
LOGPOS=`ls -l ${LOGFIL} | awk '{print $5}'`
echo "Master Log File ${LOGFIL} Position ${LOGPOS}"
You can trust these numbers because you copied them personally from the master. Now that you have the master log and position, you can start up mysql on the slave and setup replication using the log file and log position that was just reported.
Give it a try !!!
CAVEAT
If you have any InnoDB data, you should set this about 1 hour before attempt to rsync:
SET GLOBAL innodb_max_dirty_pages_pct = 0;
This will cause InnoDB to page out uncommitted data from the InnoDB Buffer Pool faster.
Best Answer
This would be interesting to write down while catching up:
how much bandwidth is used
how much cpu utilization (especially i/o wait)
ram usage
In response to the new information about i/o wait being high; for innodb a lot can be done, take a look at. I've learned many things from mysqlperformanceblog. Here's some hints:
innodb_flush_method=O_DIRECT
"Avoid double buffering and reduce swap pressure, in most cases this setting improves performance. Though be careful if you do not have battery backed up RAID cache as when write IO may suffer."
innodb_flush_log_at_trx_commit=2
"If you’re not concern about ACID and can loose transactions for last second or two in case of full OS crash than set this value. It can dramatic effect especially on a lot of short write transactions."
Those did WONDERS for us, but the drawback is that you might loose a second of written data. This is because instead of write (flush to disk) every record, you flush every second.
You can read more on: http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimization-basics/