Mysql – Replicate main MySql db to a development server to play with real data

MySQLmysql-replication

I have a main architecture with a MySql db, replication and backup, it's working fine. Now I also have a development server where I play with the code and I would like to use the data from the main db to play too, with reads and writes. How can I set up the replication to have some real data coming from the main db to the development db? Should I replicate permanently or only once a day to keep some sort of consistency during tests on dev db? Any idea / strategy is welcome.

Edit: ok so to clarify the kind of application I use is a web app development with Django. The idea is to test some new features on the development server. The need to write is important, a read only database won't be enough to run the tests thoroughly. For the moment the database takes a fairly reasonable amount of time to dump to another server (something like 10 minutes), but it's growing.

Best Answer

The "right answer" is highly application dependant, but here are a few strategies I've seen used before.

Daily Backup / Restore-on-demand

This works well if you have a small data set, and the need to be able to have various developers use the backups ad-hoc. The idea is basically a mysqldump on a replication target system and some scripts to do a MySQL import.

ETL Process

This approach makes a lot of sense if you want a subset of data, or you want to mask it somehow to protect the data from breach. You can transform sensitive data within the ETL scripts and either load directly into a development database, or create a dump like in the above approach, but you'll know that it's been cleaned already.

An ETL process can be nice if you want to run it hourly to, say, extract the last hour from production, do some cleanup/reorganization and import it into a development "master" database for export to development systems, etc.

Binary Log Replication

This method probably won't work for you, as you specifically want to read and write to the development database. Sometimes people will use replication to run read-only regression testing against, but modifying the database will result in failed replication and/or inconsistent data.

Related Topic