Mysql – Modifying columns of very large thesql tables with little or no downtime

databasedatabase-administrationinnodbmaster-slaveMySQL

I periodically need to make changes to tables in mysql 5.1, mostly adding columns. Very simple with the alter table command. But my tables have up to 40 million rows now and they are growing fast… So those alter table commands take several hours. In a couple months they'll take days I'm guessing.

Since I'm using amazon RDS, I can't have slave servers to play with and then promote to master. So my question is if there's a way to do this with minimal downtime? I don't mind an operation taking hours or even days if users can still use the db of course… Can they at least read while columns are being added? What happens if my app tries to write? Insert or update? If it fails immediately that's actually not so bad, if it just hangs and causes problems for the db server that's a big problem..

This must be a fairly common scaling issue, everyone needs to add columns.. What's typically done to a production db? Slave -> master migration?

Update – I forgot to mention I'm using the innodb storage engine

Best Answer

I periodically need to make changes to tables in mysql 5.1, mostly adding columns.

Don't. No really. Just don't. It should be a very rare occasion when this is ever necessary.

Assuming your data really is normalized to start with, the right way to solve the problem is to add a new table with a 1:1 relationship to the base table (non-obligatory on the new table).

Having to add columns regularly is usually an indicator of a database which is not normalized - if your schema is not normalized then that's the problem you need to fix.

Finally, if your schema really, really is normalized and you really, really must keep adding columns then:

  1. Ensure you've got a timestamp column on the database or that it is generating replication logs
  2. Create a copy (B) of the table (A)
  3. add the new columns to B (this will still block with myisam)
  4. disable transactions
  5. rename the original table (A) as something else (backup)
  6. rename the new table (B) with the name of the original table (A)
  7. replay the transactions from the start of the operation from the replication log or from the backup table
  8. enable transactions.