Database – Handling Schema Changes When Pushing New Versions

databasepushversioning

During times of heavy development, the database schema changes both rapidly and continuously, and by the time our weekly push to the beta build comes around, the schema has changed so much that the only sensible option is to nuke all of the tables I can and copy the new versions from my dev database. Obviously, this isn't going to work once we launch, since nuking production data is a recipe for disaster, so I was wondering what strategies were out there for managing database schema changes from one version/revision to another?

Some I've found or experienced:

  1. Straight nuke-and-dump from one database to another (what I am doing now)
  2. Maintaining an UPDATE.sql file with SQL statements that get run either via script or by hand.
  3. Maintaining an update.php file with a corresponding "db-schema-version" value in the active database

The third option seems to be the most sensible, but there still exists the possibility of a badly constructed SQL query failing mid-script, leaving the database in a half-updated state, necessitating a restoration of a backup.

It seems like a non-issue, but it does happen, since we as a team, we use phpMyAdmin, and I can't seem to even rely on myself remember copying the executed SQL statement to paste into the update.php file. Once you navigate to another page, I have to re-write the SQL statement out by hand, or reverse my change and do it again.

I guess what I am hoping for is a solution that doesn't impact our established development workflow?

Best Answer

Automate. Automate. Automate.

You're on the right track with an explicit DB version number, but I would go a step further and make the code explicitly aware of exactly what schema it expects to work against (e.g. by committing the actual DDL script and having the updater parse it); then at update time you only have to discover the existing scheme via database metadata and INSERT/DROP/ALTER as necessary, no matter from what version to what version you are updating. (You could also keep an explicit version number in the database itself and deliver the entire schema history with the installer so that you don't even need schema discovery.)

Potential syntax errors in the update SQL script are a problem, but you can solve that by verifying that the updater can only produce correct DDL statements. (Formal proofs are almost never worthwhile in enterprise software development - too much effort for too little assurance - but I feel that database integrity is one of the few exceptions: basic SQL isn't a particularly hard language to capture formally, and the benefit of protecting production data is so great that almost any amount of one-time up-front effort is justified, particularly if it has to work for unattended installs,)