Deployment CI – How to Incrementally Update a Database When Deploying a Website?

continuous integrationdeploymenttomcat

We use JSF over Tomcat along with a backing proprietary DB.

I want to know how the addition of new database fields or/and stored procedures should be handled when we deploy a new version of the application.

We deploy by sending a WAR file to the hosting Team, as such our WAR should be able to
setup the stored procedure when it starts.

What are the possible appropriate strategies to be able to incrementally update the database?

Best Answer

Jelayn's answer is pretty good, and represents roughly standard practice, esp in the Rails world where a similar solution is baked into the framework. However, the state of the art, as practiced by Facebook, Flickr, Heroku, and IMVU - a pioneer in this - is a bit more advanced.

The problem with the standard practice is that it assumes you can write 2 versions of code, and switch from one to the other with a quick migration as you deploy. There's lots of problems with this:

  • If you have a lot of data, a migration will take time. It takes months for Facebook to migrate data, but unless you have very little data, it's rare for the migration to be instant. The solution to this is to migrate slowly. IMVU pioneered the strategy here:

    • instead of migrating the table, make a separate table with the new schema
    • when a row is read that isn't present in the new table, migrate only that row and save it to the new table. You can then delete the old row.
    • also run a background job that runs the migrations in batches (watch out for conflicts!)
  • during a migration, your code needs to support the old schema and the new one. So you should write the code that supports both, then migrate, then remove the old code in a separate deploy.

  • upon a failed migration, you should decide whether you can migrate in both directions or just forwards. IMVU goes forward only I think - so if they have a bug in their migration, they fix it instead of rolling it back. I personally prefer fixing, but there are good arguments for rolling it back too.

This stuff is all highly related to continuous deployment as well, and a lot of those strategies can help. For example, a common CD technique is selectively enable new code only for a small user base to give it more testing, and only then enable it for larger and larger audiences (typically, you use a DB setting for each feature). This practice allows you gain more confidence in your migrations, esp when they are more complicated than just adding a field, and without betting all your data on a relatively untested migration.

Source: market research and talking to customers while making my Continuous Integration and Deployment start-up.

Related Topic