How to update your production codebase/database schema without causing downtime

deployment

What are some techniques for updating a production server's code base / database schema without causing any downtime?

Best Answer

Generally, the websites I've worked on that had this sort of requirement were all behind load-balancers, or had separate failover locations. In this sample, I'll presume that you've got a single load balancer, 2 web servers (A & B) and 2 database servers (M & N - usually DB servers are linked via logshipping - at least in the SQL Server world).

  1. Webserver A to be disconnected from load balancer (so all incoming traffic goes to B).
  2. Log shipping is stopped (DB Server M is going to get updated first).
  3. Update Webserver A. Point the configuration to DB Server M.
  4. Test and verify that the update worked (usually folks are hitting the IP address directly).
  5. Set the load balancer so that existing sessions continue to go to B. New sessions go to A.
  6. Wait for all sessions on B expire (might be a half hour or more, usually we watch traffic and have a 1 hour break scheduled).
  7. Update B and N.
  8. Test and verify that the update worked.
  9. Set up log shipping again and test it works.
  10. Set the load balancer to regular operation.

In a very complicated web applications, what is described as steps 1-5 might take all night and be a 50 page Excel spreadsheet with times and emergency contact numbers. In such situations, updating half the system is scheduled for 6pm to 6am while leaving the system available to users. Handling the update for the DR site is usually scheduled for the following night - just hope nothing breaks the first day.

Where uptime is a requirement, patches are tested first on the QA environment, which ideally is the same hardware as production. If they show no disruption, they can then be applied on the regular schedule, which is usually on the weekend.