Scripting successful migrations

One of the first projects I inherited at Live Oak was a rather large custom CMS implementation with several modules for managing users and organizations in the nonprofit sector. The first iteration had some issues (I won’t get into them in this post, but they will most certainly be the topic of future posts on this blog), and for the last few months, I have been performing a massive overhaul of the data model and several of the frontend processes. It has been great to rip out some of the more obscure code and refactor the rough parts of the application.

However, there were some complications as well. New users have been registering and entering all kinds of valuable information for the last 7 months, and much of this data needed to be moved around to new tables and models. This presented some interesting challenges during development because the changes had to all be implemented in a one time application upgrade, eliminating the possibility of gradually moving the information into the new data model.

Thankfully, the upgrade went very smoothly, so I thought I would take a bit and summarize some of the principles that I utilized to make this happen. This should generalize pretty well to any web application that is heavily data-centric and needs to undergo significant model/schema changes with minimal downtime.

  • Automate as much as possible - For small changes to an application, it is acceptable to just keep a running list of steps you have completed in your development environment and then perform these in your live environment when you roll out your changes. However, this is hugely prone to human error, and cannot be tested with great reliability. Maintain a sql file with all of your schema changes, and begin generating a script to perform all data migration operations as early as possible in your development cycle. Ideally, these will encompass all of your data migration (with the possible exception of filesystem changes). You will be most in-tune with the changes that need to happen at the time of implementation - not right before you are ready to go live!
  • Use your data models in your migration script - When I originally started working on my migration script for this project I thought I would write an old-school procedural “quick and dirty” script to just shuffle the data around. But as I thought about it, I realized that I was ignoring the dependencies that existed within the database. For example, whenever a new user is created, it is not as simple as just inserting a row in the users table. I needed to create a user group, the associated group memberships, assign permissions, etc… Of course, all of this already happens inside of the data models in the application. So instead of going in the back door, have your migration script utilize your application’s data models so you can rely on code that you have already verified (hopefully!) to follow your business rules.
  • Simulate your migration, and then do it again! - Creating a comprehensive migration script during the development phase of your application affords you the awesome opportunity to be able to do some simulations before you actually run your script in the live environment. This is pretty simple. Do a dump of the database from your live environment and import it into your testing database (if your database is too large for this, just choose a representative sample). Now you can run your migration script until you feel comfortable that you are covering all of the different scenarios that exist in your live database. Once you have fixed an issue in the script, just re-import the data from your database dump and try again. I had tested my migration script like crazy before I started simulations, and I still found quite a few issues that only turned up when I encountered the live data that had been entered by real users and not programmers. Once you have run the simulation several times with success, you are ready to think about deploying your application.
  • Stage your application in the live environment - I know that it is not possible for everybody to do this. However, I have encountered several ugly MySQL bugs that affected either my development server or my live server (but rarely both). It is not possible to anticipate environment variables such as these - so if it is at all possible, get your application running on a new virtual host in your live environment as soon as possible. This is the perfect place for your co-workers, testers, or even your clients to interact with the application. After a while, you can be sure that you will not encounter issues with your live environment when you go live with your changes, allowing you to focus on more important things.

That should give you a good start on lowering the stress associated with large application changes in a production environment. Of course, I still ran into a couple of issues with the migration process - so don’t get down on yourself. The bottom line is that migrations are really ugly and almost always come accompanied with unexpected issues. The first goal of developers should be to minimize the chances that the affects of the resulting bugs will significantly affect the end user. Automation is a significant step in that direction.

So what have you learned about upgrades and migrations? Please share any stories and lessons that you have learned in the comments.