Persistent Storage – Strategy for Backwards Compatibility of Persistent Storage

backward compatibilitylanguage-agnosticpersistence

In my experience, trying to ensure that new versions of an application retain compatibility with data storage from previous versions can often be a painful process.

What I currently do is to save a version number for each 'unit' of data (be it a file, database row/table, or whatever) and ensure that the version number gets updated each time the data changes in some way. I also create methods to convert from v1 to v2, v2 to v3, and so on. That way, if I'm at v7 and I encounter a v3 file, I can do v3->v4->v5->v6->v7.

So far this approach seems to be working out well, but I haven't had to make use of it extensively yet so there may be unforseen problems. I'm also concerned that if the objects I'm loading change significantly, I'll either have to keep around old versions of the classes or face updating all my conversion methods to handle the new class definition.

Is my approach sound? Are there other/better approaches I could be using? Are there any design patterns applicable to this problem?

Best Answer

You're doing it right. You're stamping data with its version, which means you have a definite interpretation of it. The only open question is how to handle "old" data. Your choices are essentially between upgrading data where it lives, having your code adapt the data in realtime, or having the code handle multiple data versions. From 30+ years experience, I can tell you the former is the only sane way to go. Bite the bullet and write a conversion routine for each step along the history, and run them in sequence. If you find that a later step obviates an earlier one (e.g., why upgrade the rows in a table if a later step deletes the table?), resist the temptation to short-circuit things unless there is a large, demonstrable performance gain in the update process.

Related Topic