C# Refactoring – Using a Factory Method Instead of a Constructor While Maintaining Backwards Compatibility

cdesign-patternsfactory-methodrefactoring

The problem

Let's say I have a class called DataSource which provides a ReadData method (and maybe others, but let's keep things simple) to read data from an .mdb file:

var source = new DataSource("myFile.mdb");
var data = source.ReadData();

A few years later, I decide that I want to be able to support .xml files in addition to .mdb files as data sources. The implementation for "reading data" is quite different for .xml and .mdb files; thus, if I were to design the system from scratch, I'd define it like this:

abstract class DataSource {
    abstract Data ReadData();
    static DataSource OpenDataSource(string fileName) {
        // return MdbDataSource or XmlDataSource, as appropriate
    }
}

class MdbDataSource : DataSource {
    override Data ReadData() { /* implementation 1 */ }
}

class XmlDataSource : DataSource {
    override Data ReadData() { /* implementation 2 */ }
}

Great, a perfect implementation of the Factory method pattern. Unfortunately, DataSource is located in a library and refactoring the code like this would break all existing calls of

var source = new DataSource("myFile.mdb");

in the various clients using the library. Woe is me, why didn't I use a factory method in the first place?


Solutions

These are the solutions I could come up with:

  1. Make the DataSource constructor return a subtype (MdbDataSource or XmlDataSource). That would solve all my problems. Unfortunately, C# does not support that.

  2. Use different names:

    abstract class DataSourceBase { ... }    // corresponds to DataSource in the example above
    
    class DataSource : DataSourceBase {      // corresponds to MdbDataSource in the example above
        [Obsolete("New code should use DataSourceBase.OpenDataSource instead")]
        DataSource(string fileName) { ... }
        ...
    }
    
    class XmlDataSource : DataSourceBase { ... }
    

    That's what I ended up using since it keeps the code backwards-compatible (i.e. calls to new DataSource("myFile.mdb") still work). Drawback: The names are not as descriptive as they should be.

  3. Make DataSource a "wrapper" for the real implementation:

    class DataSource {
        private DataSourceImpl impl;
    
        DataSource(string fileName) {
            impl = ... ? new MdbDataSourceImpl(fileName) : new XmlDataSourceImpl(fileName);
        }
    
        Data ReadData() {
            return impl.ReadData();
        }
    
        abstract private class DataSourceImpl { ... }
        private class MdbDataSourceImpl : DataSourceImpl { ... }
        private class XmlDataSourceImpl : DataSourceImpl { ... }
    }
    

    Drawback: Every data source method (such as ReadData) must be routed by boilerplate code. I don't like boilerplate code. It's redundant and clutters the code.

Is there any elegant solution that I have missed?

Best Answer

I would go for a variant to your second option that allows you to phase-out the old, too generic, name DataSource:

abstract class AbstractDataSource { ... } // corresponds to the abstract DataSource in the ideal solution

class XmlDataSource : AbstractDataSource { ... }
class MdbDataSource : AbstractDataSource { ... } // contains all the code of the existing DataSource class

[Obsolete("New code should use AbstractDataSource instead")]
class DataSource : MdbDataSource { // an 'empty shell' to keep old code working.
    DataSource(string fileName) { ... }
}

The only drawback here is that the new base-class can't have the most obvious name, because that name was already claimed for the original class and needs to remain like that for backwards compatibility. All the other classes have their descriptive names.