Data Import Design Pattern – Various Source and Destination Types

cdesign-patterns

I have to design and build an import script (in C#) that can handle the following:

read data from various sources (XML, XSLX, CSV)
verify data
write the data to various object types (customer, address)

The data will come from a number of sources but a source will always have one import format (either csv, xml, xslx). Import formats can vary from source to source. New import formats may be added in the future.
The destination object types are always the same (customer, addres and some more).

I've been thinking about using generics and I read something about the factory pattern but I'm a pretty big noob in this area so any advice is more than welcome.

What is an appropriate design pattern to solve this problem?

Best Answer

You are going overboard with fancy concepts was too soon. Generics - when you see a case use them, but otherwise don't worry. Factory pattern - way too much flexibility ( and added confusion ) for this yet.

Keep it simple. Use fundamental practices.

Try to imagine the common things between doing a read for XML, a read for CSV whatever. Things like, next record, next line. Since New formats may be added, try to imagine commonality that the to be determined format would have with the known ones. Use this commonality and define an 'interface' or a contract that all formats must adhere to. Though they adhere to the common ground, they all may have their specific internal rules.
For validating the data, try to provide a way to easily plug in new or different validator code blocks. So again, try to define an interface where each validator, responsible for a particular kind of data construction adheres to a contract.
For creating the data constructions you will probably be constrained by whoever designs the suggested output objects more than anything. Try to figure out what the next step for the data objects is, and are there any optimizations you can make by knowing the final use. For example if you know the objects are going to be used in an interactive application, you could help the developer of that app by providing 'summations' or counts of the objects or other kinds of derived information.

I'd say most of these are Template patterns or Strategy patterns. The whole project would be an Adapter pattern.

Related Solutions

C# – Data Scraping – One application or multiple

What you are working on is basically ETL. So at a high level you need an export component (get stuff) a transform component (map to known format) and a load (take known format and put stuff somewhere). If you are comfortable being tied to a RDBMS you could use something like SQL Server SSIS packages. What I would do is create a host application that managed common aspects of the overall process (errors, and pipeline processing). Then make the specifics of the E, T, and L pluggable. A low ceremony way to get this would be to host the powershell runtime and create each seesion with common context objects that the scripts will use to communicate. You get a built in pipe and filter model for scripts and easy, safe extensibility. This design has worked extremely for my team with a similar situation.

C# – Implement Generic DataSet Builder with C#

Here is a link describing this approach.

Below is my resultant class based on Dan's suggestion

using System;
using System.Data;
using System.Data.Common;

namespace CustomDataAccess{

public class DataSetBuilder
{

    #region Properties

    private DataSet _DataSet;

    public DataSet DataSet { get { return _DataSet; } }

    #endregion

    #region Constructors

    public DataSetBuilder()
    {
        this._DataSet = new DataSet();
    }

    public DataSetBuilder(string DataSetName)
    {
        this._DataSet = new DataSet(DataSetName);
    }

    public DataSetBuilder(DataSet DataSet)
    {
        this._DataSet = DataSet;
    }

    #endregion

    #region Public Methods

    public DataSetBuilder InsertTables(DataTable Table)
    {
        this._DataSet.Tables.Add(Table);

        return this;
    }

    public DataSetBuilder InsertTables(string DbProviderName, string ConnectionString, string TableName, string CommandText)
    {
        System.Data.Common.DbDataAdapter adapter = Create_Adapter(DbProviderName, ConnectionString);

        Fill_Adapter(adapter, TableName, CommandText);

        adapter.SelectCommand.Connection.Close();

        return this;
    }

    public DataSetBuilder InsertTables(string DbProviderName, string ConnectionString, string[] TableName, string[] CommandText)
    {
        if (TableName.Length != CommandText.Length)
        {
            throw new Exception("Error: Must provide a table name for each command.");
        }

        System.Data.Common.DbDataAdapter adapter = Create_Adapter(DbProviderName, ConnectionString);

        for (int i = 0; i < TableName.Length; i++)
        {

            Fill_Adapter(adapter, TableName[i], CommandText[i]);
        }

        adapter.SelectCommand.Connection.Close();

        return this;
    }

    public void AddRelations(string ParentTable, string PrimaryKey, string ChildTable, string ForeignKey, bool NestingRule)
    {
        Add_Relations(ParentTable, PrimaryKey, ChildTable, ForeignKey, NestingRule);
    }

    public void AddRelations(string[] ParentTable, string[] PrimaryKey, string[] ChildTable, string[] ForeignKey, bool[] NestingRule)
    {

        for (int i = 0; i < ParentTable.Length; i++)
        {
            Add_Relations(ParentTable[i], PrimaryKey[i], ChildTable[i], ForeignKey[i], NestingRule[i]);
        }
    }

    #endregion

    #region Private Methods

    private System.Data.Common.DbDataAdapter Create_Adapter(string DbProviderName, string ConnectionString)
    {
        DbProviderFactory dbFactory = System.Data.Common.DbProviderFactories.GetFactory(DbProviderName);

        System.Data.Common.DbConnection connection = dbFactory.CreateConnection();

        connection.ConnectionString = ConnectionString;

        connection.Open();

        System.Data.Common.DbCommand command = dbFactory.CreateCommand();

        command.Connection = connection;

        System.Data.Common.DbDataAdapter adapter = dbFactory.CreateDataAdapter();

        adapter.SelectCommand = command;

        return adapter;
    }

    private void Fill_Adapter(System.Data.Common.DbDataAdapter Adapter, string TableName, string CommandText)
    {
        Adapter.SelectCommand.CommandText = CommandText;

        Adapter.Fill(_DataSet, TableName);
    }

    private void Add_Relations(string ParentTable, string PrimaryKey, string ChildTable, string ForeignKey, bool NestingRule)
    {
        DataColumn pk = _DataSet.Tables[ParentTable].Columns[PrimaryKey];

        DataColumn fk = _DataSet.Tables[ChildTable].Columns[ForeignKey];

        DataRelation relation = _DataSet.Relations.Add(pk, fk);

        relation.Nested = NestingRule;
    }

    #endregion

}}

Best Answer

Related Solutions

C# – Data Scraping – One application or multiple

C# – Implement Generic DataSet Builder with C#

Related Topic