Object-Oriented Readability – Using Empty Subclasses to Enhance Class Hierarchy Semantics

object-orientedreadabilitysemantics

I was wondering whether using (almost) empty derived classes to give additional semantics to a class hierarchy was a good idea (or practice) ?

Because a (not so short) example is always better than a long speech, let's assume we're writing a class hierarchy for SQL generation. You want to represent a data manipulation command (INSERT, UPDATE or DELETE), so you have two alternatives :

First version, with switch statement

enum CommandType {
    Insert,
    Update,
    Delete
}

class SqlCommand {
    String TableName { get; }
    String SchemaName { get; }
    CommandType CommandType { get; }
    IEnumerable<string> Columns { get; }
    // ... and so on
}

class SqlGenerator  {
    string GenerateSQLFor(SqlCommand command)
    {
        switch (command.CommandType) 
        {
            case CommandType.Insert:
                return generateInsertCommand(command);
            case CommandType.Update:
                return generateUpdateCommand(command);
            case CommandType.Delete:
                return generateDeleteCommand(command);
            default:
                throw new NotSupportedException();
        }
    }
}

Second version with VisitorPattern (Note that the question is NOT about whether the Visitor pattern should be used or adding new operations)

abstract class SqlCommand {
    String TableName { get; }
    String SchemaName { get; }
    IEnumerable<string> Columns { get; }
    // ... and so on
    abstract void Accept(SqlCommandVisitor visitor);
}

class InsertCommand : SqlCommand 
{
    overrides void Accept(SqlCommandVisitor visitor) 
    {
        visitor.VisitInsertCommand(this);
    }
}

class DeleteCommand : SqlCommand 
{
    overrides void Accept(SqlCommandVisitor visitor) 
    {
        visitor.VisitDeleteCommand(this);
    }
}

class SqlCommandVisitor  {
    void InitStringBuffer() { ... }
    string GetStringBuffer() { ... }

    string GenerateSQLFor(SqlCommand command)
    {
        InitStringBuffer();
        command.Accept(this);
        return GetStringBuffer();
    }

    void Visit(SqlCommand command) { ... }
    void VisitInsertCommand(InsertCommand command) { ... }
    void VisitDeleteCommand(DeleteCommand command) { ... }
    void VisitUpdateCommand(UpdateCommand command) { ... }
}

With both examples we achieve the same result, but :

  • In the first version, my code feels more DRY, even though polymorphism is generally preferable over a switch statement.
  • In the second version, I feel like I'm needlessly deriving SqlCommand just so that the code carries more meaning.

Another similar case is deriving a class from a generic class, just to give additionnal meaning to the collection, eg :

class CustomerList : List<Customer> { }

I'm left off wondering whether this is a good idea. Are there any pitfalls to doing so ?

I personally prefer the second version exactly because it adds meaning to the code, and helps me when I read my code later… But I clearly lack the experience to see any downside to this pattern.

Best Answer

Your second example has the benefit of no case statement, and no CommandType. This means you can a new statement type without having to modify a case statement. A case statement ought to cause an OOP programmer to pause a moment and consider whether the design should be changed to get rid of the case statement. A variable denoting a thing's type ought to cause the programmer to pause a long while.

However, the visitor pattern doesn't seem to be needed here. Simpler would be to use a class hierarchy, as you did, but then put the SQL generation inside that hierarchy rather than having it be a separate class:

class SqlCommand {
  abstract String sql;
}

class InsertCommand : SqlCommand {
  overrides String sql {...};
}

class DeleteCommand : SqlCommand {
  overrides String sql {...};
}

The code inside the sql functions will probably turn out to have shared behavior, or enough complexity, that moving the shared behavior and complexity into a separate SqlBuilder class would be good:

class InsertCommand : SqlCommand {
  overrides String sql {
    builder = SqlBuilder.new
    builder.add("insert into")
    builder.add_identifier(table_name)
    buidler.add("values")
    builder.add_value_list(column_names)
    ...
    return builder.sql
  };
}

The SqlCommand drives the process of creating the sql, pushing its details into the SqlBuilder. This keeps the SqlCommand and SqlBuilder decoupled, since the SqlCommands do not have to expose their entire state for the builder to use. This has the benefits of your second example, but without the complexity of the visitor pattern.

If it turns out that sometimes you need the sql built one way and sometimes another (perhaps you want to support more than one SQL back end), then you abstract the SqlBuilder class, like so:

class SqlBuilder {
  abstract void add(String s);
  abstract void add_identifier(String identifier);
  abstract void add_value_list(String[] value_list);
  ...
}

class PostgresSqlBuilder : SqlBuilder {
  void add(String s) {...}
  void add_identifier(String identifier) {...}
  void add_value_list(String[] value_list) {...}
  ...
}

class MysqlSqlBuilder : SqlBuilder {
  void add(String s) {...}
  void add_identifier(String identifier) {...}
  void add_value_list(String[] value_list) {...}
  ...
}

Then the sql functions use the builder passed into them, rather than creating one:

class InsertCommand : SqlCommand {
  overrides String sql (SqlBuilder builder) {
    builder.add("insert into")
    builder.add_identifier(table_name)
    buidler.add("values")
    builder.add_value_list(column_names)
    ...
    return builder.sql
  };
}