C# Design Patterns – Handling Multiple Compression Algorithms in Class Hierarchy

Architecturecdesigndesign-patternsobject-oriented

For all you OOD experts. What would be the recommended way to model the following scenario?

I have a certain class hierarchy similar to the following one:

class Base {
   ...
}

class Derived1 : Base {
   ...
}

class Derived2 : Base {
   ...
}
...

Next, I would like to implement different compression/decompression engines for this hierarchy. (I already have code for several strategies that best handle different cases, like file compression, network stream compression, legacy system compression, etc.)

I would like the compression strategy to be pluggable and chosen at runtime, however I'm not sure how to handle the class hierarchy. Currently I have a tighly-coupled design that looks like this:

interface ICompressor {
   byte[] Compress(Base instance);
}

class Strategy1Compressor : ICompressor {
   byte[] Compress(Base instance) {

      // Common compression guts for Base class
      ...
      //

      if( instance is Derived1 ) {
         // Compression guts for Derived1 class 
      }
      if( instance is Derived2 ) {
         // Compression guts for Derived2 class
      }

      // Additional compression logic to handle other class derivations
      ...
   }

}

As it is, whenever I add a new derived class inheriting from Base, I would have to modify all compression strategies to take into account this new class. Is there a design pattern that allows me to decouple this, and allow me to easily introduce more classes to the Base hierarchy and/or additional compression strategies?

Best Answer

Visitor Pattern:

enter image description here

Note that the code contains no if-then-else or switch-case structures to select the appropiate compressor. It's not needed since the visitor patterns allow for the appropiate dispatch.

public interface IBase {
    void accept(ICompressorVisitor e);
    void setName(String s);
    String getName();
}

public class Base1 implements IBase {
    private String name="";
    @Override
    public void accept(ICompressorVisitor e) {
        e.visit(this);      
    }

    @Override
    public void setName(String s) {
        this.name = s;      
    }

    @Override
    public String getName() {
        return "Base1("+this.name+")";
    }

}

public class Base2 implements IBase  {
    private String name="";
    @Override
    public void accept(ICompressorVisitor e) {
        e.visit(this);      
    }

    @Override
    public void setName(String s) {
        this.name = s;      
    }

    @Override
    public String getName() {
        return "Base2("+this.name+")";
    }
}

public interface ICompressorVisitor {
    void visit(Base1 base);
    void visit(Base2 base);
}

public class CompressorX implements ICompressorVisitor {

    @Override
    public void visit(Base1 base) {
        System.out.println("Compressing "+base.getName()+" using algorithm X optimized for Base1");

    }

    @Override
    public void visit(Base2 base) {
        System.out.println("Compressing "+base.getName()+" using algorithm X optimized for Base2");
    }

}

public class CompressorY implements ICompressorVisitor {

    @Override
    public void visit(Base1 base) {
        System.out.println("Compressing "+base.getName()+" using algorithm Y optimized for Base1");

    }

    @Override
    public void visit(Base2 base) {
        System.out.println("Compressing "+base.getName()+" using algorithm Y optimized for Base2");

    }

}

public class Test {

    public static void main(String[] args) {
        Base1 b1 = new Base1();
        b1.setName("a");
        Base2 b2 = new Base2();
        b2.setName("b");

        CompressorX x = new CompressorX();
        CompressorY y = new CompressorY();

        b1.accept(x);
        b1.accept(y);

        b2.accept(x);
        b2.accept(y);

    }

}

Output of test:

Compresing Base1(a) using algorithm X optimized for Base1
Compresing Base1(a) using algorithm Y optimized for Base1
Compresing Base2(b) using algorithm X optimized for Base2
Compresing Base2(b) using algorithm Y optimized for Base2

Keep the need for versioning low

I assume here that the JSON files are generated for someone other component/service/client to consume. So try not removing fields. If you only add new fields, then there is no need to create a new version, so long as consumers of the JSON files ignore fields they don't know.

An interface where consumers ignore features they don't know about is more robust. Imagine for example if I consume your data at V0 and then for a year you add new fields on a biweekly basis, arriving at V26 after a year. Then in V27 you add a field that I want to consume also. Should I be bothered by updating my code to handle the fields added between V1 through V26 even though I don't use them? I don't think so.

Removing fields on the other hand is a different beast. You should do that rarely, and in bulk. The whole idea of major and minor versions in semantic versioning is about that. If you add something, it is a minor version and it should not affect users, if you remove things, it is a major update, that can break dependent code.

This also coincides with the notion of subtypes, polymorphism and substitutability. Essentially, to add a new field to a FirstPOJO, you could modify it, or you could subclass it to SubPOJO, that extends FirstPOJO by adding someField. Code written against FirstPOJO will be able to handle SubPOJO transparently. Of course if you start removing things, then code can break.

I know this doesn't exactly answer your question. But your basic problem is that you have a code architecture, that doesn't scale. Reducing the need to scale in the first place, does circumvent the problem.

C++ – Is this a good approach for a “pImpl”-based class hierarchy in C++

I think it is a poor strategy to make Derived_1::Impl derive from Base::Impl.

The main purpose of using the Pimpl idiom is to hide the implementation details of a class. By letting Derived_1::Impl derive from Base::Impl, you've defeated that purpose. Now, not only does the implementation of Base depend on Base::Impl, the implementation of Derived_1 also depends on Base::Impl.

Is there a better solution?

That depends on what trade-offs are acceptable to you.

Solution 1

Make Impl classes totally independent. This will imply that there will be two pointers to Impl classes -- one in Base and another one in Derived_N.

class Base {

   protected:
      Base() : pImpl{new Impl()} {}

   private:
      // It's own Impl class and pointer.
      class Impl { };
      std::shared_ptr<Impl> pImpl;

};

class Derived_1 final : public Base {
   public:
      Derived_1() : Base(), pImpl{new Impl()} {}
      void func_1() const;
   private:
      // It's own Impl class and pointer.
      class Impl { };
      std::shared_ptr<Impl> pImpl;
};

Solution 2

Expose the classes only as handles. Don't expose the class definitions and implementations at all.

Public header file:

struct Handle {unsigned long id;};
struct Derived1_tag {};
struct Derived2_tag {};

Handle constructObject(Derived1_tag tag);
Handle constructObject(Derived2_tag tag);

void deleteObject(Handle h);

void fun(Handle h, Derived1_tag tag);
void bar(Handle h, Derived2_tag tag);

Here's quick implementation

#include <map>

class Base
{
   public:
      virtual ~Base() {}
};

class Derived1 : public Base
{
};

class Derived2 : public Base
{
};

namespace Base_Impl
{
   struct CompareHandle
   {
      bool operator()(Handle h1, Handle h2) const
      {
         return (h1.id < h2.id);
      }
   };

   using ObjectMap = std::map<Handle, Base*, CompareHandle>;

   ObjectMap& getObjectMap()
   {
      static ObjectMap theMap;
      return theMap;
   }

   unsigned long getNextID()
   {
      static unsigned id = 0;
      return ++id;
   }

   Handle getHandle(Base* obj)
   {
      auto id = getNextID();
      Handle h{id};
      getObjectMap()[h] = obj;
      return h;
   }

   Base* getObject(Handle h)
   {
      return getObjectMap()[h];
   }

   template <typename Der>
      Der* getObject(Handle h)
      {
         return dynamic_cast<Der*>(getObject(h));
      }
};

using namespace Base_Impl;

Handle constructObject(Derived1_tag tag)
{
   // Construct an object of type Derived1
   Derived1* obj = new Derived1;

   // Get a handle to the object and return it.
   return getHandle(obj);
}

Handle constructObject(Derived2_tag tag)
{
   // Construct an object of type Derived2
   Derived2* obj = new Derived2;

   // Get a handle to the object and return it.
   return getHandle(obj);
}

void deleteObject(Handle h)
{
   // Get a pointer to Base given the Handle.
   //
   Base* obj = getObject(h);

   // Remove it from the map.
   // Delete the object.
   if ( obj != nullptr )
   {
      getObjectMap().erase(h);
      delete obj;
   }
}

void fun(Handle h, Derived1_tag tag)
{
   // Get a pointer to Derived1 given the Handle.
   Derived1* obj = getObject<Derived1>(h);
   if ( obj == nullptr )
   {
      // Problem.
      // Decide how to deal with it.

      return;
   }

   // Use obj
}

void bar(Handle h, Derived2_tag tag)
{
   Derived2* obj = getObject<Derived2>(h);
   if ( obj == nullptr )
   {
      // Problem.
      // Decide how to deal with it.

      return;
   }

   // Use obj
}

Pros and Cons

With the first approach, you can construct Derived classes in the stack. With the second approach, that is not an option.

With the first approach, you incur the cost of two dynamic allocations and deallocations for constructing and destructing a Derived in the stack. If you construct and destruct a Derived object from the heap you, incur the cost of one more allocation and deallocation. With the second approach, you only incur the cost of one dynamic allocation and one deallocation for every object.

With the first approach, you get have the ability to use virtual member function is Base. With the second approach, that is not an option.

My suggestion

I would go with the first solution so I can use the class hierarchy and virtual member functions in Base even though it is a little bit more expensive.

Best Answer

Related Solutions

Java – Class design for writing multiple versions of multiple files

Keep the need for versioning low

C++ – Is this a good approach for a “pImpl”-based class hierarchy in C++

Related Topic