C++ – Alternatives to the Visitor Design Pattern

cvisitor-pattern

I have been trying to come up with a method to "serialize" various objects into various different formats. For example:

class Shape {
public:
    virtual std::string_view name() const = 0;
    virtual double area() const = 0;
};

class Square : public Shape {
...
};

class Triangle: public Shape {
...
};

Suppose I have the two types of shapes above. Now, I want to be able to serialize (and eventually deserialize) these different class into different formats (e.g., string, JSON, bytes, …)

The first solution is to perform the serialization and deserialization within the class itself (this is what overriding the insertion operator does). However, if I start adding different types of serialization, I have to modify every single Shape class.

class Shape {
public:
    ...
    virtual std::string serializeToString() const = 0;
    virtual json_object serializeToJSON() const = 0;
    //Repeat for every type of serialized output...
};

The second solution I found was to use the visitor pattern. Using that pattern, I can create a different visitor for each type of serialized format. And, since I have much less serialized formats than visitors, I suppose it is acceptable that you have to modify every visitor class when a new Shape class is added.

class Shape {
public:
    virtual void accept(ShapeVisitor& v) = 0;
};

class Square : public Shape;
class Triangle : public Square;

class ShapeVisitor {
public:
    virtual void visit(const Square& s) = 0;
    virtual void visit(const Triangle& s) = 0;
};

class StringShapeVisitor : public ShapeVisitor {
public:
    void visit(const Square& s) const;
    void visit(const Triangle& s) const;
};

But of course, the problem with the visitor pattern is the visitors have no way to access the private data of each class. And since this is serialization I am talking about, I have to access every single private data member which I cannot see how to do without breaking encapsulation of the shapes completely.

A third option I thought of is just using some form of templates and template specialization to choose the correct function for serialization based upon the format and class. The problem is, this doesn't work at runtime on a generic Shape instance…

So my questions are:

  • Is there a modification to the visitor pattern which overcomes the private data problem?
  • Is there an alternative to the visitor pattern that would ideally not require updating a the same classes over and over?
  • Whatever method I use, is there ay way to make the process as "reversible" as possible (e.g, for deserialization)?

Best Answer

What you need is an intermediary unified representation. The problem now is that your serialization procedures need to understand the details/semantics of the various shape types. Instead, what you could do is provide the shapes with the ability to return a self-describing unified representation of some sort, that the serialization code can just treat as generic structured data, without needing to understand what the data means in the context of a specific shape.

Depending on what you're doing and on what exactly the data that's associated with the shapes is, you might come up with different schemes for this intermediary representation. E.g., it could just be some metadata followed by a list of key-value pairs ({ "width": 1.0, "height": 1.0 }, or perhaps you'd treat shapes as polygons and use a list of vertices and edges. Understand your goals and constraints and try to come up with some scheme that's suitable for what you're doing. Note that, for the shape polymorphism to be useful, there should be parts of your application that are able to work entirely through the abstract Shape interface, never requiring to know any details of the concrete shapes. If there are aspects of the application (other than serialization) for which this doesn't quite work, perhaps you can make use of this unified representation there too - if you design it well.

You'd then create various shape serializers (which may or may not form a hierarchy) that take the unified representation and output different formats. For deserialization, reconstitute the unified representation and pass it along to the shape, or a factory associated with the shape, or a shape prototype. A factory would have to be aware of different Shape subtypes, but this knowledge would be confined there (in a single place).

So, something like

class Shape {
public:
    virtual UnifiedRepr toUnifiedRepr() const = 0;
    ...
};

class Square : public Shape {
...
};

class Triangle: public Shape {
...
};

// ----------------
// Elsewhere:  
void JSONSerializer::serialize(const Shape& shape) {
    UnifiedRepr inputData = shape.toUnifiedRepr();
    // encode as JSON
    // ...
}