Serialization – Should Classes Handle Their Own Serialization and Deserialization?

cobject-orientedpythonserialization

I'm currently in the (re)design phase of several model classes of a C# .NET application. (Model as in M of MVC). The model classes already have plenty of well-designed data, behaviors, and interrelationships. I am rewriting the model from Python to C#.

In the old Python model, I think I see a wart. Each model knows how to serialize itself, and the serialization logic has nothing to do with the rest of the behavior of any of the classes. For example, imagine:

  • Image class with a .toJPG(String filePath) .fromJPG(String filePath) method
  • ImageMetaData class with a .toString() and .fromString(String serialized) method.

You can imagine how these serialization methods are not cohesive with the rest of the class, yet only the class can be guaranteed to know sufficient data to serialize itself.

Is it common practice for a class to know how to serialize and deserialize itself? Or am I missing a common pattern?

Best Answer

I generally avoid having the class know how to serialize itself, for a couple of reasons. First, if you want to (de)serialize to/from a different format, you now need to pollute the model with that extra logic. If the model is accessed via an interface, then you also pollute the contract.

public class Image
{
    public void toJPG(String filePath) { ... }

    public Image fromJPG(String filePath) { ... }
}

But what if you want to serialize it to/from a PNG, and GIF? Now the class becomes

public class Image
{
    public void toJPG(String filePath) { ... }

    public Image fromJPG(String filePath) { ... }

    public void toPNG(String filePath) { ... }

    public Image fromPNG(String filePath) { ... }

    public void toGIF(String filePath) { ... }

    public Image fromGIF(String filePath) { ... }
}

Instead, I typically like to use a pattern similar to the following:

public interface ImageSerializer
{
    void serialize(Image src, Stream outputStream);

    Image deserialize(Stream inputStream);
}

public class JPGImageSerializer : ImageSerializer
{
    public void serialize(Image src, Stream outputStream) { ... }

    public Image deserialize(Stream inputStream) { ... }
}

public class PNGImageSerializer : ImageSerializer
{
    public void serialize(Image src, Stream outputStream) { ... }

    public Image deserialize(Stream inputStream) { ... }
}

public class GIFImageSerializer : ImageSerializer
{
    public void serialize(Image src, Stream outputStream) { ... }

    public Image deserialize(Stream inputStream) { ... }
}

Now, at this point, one of the caveats with this design is that the serializers need to know the identity of the object it's serializing. Some would say that this is bad design, as the implementation leaks outside of the class. The risk/reward of this is really up to you, but you could slightly tweak the classes to do something like

public class Image
{
    public void serializeTo(ImageSerializer serializer, Stream outputStream)
    {
        serializer.serialize(this.pixelData, outputStream);
    }

    public void deserializeFrom(ImageSerializer serializer, Stream inputStream)
    {
        this.pixelData = serializer.deserialize(inputStream);
    }
}

This is more of a general example, as images usually have metadata that goes along with it; things like compression level, colorspace, etc. which may complicate the process.

Related Topic