Java Serialization for long-ish-term storage

javajsonserialization

I have an application that uses a database of about 15,000 Java objects, which I have to read every time the application starts. Originally I've been using JSON to store the data, but that has a few issues, mostly that it's slow (it can take 8-10 seconds to read all the objects on my lower-end machine) and also, it's very common for the objects in my database to have fields that point to the same object. Java serialization handles this by using references to the same object, whereas with JSON, I just have to write the state of each object and then intern them during reading. This also bloats the file size.

The contents of this database will be updated fairly infrequently (maybe about once a month or so). I've heard from pretty much every source that Java serialization is always a poor choice for long-term storage, and I understand why. Given these conditions, however, is there a good reason not to use Java serialization here?

Best Answer

While you do not anticipate changing the nature of your objects now very much - needs change over time.

I would highly suggest you consider using protobuf or Apache Thrift or a similar design instead of relying on default Java serialization.

Their advantages include strong support for avoiding impact during minor version changes, significantly better serialization and deserialization speed and smaller footprint of objects.

When I do this, typically I include the message version as a field on the object. This allows me to implement version-to-version changes during the serialization by wrapping that method, if that need should arise in the future.

Related Topic