JVM Memory – Zero-Copy Serialization and Deserialization in JVM

jvmmemoryserialization

I am trying to understand the JVM memory model. In particular, I would like to understand whether it would be feasible to have zero-copy (de)serialization libraries, such as Cap'n Proto or FlatBuffers. In particular, these two libraries do have partial support for Java, but it is not entirely clear to me whether the implementation is zero-copy.

Let me assume that I want to deserialize something, and that this region will be read only. I will start with an array of contiguous bytes, at least from the JVM point of view, although I understand they may fail to be contiguous due to the virtual memory manager (and possibily the JVM?). This array is somewhere on the JVM heap.

What I would like to do is to produce an instance of a given class, in such a way that accessing a field of this class just means to dereference a pointer into this array, without having to copy the single parts of this array somewhere else.

Is this feasible on the JVM? If not, what would be the optimum? Can I do this with a single copy regardless of the class layout?

Best Answer

The JVM does not mandate any specific internal representation for objects. How objects are laid out in memory is hidden and it is not possible to directly map a memory area such as an array to an object or to use pointers like in C/C++.

It is possible to use memory-mapped files but you have to call methods which return the byte/int/double etc. located at a specific offset, so it's not as direct access as in C since it has to be routed through method calls, and you can only access primitive data types this way.

Similarily, there are libraries which try to emulate simple data structures, like the pseudo-structs in Javolution. However, this only works for simple data types again. You could try to use some tricks with JNI but at the end of the day you always end up with more indirection than in C - you can't just have a pointer to an area of memory and access it like a normal object. This is in particular because object references can't be directly serialized - taken out of the context of a concrete running instance of the JVM, they wouldn't make any sense.

Therefore if your data structure is more complex than just a set of fields of primitive data types, I think what you request is not possible on the JVM. If you are happy with just primitives, you can check out Struct class from Javolution

Related Topic