I would like to iterate over an XML document that is essentially a list of identically structured XML elements. The elements will be serialized into Java objects.
<root>
<element attribute="value" />
<element attribute="value" />
<element attribute="value" />
...
</root>
There are a lot of elements within the root element. I would prefer not to load them all into memory. I realize I could use a SAX handler for this, but using a SAX handler to deserialize everything into Java objects seems rather obtuse. I find JDOM very easy to use, but as far as I can tell JDOM always parses the entire tree. Is there a way I can use JDOM to parse the subelements one at a time?
Another reason for using JDOM is it makes writing serialization/deserialization code easy for the corresponding Java objects, which are meaningless if not entirely in memory. However, I don't want to load all of the Java objects into memory at the same time. Rather, I want to iterate over them once.
update: here is an example of how to do this in dom4j: http://docs.codehaus.org/display/GROOVY/Reading+XML+with+Groovy+and+DOM4J. Anyway to do this in jdom?
Best Answer
Why not use StAX (javax.xml.stream.*, an implementation is included in Java SE 6) to stream in the XML, and convert individual portions to objects?
In the above example each individual "element" is unmarshalled into a POJO using JAXB (an implementation is included in Java SE 6), but you could process the fragment as you saw fit. JAXB model details below:
Note:
StAX and JAXB are also compatible with Java SE 5, you just need to download the implementations separately.