We have a rather large code base that interacts with many SOAP based XML services.
Each one of these services makes 1 to n service calls A typical low level web service call looks like this (simplified):
public XElement ExecuteWebService(string xmlRequest)
We use WCF to build the SOAP message send it and get back the response. We get the body using GetReaderAtBodyContents()
and then convert that string to XMLElement using:
XElement.Parse(response)
Then we use that XElement throughout the rest of the layers of the application. There are not any strongly typed data contracts or classes that are marked with XmlSerialization attributes.
This type of structure makes it very difficult to write tests since an XElement can be any valid XML structure, as well as there are many additional lines of code to read, parse, and update XElement as they are passed around, some pretty messy code.
Are there valid reasons for this type of structure? I was told it was done for performance reasons and it's more flexible.
Is XElement parsing and reading really faster than one time serialization (Data contract, XML Serialization)? Is this really more flexible then using a strongly typed object model?
On other systems I have always used serialization and strongly typed objects because it's easier to understand and maintain. I am not certain if this XElement approach is valid.
Best Answer
I did some research and testing and here are my results:
Analysis
We have several possibilities when reading and writing messages for web services under the .Net platform.
XmlSerializer can serialize both elements and attributes. It’s the default choice when dealing with legacy or existing message structures. When creating your data model classes, one can simply attribute the public properties or members appropriately to produce or consume the outgoing or incoming XML.
Implementing IXmlSerializable is very similar to using the XmlSerializer, except that one will write code to manage the reading and writing of Xml by implementing
GetSchema(), ReadXml(), and WriteXml().
This method will still use XmlReader and XmlWriters to read and write the Xml message.The Data Contract Serializer is also very easy to implement and use. It does not support attributes, so it is suitable for green field development when dealing with Xml. The data contract serializer is ideal for code first scenarios where the message structure is not important.
It may be difficult or not viable to use the data contract serializer when an existing message structure is present. In those cases, one should default back to the XmlSerializer as that offer much more granular control of the format of the incoming and outgoing message structure.
The Data Contract Serializer is the default serializer for WCF and usually offers similar or greater performance improvement over the Xml Serializer when serializing Xml. The data contract serializer can also serialize out to JSON as well and is a more modern approach when dealing with messages.
Data Contract Serializer should always be the first choice for green field development as it produces the least amount of code to maintain and offers good performance.
XElement/XDocument loading and parsing allows for great flexibility when reading and writing Xml messages.
Some of the things to keep in mind:
This method is suitable for cherry picking data off of a large Xml structure. For example, if one has an Xml message with 100 elements and only a few or those elements are needed, this method is appropriate. It is recommended that message processing be centralized and that the data elements that are parsed are put into a strong typed object that can be used throughout the code base. Do not pass XElements or XDocument around in the code as this will be difficult to test. Also, if message processing is not centralized, there could be duplication of XElement parsing throughout the code base with different approaches being used.
Typically, this method offers better performance than serialization when a small amount of parsing is done. As the amount of manual processing and parsing increases, serialization is a better option, even if the performance is slightly worse as there will be less code to maintain and test.
When performance is of utmost concern, implementing a custom approach can be suitable, but costly from a code and maintenance standpoint. Here’s a naïve implementation of producing the some Xml that is being used for performance tests. We have a constructor that takes in an Xml string and creates the object. We have also overrode
ToString()
to create the Xml string.Here we are using string parsing techniques and hardcoded values to read and write the Xml message. This method will offer the best performance at the cost of additional code, maintenance and custom implementation. This method is not recommended unless there is a critical need for the best performance.
Performance Summary
The chart will list initial performance (1st) time, and the average 1000 reads. Hardware used was W530 laptop using visual studio 2013 and .net 4.5.2. The processor was an i7-3840QM at 2.80Ghz. All serializers offer nanosecond read times. The warm up times can be mitigated by performing proper initialization at startup. Xml Serializer can be mitigated by using SGEN prior.
All times are in ticks. Ticks are hardware dependent but as a guideline there are roughly 10,000,000 ticks in 1 second.
Final Recommendations
The following guidelines should be followed when designing messages in which services exchange data:
By following these guidelines clients consuming those messages can be made simpler and faster.