Utility to LOGICALLY compare two xml files

xml

Right now we are attempting to build gold configurations for our environment. One piece of software that we use relies on large XML files to contain the bulk of its configuration. We want to take our lab environment, catalog it as our "gold configuration" and then be able to audit against that configuration in the future.

Since diff is a bytewise comparison and NOT a logical comparison, we can't use it to compare files in this case (XML is unordered, so it won't work). What I am looking for is something that can parse the two XML files, and compare them element by element. So far we have yet to find any utilities that can do this. OS doesn't matter, I can do it on anything where it will work. The preference is something off the shelf.

Any ideas?

Edit: One issue we have run into is one vendor's config files will occasionally mention the same element several times, each time with different attributes. Whatever diff utility we use would need to be able to identify either the set of attributes or identify them all as part of one element. Tall order 🙂

Best Answer

Two approaches that I use are (a) to canonicalize both XML files and then compare their serializations, and (b) to use the XPath 2.0 deep-equal() function. Both approaches are OK for telling you whether the files are the same, but not very good at telling you where they differ.

A commercial tool that specializes in this problem is DeltaXML.

If you have things that you consider equivalent, but which aren't equivalent at the XML level - for example, elements in a different order - then you may have to be prepared to do a transformation to normalize the documents before comparison.

Related Topic