You can use the xsi:type attribute for this purpose (you will have to use the xsi:type from the XMLSchema-instance namespace rather than your own namespace otherwise it won't work).
In the schema you declare a base type that is declared as abstract, and create additional complex types for each subtype (with the elements/attributes specific to that type).
Be aware that while this solution works, it would be better to use different element names for each type (the xsi:type is kind of going against the grain since it is now the type attribute in combination with the element name that defines the type rather than just the element name).
eg:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Creature" type="CreatureType">
</xs:element>
<xs:complexType name="CreatureType" abstract="true">
<!-- any common validation goes here -->
</xs:complexType>
<xs:complexType name="Human">
<xs:complexContent>
<xs:extension base="CreatureType">
<xs:sequence maxOccurs="1">
<xs:element name="Address"/>
</xs:sequence>
<xs:attribute name="nationality" type="xs:string"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="Animal">
<xs:complexContent>
<xs:extension base="CreatureType">
<xs:sequence maxOccurs="1">
<xs:element name="Habitat"/>
</xs:sequence>
<xs:attribute name="species" type="xs:string"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
</xs:schema>
This schema will validate these two:
<?xml version="1.0" encoding="UTF-8"?>
<Creature xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:type="Human"
nationality="British">
<Address>London</Address>
</Creature>
<?xml version="1.0" encoding="UTF-8"?>
<Creature xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:type="Animal"
species="Tiger">
<Habitat>Jungle</Habitat>
</Creature>
but not this:
<?xml version="1.0" encoding="UTF-8"?>
<Creature xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:type="SomeUnknownThing"
something="something">
<Something>Something</Something>
</Creature>
or this:
<?xml version="1.0" encoding="UTF-8"?>
<Creature xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:type="Human"
species="Tiger">
<Habitat>Jungle</Habitat>
</Creature>
I suggest ElementTree
. There are other compatible implementations of the same API, such as lxml
, and cElementTree
in the Python standard library itself; but, in this context, what they chiefly add is even more speed -- the ease of programming part depends on the API, which ElementTree
defines.
First build an Element instance root
from the XML, e.g. with the XML function, or by parsing a file with something like:
import xml.etree.ElementTree as ET
root = ET.parse('thefile.xml').getroot()
Or any of the many other ways shown at ElementTree
. Then do something like:
for type_tag in root.findall('bar/type'):
value = type_tag.get('foobar')
print(value)
And similar, usually pretty simple, code patterns.
Best Answer
The following declares the
root
element, which can only occur once and must be specified, and a sequence ofskill
elements with anid
attribute of typexs:IDREF
.xs:attribute
declares an attribute for the element. Thename
attribute specifies the attribute name. Thetype
attribute specifies the data type.No, you don't need to have
maxOccurs
. There is an implicitmaxOccurs="1"
if you don't specify it.