Xml – VB.NET validating XML file against XSD file and parsing through the xml

vb.netxmlxsd

What I need to do?
I need to validate an XML file (pass the file path/location) against the XSD file (pass the file path/location). I need to check that it is wellformed no illegal characters and it has all the tags defined in the XSD i.e no tag missing. It matches the datatypes defined in the xsd. After that is done I need to parse the xml file to get the data and store it in database.

Questions?
1) Using XmlReaderSetttings with XmlDocument and XmlReader with Validate method will that help me acheive what I need? CAn any one help me with sampel code?

2) What is the best way to parse an xml file to get specific tags?

I am new to VB.net so any sample code help will be appreciated. Thanks!

Best Answer

Yes, you are on the right track. Validating an XML document can be done using either XmlDocument or XmlReader (as I'll describe later, you can also use XDocument). Which one you choose will depend on your situation, but they both work similarly. When they find an error with the document, they call a ValidationEventHandler delegate. The XmlReader calls it via an event in the XmlReaderSettings object whereas the XmlDocument calls it via a delegate passed as a parameter to its Validate method. Here is a simple class which can be used to collect the errors:

Public Class XmlValidationErrorBuilder
    Private _errors As New List(Of ValidationEventArgs)()

    Public Sub ValidationEventHandler(ByVal sender As Object, ByVal args As ValidationEventArgs)
        If args.Severity = XmlSeverityType.Error Then
            _errors.Add(args)
        End If
    End Sub

    Public Function GetErrors() As String
        If _errors.Count <> 0 Then
            Dim builder As New StringBuilder()
            builder.Append("The following ")
            builder.Append(_errors.Count.ToString())
            builder.AppendLine(" error(s) were found while validating the XML document against the XSD:")
            For Each i As ValidationEventArgs In _errors
                builder.Append("* ")
                builder.AppendLine(i.Message)
            Next
            Return builder.ToString()
        Else
            Return Nothing
        End If
    End Function
End Class

The ValidationEventHandler method in that class matches the signature of the ValidationEventHandler delegate, so you can use it to collect the errors from either the XmlReader or the XmlDocument. Here's how you could use it with the XmlDocument:

Public Function LoadValidatedXmlDocument(xmlFilePath As String, xsdFilePath As String) As XmlDocument
    Dim doc As New XmlDocument()
    doc.Load(xmlFilePath)
    doc.Schemas.Add(Nothing, xsdFilePath)
    Dim errorBuilder As New XmlValidationErrorBuilder()
    doc.Validate(New ValidationEventHandler(AddressOf errorBuilder.ValidationEventHandler))
    Dim errorsText As String = errorBuilder.GetErrors()
    If errorsText IsNot Nothing Then
        Throw New Exception(errorsText)
    End If
    Return doc
End Function

And here's how you could use it with the XmlReader:

Public Sub LoadXml(xmlFilePath As String, xsdFilePath As String) 
    Dim settings As New XmlReaderSettings()
    settings.Schemas.Add(Nothing, xsdFilePath)
    settings.ValidationType = ValidationType.Schema
    Dim errorBuilder As New XmlValidationErrorBuilder()
    AddHandler settings.ValidationEventHandler, New ValidationEventHandler(AddressOf errorBuilder.ValidationEventHandler)
    Dim reader As XmlReader = XmlReader.Create(xmlFilePath, settings)
    ' Read the document...
    Dim errorsText As String = errorBuilder.GetErrors()
    If errorsText IsNot Nothing Then
        ' Handle the errors
    End If
End Function

Alternatively, you can also use the newer XDocument class. The way to do it with XDocument is very similar to XmlDocument. There is a Validate extension method for the XDocument which takes, yet again, a ValidationEventHandler delegate. Here's an example of that:

Public Function LoadValidatedXDocument(xmlFilePath As String, xsdFilePath As String) As XDocument
    Dim doc As XDocument = XDocument.Load(xmlFilePath)
    Dim schemas As New XmlSchemaSet()
    schemas.Add(Nothing, xsdFilePath)
    Dim errorBuilder As New XmlValidationErrorBuilder()
    doc.Validate(schemas, New ValidationEventHandler(AddressOf errorBuilder.ValidationEventHandler))
    Dim errorsText As String = errorBuilder.GetErrors()
    If errorsText IsNot Nothing Then
        Throw New Exception(errorsText)
    End If
    Return doc
End Function

As for loading the data from the XML document into a database, it's impossible to say how, precisely, to do that without knowing the schema of the XML document, the schema of the database, the kind of database, etc. I would recommend doing some research both into reading XML data and writing data to databases and see how far you get. If you have any specific questions when you run into trouble, we'll be here to help :)

Related Topic