Java – How to tell Java SAX Parser to ignore invalid character references

error handlingjavasaxxml

When trying to parse incorrect XML with a character reference such as &#x1, Java's SAX Parser dies a horrible death with a fatal error such as

    org.xml.sax.SAXParseException: Character reference "&#x1"
                                   is an invalid XML character.

Is there any way around this? Will I have to clean up the XML file before I hand it off to the SAX Parser? If so, is there an elegant way of going about this?

Best Answer

Use XML 1.1! skaffman is completely right, but you can just stick <?xml version="1.1"?> on the top of your files and you'll be in good shape. If you're dealing with streams, write a wrapper that rewrites or adds that processing instruction.