Java – How to load old Microsoft Office XML file (Excel) using Java

exceljavams-officexml

I'm not able to load an Excel file in the older Office XML format (think Office 2002 or 2003 version) into Java. I tried JXL and Apache's POI (version 3.7). POI doesn't work since it appears to want the newer Office .xlsx format.

Here's an example of the older Office XML format.

One can generate a similar XML file from MS Excel 2010 by saving the workbook as the format "XML Spreadsheet 2003"?

Are there any open-source Java libraries that will load the XMLSS format? Otherwise I have no choice but to write a custom parser: read the XML file then interpret the cell tags to build out the cell matrix. In this XML format, any rows with empty cell values are skipped, the next cell with data positioned with an index attribute that acts like an offset in the columns, I assume to save space in the XML file.

Best Answer

The format is called SpreadsheetML (do not confuse with .xlsx which is also xml-based), a library called Xelem can handle it:

import nl.fountain.xelem.excel.Workbook;
import nl.fountain.xelem.lex.ExcelReader;
//...
ExcelReader reader = new ExcelReader();
Workbook xlWorkbook = reader.getWorkbook("c:\\my\\spreadsheet.xml");
System.out.println(xlWorkbook.getSheetNames());