Xml – XSLT – remove whitespace from template

xmlxslt

I am using XML to store a small contact list and trying to write a XSL template that will transform it into a CSV file. The problem I am having is with whitespace in the output.

The output:

Friend, John, Smith, Home,
        123 test,
       Sebastopol,
       California,
       12345,
     Home 1-800-123-4567, Personal john.smith@gmail.com

I have indented/spaced both the source XML file and the associated XSL Template to make it easier to read and develop, but all that extra white space is getting itself into the output. The XML itself doesn't have extra whitespace inside the nodes, just outside of them for formatting, and the same goes for the XSLT.

In order for the CSV file to be valid, each entry needs to be on it's own line, not broken up. Besides stripping all extra white space from the XML and XSLT (making them just one long line of code), is there another way to get rid of the whitespace in the output?

Edit:
Here is a small XML sample:

<PHONEBOOK>
    <LISTING>
        <FIRST>John</FIRST>
        <LAST>Smith</LAST>
        <ADDRESS TYPE="Home">
            <STREET>123 test</STREET>
            <CITY>Sebastopol</CITY>
            <STATE>California</STATE>
            <ZIP>12345</ZIP>
        </ADDRESS>
        <PHONE>1-800-123-4567</PHONE>
        <EMAIL>john.smith@gmail.com</EMAIL>
        <RELATION>Friend</RELATION>
    </LISTING>
</PHONEBOOK>

And here is the XSLT:

<?xml version="1.0" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />

 <xsl:template match="/">
   <xsl:for-each select="//LISTING">
    <xsl:value-of select="RELATION" /><xsl:text>, </xsl:text>
    <xsl:value-of select="FIRST" /><xsl:text>, </xsl:text>
    <xsl:value-of select="LAST" /><xsl:text>, </xsl:text>

    <xsl:if test="ADDRESS">
     <xsl:for-each select="ADDRESS">
       <xsl:choose>
        <xsl:when test="@TYPE">
         <xsl:value-of select="@TYPE" />,
        </xsl:when>
            <xsl:otherwise>
            <xsl:text>Home </xsl:text>
            </xsl:otherwise>
       </xsl:choose>
       <xsl:value-of select="STREET" />,
       <xsl:value-of select="CITY" />,
       <xsl:value-of select="STATE" />,
       <xsl:value-of select="ZIP" />,
     </xsl:for-each>
    </xsl:if>

    <xsl:for-each select="PHONE">
      <xsl:choose>
       <xsl:when test="@TYPE">
        <xsl:value-of select="@TYPE" />  
       </xsl:when>
       <xsl:otherwise><xsl:text>Home </xsl:text></xsl:otherwise>
      </xsl:choose>
     <xsl:value-of select="."  /><xsl:text  >, </xsl:text>
    </xsl:for-each>

    <xsl:if test="EMAIL">
     <xsl:for-each select="EMAIL">
      <xsl:choose>
       <xsl:when test="@TYPE">
        <xsl:value-of select="@TYPE" /><xsl:text  > </xsl:text> 
       </xsl:when>
       <xsl:otherwise><xsl:text  >Personal </xsl:text></xsl:otherwise>
      </xsl:choose>
      <xsl:value-of select="."  /><xsl:text  >, </xsl:text>
     </xsl:for-each>
    </xsl:if>
    <xsl:text>&#10;&#13;</xsl:text>
   </xsl:for-each>
 </xsl:template>

</xsl:stylesheet>

Best Answer

In XSLT, white-space is preserved by default, since it can very well be relevant data.

The best way to prevent unwanted white-space in the output is not to create it in the first place. Don't do:

<xsl:template match="foo">
  foo
</xsl:template>

because that's "\n··foo\n", from the processor's point of view. Rather do

<xsl:template match="foo">
  <xsl:text>foo</xsl:text>
</xsl:template>

White-space in the stylesheet is ignored as long as it occurs between XML elements only. Simply put: never use "naked" text anywhere in your XSLT code, always enclose it in an element.

Also, using an unspecific:

<xsl:apply-templates />

is problematic, because the default XSLT rule for text nodes says "copy them to the output". This applies to "white-space-only" nodes as well. For instance:

<xml>
  <data> value </data>
</xml>

contains three text nodes:

  1. "\n··" (right after <xml>)
  2. "·value·"
  3. "\n" (right before </xml>)

To avoid that #1 and #3 sneak into the output (which is the most common reason for unwanted spaces), you can override the default rule for text nodes by declaring an empty template:

<xsl:template match="text()" />

All text nodes are now muted and text output must be created explicitly:

<xsl:value-of select="data" />

To remove white-space from a value, you could use the normalize-space() XSLT function:

<xsl:value-of select="normalize-space(data)" />

But careful, since the function normalizes any white-space found in the string, e.g. "·value··1·" would become "value·1".

Additionally you can use the <xsl:strip-space> and <xsl:preserve-space> elements, though usually this is not necessary (and personally, I prefer explicit white-space handling as indicated above).