In XSLT, white-space is preserved by default, since it can very well be relevant data.
The best way to prevent unwanted white-space in the output is not to create it in the first place. Don't do:
<xsl:template match="foo">
foo
</xsl:template>
because that's "\n··foo\n"
, from the processor's point of view. Rather do
<xsl:template match="foo">
<xsl:text>foo</xsl:text>
</xsl:template>
White-space in the stylesheet is ignored as long as it occurs between XML elements only. Simply put: never use "naked" text anywhere in your XSLT code, always enclose it in an element.
Also, using an unspecific:
<xsl:apply-templates />
is problematic, because the default XSLT rule for text nodes says "copy them to the output". This applies to "white-space-only" nodes as well. For instance:
<xml>
<data> value </data>
</xml>
contains three text nodes:
"\n··"
(right after <xml>
)
"·value·"
- "
\n"
(right before </xml>
)
To avoid that #1 and #3 sneak into the output (which is the most common reason for unwanted spaces), you can override the default rule for text nodes by declaring an empty template:
<xsl:template match="text()" />
All text nodes are now muted and text output must be created explicitly:
<xsl:value-of select="data" />
To remove white-space from a value, you could use the normalize-space()
XSLT function:
<xsl:value-of select="normalize-space(data)" />
But careful, since the function normalizes any white-space found in the string, e.g. "·value··1·"
would become "value·1"
.
Additionally you can use the <xsl:strip-space>
and <xsl:preserve-space>
elements, though usually this is not necessary (and personally, I prefer explicit white-space handling as indicated above).
@Oded, @khachik,
Try checking his desired output for well-formedness. It is indeed well-formed XML. ("Valid" is not even a question here, as there is no schema.)
It is a common misconception that ">" is not legal in well-formed XML.
In most contexts, "<" is not legal, but ">" is legal everywhere with one rare exception. The relevant paragraph of the spec:
The ampersand character (&) and the
left angle bracket (<) MUST NOT appear
in their literal form, except when
used as markup delimiters, or within a
comment, a processing instruction, or
a CDATA section. If they are needed
elsewhere, they MUST be escaped using
either numeric character references or
the strings " & " and " < "
respectively. The right angle bracket
(>) may be represented using the
string " > ", and MUST, for
compatibility, be escaped using either
" > " or a character reference when
it appears in the string " ]]> " in
content, when that string is not
marking the end of a CDATA section.
With XSLT 2.0, the "right" way to do what you want is to use <xsl:character-map>
.
With XSLT 1.0, I think the only way to force the use of ">" in the output is to use disable-output-escaping, as @khachik suggested. Note however that XSLT processors are not required to honor DOE or character maps, and some can't (e.g. if they're in a pipeline and are not connected to serialization). But you probably know by now whether yours can, and if it can't, you'll need to handle serialization issues at the end of the pipeline.
However, it is worth asking, why do you want the ">" serialized as ">"? As seen in the spec, > is a perfectly acceptable way to express exactly the same information as far as XML is concerned. No downstream XML consumer should know the difference or care. Do you want it for aesthetic reasons?
Update: the OP wants that because the output needs to be not only well-formed XML, it also needs to be well-formed Literate Haskell.
Best Answer
Use the entity code
 
instead.
is a HTML "character entity reference". There is no named entity for non-breaking space in XML, so you use the code 
.Wikipedia includes a list of XML and HTML entities, and you can see that there are only 5 "predefined entities" in XML, but HTML has over 200. I'll also point over to Creating a space ( ) in XSL which has excellent answers.