To change and to change for the better are two different things.
One of the beauties of XML is that if you don’t like something, you can change it. Since it is impossible to please everyone, transforming XML to XML is extremely common. However, you will not transform XML only to improve the structure of a poorly designed schema. Sometimes you need to merge disparate XML documents into a single document. At other times you want to break up a large document into smaller subdocuments. You might also wish to preprocess a document to filter out only the relevant information, without changing its structure, before sending it off for further processing.
A simple but important tool in many XML-to-XML transformations is the
identity
transform
. This tool is a stylesheet that copies an input
document to an output document without changing it. This task may
seem better suited to the operating systems copy operation, but as
the following examples demonstrate, this simple stylesheet can be
imported into other stylesheets to yield very common types of
transformations with little added coding effort.
Example 6-1 shows the identity stylesheet. I
actually prefer calling this stylesheet the copying stylesheet, and I
call the techniques that utilize it the overriding copy
idiom
.
Example 6-1. copy.xslt
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="node( ) | @*"> <xsl:copy> <xsl:apply-templates select="@* | node( )"/> </xsl:copy> </xsl:template> </xsl:stylesheet>
You have a document that encodes information with attributes, and you would like to use child elements instead.
This problem is tailor-made for what the introduction to this chapter
calls the
overriding copy
idiom
. This example transforms attributes to elements
globally:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:import href="copy.xslt"/> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:template match="@*"> <xsl:element name="{local-name(.)}" namespace="{namespace-uri(..)}"> <xsl:value-of select="."/> </xsl:element> </xsl:template> </xsl:stylesheet>
The stylesheet works by overriding the copy behavior for attributes. It replaces the behavior with a template that converts an attribute into an element (of the same name) whose value is the attribute’s value. It also assumes that this new element should be in the same namespace as the attribute’s parent. If you prefer not to make assumptions, then use the following code:
<xsl:template match="@*"> <xsl:variable name="namespace"> <xsl:choose> <!--Use namespsace of attribute, if there is one --> <xsl:when test="namespace-uri( )"> <xsl:value-of select="namespace-uri( )" /> </xsl:when> <!--Otherwise use parents namespace --> <xsl:otherwise> <xsl:value-of select="namespace-uri(..)" /> </xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:element name="{name( )}" namespace="{$namespace}"> <xsl:value-of select="." /> </xsl:element> </xsl:template>
You’ll often want to be selective when transforming attributes to elements (see Example 6-2 to Example 6-4).
Example 6-2. Input
<people which="MeAndMyFriends"> <person firstname="Sal" lastname="Mangano" age="38" height="5.75"/> <person firstname="Mike" lastname="Palmieri" age="28" height="5.10"/> <person firstname="Vito" lastname="Palmieri" age="38" height="6.0"/> <person firstname="Vinny" lastname="Mari" age="37" height="5.8"/> </people>
Example 6-3. A stylesheet that transforms person attributes only
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:import href="copy.xslt"/> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:template match="person/@*"> <xsl:element name="{local-name(.)}" namespace="{namespace-uri(..)}"> <xsl:value-of select="."/> </xsl:element> </xsl:template> </xsl:stylesheet>
Example 6-4. Output
<people which="MeAndMyFriends"> <person> <firstname>Sal</firstname> <lastname>Mangano</lastname> <age>38</age> <height>5.75</height> </person> <person> <firstname>Mike</firstname> <lastname>Palmieri</lastname> <age>28</age> <height>5.10</height> </person> <person> <firstname>Vito</firstname> <lastname>Palmieri</lastname> <age>38</age> <height>6.0</height> </person> <person> <firstname>Vinny</firstname> <lastname>Mari</lastname> <age>37</age> <height>5.8</height> </person> </people>
This section and Recipe 6.2 address the problems that arise when a document designer makes a poor choice between encoding information in attributes versus elements. The attribute-versus-element decision is one of the most controversial aspects of document design.[9] These examples are helpful because they allow you to correct your own or others’ (perceived) mistakes.
[9] The only other stylistic issue I have seen software developers get more passionate about is where to put the curly braces in C-like programming languages (e.g., C++ and Java).