You have a poorly designed document that can use extra structure.[13]
This is the opposite problem from that solved in Recipe 6.7. Here you need to add additional structure to a document, possibly to organize its elements by some additional criteria.
This type of deepening transformation example undoes the flattening transformation performed in Recipe 6.7:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:import href="copy.xslt"/> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:template match="people"> <union> <xsl:apply-templates select="person[@class = 'union']" /> </union> <salaried> <xsl:apply-templates select="person[@class = 'salaried']" /> </salaried> </xsl:template> </xsl:stylesheet>
In a misguided effort to streamline XML, some people attempt to encode information by inserting sibling elements rather than parent elements.[14]
For example, suppose someone distinguished between union and salaried employees in the following way:
<people> <class name="union"/> <person> <firstname>Warren</firstname> <lastname>Rosenbaum</lastname> <age>37</age> <height>5.75</height> </person> ... <person> <firstname>Theresa</firstname> <lastname>Archul</lastname> <age>37</age> <height>5.5</height> </person> <class name="salaried"/> <person> <firstname>Sal</firstname> <lastname>Mangano</lastname> <age>37</age> <height>5.75</height> </person> ... <person> <firstname>James</firstname> <lastname>O'Riely</lastname> <age>33</age> <height>5.5</height> </person> </people>
Notice that the elements signifying union and salaried
class
elements are now empty. The intent is that
all following-siblings of a class
element belong
to that class until another class
element is
encountered or there are no more siblings. This type of encoding is
easy to grasp, but more difficult for an XSLT program to process. To
correct this representation, you need to create a stylesheet that
computes the set difference between all person elements following the
first occurrence of a class element and the person elements following
the next occurrence of a class element. XSLT 1.0 does not have an
explicit set difference function. You can get essentially the same
effect and be more efficient by considering all elements following a
class
element whose position is less than the
position of elements following the next class
element:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:import href="copy.xslt"/> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:strip-space elements="*"/> <!-- The total number of people --> <xsl:variable name="num-people" select="count(/*/person)"/> <xsl:template match="class"> <!--The last position we want to consider. --> <xsl:variable name="pos" select="$num-people - count(following-sibling::class/following-sibling::person)"/> <xsl:element name="{@name}"> <!-- Copy people that follow this class but whose position is less than or equal to $pos.--> <xsl:copy-of select="following-sibling::person[position( ) <= $pos]"/> </xsl:element> </xsl:template> <!-- Ignore person elements. They were coppied above. --> <xsl:template match="person"/> </xsl:stylesheet>
More subtly, a key can be used as follows:
<xsl:key name="people" match="person" use="preceding-sibling::class[1]/@name" /> <xsl:template match="people"> <people> <xsl:apply-templates select="class" /> </people> </xsl:template> <xsl:template match="class"> <xsl:element name="{@name}"> <xsl:copy-of select="key('people', @name)" /> </xsl:element> </xsl:template>
A step-by-step approach is another alternative:
<xsl:template match="people"> <people> <xsl:apply-templates select="class[1]" /> </people> </xsl:template> <xsl:template match="class"> <xsl:element name="{@name}"> <xsl:apply-templates select="following-sibling::*[1][self::person]" /> </xsl:element> <xsl:apply-templates select="following-sibling::class[1]" /> </xsl:template> <xsl:template match="person"> <xsl:copy-of select="." /> <xsl:apply-templates select="following-sibling::*[1][self::person]" /> </xsl:template>
When you added structure based on existing data, you explicitly referred to the criteria that formed the categories of interest (e.g., union and salaried). It would be better if the stylesheet figured these categories out by itself. This makes the stylesheet more generic at the cost of added complexity:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:import href="copy.xslt"/> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:strip-space elements="*"/> <!-- build a unique list of all classes --> <xsl:variable name="classes" select="/*/*/@class[not(. = ../preceding-sibling::*/@class)]"/> <xsl:template match="/*"> <!-- For each class create an element named after that class that contains elements of that class --> <xsl:for-each select="$classes"> <xsl:variable name="class-name" select="."/> <xsl:element name="{$class-name}"> <xsl:for-each select="/*/*[@class=$class-name]"> <xsl:copy> <xsl:apply-templates/> </xsl:copy> </xsl:for-each> </xsl:element> </xsl:for-each> </xsl:template> </xsl:stylesheet>
Although not 100% generic, this stylesheet avoids making assumptions
about what kinds of classes exist in the document. The only
application-specific information in this stylesheet is the fact that
the categories are encoded in an attribute @class
and that the attribute occurs in elements that are two levels down
from the root.
The solution can be implemented explicitly in terms of set difference. This solution is elegant, but impractical for large documents with many categories. The trick used here for computing set difference is explained in Recipe 7.1:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:import href="copy.xslt"/> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:template match="class"> <!--All people following this class element --> <xsl:variable name="nodes1" select="following-sibling::person"/> <!--All people following the next class element --> <xsl:variable name="nodes2" select="following-sibling::class/following-sibling::person"/> <xsl:element name="{@name}"> <xsl:copy-of select="$nodes1[count(. | $nodes2) != count($nodes2)]"/> </xsl:element> </xsl:template> <xsl:template match="person"/> </xsl:stylesheet>