You want to know how to exploit some of the useful extensions available in these popular XSLT implementations.
This recipe is broken into a bunch of mini-recipes showcasing the
most important Saxon and Xalan extensions. For all examples, the
saxon
namespace prefix is associated with
http://icl.com/saxon, and the
xalan
namespace prefix is associated with
http://xml.apache.org/xslt.
This book has used Saxon’s facility
several times
to output results to more than one file. Saxon uses the
saxon:output
element. It also provides the
xsl:document
element, but it will only work if the
stylesheet version attribute is 1.1 and is therefore not preferred.
The href
attribute specifies the output
destination. This attribute can be an attribute value template:
<saxon:output href="toc.html"> <html> <head><title>Table of Contents</title></head> <body> <xsl:apply-templates mode="toc" select="*"/> </body> </html> </saxon:output>
Xalan takes a significantly different approach to multidestination
output. Rather than one instruction, Xalan gives you three:
redirect:open
, redirect:close
,
and redirect:write
. The extension namespace
associated with these elements is xmlns:redirect = "org.apache. xalan.xslt.extensions.Redirect"
. For the most
common cases, you can get away with using
redirect:write
by itself because if used alone, it
will open, write, and close the file.
Each element includes a file attribute and/or a select attribute to
designate the output file. The file attribute takes a string, so you
can use it to specify the output filename directly. The select
attribute takes an XPath expression, so you can use it to generate
the output file name dynamically. If you include both attributes, the
redirect
extension first evaluates the select
attribute and falls back to the file attribute if the
select
attribute expression does not return a
valid filename:
<xalan:write file="toc.html"> <html> <head><title>Table of Contents</title></head> <body> <xsl:apply-templates mode="toc" select="*"/> </body> </html> </saxon:output>
By using Xalan’s extended capabilities, you can switch from writing a primary output file to other secondary files while the primary remains open. This step undermines the no-side-effects nature of XSLT, but presumably, Xalan will ensure a predictable operation:
<xsl:template match="doc"> <xalan:open file="regular.xml"/> <xsl:apply-templates select="*"/> <xalan:close file="regular.xml"/> <xsl:template/> <xsl:template match="regular"> <xsl:write file="regular.xml"> <xsl:copy-of select="."/> </xsl:write/> </xsl:template> <xsl:template match="*"> <xsl:variable name="file" select="concat(local-name( ),'.xml')"/> <xsl:write select="$file"> <xsl:copy-of select="."/> </xsl:write/> </xsl:template>
XSLT 2.0 provides native support for multiple result destinations via
a new element called xsl:result-document
:
<xsl:result-document format="html" href="toc.html"> <html> <head><title>Table of Contents</title></head> <body> <xsl:apply-templates mode="toc" select="*"/> </body> </html> </xsl:result-document>
Developers who have worked a lot with Unix are intimately familiar with the notion of a processing pipeline, in which the output of a command is fed into the input of another. This facility is also available in other operating systems, such as Windows. The genius of the pipelining approach to software development is that it enables the assembly of complex tasks from more basic commands.
Since an XSLT transformation is ultimately a tree-to-tree
transformation, applying the pipelining approach is natural. Here the
result tree of one transform becomes the input tree of the next. You
have seen numerous examples in which the node-set extension function
can create intermediate results that can be processed by subsequent
stages. Alternatively, Saxon provides this functionality via the
saxon:next-in-chain extension
attribute of
xsl:output
. The
saxon:next-in-chain
attribute directs the output
to another stylesheet. The value is the URL of a stylesheet that
should be used to process the output stream. The output stream must
always be pure XML, and attributes that control the
output’s format (e.g., method
,
cdata-section-elements
, etc.) have no effect. The
second stylesheet’s output is directed to the
destination that would have been used for the first stylesheet if no
saxon:next-in-chain
attribute were present.
Xalan has a different approach to this functionality; it uses a
pipeDocument
extension element. The nice thing
about pipeDocument
is that you can use it in an
otherwise empty stylesheet to create a pipeline between independent
stylesheets that do not know they are used in this way. The Xalan
implementation is therefore much more like the Unix pipe because the
pipeline is not hardcoded into the participating stylesheets. Imagine
that a stylesheet called strip.xslt
stripped out
specific elements from an XML document representing a book, and a
stylesheet called contents.xslt
created a table
of contents based on the hierarchical structure of the
document’s markup. You could create a pipeline
between the stylesheets as follows:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:pipe="xalan://PipeDocument" extension-element-prefixes="pipe"> <xsl:param name="source"/> <xsl:param name="target"/> <!-- A list of elements to preserve. All others are stripped. --> <xsl:param name="preserve-elems"/> <pipe:pipeDocument source="{$source}" target="{$target}"> <stylesheet href="strip.xslt"> <param name="preserve-elems" value="{$preserve-elems}"/> </stylesheet> <stylesheet href="contents.xslt"/> </pipe:pipeDocument> </xsl:stylesheet>
This code would create a table of contents based on the specified
elements without disabling the independent use of
strip.xsl
or contents.xsl
.
Chapter 3 provided a host of recipes dealing with dates and times but no
pure XSLT facility that could determine the current date and time.
Both Saxon and Xalan implement core functions from the EXSLT dates
and times module. This section includes EXSLT’s
date-and-time documentation for easy reference. The functions are
shown in Table 12-1 with their return type,
followed by the function and arguments. A question mark
(?
) indicates optional arguments.
Table 12-1. EXSLT’s date-and-time functions
Function |
Behavior |
---|---|
string date: date-time( )
|
The |
string date: date(string?)
|
The |
string date: time(string?)
|
The |
number date: year(string?)
|
The |
boolean date: leap-year(string?)
|
The |
number date: month-in-year(string?)
|
The |
string date: month-name(string?)
|
The |
string date: month-abbreviation(string?)
|
The |
number date: week-in-year(string?)
|
The |
number date: day-in-year(string?)
|
The |
number date: day-in-month(string?)
|
The |
number date: day-of-week-in-month(string?)
|
The |
number date: day-in-week(string?)
|
The |
string date: day-name(string?)
|
The |
string date: day-abbreviation(string?)
|
The |
number date: hour-in-day(string?)
|
The |
number date: minute-in-hour(string?)
|
The |
number date: second-in-minute(string?)
|
The |
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:date="http://exslt.org/dates-and-times"> <xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:template match="/"> <html> <head><title>My Dull Home Page</title></head> <body> <h1>My Dull Homepage</h1> <div>It's <xsl:value-of select="date:time( )"/> on <xsl:value-of select="date:date( )"/> and this page is as dull as it was yesterday.</div> </body> </html> </xsl:template> </xsl:stylesheet>
Chapter 7 investigated various
means
of implementing set operations other than set union, which XPath
supplies natively via the union operator (|
).
These solutions were not necessarily the most efficient or obvious.
Both Saxon and Xalan remedy this problem by implementing many of the set operations defined by EXSLT’s set module (see Table 12-2).
Table 12-2. EXSLT’s set module’s set operations
Function |
Behavior |
---|---|
Node-set set: difference(node-set, node-set)
|
The |
Node-set set: intersection(node-set, node-set)
|
The |
Node-set set: distinct(node-set)
|
The |
boolean set: has-same-node(node-set, node-set)
|
The |
Node-set set: leading(node-set, node-set)
|
The |
Node-set set: trailing(node-set, node-set)
|
The |
set:distinct
is a convenient way to remove
duplicates, as long as equality is defined as string-value equality:
<xsl:varaible name="firstNames" select="set:destinct(person/firstname)"/>
set:leading
and set:traling
can
extract nodes bracketed by other nodes. For example, Recipe 10.9 used a complex expression to locate the
xslx:elsif
and xslx:else
nodes
that went with your enhanced xslx:if
. Extensions
can simplify this process:
<xsl:apply-templates select="set:leading(following-sibling::xslx:else | following-sibling::xslx:elsif, following-sibling::xslx:if)"/>
This code specifies that you select all xslx:else
and xslx:elseif
siblings that come after the
current node, but before the next xslx:if
.
Xalan provides functions that allow you to
get
information about the location of nodes in the source tree. Saxon
6.5.2 provides only saxon:systemId
and
saxon:lineNumber
. Debugging is one application of
these functions. To use the functions, set the
TransformerFactory
source_location
attribute to
true
with either the command-line utility
-L
flag or the
TransformerFactory.setAttribute( )
method.
systemId( )
,
systemId(node-set)
Returns the system ID for the current node and the first node in the node set, respectively.
lineNumber( )
,
lineNumber(node-set)
Returns the line number in the source document for the current node and the first node in the node set, respectively. This function returns -1 if the line number is unknown (for example, when the source is a DOM Document).
columnNumber( )
,
columnNumber(node-set)
Returns the column number in the source document for the current node and the first node in the node set, respectively. This function returns -1 if the column number is unknown (for example, when the source is a DOM Document):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xalan="http://xml.apache.org/xslt" xmlns:info="xalan://org.apache.xalan.lib.NodeInfo"> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:template match="foo"> <xsl:comment>Matched a foo on line <xsl:value-of select="info:lineNumber( )"/> and column <xsl:value-of select="info:columnNumber( )"/>.</xsl:comment> <!-- ... --> </xsl:template> </xsl:stylesheet>
Interfacing XSLT to a relational database opens up a whole new world of possibilities. Both Saxon and Xalan have extensions to support SQL. If you write stylesheets that modify databases, you violate the XSLT no-side-effects rule.
Michael Kay has this to say about Saxon’s SQL extensions, “These are not intended as being necessarily a production-quality piece of software (there are many limitations in the design), but more as an illustration of how extension elements can be used to enhance the capability of the processor.”
Saxon provides database interaction via five extension elements:
sql:connect
, sql:query
,
sql:insert
, sql:column
, and
sql:close
. Anyone who ever interacted with a
relational database though ODBC or JDBC should feel comfortable using
these elements.
<sql:connect driver="jdbc-driver" database="db name" user="user name" password="user password"/>
Creates a database connection. Each attribute can be an attribute
value template. The driver
attribute names the
JDBC driver class, and the database
must be a name
that JDBC can associate with an actual database.
<sql:query table="the table" column="column names" where="where clause" row-tag="row element name" column-tag="column element name" disable-output-escaping="yes or no"/>
Performs a query and writes the results to the output tree using
elements to represent the rows and columns. The names of these
elements are specified by row-tag
and
col-tag
, respectively. The column attribute can
contain a list of columns or use *
for all.
<sql:insert table="table name">
Performs an SQL INSERT. The child elements
(sql:column
) specify the data to be added to the
table.
<sql:column name="col name" select="xpath expr"/>
Used as a child of sql:insert
. The value can be
specified by the select
attribute or by the
evaluation of the sql:column
’s
child elements. However, in both cases only the string value can be
used. Hence, there is no way to deal with other standard SQL data
types.
Xalan’s SQL support is richer than Saxon’s. This chapter covers only the basics. The “See Also” section provides pointers to more details. Unlike Saxon, Xalan uses extension functions that provide relational database access.
sql:new(driver, db, user, password)
Establishes a connection.
sql:new(nodelist)
Sets up a connection using information embedded as XML in the input document or stylesheet. For example:
DBINFO> <dbdriver>org.enhydra.instantdb.jdbc.idbDriver</dbdriver> <dburl>jdbc:idb:../../instantdb/sample.prp</dburl> <user>jbloe</user> <password>geron07moe</password> </DBINFO> <xsl:param name="cinfo" select="//DBINFO"/> <xsl:variable name="db" select="sql:new($cinfo)"/>
query(xconObj, sql-query)
Queries the database. The xconObj
is returned by
new( )
. The function returns a
streamable
result set in the form of a row-set
node. You can work your way through the row set one row at a time.
The same row element is used repeatedly, so you can begin
transforming the row set before the entire result set is returned.
pquery(xconObj,sql-query-with-params)
addParameter(xconObj, paramValue)
addParameterFromElement(xconObj,element)
addParameterFromElement(xconObj,node-list)
clearParameters(xconObj)
Used together to implement parameterized queries. Parameters take the
form of ?
characters embedded in the query. The
various addParameter( )
functions set these
parameters with actual values before the query is executed. Use
clearParameters( )
to make the connection object
forget about prior values.
close(xconObj)
Closes the connection to the database.
The query( )
and pquery( )
extension functions return a Document node that contains (as needed)
an array of column-header elements, a single row element that is used
repeatedly, and an array of col
elements. Each
column-header element (one per column in the row set) contains an
attribute (ColumnAttribute
) for each column
descriptor in the ResultSetMetaData
object. Each
col
element contains a text node with a textual
representation of the value for that column in the current row.
You can find more information on using XSLT to access relational data in Doug Tidwell’s XSLT (O’Reilly, 2001).
Saxon and Xalan have a very powerful
extension function called
evaluate
that takes a string and evaluates it as
an XPath expression. Such a feature was under consideration for XSLT
2.0, but at this time, the XSLT 2.0 working group decided not to
pursue it. Their justification is that dynamic evaluation
“...has significant implications on the runtime
architecture of the processor, as well as the ability to do static
optimization.”
Dynamic capabilities can come in handy when creating a table-driven stylesheet. The following stylesheet can format information on people into a table, but you can customize it to handle an almost infinite variety of XML formats simply by altering entries in a table:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:saxon="http://icl.com/saxon" xmlns:paths="http://www.ora.com/XSLTCookbook/NS/paths" exclude-result-prefixes="paths"> <xsl:output method="html"/> <!-- This parameter is used to specify a document con taining a table that --> <!-- specifies how to locate info on people --> <xsl:param name="pathsDoc"/> <xsl:template match="/"> <html> <head> <title>People</title> </head> <body> <!-- We load an Xpath expression out of a table [Symbol_Wingdings_224] <!-- in an external document. --> <xsl:variable name="peoplePath" select="document($pathsDoc)/*/paths:path[@type='people']/@xpath"/> <table> <tbody> <tr> <th>First</th> <th>Last</th> </tr> <!-- Dynamically evaluate the xpath that locates information on --> <!-- each person --> <xsl:for-each select="saxon:evaluate($peoplePath)"> <xsl:call-template name="process-person"/> </xsl:for-each> </tbody> </table> </body> </html> </xsl:template> <xsl:template name="process-person"> <xsl:variable name="firstnamePath" select="document($pathsDoc)/*/paths:path[@type='first']/@xpath"/> <xsl:variable name="lastnamePath" select="document($pathsDoc)/*/paths:path[@type='last']/@xpath"/> <tr> <!-- Dynamically evaluate the xpath that locates the person --> <!-- specific info we want to process --> <td><xsl:value-of select="saxon:evaluate($firstnamePath)"/></td> <td><xsl:value-of select="saxon:evaluate($lastnamePath)"/></td> </tr> </xsl:template> </xsl:stylesheet>
You can use this table to process person data encoded as elements:
`<paths:paths xmlns:paths="http://www.ora.com/XSLTCookbook/NS/paths"> <paths:path type="people" xpath="people/person"/> <paths:path type="first" xpath="first"/> <paths:path type="last" xpath="last"/> </paths:paths>
Add this table to process person data encoded as attributes:
<paths:paths xmlns:paths="http://www.ora.com/XSLTCookbook/NS/paths" > <paths:path type="people" xpath="people/person"/> <paths:path type="first" xpath="@first"/> <paths:path type="last" xpath="@last"/> </paths:paths>
Almost any book you read
on
XSLT will describe the inability to change the value of variables and
parameters once they are bound as a feature of XSLT rather than a
defect. This is true because it prevents a certain class of bugs,
makes stylesheets easier to understand, and enables certain
performance optimizations. However, sometimes being unable to change
the values is simply inconvenient. Saxon provides a way around this
obstacle with its saxon:assign
extension element.
You can use saxon:assign
only on variables
designated as assignable with the extension attribute
saxon:assignable="yes"
:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:saxon="http://icl.com/saxon" extension-element-prefixes="saxon"> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:variable name="countFoo" select="0" saxon:assignable="yes"/> <xsl:template name="foo"> <saxon:assign name="countFoo" select="$countFoo + 1"/> <xsl:comment>This is invocation number <xsl:value-of select="$countFoo"/> of template foo.</xsl:comment> </xsl:template> <!- ... --> </xsl:stylesheet>
Many examples in this
book
are implemented as named templates
accessed via xsl:call-template
. Often, this
implementation is inconvenient and awkward because what you really
want is to access this code as first-class functions that can be
invoked as easily as native XPath functions. Help is on the way in
XSLT 2.0, but in the meantime, you might consider using an EXSLT
extension called func:function
that is implemented
by Saxon and the latest version of Xalan (Version 2.3.2). The
following code is a template from Chapter 2
reimplemented as a function:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:func="http://exslt.org/functions" xmlns:str="http://www.ora.com/XSLTCookbook/namespaces/strings" extension-element-prefixes="func"> <xsl:template match="/"> <xsl:value-of select="str:substring-before-last('123456789a123456789a123', 'a')"/> </xsl:template> <func:function name="str:substring-before-last"> <xsl:param name="input"/> <xsl:param name="substr"/> <func:result> <xsl:if test="$substr and contains($input, $substr)"> <xsl:variable name="temp" select="substring-after($input, $substr)" /> <xsl:value-of select="substring-before($input, $substr)" /> <xsl:if test="contains($temp, $substr)"> <xsl:value-of select="concat($substr, str:substring-before-last($temp, $substr))"/> </xsl:if> </xsl:if> </func:result> </func:function> </xsl:stylesheet>
Using vendor-specific extensions is a double-edged sword. On the one hand, they can provide you with the ability to deliver an XSLT solution faster or more simply than you could if you constrained yourself to standard XSLT. In a few cases, they allow you to do things that are impossible with standard XSLT. On the other hand, they can lock you into an implementation whose future is uncertain.
EXSLT.org encourages implementers to adopt uniform conventions for the most popular extensions, so you should certainly prefer an EXSLT solution to a vendor-specific one if you have a choice.
Another tactic is to avoid vendor-specific implementations altogether in favor of your own custom implementation. In this way, you control the source and can port the extension to more than one processor, if necessary. Recipe 12.2, Recipe 12.3, and Recipe 12.4 address custom extensions.
This book has not covered all of the extensions available in Saxon and Xalan. Additional information and features of Saxon extensions can be found at http://saxon.sourceforge.net/saxon6.5.2/extensions.html or http://saxon.sourceforge.net/saxon7.2/extensions.html (the XSLT 2.0 beta version). Additional Xalan extension information can be found at http://xml.apache.org/xalan-j/extensionslib.html.