Prior sections considered how extensions provided by the XSLT implementers could be used to your advantage. This section develops your own extension elements from scratch. Unlike extension functions, creating extension elements requires much more intimacy with a particular processor’s implementation details. Because processor designs vary widely, much of the code will not be portable between processors.
This section begins with a simple extension that provides syntactic
sugar rather than extended functionality. A common requirement in
XSLT coding is to switch context to another node. Using an
xsl:for-each
is an idiomatic way of accomplishing
this. The process is somewhat confusing because the intent is not to
loop but to change context to the single node
defined by the xsl:for-each
’s
select
:
<xsl:for-each select="document('new.xml')"> <!-- Process new document --> </xsl:for-each>
You will implement an extension element called
xslx:set-context
,
which acts exactly like xsl:for-each
, but only on
the first node of the node set defined by the select (normally, you
have only one node anyway).
Saxon requires an implementation of the
com.icl.saxon.style.ExtensionElementFactory
interface for all extension elements associated with a particular
namespace. The factory is responsible for creating the extension
elements from the element’s local name. The second
extension, named templtext
, is covered later:
package com.ora.xsltckbk; import com.icl.saxon.style.ExtensionElementFactory; import org.xml.sax.SAXException; public class CkBkElementFactory implements ExtensionElementFactory { public Class getExtensionClass(String localname) { if (localname.equals("set-context")) return CkBkSetContext.class; if (localname.equals("templtext")) return CkBkTemplText.class; return null; } }
When using a stylesheet extension, you must use a namespace that ends
in a /
, followed by the factory’s
fully qualified name. The namespace prefix must also appear in the
xsl:stylesheet
’s
extension-element-prefixes
attribute:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xslx="http://com.ora.xsltckbk.CkBkElementFactory" extension-element-prefixes="xslx"> <xsl:template match="/"> <xslx:set-context select="foo/bar"> <xsl:value-of select="."/> </xslx:set-context> </xsl:template> </xsl:stylesheet>
The set-context
element implementation derives
from com.icl.saxon.style.StyleElement
and must
implement prepareAttributes( )
and
process( )
, but it will usually implement the
others shown in Table 12-3.
Table 12-3. Important Saxon StyleElement methods
Method |
Effect |
---|---|
|
Extensions always return |
|
Returns |
|
Called at compile time to allow the class to parse information contained in the extensions attributes. It is also the time to do local validation. |
|
Called at compile time after all stylesheet elements have done local validation. It allows cross validation between this element and its parents or children. |
|
Called at runtime to execute the extension. This method can access or modify information in the context, but must not modify the stylesheet tree. |
The xslx:set-context
element was easy to implement
because the code was stolen from Saxon’s
XSLForEach
implementation and modified to do what
XSLForEach
does, but only once:
public class CkBkSetContext extends com.icl.saxon.style.StyleElement { Expression select = null; public boolean isInstruction( ) { return true; } public boolean mayContainTemplateBody( ) { return true; }
Here you make sure @select
is present. If it is,
call makeExpression
, which
parses it
into an XPath expression:
public void prepareAttributes( ) throws TransformerConfigurationException { StandardNames sn = getStandardNames( ); AttributeCollection atts = getAttributeList( ); String selectAtt = null; for (int a=0; a<atts.getLength( ); a++) { int nc = atts.getNameCode(a); int f = nc & 0xfffff; if (f= =sn.SELECT) { selectAtt = atts.getValue(a); } else { checkUnknownAttribute(nc); } } if (selectAtt= =null) { reportAbsence("select"); } else { select = makeExpression(selectAtt); } } public void validate( ) throws TransformerConfigurationException { checkWithinTemplate( ); }
This code is identical to Saxon’s
for-each
, except instead of looping
selection.hasMoreElements
, it simply checks once,
extracts the element, sets the context and current node, processes
children, and returns the result to the context:
public void process(Context context) throws TransformerException { NodeEnumeration selection = select.enumerate(context, false); if (!(selection instanceof LastPositionFinder)) { selection = new LookaheadEnumerator(selection); } Context c = context.newContext( ); c.setLastPositionFinder((LastPositionFinder)selection); int position = 1; if (selection.hasMoreElements( )) { NodeInfo node = selection.nextElement( ); c.setPosition(position++); c.setCurrentNode(node); c.setContextNode(node); processChildren(c); context.setReturnValue(c.getReturnValue( )); } } }
The next example extension is not as simple because it extends XSLT’s capabilities rather than creating an alternate implementation for existing functionality.
You can see that because a whole chapter of this book is dedicated to code generation, the task interests me. However, although XSLT is near optimal in its XML manipulation capabilities, it lacks output capabilities due to the XML’s verbosity. Consider a simple C++ code generation task in native XSLT:
<classes> <class> <name>MyClass1</name> </class> <class> <name>MyClass2</name> </class> <class> <name>MyClass3</name> <bases> <base>MyClass1</base> <base>MyClass2</base> </bases> </class> </classes>
A stylesheet that transforms this XML into C++ might look like this:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text"/> <xsl:template match="class"> class <xsl:value-of select="name"/> <xsl:apply-templates select="bases"/> { public: <xsl:value-of select="name"/>( ) ; ~<xsl:value-of select="name"/>( ) ; <xsl:value-of select="name"/>(const <xsl:value-of select="name"/>& other) ; <xsl:value-of select="name"/>& operator =(const <xsl:value-of select="name"/> & other) ; } ; </xsl:template> <xsl:template match="bases"> <xsl:text>: public </xsl:text> <xsl:for-each select="base"> <xsl:value-of select="."/> <xsl:if test="position( ) != last( )"> <xsl:text>, public </xsl:text> </xsl:if> </xsl:for-each> </xsl:template> <xsl:template match="text( )"/> </xsl:stylesheet>
This code is tedious to write and difficult to read because the C++ is lost in a rat’s nest of markup.
The extension xslx:templtext
addresses this
problem by creating an alternate implementation of
xsl:text
that can contain special
escapes
and indicate special processing. An escape is indicated by
surrounding backslashes () and comes in two forms. An obvious
alternative would use {
and }
to mimic attribute value templates and XQuery; however, because you
use these common characters in code generators, I opted for the
backslashes.
Escape |
Equivalent XSLT |
---|---|
expression |
<xsl:value-of select="expression"/> |
expression%delimit[a] |
<xsl:for-each select="expression"> <xsl:value-of select="."/> <xsl:if test="position( ) != last( )> <xsl:value-of select="delimit"/> </xsl:if> </xsl:for-each> |
[a] XSLT 2.0 will provide this
functionality via |
Given this facility, your code generator would look as follows:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xslx="http://com.ora.xsltckbk.CkBkElementFactory" extension-element-prefixes="xslx"> <xsl:output method="text"/> <xsl:template match="class"> <xslx:templtext> class ame <xsl:apply-templates select="bases"/> { public: ame( ) ; ~ ame( ) ; ame(const ame& other) ; ame& operator =(const ame& other) ; } ; </xslx:templtext> </xsl:template> <xsl:template match="bases"> <xslx:templtext>: public ase%', public '</xslx:templtext> </xsl:template> <xsl:template match="text( )"/> </xsl:stylesheet>
This code is substantially easier to read and write. This facility is applicable to any context where a lot of boilerplate text will be generated. An XSLT purist may frown on such an extension because it introduces a foreign syntax into XSLT that is not subject to simple XML manipulation. This argument is valid; however, from a practical standpoint, many developers would reject XSLT (in favor of Perl) for boilerplate generation simply because it lacks a concise and unobtrusive syntax for getting the job done. So enough hemming and hawing; let’s just code it:
package com.ora.xsltckbk; import java.util.Vector ; import java.util.Enumeration ; import com.icl.saxon.tree.AttributeCollection; import com.icl.saxon.*; import com.icl.saxon.expr.*; import javax.xml.transform.*; import com.icl.saxon.output.*; import com.icl.saxon.trace.TraceListener; import com.icl.saxon.om.NodeInfo; import com.icl.saxon.om.NodeEnumeration; import com.icl.saxon.style.StyleElement; import com.icl.saxon.style.StandardNames; import com.icl.saxon.tree.AttributeCollection; import com.icl.saxon.tree.NodeImpl;
Your extension class first declares constants that will be used in a simple state machine that parses the escapes:
public class CkBkTemplText extends com.icl.saxon.style.StyleElement { private static final int SCANNING_STATE = 0 ; private static final int FOUND1_STATE = 1 ; private static final int EXPR_STATE = 2 ; private static final int FOUND2_STATE = 3 ; private static final int DELIMIT_STATE = 4 ;
Then define four private classes that implement the mini-language
contained within the xslx:templtext
element. The
base class,
CkBkTemplParam
,
captures literal text that may come before an escape:
private class CkBkTemplParam { public CkBkTemplParam(String prefix) { m_prefix = prefix ; } public void process(Context context) throws TransformerException { if (!m_prefix.equals("")) { Outputter out = context.getOutputter( ); out.setEscaping(false); out.writeContent(m_prefix); out.setEscaping(true); } } protected String m_prefix ; }
The CkBkValueTemplParam
class derives from
CkBkTemplParam
and implements the behavior of a
simple value-of escape expr
. To simplify the
implementation in this example, the disabled output escaping will be
the norm inside a
xslx:templtext
element:
private class CkBkValueTemplParam extends CkBkTemplParam { public CkBkValueTemplParam(String prefix, Expression value) { super(prefix) ; m_value = value ; } public void process(Context context) throws TransformerException { super.process(context) ; Outputter out = context.getOutputter( ); out.setEscaping(false); if (m_value != null) { m_value.outputStringValue(out, context); } out.setEscaping(true); } private Expression m_value ; }
The CkBkTemplParam
class implements the of
expr%delimit
behavior, largely by mimicking the
behavior of a Saxon
XslForEach
class:
private class CkBkListTemplParam extends CkBkTemplParam { public CkBkListTemplParam(String prefix, Expression list, Expression delimit) { super(prefix) ; m_list = list ; m_delimit = delimit ; } public void process(Context context) throws TransformerException { super.process(context) ; if (m_list != null) { NodeEnumeration m_listEnum = m_list.enumerate(context, false); Outputter out = context.getOutputter( ); out.setEscaping(false); while(m_listEnum.hasMoreElements( )) { NodeInfo node = m_listEnum.nextElement( ); if (node != null) { node.copyStringValue(out); } if (m_listEnum.hasMoreElements( ) && m_delimit != null) { m_delimit.outputStringValue(out, context); } } out.setEscaping(true); } } private Expression m_list = null; private Expression m_delimit = null ; }
The last private class is CkBkStyleTemplParam
, and
it is used as a holder of elements nested within the
xslx:templtext
, for example,
xsl:apply-templates
:
private class CkBkStyleTemplParam extends CkBkTemplParam { public CkBkStyleTemplParam(StyleElement snode) { m_snode = snode ; } public void process(Context context) throws TransformerException { if (m_snode.validationError != null) { fallbackProcessing(m_snode, context); } else { try { context.setStaticContext(m_snode.staticContext); m_snode.process(context); } catch (TransformerException err) { throw snode.styleError(err); } } } }
The next three methods are standard. If you allow the standard
disable-output-escaping
attribute to control output
escaping, you would capture its value in prepareAttributes( )
. The Saxon
XslText.java
source provides the necessary code:
public boolean isInstruction( ) { return true; } public boolean mayContainTemplateBody( ) { return true; } public void prepareAttributes( ) throws TransformerConfigurationException { StandardNames sn = getStandardNames( ); AttributeCollection atts = getAttributeList( ); for (int a=0; a<atts.getLength( ); a++) { int nc = atts.getNameCode(a); checkUnknownAttribute(nc); } }
The validate stage is an opportunity to parse the contents of the
xslx:templtext
element, looking for escapes. You
send every text node to a parser function. Element style content is
converted into instances CkBkStyleTemplParam
. The
member m_TemplParms
is a vector where the results
of parsing are stored:
public void validate( ) throws TransformerConfigurationException { checkWithinTemplate( ); m_TemplParms = new Vector( ) ; NodeImpl node = (NodeImpl)getFirstChild( ); String value ; while (node!=null) { if (node.getNodeType( ) = = NodeInfo.TEXT) { parseTemplText(node.getStringValue( )) ; } else if (node instanceof StyleElement) { StyleElement snode = (StyleElement) node; m_TemplParms.addElement(new CkBkStyleTemplParam(snode)) ; } node = (NodeImpl)node.getNextSibling( ); } }
The process method loops over m_TemplParms
and
calls each implementation’s process method:
public void process(Context context) throws TransformerException { Enumeration iter = m_TemplParms.elements( ) ; while (iter.hasMoreElements( )) { CkBkTemplParam param = (CkBkTemplParam) iter.nextElement( ) ; param.process(context) ; } }
The following private functions implement a simple state-machine-driven parser that would be easier to implement if you had access to a regular-expression engine (which is actually available to Java Version 1.4.1). The parser handles two consecutive backslashes (\) as a request for a literal backslash. Likewise, %% is translated into a single %:
private void parseTemplText(String value) { //This state machine parses the text looking for parameters int ii = 0 ; int len = value.length( ) ; int state = SCANNING_STATE ; StringBuffer temp = new StringBuffer("") ; StringBuffer expr = new StringBuffer("") ; while(ii < len) { char c = value.charAt(ii++) ; switch (state) { case SCANNING_STATE: { if (c = = '') { state = FOUND1_STATE ; } else { temp.append(c) ; } } break ; case FOUND1_STATE: { if (c = = '') { temp.append(c) ; state = SCANNING_STATE ; } else { expr.append(c) ; state = EXPR_STATE ; } } break ; case EXPR_STATE: { if (c = = '') { state = FOUND2_STATE ; } else { expr.append(c) ; } } break ; case FOUND2_STATE: { if (c = = '') { state = EXPR_STATE ; expr.append(c) ; } else { processParam(temp, expr) ; state = SCANNING_STATE ; temp = new StringBuffer("") ; temp.append(c) ; expr = new StringBuffer("") ; } } break ; } } if (state = = FOUND1_STATE || state = = EXPR_STATE) { compileError("xslx:templtext dangling \"); } else if (state = = FOUND2_STATE) { processParam(temp, expr) ; } else { processParam(temp, new StringBuffer("")) ; } } private void processParam(StringBuffer prefix, StringBuffer expr) { if (expr.length( ) = = 0) { m_TemplParms.addElement(new CkBkTemplParam(new String(prefix))) ; } else { processParamExpr(prefix, expr) ; } } private void processParamExpr(StringBuffer prefix, StringBuffer expr) { int ii = 0 ; int len = expr.length( ) ; int state = SCANNING_STATE ; StringBuffer list = new StringBuffer("") ; StringBuffer delimit = new StringBuffer("") ; while(ii < len) { char c = expr.charAt(ii++) ; switch (state) { case SCANNING_STATE: { if (c = = '%') { state = FOUND1_STATE ; } else { list.append(c) ; } } break ; case FOUND1_STATE: { if (c = = '%') { list.append(c) ; state = SCANNING_STATE ; } else { delimit.append(c) ; state = DELIMIT_STATE ; } } break ; case DELIMIT_STATE: { if (c = = '%') { state = FOUND2_STATE ; } else { delimit.append(c) ; } } break ; } } try { if (state = = FOUND1_STATE) { compileError("xslx:templtext trailing %"); } else if (state = = FOUND2_STATE) { compileError("xslx:templtext extra %"); } else if (state = = SCANNING_STATE) { String prefixStr = new String(prefix) ; Expression value = makeExpression(new String(list)) ; m_TemplParms.addElement( new CkBkValueTemplParam(prefixStr, value)) ; } else { String prefixStr = new String(prefix) ; Expression listExpr = makeExpression(new String(list)) ; Expression delimitExpr = makeExpression(new String(delimit)) ; m_TemplParms.addElement( new CkBkListTemplParam(prefixStr, listExpr, delimitExpr)) ; } } catch(Exception e) { } } //A vector of CBkTemplParms parse form text private Vector m_TemplParms = null; }
You can make some useful enhancements to the functionality of
xslx:templtext
. For example, you could expand the
functionality of the list escape to multiple lists (e.g.,
/expr1%delim1%expr2%delim2/
.). This enhancement
would roughly translate into the following XSLT equivalent:
<xsl:for-each select="expr1"> <xsl:variable name="pos" select="position( )"/> <xsl:value-of select="."/> <xsl:if test="$pos != last( )"> <xsl:value-of select="delim1"/> </xsl:if> <xsl:value-of select="expr2[$pos]"/> <xsl:if test="$pos != last( )"> <xsl:value-of select="delim2"/> </xsl:if> </xsl:for-each >
This facility would be useful when pairs of lists need to be
sequenced into text. For example, consider a C++
function’s parameters, which consist of name and
type pairs. The XSLT code is only a rough specification of semantics
because it assumes that the node sets specified by
expr1
and expr2
have the same
number of elements. I believe that an actual implementation would
continue to expand the lists as long as any set still had nodes,
suppressing delimiters for those that did not. Better yet, the
behavior could be controlled by attributes of
xslx:templtext
.
Space does not permit full implementations of these extension elements in Xalan. However, based on the information provided in the introduction, the path should be relatively clear.
Developers interested in extending Saxon should read Michael Kay’s article on Saxon design (http://www-106.ibm.com/developerworks/library/x-xslt2).