The abstract form of sum for processors that support tail-recursive optimization is as follows:
<xsl:template name="math:sum"> <!-- Initialize nodes to empty node set --> <xsl:param name="nodes" select="/.."/> <xsl:param name="result" select="0"/> <xsl:choose> <xsl:when test="not($nodes)"> <xsl:value-of select="$result"/> </xsl:when> <xsl:otherwise> <!-- call or apply template that will determine value of node unless the node is literally the value to be summed --> <xsl:variable name="value"> <xsl:call-template name="some-function-of-a-node"> <xsl:with-param name="node" select="$nodes[1]"/> </xsl:call-template> </xsl:variable> <!-- recurse to sum rest --> <xsl:call-template name="math:sum"> <xsl:with-param name="nodes" select="$nodes[position( ) != 1]"/> <xsl:with-param name="result" select="$result + $value"/> </xsl:call-template> </xsl:otherwise> </xsl:choose> </xsl:template>
Two techniques can handle a large number of nodes in the absence of
tail-recursive optimization. The first is commonly called
divide and conquer
.
The idea behind this technique is to
reduce the amount of work by at least a factor of two on each
recursive step:
<xsl:template name="math:sum-dvc"> <xsl:param name="nodes" select="/.."/> <xsl:param name="result" select="0"/> <xsl:param name="dvc-threshold" select="100"/> <xsl:choose> <xsl:when test="count($nodes) <= $dvc-threshold"> <xsl:call-template name="math:sum"> <xsl:with-param name="nodes" select="$nodes"/> <xsl:with-param name="result" select="$result"/> </xsl:call-template> </xsl:when> <xsl:otherwise> <xsl:variable name="half" select="floor(count($nodes) div 2)"/> <xsl:variable name="sum1"> <xsl:call-template name="math:sum-dvc"> <xsl:with-param name="nodes" select="$nodes[position( ) <= $half]"/> <xsl:with-param name="result" select="$result"/> <xsl:with-param name="dvc-threshold" select="$dvc-threshold"/> </xsl:call-template> </xsl:variable> <xsl:call-template name="math:sum-dvc"> <xsl:with-param name="nodes" select="$nodes[position( ) > $half]"/> <xsl:with-param name="result" select="$sum1"/> <xsl:with-param name="dvc-threshold" select="$dvc-threshold"/> </xsl:call-template> </xsl:otherwise> </xsl:choose> </xsl:template>
The second is called batching
,
which
uses two recursive stages. The first stage divides the large problem
into batches of reasonable size. The second stage processes each
batch recursively.
<xsl:template name="math:sum-batcher"> <xsl:param name="nodes" select="/.."/> <xsl:param name="result" select="0"/> <xsl:param name="batch-size" select="500"/> <xsl:choose> <xsl:when test="not($nodes)"> <xsl:value-of select="$result"/> </xsl:when> <xsl:otherwise> <xsl:variable name="batch-sum"> <xsl:call-template name="math:sum"> <xsl:with-param name="nodes" select="$nodes[position( ) < $batch-size]"/> <xsl:with-param name="result" select="$result"/> </xsl:call-template> </xsl:variable> <xsl:call-template name="math:sum-batcher"> <xsl:with-param name="nodes" select="$nodes[position( ) >= $batch-size]"/> <xsl:with-param name="result" select="$batch-sum"/> <xsl:with-param name="batch-size" select="$batch-size"/> </xsl:call-template> </xsl:otherwise> </xsl:choose> </xsl:template>
The solutions for product
are similar:
<xsl:template name="math:product"> <xsl:param name="nodes" select="/.."/> <xsl:param name="result" select="1"/> <xsl:choose> <xsl:when test="not($nodes)"> <xsl:value-of select="$result"/> </xsl:when> <xsl:otherwise> <!-- call or apply template that will determine value of node unless the node is literally the value to be multiplied --> <xsl:variable name="value"> <xsl:call-template name="some-function-of-a-node"> <xsl:with-param name="node" select="$nodes[1]"/> </xsl:call-template> </xsl:variable> <xsl:call-template name="math:product"> <xsl:with-param name="nodes" select="$nodes[position( ) != 1]"/> <xsl:with-param name="result" select="$result * $value"/> </xsl:call-template> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template name="math:product-batcher"> <xsl:param name="nodes" select="/.."/> <xsl:param name="result" select="1"/> <xsl:param name="batch-size" select="500"/> <xsl:choose> <xsl:when test="not($nodes)"> <xsl:value-of select="$result"/> </xsl:when> <xsl:otherwise> <xsl:variable name="batch-product"> <xsl:call-template name="math:product"> <xsl:with-param name="nodes" select="$nodes[position( ) < $batch-size]"/> <xsl:with-param name="result" select="$result"/> </xsl:call-template> </xsl:variable> <xsl:call-template name="math:product-batcher"> <xsl:with-param name="nodes" select="$nodes[position( ) >= $batch-size]"/> <xsl:with-param name="result" select="$batch-product"/> <xsl:with-param name="batch-size" select="$batch-size"/> </xsl:call-template> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template name="math:product-dvc"> <xsl:param name="nodes" select="/.."/> <xsl:param name="result" select="1"/> <xsl:param name="dvc-threshold" select="100"/> <xsl:choose> <xsl:when test="count($nodes) <= $dvc-threshold"> <xsl:call-template name="math:product"> <xsl:with-param name="nodes" select="$nodes"/> <xsl:with-param name="result" select="$result"/> </xsl:call-template> </xsl:when> <xsl:otherwise> <xsl:variable name="half" select="floor(count($nodes) div 2)"/> <xsl:variable name="product1"> <xsl:call-template name="math:product-dvc"> <xsl:with-param name="nodes" select="$nodes[position( ) <= $half]"/> <xsl:with-param name="result" select="$result"/> <xsl:with-param name="dvc-threshold" select="$dvc-threshold"/> </xsl:call-template> </xsl:variable> <xsl:call-template name="math:product-dvc"> <xsl:with-param name="nodes" select="$nodes[position( ) > $half]"/> <xsl:with-param name="result" select="$product1"/> <xsl:with-param name="dvc-threshold" select="$dvc-threshold"/> </xsl:call-template> </xsl:otherwise> </xsl:choose> </xsl:template>
Using the built-in XPath sum( )
function
is the
simplest way to perform simple sums. However, if you want to compute
sums of the nodes’ arbitrary function in a node-set,
then you need either to:
Use one of the recipes in this section.
Compute the function of the nodes first, capturing the result in a
variable as a result-tree fragment. Then use an extension function to
convert the fragment to a node set that can be fed to
sum
. In XSLT 2.0, generalized sums will become
trivial because of the banishment of result-tree fragments.
Batching and divide and conquer are two techniques for managing recursion that are useful whenever you must process a potentially large set of nodes. Experimentation shows that even when using an XSLT processor that recognizes tail recursion, better performance results from these approaches.
Chapter 14 shows how to make a reusable batch and divide-and-conquer drivers.
Dimitre Novatchev and Slawomir Tyszko compare batching with divide and conquer at http://www.vbxml.com/xsl/articles/recurse/.