Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Finding Minimums and Maximums

Problem

You need to find the minimum (or maximum) numerical node (or nodes) in a node set.

Solution

The EXSLT functions that perform these operations are math:min, math:max, math:lowest, and math:highest. min and max find the value of the node with minimum and maximum numerical value, respectively. EXSLT defines math:min as follows:

The minimum value is defined as follows. The node set passed as an argument is sorted in ascending order as it would be by xsl:sort with a data type of number. The minimum is the result of converting the string value of the first node in this sorted list to a number using the number function.
If the node set is empty, or if the result of converting the string values of any of the nodes to a number is NaN, then NaN is returned.

math:max is defined similarly. EXSLT provides pure XSLT implementations that are literal implementations of this definition, as shown in Example 2-9.

Example 2-9. EXSLT min and max implement directly from the definition

<xsl:template name="math:min">
   <xsl:param name="nodes" select="/.." />
   <xsl:choose>
      <xsl:when test="not($nodes)">NaN</xsl:when>
      <xsl:otherwise>
         <xsl:for-each select="$nodes">
            <xsl:sort data-type="number" />
            <xsl:if test="position(  ) = 1">
               <xsl:value-of select="number(.)" />
            </xsl:if>
         </xsl:for-each>
      </xsl:otherwise>
   </xsl:choose>
</xsl:template>
   
<xsl:template name="math:max">
   <xsl:param name="nodes" select="/.." />
   <xsl:choose>
      <xsl:when test="not($nodes)">NaN</xsl:when>
      <xsl:otherwise>
         <xsl:for-each select="$nodes">
            <xsl:sort data-type="number" order="descending" />
            <xsl:if test="position(  ) = 1">
               <xsl:value-of select="number(.)" />
            </xsl:if>
         </xsl:for-each>
      </xsl:otherwise>
   </xsl:choose>
</xsl:template>

You may be scratching your head over the default value for the nodes parameter (select="/.."). This is simply an idiomatic way to initialize nodes to an empty node set (i.e., the parent of the root is empty by definition).

Although the definitions of math:min and math:max use xsl:sort, the implementations need not, so results that are more efficient are possible provided your XSLT processor supports tail recursion (see Example 2-10).

Example 2-10. min and max implemented with divide and conquer

<xsl:template name="math:max">
     <xsl:param name="nodes" select="/.."/>
     <xsl:param name="max"/>
  <xsl:variable name="count" select="count($nodes)"/>
  <xsl:variable name="aNode" select="$nodes[ceiling($count div 2)]"/>
  <xsl:choose>
    <xsl:when test="not($count)">
      <xsl:value-of select="number($max)"/>
    </xsl:when>
    <xsl:when test="number($aNode) != number($aNode)">
      <xsl:value-of select="number($aNode)"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:call-template name="math:max">
        <xsl:with-param name="nodes" select="$nodes[not(. &lt;= number($aNode))]"/>
        <xsl:with-param name="max">
          <xsl:choose>
            <xsl:when test="not($max) or $aNode > $max">
              <xsl:value-of select="$aNode"/>
            </xsl:when>
            <xsl:otherwise>
              <xsl:value-of select="$max"/>
            </xsl:otherwise>
          </xsl:choose>
        </xsl:with-param>
      </xsl:call-template>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>
   
<xsl:template name="math:min">
  <xsl:param name="nodes" select="/.."/>
  <xsl:param name="min"/>
  <xsl:variable name="count" select="count($nodes)"/>
  <xsl:variable name="aNode" select="$nodes[ceiling($count div 2)]"/>
  <xsl:choose>
    <xsl:when test="not($count)">
      <xsl:value-of select="number($min)"/>
    </xsl:when>
    <xsl:when test="number($aNode) != number($aNode)">
      <xsl:value-of select="number($aNode)"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:call-template name="math:min">
        <xsl:with-param name="nodes" select="$nodes[not(. >= number($aNode))]"/>
        <xsl:with-param name="min">
          <xsl:choose>
            <xsl:when test="not($min) or $aNode &lt; $min">
              <xsl:value-of select="$aNode"/>
            </xsl:when>
            <xsl:otherwise>
              <xsl:value-of select="$min"/>
            </xsl:otherwise>
          </xsl:choose>
        </xsl:with-param>
      </xsl:call-template>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

Typically, the preceding implementations are faster than the version using xsl:sort. In some degenerate cases, they are likely to be slower. The reason is that efficiency hinges on removing half the nodes from consideration (on average) at each recursive step. One can imagine a scenario in which at each pass aNode is the minimum node (in the case of math:max) or the maximum remaining node (in the case of math:min). If this were to occur, each pass would remove only one node, resulting in poor performance. Luckily, data tends to come in two configurations: presorted and random. In both cases, these implementations should hold their own.

While you had to look out for non-numeric data explicitly, EXSLT’s implementations let xsl:sort take care of this. The XSLT standard mandates that non-numeric data be placed up front by sort when data-type='number'.

Warning

Don’t be tempted to write not(number($var)) to test for NaN! I often catch myself doing this because it “sounds” correct. The number function does not test for a number; instead, it attempts to convert its argument to a number. This is not what you want—this test will conclude 0 is not a number due to the conversion of 0 to false. The correct test is number($var) != number($var). This test works because NaN is never equal to NaN, but any number is always equal to itself. Do not be tempted to shorten this idiom to number($var) != $var. Doing so works most of the time, but if $var is an empty node set, it will fail. If you prefer, a more direct approach string(number($var)) = 'NaN' also works.

EXSLT defines math:lowest as follows.

The math:lowest function returns the nodes in the node set whose value is the minimum value for the node set. The minimum value for the node set is the same as the value as calculated by math:min. A node has this minimum value if the result of converting its string value to a number as if by the number function is equal to the minimum value, where the equality comparison is defined as a numerical comparison using the = operator.
If any of the nodes in the node set has a non-numeric value, the math:min function will return NaN. The definition numeric comparisons entails that NaN != NaN. Therefore if any of the nodes in the node set has a non-numeric value, math:lowest will return an empty node set.

The EXSLT implementation is literally based on this definition and might not be very efficient:

<xsl:template name="math:lowest">
   <xsl:param name="nodes" select="/.." />
   <xsl:if test="$nodes and not($nodes[number(.) != number(.)])">
      <xsl:variable name="min">
         <xsl:for-each select="$nodes">
            <xsl:sort data-type="number" />
            <xsl:if test="position(  ) = 1">
               <xsl:value-of select="number(.)" />
            </xsl:if>
         </xsl:for-each>
      </xsl:variable>
      <xsl:copy-of select="$nodes[. = $min]" />
   </xsl:if>
</xsl:template>

The xsl:if test scans all nodes to handle cases when a non-numeric is present. It then sorts to find the min and finally collects all nodes with that min. Example 2-11 reuses math:min to do the same without needing to sort.

Example 2-11. Lowest implemented by reusing math:min

<xsl:template name="math:lowest">
     <xsl:param name="nodes" select="/.."/>
   
     <xsl:variable name="min">
       <xsl:call-template name="math:min">
         <xsl:with-param name="nodes" select="$nodes"/>
       </xsl:call-template>
     </xsl:variable> 
     <xsl:choose>
       <xsl:when test="number($min) = number($min)">
         <xsl:copy-of select="$nodes[. = $min]"/>
       </xsl:when>
     </xsl:choose>
</xsl:template>

Finally, you can implement math:lowest with only one pass over the nodes if you are willing to forego reuse of math:min (see Example 2-12).

Example 2-12. Lowest implemented without reuse of math:min

<xsl:template name="math:lowest">
  <xsl:param name="nodes" select="/.."/>
  <xsl:param name="lowest" select="/.."/>
  <xsl:variable name="index" select="ceiling(count($nodes) div 2)"/>
  <xsl:variable name="aNode" select="$nodes[$index]"/>
  <xsl:choose>
    <xsl:when test="not($index)">
      <xsl:copy-of select="$lowest"/>
    </xsl:when>
    <xsl:when test="number($aNode) != number($aNode)"/>
    <xsl:otherwise>
      <xsl:choose>
        <xsl:when test="not($lowest) or $aNode &lt; $lowest">
           <xsl:call-template name="math:lowest">
            <xsl:with-param name="nodes" select="$nodes[not(. >= $aNode)]"/>
            <xsl:with-param name="lowest" select="$nodes[. = $aNode]"/>
          </xsl:call-template>
        </xsl:when>
        <xsl:when test="$aNode = $lowest">
           <xsl:call-template name="math:lowest">
            <xsl:with-param name="nodes" select="$nodes[not(. >= $aNode)]"/>
            <xsl:with-param name="lowest" select="$lowest|$nodes[$index]"/>
          </xsl:call-template>
        </xsl:when>
        <xsl:otherwise>
           <xsl:call-template name="math:lowest">
            <xsl:with-param name="nodes" select="$nodes[not(. >= $aNode)]"/>
            <xsl:with-param name="lowest" select="$lowest"/>
          </xsl:call-template>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

Interestingly, this implementation does worse, probably because of the additional copying that occurs. In performance tests on 10,000 data points using various distributions of data (sorted, reverse sorted, semirandom, and random), the math:min-based implementation beat the xsl:sort-based implementation by about 40% on average (often better). The recursive implementation that did not use math:min was 24% slower than the one that did.

The math:highest definition and implementations follow directly from inverting the relational logic of math:lowest, so I will not discuss them here.

Discussion

The minimum and maximum values of a node set can be determined by the simple XPath expressions <xsl:value-of select="$nodes[not($nodes < .)]"/> and <xsl:value-of select="$nodes[not($nodes > .)]"/>. In English, the first says, “Select all nodes for which there is no node less than its value,” and the second says, “Select all nodes for which there is no node greater than its value.”

Although very simple, these expressions have O(N²) performance, where N is the number of nodes. Therefore, unless you are certain that the number of nodes is small, avoid this shortcut if you can. Occasionally, you are forced to use it because, for example, you have to find the min/max within the select attribute of xsl:sort or the use attribute of xsl:key (for which you cannot call a template).

In another XSLT publication, the following recursive implementation for finding minimums is described as more efficient than one using xsl:sort:

<xsl:template name="math:min">
  <xsl:param name="nodes" select="/.."/>
  <xsl:param name="min" select="number('NaN')"/>
  <xsl:choose>
    <xsl:when test="not($nodes)">
      <xsl:value-of select="number($min)"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:variable name="aNode" select="$nodes[1]"/>
      <xsl:call-template name="math:min">
        <xsl:with-param name="nodes" select="$nodes[position(  ) > 1]"/>
        <xsl:with-param name="min">
          <xsl:choose>
            <xsl:when test="$aNode &lt; $min or string($min) = 'NaN'">
              <xsl:value-of select="$aNode"/>
            </xsl:when>
            <xsl:otherwise>
              <xsl:value-of select="$min"/>
            </xsl:otherwise>
          </xsl:choose>
        </xsl:with-param>
      </xsl:call-template>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

This is simply not the case on any XSLT implementation I tested, and I can’t imagine that it ever could be. The reason this is much more likely to be slow is because only one node is removed from consideration at each step. Even with tail recursion, there will be a lot of copying of nodes. It is easy to be fooled into thinking that this recursive solution is as efficient as the iterative-indexing solution you might come up with in C or Java. However, indexing is not the same as creating a brand-new node set with the first item removed, as is the case with $nodes[position( ) > 1].

In many cases when you need to find the minimum of a data set, you end up also needing the maximum. It would be nice to have a function handy that gives two for the price of one. The following will do so for a slight increase in complexity:

<xsl:template name="math:min-max">
  <xsl:param name="nodes" select="/.."/>
  <xsl:param name="nodes-for-max" select="$nodes"/>
  <xsl:param name="min"/>
  <xsl:param name="max"/>
  <xsl:variable name="count1" select="count($nodes)"/>
  <xsl:variable name="aNode1" select="$nodes[ceiling($count1 div 2)]"/>
  <xsl:variable name="count2" select="count($nodes-for-max)"/>
  <xsl:variable name="aNode2" select="$nodes-for-max[ceiling($count2 div 2)]"/>
  <xsl:choose>
    <xsl:when test="not($count1) and not($count2)">
      <xsl:value-of select="concat(number($min),',',number($max))"/>
    </xsl:when>
    <xsl:when test="number($aNode1) != number($aNode1) and 
               number($aNode2) != number($aNode2)">
      <xsl:value-of select="concat(number($aNode1),',',number($aNode2))"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:call-template name="math:min-max">
        <xsl:with-param name="nodes" select="$nodes[not(. >= number($aNode1))]"/>
        <xsl:with-param name="nodes-for-max" 
          select="$nodes-for-max[not(. &lt;= number($aNode2))]"/>
        <xsl:with-param name="min">
          <xsl:choose>
            <xsl:when test="not($min) or $aNode1 &lt; $min">
              <xsl:value-of select="$aNode1"/>
            </xsl:when>
            <xsl:otherwise>
              <xsl:value-of select="$min"/>
            </xsl:otherwise>
          </xsl:choose>
        </xsl:with-param>
        <xsl:with-param name="max">
          <xsl:choose>
            <xsl:when test="not($max) or $aNode2 > $max">
              <xsl:value-of select="$aNode2"/>
            </xsl:when>
            <xsl:otherwise>
              <xsl:value-of select="$max"/>
            </xsl:otherwise>
          </xsl:choose>
        </xsl:with-param>
      </xsl:call-template>
    </xsl:otherwise>
    </xsl:choose>
</xsl:template>

Testing shows that this approach continues to outperform a sort-based solution.

When considering minima and maxima, only one special case was addressed: when the nodes are literally the numbers you must process. A more general problem involves finding the minimum or maximum of a function of the nodes in a node set. For example, consider the case in which you have a set of positive and negative numbers and you need the minimum of the square of their value. Although hacking the previous implementations to do the squaring is simple, a single reusable implementation is more desirable. Chapter 14 revisits this problem and describes several alternatives for creating generic solutions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Finding Minimums and Maximums

Create new playlist

Sign In

Sign Up

Finding Minimums and Maximums

Problem

Solution

Warning

Discussion

Table of Contents for
Finding Minimums and Maximums