Data types and functions

The expression language handles boolean values, numeric values, strings and nodes. A number of functions are supplied to transform data items from one type to another, and to compare, manipulate and analyze them.

Functions

Functions perform a specific operation and return a value. A return value effectively replaces the function at its position within the expression. Functions have names that represent their purpose, such as 'round' and 'string-length', followed by brackets:

						function()
					

In most cases, functions operate on supplied information, passed to the function as comma-separated parameters placed within the brackets:

						function(parameter1, parameter2)
					

There are a number of functions available to support each of the data types discussed below. However, the examples in this section are generally not very useful except to illustrate the form and behaviour of particular functions. The later discussion on the use of operators shows how these functions may be put to more practical use.

Boolean values

A boolean value must be either 'true' or 'false'. This type of value is used in tests, such as in the If element, to decide whether or not to take an action. The function 'true()' simply returns the value 'true', so the following If element is always triggered:

<if test="true()"><!-- ENABLED --></if>

The function 'false()' always returns the value 'false'. The following test is never enabled:

<if test="false()"><!-- NOT ENABLED --></if>

The 'not()' function reverses a boolean value, from 'true' to 'false' or 'false' to 'true':

<if test="not(true())"><!-- NOT ENABLED --></if>

<if test="not(false())"><!-- ENABLED --></if>

When a boolean test is needed but the expression does not return a value of this kind, the value is automatically converted to a boolean as if it were enclosed within the 'boolean()' function, which converts values to 'true' or 'false' according to the following rules.

A number if considered to be 'true' if it is not zero (and is definitely a valid number). The value zero is therefore 'false', as is NaN (Not a Number):

<if test="1"><!-- ENABLED --></if>

<if test="0"><!-- NOT ENABLED --></if>

<if test="1bad"><!-- NOT ENABLED (NaN) --></if>

A string is considered to be 'true' if it contains at least one character. An empty string is 'false':

<if test="'x'"><!-- ENABLED --></if>

<if test="''"><!-- NOT ENABLED (empty string) --></if>

A set of nodes is true if there is at least one node in the set. A list of zero nodes is false. This is demonstrated by the simple test for the presence of elements of a specific type. In this example, if at least one Title element is present as a child of the current element, then 'true' is returned:

<if test="title">
  <!-- ENABLED (if embedded title(s)) -->
</if>

Numeric values

A numeric value can be any positive or negative real number, such as '3', '+55.5' and '-9999'. Numbers must be stored as binary values in eight bytes (64 bits), and so there is a (very high) limit on the values allowed.

A number of functions are included for manipulating numeric values. First, other data items can be converted to numbers using the 'number()' function. Boolean values are converted to '1' when true, and '0' when false:

<if test="number(true())"><!-- ENABLED --></if>

<if test="number(false())"><!-- NOT ENABLED --></if>

Strings are converted to numbers if their contents are pure numbers, optionally surrounded by whitespace. If the string does not conform to these constraints, it is converted to NaN (Not a Number) (which is then considered 'false' as a boolean value):

<if test="number('1')"><!-- ENABLED --></if>

<if test="number(' -1 ')"><!-- ENABLED --></if>

<if test="number('-0')"><!-- NOT ENABLED (zero) --></if>

<if test="number('99 bad')"><!-- NOT ENABLED (NaN) --></if>

Other functions, described below, take parameters that need to be numeric values. But it is possible to pass other data types to them without having to use the number() function described above. This function can be omitted, as its presence can be implied. The two examples below are functionally identical, but the latter is preferable as it is more legible:

<if test="function(number('0.5'))">...</if>

<if test="function('0.5')">...</if>

The 'floor()' function returns the nearest positive integer that is smaller than the value passed to it. For example, the floor of '3.2' is '3'. The 'ceiling()' function rounds real numbers up to the nearest higher integer value, so '3.2' become '4'. The 'round()' function rounds real numbers up or down to the nearest integer, with '0.5' rounded up to '1', and '0.4' rounded down to '0':

<if test="ceiling(0.5)"><!-- ENABLED (1) --></if>

<if test="floor(0.5)"><!-- NOT ENABLED (0) --></if>

<if test="round(0.5)"><!-- ENABLED (1) --></if>

<if test="round(0.4)"><!-- NOT ENABLED (0) --></if>

The 'sum()' function returns the result of adding all the parameters, which must be nodes. The values to sum are the values extracted from each node. These values are first converted to strings (as described below), then the strings are converted to numbers. Finally the numbers are added together.

String values

A string value is a sequence of characters that is enclosed by quotes. Other data types can be converted into strings using the 'string()' function. When a number is converted into a string, the string consists of characters that represent each digit of the number, plus any decimal point or leading '-' symbol (though minus zero becomes just '0'). An invalid number becomes the string 'NaN'. When a boolean value is converted to a string, the string holds the word 'true' or the word 'false'.

Strings can be analyzed using the 'string-length()', 'starts-with()' and 'contains()' functions. The first of these returns a number representing the number of characters in the string. The second and third return a boolean value indicating whether or not the string starts with, or contains, the given series of characters:

<if test="string-length('1')"><!-- ENABLED (1) --></if>

<if test="string-length('')"><!-- NOT ENABLED (0) --></if>

<if test="starts-with('the', 'the boat')">
  <!-- ENABLED (true) -->
</if>

<if test="starts-with('the', 'a boat')">
  <!-- NOT ENABLED (false) -->
</if>

<if test="contains('boat', 'the boat is yellow')">
  <!-- ENABLED (true) -->
</if>

<if test="contains('boat', 'the car is yellow')">
  <!-- NOT ENABLED (false) -->
</if>

Other functions are used to create new strings from parts of existing strings. The 'substring-before()' function and the 'substring-after()' function extract initial or trailing characters from a string, with the cut-off point given as a character in the second parameter to the function:

						substring-before('abc-123', '-')">

substring-after('abc-123', '-')">

If the given character is repeated in the string, only the first occurrence is relevant.

A more flexible version of this feature, called 'substring()', specifies an offset and range for the characters to extract. In the following example, four characters are selected starting at the fifth character of the string:

Characters within strings can be replaced. The 'translate()' function takes three parameters. The first parameter is the string containing characters to be replaced. The second parameter is a list of characters that must be converted to other characters. The third parameter is the list of replacement characters. A specific character in the second string is replaced by the character at the same position in the third string. For example, values of 'abc' and 'xyz' for these two parameters specify that 'a' is to become 'x', 'b' is to become 'y', and 'c' is to become 'z'. Each occurrence in the first string is replaced; for example, 'translate('abc abc', 'b', 'x')', returns the string 'axc axc'. The following example returns the upper-case version of any string of alphabetic characters:

						translate('Convert This',
            'abcdefghijklmnopqrstuvwxyz',
            'ABCDEFGHIJKLMNOPQRSTUVWXYZ')
					

The 'normalize-space()' function returns a string that removes unnecessary whitespace characters from the parameter string. For example, ' the boat ' becomes 'the boat'.

Node-set values

Every node has a value, which is represented by a string. The string value of a node is determined in different ways, depending on the node type. The value of the root node of the document is the concatenation of all the text nodes in the document (which is not generally useful). The value of an element node is also the concatenation of all the text nodes within that element, and within all descendant elements. For example, the value of the following Para element is the text in the paragraph, including the text in the embedded highlighted section ('All the text here is the value'):

<para>All the <emph>text</emph> here is the value</para>

Note that, for this reason, the Value Of instruction in XSLT can be useful for extracting the text from an element for reuse elsewhere, ignoring any embedded tags.

The string value of an attribute node is its normalized value (tabs and line-feeds are converted to spaces first).

The string value of a comment node is the entire content of the comment, not including the surrounding markup '<!--' and '-->':

<!--All the text of the comment-->

The string value of a processing instruction node is the instruction part of the code and so does not include the target name. It is not possible to retrieve the target name. However, processing instructions can be selected on a target-specific basis, by including the target name in the node function, as in 'processing-instruction('ACME')':

<?ACME page-break?>

The string value of a text node is all the characters in that node. A text node cannot be empty as it only exists in order to hold one or more characters.

Unlike the simpler data types described earlier, a node already exists at a specific location in the document. Therefore, to use a node in an expression it must first be found. A location path is used to find the required node. The other issue this raises is that location paths often target multiple nodes. For example, '//title' returns a list of all Title element nodes. Some of the functions described below work on a single node. When a list is passed to them, only the first node is processed.

When no location path is included, the default location is the direct content of the current node. For example, a test can be made for the presence of child nodes, of any kind (except attributes), using the 'node()' function. If there are any children, such as text blocks, comments or processing instructions, these objects will have string values, and a string containing any characters at all is considered 'true'. Otherwise, the string will be empty, and therefore 'false':

<if test="node()">...</if>

This test, made on the Para elements below, has the following effect on the If element:

<para></para> <!-- NOT ENABLED -->


<para/> <!-- NOT ENABLED -->


<para>some text</para> <!-- ENABLED -->


<para><!-- a comment --></para> <!-- ENABLED -->

More specific variants use the 'text()', 'comment()' and 'processing-instruction()' functions, and specific or general ('*') element names.

Context positions

The current node is always part of a current node list, though it is possible for this list to only contain the one, current node. The length of the node list, and the position of the current node within this list, can be important. After each step in a location path, the list contains all the nodes selected by this step.

The 'position()' function returns a number that indicates the position of the node amongst the current list of nodes. In the simplest case, this is the position of the node amongst its sibling nodes:

<if test="position()">
  <!-- ENABLED (current list must be at least 1) -->
</if>

The position value is not always calculated in the same direction. In a location path, it depends on the direction of the step to be taken. The simple rule underlying the various directions illustrated below is that when the direction is forward, searching for elements that start after the start-tag of the current element, then counting is in document order. Otherwise, it is in reverse order.

The 'last()' function returns the number of nodes in the current context list. The position will always be a value between '1' and this maximum:

<if test="last()">
  <!-- ENABLED (current list must be at least 1) -->
</if>

The 'count()' function returns the total number of nodes passed to it. For example, 'count(para)' returns the number of paragraphs, such as the number of paragraph children of the current element, and 'count(*)' counts all elements:

<if test="count(para)">
  <!-- ENABLED (if embedded paragraphs) -->
</if>
<if test="count(*)">
  <!-- ENABLED (if embedded elements) -->
</if>

The 'id()' function returns the node with the given unique identifier, if such a node exists. The function 'id('xyz')' returns the node with the unique identifier 'xyz'. Unique identifiers are only recognized if a DTD is used, and it assigns to one or more attributes the attribute type ID. This is typically used to apply formatting or to reuse the content of a specific element:

<xsl:template match="id('JSmith')">
  <P>J. Smith details:</P>
  <P><xsl:apply-templates/></P>
  <P>J. Smith works for company -
    <xsl:value-of select="id('company19')"/>
  </P>
</xsl:template>

The 'name()' function returns the string value of the name of the given node. The name returned is the qualified name, with a namespace prefix where appropriate. For example, 'name(.)' returns the name of the current node. This could be used in a named template to debug a stylesheet:

<xsl:template name="debug">
  <P>NAME = <xsl:value-of select="name(.)"/></P>
</xsl:template>

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset