A variety of numerical calculations can be performed using XQuery. This chapter describes the major numeric types, along with the operators and functions that act on numeric values. These include comparisons, arithmetic operations, and functions that operate on numeric values such as round
and sum
.
The four main numeric types supported in XQuery are xs:decimal
, xs:integer
, xs:float
, and xs:double
. All the operations and functions that can be performed on these types of numeric values can also be performed on values whose types are restrictions of these types. This includes user-defined types that appear in a schema, as well as the built-in derived types such as xs:positiveInteger
and xs:unsignedByte
. For a complete list and explanation of these built-in derived types, see Appendix B.
xs:integer
TypeThe type xs:integer
represents a signed integer. The limit on how large an xs:integer
value can be is implementation-defined. Numeric literals that contain only digits (no decimal points or the letter E
or e
) are considered integers, with the type xs:integer
. For example, 25
is an xs:integer
value.
In the type hierarchy, xs:integer
is derived from xs:decimal
. Therefore, anywhere XQuery is expecting an xs:decimal
value, an xs:integer
value may be used in its place because of subtype substitution.
xs:float
and xs:double
TypesThe type xs:float
is patterned after IEEE single-precision 32-bit floating-point numbers, and xs:double
is patterned after IEEE double-precision 64-bit floating-point numbers. The representation of both xs:float
and xs:double
values is a mantissa (a decimal number) followed by the character E
or e
, followed by an exponent, which must be an integer. For example, 3E2
represents 3 × 102, or 300. Numeric literals that contain an E
or e
are considered to have the type xs:double
.
In addition, the following values are represented: INF
(infinity), -INF
(negative infinity), and NaN
(not a number).
xs:numeric
TypeThe type xs:numeric
is a union type that includes xs:double
, xs:float
, and xs:decimal
. It is most commonly used in the signatures of functions that accept any numeric type. For example, the abs
function accepts an argument of type xs:numeric
, which means that an xs:double
, xs:float
, or xs:decimal
value can be passed to it. Because of the function conversion rules, xs:integer
values are also allowed, since xs:integer
is derived from xs:decimal
.
Despite its xs
prefix, this type is not defined by XML Schema. However, it is implicitly supported by XQuery 3.1 processors. In previous versions of XQuery, a special keyword numeric
(no prefix) was used in built-in function signatures instead.
How does a value become “numeric”? As with any type, a value may be assigned one of the numeric types in a number of ways, for example:
It may be selected from an input document that has a schema declaring it to have a numeric type.
It may be a numeric literal value that appears in the query and is not surrounded by quotes. For example, $price > 25.5
compares $price
to the xs:decimal
value 25.5
.
It may be the result of a function that returns a number, such as count($products)
, which returns an xs:integer
.
It may be the result of one of the standard constructor functions, such as:
xs:float("25.5E3")
, which constructs an xs:float
value from a string
xs:decimal($prod/price)
, which constructs an xs:decimal
value from an element
It may be the result of an explicit cast, such as $prod/price cast as xs:decimal
.
If it is untyped, it may be cast automatically when it is passed to a function, such as the sum
function.
A random number can be generated with the random-number-generator
function.
number
FunctionIn addition to the standard type constructors, the number
function is useful for telling the processor to treat a node or atomic value as a number, regardless of its declared type (if any). It returns that argument cast as an xs:double
. If no argument is provided, the number
function uses the context node.
One difference between using the number
function and the xs:double
constructor is that the number
function returns the xs:double
value NaN
in the case that the value cannot be cast to a numeric value, whereas the xs:double
constructor raises an error. Table 17-1 shows some examples that use the number
function.
Example | Return value |
---|---|
number(doc("prices.xml")//prod[1]/price)
|
29.99
|
number(doc("prices.xml")//prod[1]/price/@currency)
|
NaN
|
number("29.99")
|
29.99
|
number( () )
| NaN |
If an operation, such as a comparison or arithmetic operation, is performed on values of two different primitive numeric types, one value is promoted to the type of the other value. Specifically, an xs:decimal
value can be promoted to xs:float
or xs:double
, and an xs:float
value to xs:double
. For example, the expression 1.0 + 1.2E0
adds a decimal number to a floating-point number. The xs:decimal
number (1.0) is promoted to xs:double
before the expression is evaluated.
Numeric type promotion happens automatically in arithmetic expressions and comparison expressions. It is also used in calls to functions that expect numeric values. For example, if a function expects an xs:double
value, you can pass it an xs:decimal
value, and xs:decimal
will be promoted to xs:double
.
In addition to these specific promotion rules, any numeric value can be treated as if it has its type’s base type or any ancestor type. This is known as subtype substitution. For example, if in your schema you define a type myDecimal
that is derived by restriction from xs:decimal
, a myDecimal
value can be added to an xs:decimal
value, returning an xs:decimal
value. This rule also applies to built-in types. For example, since xs:integer
is derived from xs:decimal
, an xs:integer
value can be used anywhere an xs:decimal
value is expected.
Two numeric values can be compared using the general comparison operators: =
, !=
, <
, <=
, >
, and >=
. Values of different numeric types can be compared; one is promoted to the other’s type. Nodes that contain numeric values can also be compared using these operators; in that case, they are atomized to extract their typed values. Table 17-2 shows some examples of comparing numeric values.
Some caution should be used when comparing untyped values using the general comparison operators. When an untyped value is compared to a numeric value (for example, a numeric literal), it is cast to the numeric type. However, when two untyped values are compared, they are treated like strings. This means that, for example, the untyped value 100
would evaluate to less than the value 99
. If you want to compare two untyped values, you must explicitly cast the value(s) to a numeric type, as shown in the fourth example in Table 17-2. The table assumes that prices.xml is untyped, i.e., has not been validated with a schema.
Example | Return value |
---|---|
doc("prices.xml")//prod[3]/discount > 10
|
false
|
doc("prices.xml")//prod[3]/discount gt 10
| Error XPTY0004 |
doc("prices.xml")//prod[3]/discount > doc("prices.xml")//prod[1]/discount
| true (it is comparing the string 3.99 to the string 10.00 ) |
doc("prices.xml")//prod[3]/number(discount) > doc("prices.xml")//prod[1]/number(discount)
| false (it is comparing the number 3.99 to the number 10.00 ) |
3 gt 2
|
true
|
1 = 1.0
|
true
|
xs:float("NaN") = xs:float("NaN")
|
false
|
xs:string(xs:float("NaN")) = "NaN"
|
true
|
Numeric values can also be compared using the value comparison operators: eq
, ne
, lt
, le
, gt
, and ge
. However, the value comparison operators treat every untyped operand like a string, even if the other operand is numeric. This means that if you want a numeric comparison, you have to say so, by using an explicit cast.
The value INF
(positive infinity) is greater than all other values, and -INF
(negative infinity) is less than all other values, but each equals itself. The value NaN
cannot usefully be compared with any other value (including itself) by using comparison operators, because the result of the comparison operation is always false
(unless the operator is !=
, in which case it’s always true
). To determine whether a value is NaN
, you can compare its string value to the string NaN
, as in string($myVal) = "NaN"
. In some functions, such as the distinct-values
function, NaN
is considered to be equal to itself.
The following typical arithmetic operations can be performed on numeric values:
Addition and subtraction using the plus (+) and the minus (-) sign
Integer division (with results truncated) using the idiv
operator
Modulus (the remainder of a division) using the mod
operator
Some of these arithmetic operators can be used on date and time types in addition to numeric types. Date/time arithmetic is described in “Using Arithmetic Operators on Dates, Times, and Durations”.
If the value NaN
is involved in an arithmetic operation (and the other operand is not the empty sequence), the result is always NaN
. If the empty sequence is used in an arithmetic operation, the result is always the empty sequence. It is important to understand that the empty sequence is different from zero. For example, $prod/price - $prod/discount
is equal to the empty sequence (not the value of $prod/price
) if there is no element that matches the $prod/discount
path.
Arithmetic operators cannot accept a sequence of more than one value as one of their operands. For example:
doc("prices.xml")//price * 2
will raise a type error because more than one price
element is returned by the path expression. To perform an arithmetic operation on a sequence of values, you can put parentheses around the arithmetic operation, as in:
doc("prices.xml")//(price * 2)
which will perform the operation on each price element individually and return a sequence of doubled price values. You could get the same results by using a simple map operator, as in:
doc("prices.xml")//price ! (. * 2)
When an operation is performed on two values that are the same type, the result is also a value of that type. For example, adding two xs:integer
values results in an xs:integer
. However, if an operation is performed on values of two different numeric types, one value is promoted to the type of the other value. For example, adding an xs:decimal
to an xs:float
results in an xs:float
. This is true for all arithmetic operations except division of two xs:integer
values, which results in an xs:decimal
, and integer division, which always results in an xs:integer
.
If an untyped value is used in an arithmetic operation, it is automatically cast to xs:double
. For example, when adding the xs:integer
2
to the untyped value 3
, the untyped value is cast to xs:double
, and the result is the xs:double
value 5
. All non-numeric types must be explicitly cast to a numeric type before being used in an arithmetic operation.
Atomization occurs on the operands of arithmetic expressions. This means that the operations can be performed on nodes that contain numeric values, as well as numeric atomic values themselves. For example, an arithmetic expression might be ($price * 2)
if $price
is bound to a single node that contains a numeric value. For more information on atomization, see “Atomization”.
Multiplication and division take precedence over addition and subtraction, as is customary in mathematical expressions. For example, 2 + 3 * 5
is equal to 2 + (3 * 5)
, or 17
, rather than (2 + 3) * 5
, or 25
. The unary minus operator has precedence over all others. For example, - 3 + 5
is equal to 2
, not
-(3 + 5)
, or -8
.
Multiplication and division operators (*, div
, idiv
, and mod
) have equal precedence and are evaluated from left to right. Likewise, addition and subtraction operators have equal precedence and are evaluated from left to right. When in doubt, it is a good practice to use parentheses to delimit expressions for the sake of clarity.
Addition, subtraction, and multiplication are straightforward. Table 17-3 shows some examples.
Example | Return value | Return type |
---|---|---|
5 + 3
|
8
|
xs:integer
|
5 + 3.0
|
8
|
xs:decimal
|
5 + 3.0E0
|
8
|
xs:double
|
5 * 3
|
15
|
xs:integer
|
2 + 3 * 5
|
17
|
xs:integer
|
(2 + 3) * 5
|
25
|
xs:integer
|
- 3 + 5
|
2
|
xs:integer
|
() + 3
|
()
| N/A |
doc("prices.xml")//prod[1]/price+5
|
34.99
|
xs:double
|
doc("prices.xml")//prod[1]/price-5
|
()
| N/A |
doc("prices.xml")//prod[1]/price - 5
|
24.99
|
xs:double
|
Generally, you are not required to put whitespace before or after arithmetic operators. For example, price+5
, with no spaces, is a valid expression meaning “the value of the price
child plus 5.” However, there is a special rule for subtraction. Because the hyphen (-
) is a valid character in XML names, it is necessary to put whitespace after any valid XML name that precedes it. For example, price-5
is interpreted as a single name, so to subtract, you should use price - 5
instead. (The space after the hyphen is technically unnecessary since a name cannot start with a hyphen, but it looks cleaner.)
There are two division operators: div
and idiv
. A slash (/) cannot be used to indicate division because the / operator is used to delimit steps in a path expression. The div
operator is used to perform division of the first operand (the dividend) by the second operand (the divisor). If both numbers being divided are xs:integer
-based values, the result is an xs:decimal
. Otherwise, normal type promotion rules apply, and the type of the result is the same as the type of the operands.
The idiv
operator is used to divide two numbers and obtain the integer portion of the division result. The operands can have any numeric type. If the result of the division is not an even integer, the decimal portion of the number is truncated rather than rounded. For example, (14 div 4)
is equal to 3.5
, but (14 idiv 4)
is equal to 3
.
Table 17-4 shows examples of the div
and idiv
operators.
Attempting to divide by zero will raise error FOAR0001
when using the idiv
operator or when using the div
operator with values of type xs:integer
or xs:decimal
. Dividing by zero using the div
operator on values of type xs:float
or xs:double
will not raise an error; it will return NaN
(if the dividend is 0
), or INF
or -INF
.
When using idiv
, error FOAR0002
will be raised if either operand is NaN
, or if the divisor is INF
or -INF
.
The mod
operator is used to obtain the remainder after dividing the first operand (the dividend) by the second operand (the divisor). For example, (14 mod 4)
equals 2
. The sign of the result is the same as the sign of the first operand. Table 17-5 shows examples of the mod
operator.
Example | Return value | Return type |
---|---|---|
14 mod 4
|
2
|
xs:integer
|
-14 mod 4
|
-2
|
xs:integer
|
14 mod -4
|
2
|
xs:integer
|
14.9 mod 2.1
|
0.2
|
xs:decimal
|
14.5E1 mod 2E1
|
5
|
xs:double
|
xs:float("14") mod 0
|
NaN
|
xs:float
|
xs:double("INF") mod 2
|
NaN
|
xs:double
|
14 mod ()
|
()
| N/A |
14 mod xs:double("INF")
|
14
|
xs:double
|
Special rules, depicted in Table 17-6, apply when one of the operands is INF
, -INF
, or 0
. The cases marked "NaN
or division by zero” depend on the types of the operands. If the operands are of type xs:decimal
or xs:integer
, a “division by zero” error FOAR0001
is raised. Otherwise, NaN
is returned.
Divisor | ______________ | Dividend | ______________ |
---|---|---|---|
INF or -INF | Finite number | 0 or -0 | |
INF or -INF |
NaN
| The dividend | The dividend |
Finite number |
NaN
| The remainder |
0
|
0 or -0 |
NaN
| NaN or division by zero | NaN or division by zero |
XQuery provides built-in functions that operate on numeric values. Some operate on single numeric values. They are summarized in Table 17-7 and covered in more detail in Appendix A. Each of these functions returns a numeric value whose type is the same as its argument, or the empty sequence if the argument is the empty sequence.
Function name | Description |
---|---|
round
| The argument rounded to the nearest whole number |
round-half-to-even
| The argument rounded, with half values rounded to the nearest even number |
floor
| The largest whole number that is not greater than the argument |
ceiling
| The smallest whole number that is not smaller than the argument |
abs
| The absolute value of the argument |
Table 17-8 lists some additional functions that can be used to aggregate or summarize numeric data. These functions accept a sequence of numeric values and return a single numeric result. All of these functions will automatically cast untyped values to xs:double
, so it is not necessary to perform any explicit casting to have the values treated like numbers.
Function name | Description |
---|---|
avg
| The average of a sequence of numbers |
sum
| The sum of a sequence of numbers |
min
| The minimum value of a sequence of numbers |
max
| The maximum value of a sequence of numbers |
Table 17-9 lists the trigonometric and exponential functions that were introduced in version 3.0. They are all in the namespace http://www.w3.org/2005/xpath-functions/math
, commonly associated with the prefix math
.
Function name | Description |
---|---|
math:acos
| The arc cosine of the argument |
math:asin
| The arc sine of the argument |
math:atan
| The arc tangent of the argument |
math:atan2
| The arc tangent based on two arguments |
math:cos
| The cosine of the argument |
math:exp
| The value of ex, where x is the argument |
math:exp10
| The value of 10x, where x is the argument |
math:log
| The natural logarithm of the argument |
math:log10
| The base-ten logarithm of the argument |
math:pi
| An approximation of the mathematical constant π (pi) |
math:pow
| The result of raising the first argument to the power of the second |
math:sin
| The sine of the argument |
math:sqrt
| The non-negative square root of the argument |
math:tan
| The tangent of the argument |
XQuery 3.0 adds some capabilities to format numbers according to specified patterns. They consist of two functions: format-integer
and format-number
. There is additionally a prolog declaration that specifies defaults for formatting decimal numbers.
The format-integer
function formats an integer according to the pattern you specify in the $picture
(second) argument. This can be used to pad integers with leading zeros, insert grouping separators between segments of an integer, or convert integers to other formats like letters or Roman numerals.
The $picture
argument can take one of several forms. It can be a decimal digit pattern that shows optional and mandatory digits, along with grouping separators. For example, the pattern 0000
represents four mandatory digits, so the result will contain at least four digits (padding with leading zeros if necessary). The pattern #,##0
uses the optional digit sign #
, which will not insert leading zeros but will cause a comma to be inserted between each group of three digits that is present.
It can instead be a letter indicating a style of number, for example a
to use lowercase letters, I
to use uppercase Roman numerals, or w
to use lowercase words (“one”, “two”, etc.)
Ordinal numbers (1st, 2nd, etc.) are handled by appending ;o
to the end of the $picture
argument.
Table 17-10 shows some examples. A much more detailed explanation of the syntax of the format-integer
function and its $picture
argument can be found in Appendix A, in the “format-integer” section.
Example | Return value | Purpose |
---|---|---|
format-integer(123, '0000')
|
0123
| Adding leading zeros |
format-integer(12345678, '#,##0')
|
12,345,678
| Inserting commas between groups |
format-integer(4, 'a')
|
d
| Using other numbering systems (letters, Roman numerals, etc.) |
format-integer(4, 'Ww')
|
Four
| Using words |
format-integer(15, '0;o')
|
15th
| Using ordinal numbers |
The format-number
function formats a number according to the pattern in the $picture
argument, and optionally according to the decimal format in the $decimal-format-name
argument. It can be used to format values of any numeric type, but for integers it may be best to use the format-integer
function described in the previous section, because of its support for different number styles.
The $picture
argument can be used to pad numbers with leading or trailing zeros, insert grouping separators between segments of a number, or show numbers with exponents. It is a decimal digit pattern that shows optional and mandatory digits, along with grouping separators. For example, the pattern 0000
represents four mandatory digits, so the result will contain at least four digits (padding with leading zeros if necessary). The pattern #,##0
uses the optional digit sign #
, which will not insert leading zeros but will cause a comma to be inserted between each group of three digits that is present.
The optional $decimal-format-name
argument allows the specification of a decimal format, which controls other aspects of formatting the number, including the characters used for the decimal separator and grouping separator, the value to show when the number is NaN
, and so on.
Table 17-11 shows some examples. A much more detailed explanation of the syntax of the $picture
argument and decimal formats can be found in Appendix A, in the “format-number” section.
Example | Return value | Purpose |
---|---|---|
format-number(12, '0000.00')
|
0012.00
| Padding with leading and trailing zeros |
format-number(12345.6, '#,###.0')
|
12,345.6
| Inserting commas between groups |
format-number(-1, "#,##0.00;(#,##0.00)")
|
(1.00)
| Using different formats for negative numbers |
format-number(0.18, '0%')
|
18%
| Calculating percentages |
format-number(12, 'Number: 0')
|
Number: 12
| Inserting other characters before or after |
A decimal format declaration can be used to override the default values for decimal separators, grouping separators, and other settings used by the format-number
function. For example, the default value for a decimal separator is a period, and the default value for a grouping separator is a comma, so a decimal number might be formatted as 1,000.50
. To reverse these two characters, so that the number would be formatted as 1.000,50
as is customary in some countries, the following default decimal declaration could be used:
declare default decimal-format decimal-separator="," grouping-separator=".";
The syntax of a decimal format declaration is shown in Figure 17-1. It is possible to specify a default decimal declaration, as shown in the previous example, which is used by the two-argument version of the format-number
. If you need to format numbers multiple different ways in the same query, you can also name your decimal format declarations and pass that name to the format-number
function as the third argument.
Any of the available properties shown in the syntax diagram can be specified. Most of the properties must be a single character, except for infinity
and NaN
, which can be strings of any length. The characters must be unique; it is not acceptable to use the same character for two different purposes. If a property is not included in the decimal format declaration, a default value is used.
An explanation of each of the keywords is provided in Table 17-12.