Now that you are an expert on the syntax of XQuery expressions, let’s look at some more advanced queries. This chapter describes syntax and techniques for some commonly requested query capabilities. You may have these same requirements for your queries, but even if you don’t, this chapter will show you some creative ways to apply XQuery syntax.
Determining positions and generating sequence numbers are sometimes challenging to query authors who are accustomed to procedural programming languages. Because XQuery is a declarative rather than a procedural language, it is not possible to use familiar techniques like counters. In addition, the sorting and filtering of results can interfere with sequence numbers. This section describes some techniques for working with positions and sequence numbers.
Suppose you want to return a list of product names preceded by a sequence number. Your first approach might be to use a variable as a counter, as shown in Example 9-1. However, the results are not what you might expect. This is because the return
clause for each iteration is evaluated in parallel rather than sequentially. This means that you cannot make changes to the value of a variable in one iteration, and expect it to affect the next iteration of the for
clause. At the beginning of every iteration, the $count
variable is equal to 0.
Query
let
$
count
:=
0
for
$
prod
in
doc
(
"catalog.xml"
)//
product
[
@dept
=
(
"ACC"
,
"WMN"
)]
let
$
count
:=
$
count
+
1
return
<p>
{
$
count
}
.
{
data
(
$
prod
/
name
)}
</p>
Results
<p>
1. Fleece Pullover</p>
<p>
1. Floppy Sun Hat</p>
<p>
1. Deluxe Travel Bag</p>
Another temptation might be to use the position
function, as shown in Example 9-2. However, this will return the same results as the previous example. In XQuery, unlike in XSLT, the position
function only has meaning inside a predicate in a path expression. In this case, the value bound to $prod
is no longer in the context of three product items; therefore, it has no relative sequence number within that sequence.
position
functionfor
$
prod
in
doc
(
"catalog.xml"
)//
product
[
@dept
=
(
"ACC"
,
"WMN"
)]
return
<p>
{
$
prod
/
position
()}
.
{
data
(
$
prod
/
name
)}
</p>
Luckily, FLWORs have a special syntax that enables you to define a positional variable in the for
clause. This variable, which is preceded by the keyword at
, is bound to an integer representing the iteration number, as shown in Example 9-3.
for
clauseQuery
for
$
prod
at
$
count
in
doc
(
"catalog.xml"
)//
product
[
@dept
=
(
"ACC"
,
"WMN"
)]
return
<p>
{
$
count
}
.
{
data
(
$
prod
/
name
)}
</p>
Results
<p>
1. Fleece Pullover</p>
<p>
2. Floppy Sun Hat</p>
<p>
3. Deluxe Travel Bag</p>
However, the positional variable in the at
clause does not always work. For example, suppose you wanted to use both where
and order by
clauses in your FLWOR. This interferes with the sequencing, as shown in Example 9-4. First, the numbers are not in ascending order. This is because the results are ordered after the positional number is evaluated. Another problem is that the sequence numbers are 2, 3, and 4 instead of 1, 2, and 3. This is because there are four product
elements returned in the for
clause, and the first one was eliminated by the where
clause.
where
clauseQuery
for
$
prod
at
$
count
in
doc
(
"catalog.xml"
)//
product
where
$
prod
/
@dept
=
(
"ACC"
,
"MEN"
)
order by
$
prod
/
name
return
<p>
{
$
count
}
.
{
data
(
$
prod
/
name
)}
</p>
Results
<p>
4. Cotton Dress Shirt</p>
<p>
3. Deluxe Travel Bag</p>
<p>
2. Floppy Sun Hat</p>
One way to resolve this is by embedding a second FLWOR in the let
clause, as shown in Example 9-5. This embedded FLWOR returns all the products sorted and filtered appropriately. Then, the for
clause contained in the main FLWOR uses the positional variable on the sorted sequence.
where
clauseQuery
let
$
sortedProds
:=
for
$
prod
in
doc
(
"catalog.xml"
)//
product
where
$
prod
/
@dept
=
"ACC"
or
$
prod
/
@dept
=
"MEN"
order by
$
prod
/
name
return
$
prod
for
$
sortedProd
at
$
count
in
$
sortedProds
return
<p>
{
$
count
}
.
{
data
(
$
sortedProd
/
name
)}
</p>
Results
<p>
1. Cotton Dress Shirt</p>
<p>
2. Deluxe Travel Bag</p>
<p>
3. Floppy Sun Hat</p>
count
ClauseStarting in version 3.0, the count
clause offers a new way to address the challenges of keeping track of position. Its simple syntax is shown in Figure 9-1.
Like the at
keyword in a for
clause, the count
clause creates a variable whose value is the iteration number. It is more useful, though, because it can appear anywhere between the first clause and the return
clause of a FLWOR. Example 9-6 shows how you can use it after the where
and order by
clauses, eliminating the problem shown in Example 9-4.
count
clausexquery
version
"3.0"
;
for
$
prod
in
doc
(
"catalog.xml"
)//
product
where
$
prod
/
@dept
=
(
"ACC"
,
"MEN"
)
order by
$
prod
/
name
count
$
count
return
<p>
{
$
count
}
.
{
data
(
$
prod
/
name
)}
</p>
Being able to use the count
clause after an order by
clause but before a where
clause also makes it easier to rank items and only return a certain number. Example 9-7 shows how you might return only the highest prices in the price document. It refers to the variable $count
in the where
to limit it to the two highest values.
count
clause for rankingQuery
xquery
version
"3.0"
;
for
$
prod
in
doc
(
"prices.xml"
)//
prod
order by
$
prod
/
price
/
number
(.)
descending
count
$
count
where
$
count
<=
2
return
<p>
{
$
count
}
.
{
data
(
$
prod
/
price
)}
</p>
Results
<p>
1. 69.99</p>
<p>
2. 39.99</p>
The count
clause can also be used after a group by
clause to count the position of the group within all the groups. This is shown in Example 9-8.
count
clause with groupsQuery
xquery
version
"3.0"
;
for
$
item
in
doc
(
"order.xml"
)//
item
group by
$
d
:=
$
item
/
@dept
count
$
count
return
<p>
{
concat
(
"Group "
,
$
count
,
": "
,
$
d
)}
</p>
Results
<p>
Group 1: ACC</p>
<p>
Group 2: MEN</p>
<p>
Group 3: WMN</p>
Sometimes it is also useful to test whether an item is last in a sequence. Earlier in this chapter, we saw that the position
function is not useful unless it is in a predicate. The same is true of the last
function, which limits its usefulness when testing for the last item, for example, in a FLWOR.
Suppose you want to concatenate the names of the products together, separated by commas. At the end of the last product name, you want to specify a period instead of a comma. (Actually, this particular example would be best accomplished using the string-join
function. However, the example is useful for illustrative purposes.) One approach is to bind the number of items to a variable in a let
clause, as shown in Example 9-9.
Query
<p>
{
let
$
prods
:=
doc
(
"catalog.xml"
)//
product
let
$
numProds
:=
count
(
$
prods
)
for
$
prod
at
$
count
in
$
prods
return
if
(
$
count
=
$
numProds
)
then
concat
(
$
prod
/
name
,
"."
)
else
concat
(
$
prod
/
name
,
","
)
}
</p>
Results
<p>
Fleece Pullover, Floppy Sun Hat, Deluxe Travel Bag, Cotton Dress Shirt.</p>
The $numProds
variable is bound to the number of products. A positional variable, $count
, is used to keep track of the iteration number. When the $count
variable equals the $numProds
variable, you have arrived at the last item in the sequence.
Another approach is to use the is
operator to determine whether the current $prod
element is the last one in the sequence. This query is shown in Example 9-10. In this case, it is not necessary to count the number of items or to use a positional variable. The results are the same as in Example 9-9.
is
operator<p>
{
let
$
prods
:=
doc
(
"catalog.xml"
)//
product
for
$
prod
in
$
prods
return
if
(
$
prod
is
$
prods
[
last
()])
then
concat
(
$
prod
/
name
,
"."
)
else
concat
(
$
prod
/
name
,
", "
)
}
</p>
The window
clause, new in version 3.0, provides a way to group items that are adjacent to each other in more sophisticated ways than are possible with just a group by
clause. Like a for
clause, a window
clause creates an iteration over a sequence of items. However, it creates windows based on starting and ending conditions, and evaluates the return
clause once per window.
Example 9-11 uses the window
clause to iterate through a sequence of integers and create a new window when it encounters an even number (where that number mod
2 is 0). The variable $s
is named so that it can be referenced in the when
expression, which specifies the condition that creates a new window. In the return
clause, the variable $w
refers to all the items in the original input sequence that are in the window. Note that in this example, the value 1 does not appear in the results, because there was no window for it.
window
clauseQuery
xquery
version
"3.0"
;
for
tumbling
window
$
w
in
(
1
,
4
,
3
,
12
,
5
,
13
,
8
)
start
$
s
when
$
s
mod
2
=
0
return
<window>
{
$
w
}
</window>
Results
<window>
4 3</window>
<window>
12 5 13</window>
<window>
8</window>
Creating windows with the window
clause is different from creating groups with the group by
clause in several ways. One is that the order of the items is retained, whereas in grouping the items are rearranged into their groups. In Example 9-11, the integers were kept in their original order, not reorganized into two separate groups containing even and odd numbers. Because you retain the order of the items, you can put constraints on items that come before and after the start or end of the window. For example, you can specify a start condition to create a new window if a particular value changes compared to the previous item.
There are two kinds of window
clauses: tumbling and sliding. The difference is that tumbling windows never overlap each other, while sliding windows might. This is another difference from grouping, where the groups cannot overlap. The syntax of a window
clause is shown in Figure 9-2. The diagram shows the end
condition as optional, but it is only optional when the tumbling
option is chosen. With sliding windows, the end
condition is required.
The start
and end
conditions use a series of window variables. Figure 9-3 shows their syntax.
In all, up to nine variables can be declared in a window
clause. They are listed in Table 9-1. All but the first are optional. It is possible to name these variables anything you want, as long as their names are distinct from each other. For clarity, in all of the examples in this chapter, the variables are named the same thing, as listed in the second column of the table. If all nine variables were used (which is very unusual), a window
clause might look like:
for tumbling window $w in (2, 4, 6, 8, 10) start $s at $s-pos previous $s-prev next $s-next when true() end $e at $e-pos previous $e-prev next $e-next when true() return $s
The window variables are declared so that they can be accessed in the when
expression (either in the start
or end
conditions) or in the return
clause. They are optional; if they are not referenced, they do not need to be specified. In Example 9-11, only the $s
variable was used.
start
and end
ConditionsExample 9-11 only had a start
condition, saying when to start a new window. Every item starting with the start item, up until the next start item, was included in the window. It is also possible to use an end
condition to end the window before the next window starts. With tumbling windows, this usually means that you do not want to include all the items in your results, just certain subsets.
Example 9-12 shows some input XML that has sequences of three elements (propname
, value
, and alt-value
). If you want to create a new window when you reach a propname
element, but want to ignore the alt-value
elements, you could end the window when you encounter a value
element, thus excluding alt-value
elements from the results.
end
conditionQuery
xquery
version
"3.0"
;
let
$
props
:=
<properties>
<propname>
x
</propname>
<value>
xval
</value>
<alt-value>
xval2
</alt-value>
<propname>
y
</propname>
<value>
yval
</value>
<alt-value>
yval2
</alt-value>
</properties>
for
tumbling
window
$
w
in
$
props
/
*
start
$
s
when
name
(
$
s
)
=
'propname'
end
$
e
when
name
(
$
e
)
=
'value'
return
<property>
{
$
w
}
</property>
Results
<property>
<propname>
x</propname>
<value>
xval</value>
</property>
<property>
<propname>
y</propname>
<value>
yval</value>
</property>
The start-item-position and end-item-position variables are useful for creating windows based on the position of the item in the sequence. For example, to create windows of three items, one way would be to test the difference between the start and end positions, as shown in Example 9-13.
Query
xquery
version
"3.0"
;
for
tumbling
window
$
w
in
(
1
,
4
,
3
,
12
,
5
,
13
,
8
)
start
at
$
s-pos
when
true
()
end
at
$
e-pos
when
$
e-pos
-
$
s-pos
=
2
return
<window>
{
$
w
}
</window>
Results
<window>
1 4 3</window>
<window>
12 5 13</window>
<window>
8</window>
In that example, the when
expression in the start
condition is set to true()
, meaning that a new window should always start immediately after another one ends (or at the beginning). In the end
condition, it tests the difference between the positions of the start item and the end item, and if it is equal to two, that means there are three items in the window and it should end the window. In the results, the value 8 appears in a window by itself because that window was started, even though it never satisfied the end condition.
Example 9-14 shows a slightly revised query that uses only end
instead of just end
. This has the effect of removing the last window that contained only 8, because the end condition was never true.
only
keywordQuery
xquery
version
"3.0"
;
for
tumbling
window
$
w
in
(
1
,
4
,
3
,
12
,
5
,
13
,
8
)
start
at
$
s-pos
when
true
()
only
end
at
$
e-pos
when
$
e-pos
-
$
s-pos
=
2
return
<window>
{
$
w
}
</window>
Results
<window>
1 4 3</window>
<window>
12 5 13</window>
The previous
and next
keywords in a window
clause allow you to create windows by testing the previous or next items. This is useful for testing if something has changed. For example, suppose you want to iterate through the order document and display some information from it, but you want to put a header in every time the department changes. You don’t want to group by department, because for whatever reason, it is important that the items retain their original order. An example of this is shown in Example 9-15.
previous
keywordQuery
xquery
version
"3.0"
;
for
tumbling
window
$
w
in
doc
(
"order.xml"
)//
item
start
$
s
previous
$
s-prev
when
$
s
/
@dept
!=
$
s-prev
/
@dept
return
(
<p>
Department
{
data
(
$
s
/
@dept
)}
</p>
,
$
w
)
Results
<p>
Department ACC</p>
<item
dept=
"ACC"
num=
"563"
quantity=
"1"
/>
<item
dept=
"ACC"
num=
"443"
quantity=
"2"
/>
<p>
Department MEN</p>
<item
dept=
"MEN"
num=
"784"
quantity=
"1"
color=
"white"
/>
<item
dept=
"MEN"
num=
"784"
quantity=
"1"
color=
"gray"
/>
<p>
Department WMN</p>
<item
dept=
"WMN"
num=
"557"
quantity=
"1"
color=
"black"
/>
The example uses the keyword previous
and declares a variable $s-prev
that will contain the previous item. The when
expression then checks if the current and previous dept
attributes are equal to each other. If not, a new window is created, which allows the insertion of a p
element at the beginning of the window.
However, you may have noticed that the first item in the order is missing from the results. This is because $s-prev
is the empty sequence when it encounters the first item, and comparing anything to the empty sequence is always false
. This could be remedied by adding an additional test for the existence of $s-prev
, as in:
start $s previous $s-prev when $s/@dept != $s-prev/@dept or not($s-prev)
or alternatively, you could compare their string values, as in:
start $s previous $s-prev when string($s/@dept) != string($s-prev/@dept)
It is possible to specify a similar condition in the end
condition by checking the next item, as shown in Example 9-16, where we use the technique of comparing string values to avoid dropping the last item.
next
keywordQuery
xquery
version
"3.0"
;
for
tumbling
window
$
w
in
doc
(
"order.xml"
)//
item
start
$
s
when
true
()
end
$
e
next
$
e-next
when
string
(
$
e
/
@dept
)
!=
string
(
$
e-next
/
@dept
)
return
(
<p>
Department
{
data
(
$
s
/
@dept
)}
</p>
,
$
w
)
Results
<p>
Department WMN</p>
<item
dept=
"WMN"
num=
"557"
quantity=
"1"
color=
"navy"
/>
<p>
Department ACC</p>
<item
dept=
"ACC"
num=
"563"
quantity=
"1"
/>
<item
dept=
"ACC"
num=
"443"
quantity=
"2"
/>
<p>
Department MEN</p>
<item
dept=
"MEN"
num=
"784"
quantity=
"1"
color=
"white"
/>
<item
dept=
"MEN"
num=
"784"
quantity=
"1"
color=
"gray"
/>
<p>
Department WMN</p>
<item
dept=
"WMN"
num=
"557"
quantity=
"1"
color=
"black"
/>
Sliding windows differ from tumbling windows in that the windows can overlap. This can be useful for things like moving averages, or for data that naturally overlaps, like overlapping events over time. Example 9-14 showed how to create windows with three items each. Using tumbling windows, each item was only in one window. If we change that example slightly to make it a sliding window, as shown in Example 9-17, more windows are created, with overlapping values.
Query
xquery
version
"3.0"
;
for
sliding
window
$
w
in
(
1
,
4
,
3
,
12
,
5
,
13
,
8
)
start
at
$
s-pos
when
true
()
only
end
at
$
e-pos
when
$
e-pos
-
$
s-pos
=
2
return
<window>
{
$
w
}
</window>
Results
<window>
1 4 3</window>
<window>
4 3 12</window>
<window>
3 12 5</window>
<window>
12 5 13</window>
<window>
5 13 8</window>
You could calculate the integer average of each window by changing the return
clause to:
return round(avg($w))
which would result in the sequence:
3 6 7 10 9
Often you will want to include elements from an input document, but with minor modifications. For example, you may wish to eliminate or add attributes, or change their names. However, the XQuery language does not have any special functions or operators that perform these minor modifications. For example, there is no direct syntax that means “select all the product
elements, but leave out their dept
attributes.”
The good news is that you can accomplish these modifications by “reconstructing” the elements. For example, you can write a query to construct a new product
element and include all the children and attributes (except dept
) of the original product
element in the input document. Even better, you can write a user-defined function that uses computed constructors to handle these modifications in the general case. This section describes some common modifications and provides useful functions to handle these cases.
Note that these functions are intended to change the way elements and attributes appear in the results of a query, not to update them in an XML database. To update your XML database, you should check the documentation for your XQuery implementation to find out about its update capabilities, which may consist of support for the XQuery Update Facility, or implementation-specific update functions.
To add an attribute to an element, you could use a function like the one shown in Example 9-18. It takes as arguments a sequence of elements, along with a sequence of attribute names and values, and returns newly constructed elements with those attributes added.
functx:add-attributes
declare
namespace
functx
=
"http://www.functx.com"
;
declare
function
functx:add-attributes
(
$
elements
as
element
()*,
$
attrNames
as
xs:QName
*
,
$
attrValues
as
xs:anyAtomicType
*
)
as
element
()*
{
for
$
element
in
$
elements
return
element
{
node-name
(
$
element
)}
{
for
$
attrName
at
$
seq
in
$
attrNames
return
if
(
$
element
/
@*
[
node-name
(.)
=
$
attrName
])
then
()
else
attribute
{
$
attrName
}
{
$
attrValues
[
$
seq
]},
$
element
/
@*
,
$
element
/
node
()
}
};
The function makes use of computed constructors to create dynamically new elements with the same names as the original elements passed to the function. It also uses a computed constructor to create the attributes, specifying their names and values as expressions.
It then copies the attribute and child nodes from the original elements. The expression $element/node()
is used rather than *
because node()
will return text, processing instruction, and comment nodes in addition to child elements. However, node()
does not return attributes, so a separate expression $element/@*
is used to copy those.
The expression:
functx:add-attributes(doc("catalog.xml")//product, xs:QName("xml:lang"), "en")
uses this function to return all the product
elements from the catalog, with an additional xml:lang="en"
attribute.
Removing attributes also requires the original element to be reconstructed. Example 9-19 shows a function that “removes” attributes from a sequence of elements. It does this by reconstructing the elements, copying the content and all the attributes, except those whose names are specified in the second argument.
functx:remove-attributes
declare
namespace
functx
=
"http://www.functx.com"
;
declare
function
functx:remove-attributes
(
$
elements
as
element
()*,
$
names
as
xs:string
*
)
as
element
()*
{
for
$
element
in
$
elements
return
element
{
node-name
(
$
element
)}
{
$
element
/
@*
[
not
(
name
()
=
$
names
)],
$
element
/
node
()
}
};
The function will accept a sequence of strings that represent attribute names to remove. For example, the expression:
functx:remove-attributes(doc("order.xml")//item, ("quantity", "color"))
returns all the item
elements from the order document, minus the quantity
and color
attributes. The extra parentheses are necessary around the "quantity"
and "color"
strings to combine them into a single argument that is a sequence of two items. Otherwise, they would be considered two separate arguments.
Notice that the predicate [not(name() = $names)]
does not need to explicitly iterate through each of the strings in the $names
sequence. This is because the =
operator, unlike the eq
operator, can be used on lists of values. The comparison will return true for every attribute name that is equal to any one of the strings in $names
. Using the not
function means that the predicate will allow through only the attributes whose names do not match any of the strings in $names
.
It may seem simpler to use the !=
operator rather than the not
function, but that does not have the same meaning. If the predicate were [name() != $names]
, it would return all the attributes whose names don’t match either
quantity
or color
. This means that it will return all the attributes, because no single attribute name will match both strings.
You could go further and remove attributes from an element as well as remove all its descendants. The recursive function functx:remove-attributes-deep
, shown in Example 9-20, accomplishes this.
functx:remove-attributes-deep
declare
namespace
functx
=
"http://www.functx.com"
;
declare
function
functx:remove-attributes-deep
(
$
nodes
as
node
()*,
$
names
as
xs:string
*
)
as
node
()*
{
for
$
node
in
$
nodes
return
if
(
$
node
instance
of
element
())
then
element
{
node-name
(
$
node
)}
{
$
node
/
@*
[
not
(
name
()
=
$
names
)],
functx:remove-attributes-deep
(
$
node
/
node
(),
$
names
)}
else
if
(
$
node
instance
of
document-node
())
then
functx:remove-attributes-deep
(
$
node
/
node
(),
$
names
)
else
$
node
};
This function uses an algorithm similar to the previous example, but it differs in the way it processes the children. Instead of simply copying all the child nodes, the function uses a FLWOR to iterate through them. If a child is an element, it recursively calls itself to process that child. If the node is a document node, it calls functx:remove-attributes-deep
on its child (root) element. If the child is not an element or document node (for example, if it is a text or processing-instruction node), the function returns it as is. Iterating through all of the child nodes in a FLWOR preserves their original order in the results.
The three previous functions relate to adding and removing attributes, and they could apply equally to child elements. For example, the function in Example 9-21 could be used to eliminate certain elements from a document by name. It is very similar to functx:remove-attributes-deep
, in Example 9-20, except that it removes child elements instead of the attributes.
functx:remove-elements-deep
declare
namespace
functx
=
"http://www.functx.com"
;
declare
function
functx:remove-elements-deep
(
$
nodes
as
node
()*,
$
names
as
xs:string
*
)
as
node
()*
{
for
$
node
in
$
nodes
return
if
(
$
node
instance
of
element
())
then
if
(
name
(
$
node
)
=
$
names
)
then
()
else
element
{
node-name
(
$
node
)}
{
$
node
/
@*
,
functx:remove-elements-deep
(
$
node
/
node
(),
$
names
)}
else
if
(
$
node
instance
of
document-node
())
then
functx:remove-elements-deep
(
$
node
/
node
(),
$
names
)
else
$
node
};
Another common use case is to remove certain elements but keep their contents. For example, if you want to remove any inline formatting from the desc
element in the product catalog, you will want to remove any i
or b
tags, but keep the content of those elements. You can use the function shown in Example 9-22 for that.
functx:remove-elements-not-contents
declare
namespace
functx
=
"http://www.functx.com"
;
declare
function
functx:remove-elements-not-contents
(
$
nodes
as
node
()*,
$
names
as
xs:string
*
)
as
node
()*
{
for
$
node
in
$
nodes
return
if
(
$
node
instance
of
element
())
then
if
(
name
(
$
node
)
=
$
names
)
then
functx:remove-elements-not-contents
(
$
node
/
node
(),
$
names
)
else
element
{
node-name
(
$
node
)}
{
$
node
/
@*
,
functx:remove-elements-not-contents
(
$
node
/
node
(),
$
names
)}
else
if
(
$
node
instance
of
document-node
())
then
functx:remove-elements-not-contents
(
$
node
/
node
(),
$
names
)
else
$
node
};
Another minor modification is to change the names of certain elements. This could be useful for implementing changes to an XML vocabulary or for language translation.
For example, suppose you want to preserve the structure of the order document but want to change the name order
to purchaseOrder
, and the name item
to purchasedItem
. Since the document is very simple, this could be done in a hardcoded way by using direct constructors. However, there may be situations where the document is more complex and you do not know exactly where these elements appear, or they appear in a variety of places in different document types.
You can modify names more flexibly using the function shown in Example 9-23. The function provided here is slightly altered from the version in the FunctX function library in order to make it more clear for educational purposes, but the result is the same.
functx:change-element-names-deep
declare
namespace
functx
=
"http://www.functx.com"
;
declare
function
functx:change-element-names-deep
(
$
nodes
as
node
()*,
$
oldNames
as
xs:QName
*
,
$
newNames
as
xs:QName
*
)
as
node
()*
{
if
(
count
(
$
oldNames
)
!=
count
(
$
newNames
))
then
error
(
xs:QName
(
"Different_Number_Of_Names"
))
else
for
$
node
in
$
nodes
return
if
(
$
node
instance
of
element
())
then
let
$
newName
:=
if
(
node-name
(
$
node
)
=
$
oldNames
)
then
$
newNames
[
index-of
(
$
oldNames
,
node-name
(
$
node
))]
else
node-name
(
$
node
)
return
element
{
$
newName
}
{
$
node
/
@*
,
functx:change-element-names-deep
(
$
node
/
node
(),
$
oldNames
,
$
newNames
)}
else
$
node
};
Like the functx:remove-attributes-deep
function, functx:change-element-names-deep
calls itself recursively to traverse the entire XML document. Every time it finds an element, it figures out what its new name should be and binds it to the variable $newName
. If it finds its original name in the $oldNames
sequence, it selects the name in $newNames
that is in the same position. Otherwise, it uses the original name. It then reconstructs the element using the new name, copies all of its attributes, and recursively calls itself to process all of its children.
Example 9-24 calls the functx:change-element-names-deep
function with sequences of old and new names. It returns the same data and basic structure as the order.xml document, except with the two names changed.
functx:change-element-names-deep
functionQuery
declare
namespace
functx
=
"http://www.functx.com"
;
let
$
order
:=
doc
(
"order.xml"
)/
order
let
$
oldNames
:=
(
xs:QName
(
"order"
),
xs:QName
(
"item"
))
let
$
newNames
:=
(
xs:QName
(
"purchaseOrder"
),
xs:QName
(
"purchasedItem"
))
return
functx:change-element-names-deep
(
$
order
,
$
oldNames
,
$
newNames
)
Results
<purchaseOrder
num=
"00299432"
date=
"2015-09-15"
cust=
"0221A"
>
<purchasedItem
dept=
"WMN"
num=
"557"
quantity=
"1"
color=
"navy"
/>
<purchasedItem
dept=
"ACC"
num=
"563"
quantity=
"1"
/>
<purchasedItem
dept=
"ACC"
num=
"443"
quantity=
"2"
/>
<purchasedItem
dept=
"MEN"
num=
"784"
quantity=
"1"
color=
"white"
/>
<purchasedItem
dept=
"MEN"
num=
"784"
quantity=
"1"
color=
"gray"
/>
<purchasedItem
dept=
"WMN"
num=
"557"
quantity=
"1"
color=
"black"
/>
</purchaseOrder>
Your query results may consist of several FLWORs or other expressions that each return a result sequence. In addition, the sequences you use in your for
and let
clauses may be composed from more than one sequence.
There are four ways to combine two or more sequences to form a third sequence. They differ in which items are selected, whether their order is affected, whether duplicates are eliminated, and whether non-nodes are allowed in the sequences.
The first way to merge two sequences is simply to create a third sequence that is the concatenation of the first two. This is known as a sequence constructor, and it uses parentheses and commas to concatenate two sequences together. For example:
let $prods := doc("catalog.xml")//product let $items := doc("order.xml")//item return ($prods, $items)
returns a sequence that is the concatenation of two other sequences, $prods
and $items
. The items in $prods
are first in the sequence, then the items in $items
, in the order they appear in that sequence. No attempt is made to eliminate duplicates or sort the items in any way.
Note that concatenation is the only way to combine sequences that contain items that are not nodes; union
, intersect
, and except
work on sequences that contain nodes only.
union
ExpressionAnother approach to combining sequences of nodes is via a union
expression, which is indicated by the keyword union
or the vertical bar character (|). The two operators have the exact same meaning. The resulting nodes are rearranged into document order. Path expressions such as:
doc("catalog.xml")//product/(number | name)
use the vertical bar operator to select the union of the number
children and the name
children of product
. An equivalent alternative is to use the vertical bar operator to separate two entire multistep path expressions, as in:
doc("catalog.xml")//product/number | doc("catalog.xml")//product/name
Unlike simple concatenation, using a union expression eliminates duplicate nodes. Duplicates are determined based on node identity, not typed value or string value. That means an expression like:
doc("catalog.xml")//product/@dept | doc("order.xml")//item/@dept
results in all 10 dept
attributes (four from catalog.xml and six from order.xml), because it does not eliminate the duplicate department values.
A union eliminates duplicate nodes not just between the sequences, but also within either of the original sequences.
intersect
ExpressionAn intersect
expression results in a sequence that contains only those nodes that are in both of the original sequences. As with union
expressions, duplicate nodes (based on node identity) are eliminated, and the resulting nodes are rearranged into document order. For example, the expression:
let $prods := doc("catalog.xml")//product
return $prods[@dept = "ACC"] intersect
$prods[number = 443]
returns the third product
element in the catalog.
except
ExpressionAn except
expression results in a sequence that contains only nodes that are in the first sequence, but not in the second. As with union
expressions, duplicate values (based on node identity) are eliminated, and the resulting nodes are rearranged into document order. For example, the expression:
doc("catalog.xml")//product/(* except
number)
returns all the element children of product
except for number
elements. The parentheses are required because the slash (/) operator has precedence over the except
operator. Without the parentheses, it would be interpreted as two separate path expressions: doc("catalog.xml")//product/*
and number
. This is equally true for the other operators in this section.
When we think of XML structures, we tend to think of the input documents and the results. However, XQuery also allows you to create intermediate XML structures in your queries that are not included in the results. This can be useful for many reasons, among them creating lookup tables and narrowing down input documents to reduce complexity or improve performance.
Suppose you want to create a summary of the product catalog that lists the departments. However, you would like to convert the department codes to more descriptive names. You could use the query shown in Example 9-25 to accomplish this.
Query
let
$
cat
:=
doc
(
"catalog.xml"
)/
catalog
for
$
dept
in
distinct-values
(
$
cat
/
product
/
@dept
)
return
<li>
Department:
{
if
(
$
dept
=
"ACC"
)
then
"Accessories"
else
if
(
$
dept
=
"MEN"
)
then
"Menswear"
else
if
(
$
dept
=
"WMN"
)
then
"Womens"
else
()
}
(
{
$
dept
}
)
</li>
Results
<li>
Department: Womens (WMN)</li>
<li>
Department: Accessories (ACC)</li>
<li>
Department: Menswear (MEN)</li>
This gives the desired results, namely a descriptive name for the department, with the department code in parentheses. However, the query is somewhat cluttered, and anyone maintaining the query would have to be careful to insert any new department codes in the right place, using the right XQuery syntax. A more elegant solution is shown in Example 9-26, which uses an intermediate XML structure as a lookup table. It has the same results as the previous example.
let
$
deptNames
:=
<deptNames>
<dept
code
=
"
ACC
"
name
=
"
Accessories
"
/>
<dept
code
=
"
MEN
"
name
=
"
Menswear
"
/>
<dept
code
=
"
WMN
"
name
=
"
Womens
"
/>
</deptNames>
let
$
cat
:=
doc
(
"catalog.xml"
)/
catalog
for
$
dept
in
distinct-values
(
$
cat
/
product
/
@dept
)
return
<li>
Department:
{
data
(
$
deptNames
/
dept
[
@code
=
$
dept
]/
@name
)
}
(
{
$
dept
}
)
</li>
Just as you can use path expressions on the input documents, you can use them to traverse the intermediate XML structure. So, the expression:
$deptNames/dept[@code = $dept]/@name
traverses the deptNames
structure looking for the department name where the code matches the department code in question.
This solution is easier to maintain and it makes the mappings more obvious. Of course, if this is a general-purpose and unchanging lookup table that might be used in many queries, it can alternatively be stored as a separate XML document that is referenced using the doc
function.
In the previous example, the intermediate XML was hardcoded into the query. You can also build a temporary XML structure in the query, based on values from the input data. This can be handy to reduce the complexity of an input document before performing further querying or transformation on it.
Suppose you want to perform a join on the product catalog and the order. With the results of the join, you want to create an HTML table that formats the information. While this can probably be done in one FLWOR, it may be easier to write (and read!) a query that does it in two steps.
Such a query is shown in Example 9-27. It constructs a series of item
elements that have attributes representing all the data items from the join. It binds the variable $tempResults
to the six resulting item
elements by using a let
clause. Within the main return
clause, it uses an embedded FLWOR to iterate through the item
elements and turn them into table rows (tr
elements).
Query
let
$
tempResults
:=
for
$
item
in
doc
(
"order.xml"
)//
item
,
$
prod
in
doc
(
"catalog.xml"
)//
product
where
$
item
/
@num
=
$
prod
/
number
return
<item
num
=
"{
$
item
/
@num
}"
name
=
"{
$
prod
/
name
}"
color
=
"{
$
item
/
@color
}"
quant
=
"{
$
item
/
@quantity
}"
/>
return
<table>
<tr>
<th>
#
</th><th>
Name
</th><th>
Color
</th><th>
Quan
</th>
</tr>
{
for
$
lineItem
in
$
tempResults
return
<tr>
<td>
{
data
(
$
lineItem
/
@num
)}
</td>
<td>
{
data
(
$
lineItem
/
@name
)}
</td>
<td>
{
data
(
$
lineItem
/
@color
)}
</td>
<td>
{
data
(
$
lineItem
/
@quant
)}
</td>
</tr>
}
</table>
Value of $tempResults
<item
num=
"557"
color=
"navy"
name=
"Fleece Pullover"
quant=
"1"
/>
<item
num=
"563"
color=
""
name=
"Floppy Sun Hat"
quant=
"1"
/>
<item
num=
"443"
color=
""
name=
"Deluxe Travel Bag"
quant=
"2"
/>
<item
num=
"784"
color=
"white"
name=
"Cotton Dress Shirt"
quant=
"1"
/>
<item
num=
"784"
color=
"gray"
name=
"Cotton Dress Shirt"
quant=
"1"
/>
<item
num=
"557"
color=
"black"
name=
"Fleece Pullover"
quant=
"1"
/>
Partial Results
<table>
<tr>
<th>
#</th><th>
Name</th><th>
Color</th><th>
Quan</th>
</tr>
<tr>
<td>
557</td>
<td>
Fleece Pullover</td>
<td>
navy</td>
<td>
1</td>
</tr>
<!-- ... -->
</table>
In this case, the example input documents are fairly simple, so this approach may be overkill. However, as documents and the joins between them become more complex, intermediate XML results can be very useful in simplifying queries.
This technique is sometimes called pipelining. Rather than putting the whole pipeline in one query, you could also consider chaining together a sequence of separate queries. This makes it easier to reuse each of the queries in different pipelines. A good use case for pipelining is to handle variants of the input vocabulary—for example, different flavors of RSS. Rather than have one query that handles all the variations, you can build a pipeline in which you first convert a particular variant to your chosen “canonical form,” and then operate on that. It’s also possible to combine XQuery with other technologies (such as XSLT) by using pipelines, where different technologies are used to implement different steps in the pipeline.