Introduction to XPath

XPath, short for XML Path language, is a language for retrieving information in an XML document. XPath finds information by navigating through elements and attributes in the document. You access an XML node by specifying an expression that refers to the node by its position (either relative or absolute), type, content, or other criteria.

Note

XPath is a standard from the World Wide Web Consortium (W3C), a body that develops specifications, guidelines, software, and tools for the Web. Currently at version 2.0, the XPath specification can be found at http://www.w3.org/TR/xpath20/.


Consider the XML document in Listing 8.1.

Listing 8.1. The library.xml document
<library>
    <book available="true">
        <isbn>1234567890</isbn>
        <title>In the Middle of the Night</title>
        <author>Jane Volvic</author>
        <pageCount>256</pageCount>
        <price>19.95</price>
    </book>
    <book available="false">
        <isbn>1234567893</isbn>
        <title>The Man who Never Grows Old</title>
        <author>Anthony Hophop</author>
        <pageCount>400</pageCount>
        <price>29.95</price>
    </book>
    <book available="true">
        <isbn>1234567894</isbn>
        <title>No Excuses</title>
        <author>Sherma Shaun</author>
        <pageCount>160</pageCount>
        <price>9.95</price>
    </book>
</library>

The root node of the XML document is library. It describes the book collection in a library. There are two books specified using the book element. Each book has the isbn, title, author, and pageCount subelements. In addition, the available attribute of the book element indicates whether or not the book is available in the library.

In XPath you use the forward slash / to access the root node. Therefore, /library references the root node in the library.xml document. To refer to the first book element under <library> you specify the name of the child node, followed by an index number. For example, the following expression refers to the first book child element of <library>.

/library/book[1]

To refer to all book elements under <library>, remove the index:

/library/book

The following expression references the first author of the first book in the XML document:

/library/book[1]/author[1]

To reach a node without qualifying the whole path, you can use the // characters, followed by the name of the element. For instance, this expression represents all book elements in the XML document, regardless where they are:

//book

In the case of the XML document in Listing 8,1, //book returns the same value as /library/book. However, this is not always the case. If <book> can be found in other places than directly under <library>, //book will include the book elements not directly under <library> too, while /library/book only returns the book elements directly under <library>.

In addition to specifying a node position, you can search for nodes by attribute, using the @ character. For example, the following expression refers to all book elements whose available attribute is true.

//book[@available='true']

The following expression refers to all book elements with the available attribute, because the value of the attribute is not specified,.

//book[@available]

XPath also allows you to pass search criteria in your expression. For example, the following expression returns the titles of books under $20.

/library/book[price<20.00]/title

XPath has built-in functions that provides another way of retrieving information. Here are some of the functions:

first()

Specifies the first element.

last()

Specifies the last element.

position()

Specifies the index number of an element.

For example, the following expression returns the first book element under <library>:

/library/book[first()]

And, this one returns the last book element under <library>:

/library/book[last()]

This one, on the other hand, refers to the second book element in the whole XML document.

//book[first() + 1]

The following expression specifies the first two book elements in the document:

//book[position() < 3]

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset