Navigating in XML Documents

As you saw earlier in Table 11.4, the Node interface contains all the standard W3C DOM methods for navigating in a document that we've already used with JavaScript in Chapter 7, including getNextSibling, getPreviousSibling, getFirstChild, getLastChild, and getParent. You can put those methods to work here as easily as in Chapter 7; for example, here's the XML document that we navigated through in Chapter 7, meetings.xml:

<?xml version="1.0"?>
<MEETINGS>
   <MEETING TYPE="informal">
       <MEETING_TITLE>XML In The Real World</MEETING_TITLE>
       <MEETING_NUMBER>2079</MEETING_NUMBER>
       <SUBJECT>XML</SUBJECT>
       <DATE>6/1/2002</DATE>
       <PEOPLE>
           <PERSON ATTENDANCE="present">
               <FIRST_NAME>Edward</FIRST_NAME>
               <LAST_NAME>Samson</LAST_NAME>
           </PERSON>
           <PERSON ATTENDANCE="absent">
               <FIRST_NAME>Ernestine</FIRST_NAME>
               <LAST_NAME>Johnson</LAST_NAME>
           </PERSON>
           <PERSON ATTENDANCE="present">
               <FIRST_NAME>Betty</FIRST_NAME>
               <LAST_NAME>Richardson</LAST_NAME>
           </PERSON>
       </PEOPLE>
   </MEETING>
</MEETINGS>

In Chapter 7, we navigated through this document to display the third person's name, and I'll do the same here. The main difference between the XML for Java and the JavaScript implementations in this case is that the XML for Java implementation treats all text as text nodes—including the spacing used to indent meetings.xml. This means that I can use essentially the same code to navigate through the document here that we used in Chapter 7, bearing in mind that we must step over the text nodes which only contain indentation text. Here's what that looks like in a program named nav.java:

import org.w3c.dom.*;
import org.apache.xerces.parsers.DOMParser;

public class nav
{
    public static void displayDocument(String uri)
    {
        try {
            DOMParser parser = new DOMParser();
            parser.parse(uri);
            Document document = parser.getDocument();

            display(document);

        } catch (Exception e) {
            e.printStackTrace(System.err);
        }
    }

    public static void display(Node node)
    {
        Node textNode;
        Node meetingsNode = ((Document)node).getDocumentElement();
        textNode = meetingsNode.getFirstChild();
        Node meetingNode = textNode.getNextSibling();
        textNode = meetingNode.getLastChild();
        Node peopleNode = textNode.getPreviousSibling();
        textNode = peopleNode.getLastChild();
        Node personNode = textNode.getPreviousSibling();
        textNode = personNode.getFirstChild();
        Node first_nameNode = textNode.getNextSibling();
        textNode = first_nameNode.getNextSibling();
        Node last_nameNode = textNode.getNextSibling();

        System.out.println("Third name: " +
            first_nameNode.getFirstChild().getNodeValue() + ' '
            + last_nameNode.getFirstChild().getNodeValue());
    }

    public static void main(String args[])
    {
        displayDocument("meetings.xml");
    }
}

And here are the results of this program:

%java nav
Third name: Betty Richardson

Ignoring Whitespace

You can eliminate the indentation spaces, called "ignorable" whitespace, if you want. In that case, you must provide the XML for Java parser some way of checking the grammar of your XML document so that it knows what kind of whitespace it may ignore, and you can do that by giving the document a DTD:

<?xml version="1.0"?>
<!DOCTYPE MEETINGS [
<!ELEMENT MEETINGS (MEETING*)>
<!ELEMENT MEETING (MEETING_TITLE,MEETING_NUMBER,SUBJECT,DATE,PEOPLE*)>
<!ELEMENT MEETING_TITLE (#PCDATA)>
<!ELEMENT MEETING_NUMBER (#PCDATA)>
<!ELEMENT SUBJECT (#PCDATA)>
<!ELEMENT DATE (#PCDATA)>
<!ELEMENT FIRST_NAME (#PCDATA)>
<!ELEMENT LAST_NAME (#PCDATA)>
<!ELEMENT PEOPLE (PERSON*)>
<!ELEMENT PERSON (FIRST_NAME,LAST_NAME)>
<!ATTLIST MEETING
    TYPE CDATA #IMPLIED>
<!ATTLIST PERSON
    ATTENDANCE CDATA #IMPLIED>
]>
<MEETINGS>
    <MEETING TYPE="informal">
       <MEETING_TITLE>XML In The Real World</MEETING_TITLE>
       <MEETING_NUMBER>2079</MEETING_NUMBER>
       <SUBJECT>XML</SUBJECT>
       <DATE>6/1/2002</DATE>
       <PEOPLE>
           <PERSON ATTENDANCE="present">
               <FIRST_NAME>Edward</FIRST_NAME>
               <LAST_NAME>Samson</LAST_NAME>
           </PERSON>
           <PERSON ATTENDANCE="absent">
               <FIRST_NAME>Ernestine</FIRST_NAME>
               <LAST_NAME>Johnson</LAST_NAME>
           </PERSON>
           <PERSON ATTENDANCE="present">
               <FIRST_NAME>Betty</FIRST_NAME>
               <LAST_NAME>Richardson</LAST_NAME>
           </PERSON>
       </PEOPLE>
   </MEETING>
</MEETINGS>

Now I call the parser method setIncludeIgnorableWhitespace with a value of false to turn off ignorable whitespace, and I don't have to worry about the indentation spaces showing up as text nodes, which makes the code considerably shorter:

import org.w3c.dom.*;
import org.apache.xerces.parsers.DOMParser;

public class nav
{
    public static void displayDocument(String uri)
    {
        try {
            DOMParser parser = new DOMParser();
            parser.setIncludeIgnorableWhitespace(false);
            parser.parse(uri);
            Document document = parser.getDocument();

            display(document);

        } catch (Exception e) {
            e.printStackTrace(System.err);
        }
    }

    public static void display(Node node)
    {
        Node meetingsNode = ((Document)node).getDocumentElement();
        Node meetingNode = meetingsNode.getFirstChild();
        Node peopleNode = meetingNode.getLastChild();
        Node personNode = peopleNode.getLastChild();
        Node first_nameNode = personNode.getFirstChild();
        Node last_nameNode = first_nameNode.getNextSibling();

        System.out.println("Third name: " +
            first_nameNode.getFirstChild().getNodeValue() + ' '
            + last_nameNode.getFirstChild().getNodeValue());
    }

    public static void main(String args[])
    {
        displayDocument("meetings.xml");
    }
}

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset