Chapter 11. XML Processing

There are many third-party Java tools for the direct manipulation of XML. In addition, many new Java APIs are currently in early release as part of the Java Community Process. However, Java version 1.4 includes only basic XML processing features, providing support for the Document Object Model (DOM), Simple API for XML (SAX), and Extensible Stylesheet Language Transformations (XSLT) standards, detailed in Table 11-1.

Table 11-1. The Three XML Standards Supported by Java

Standard

Description

Document Object Model Level 2

A standardized interface for accessing and updating XML documents as a hierarchical tree of nodes. See www.w3.org/DOM/ for details.

Simple API for XML 2.0

An event-driven interface for the efficient parsing of XML documents. See www.saxproject.org/ for details.

XSL Transformations 1.0

A language for transforming XML documents into other XML documents. See www.w3.org/Style/XSL/ for details.

This chapter explores the Microsoft .NET XML processing features that are comparable to those provided by Java version 1.4. We will also discuss features for which Java version 1.4 provides no equivalent, including support for XPath, and a mechanism that simplifies the writing of well-formed XML documents. Throughout this chapter, we assume the reader has knowledge of XML and related technologies.

In addition to explicit manipulation, the designers of .NET have factored implicit XML support into many aspects of the .NET Framework. Table 11-2 highlights the extent of XML integration within the .NET Framework and includes references to relevant chapters.

Table 11-2. XML Integration into .NET

Feature

Usage

Reference

Serialization

Objects can be serialized as Simple Object Access Protocol (SOAP) messages using a SoapFormatter.

Chapter 10

ADO.NET

Communication with SQL databases, database schema, XPath queries on data sets.

Chapter 16

Remoting

HttpChannels transmit data encoded as SOAP messages.

Chapter 15

XML Web services

SOAP, Web Service Definition Language (WSDL), Universal Description, Discovery, and Integration (UDDI).

Chapter 19

Configuration files

Configuration files affecting many parts of the .NET Framework are specified using XML.

Appendix C

XmlNameTable

One of the most frequent tasks when parsing XML is comparing strings; comparing the contents of strings is significantly more expensive than comparing object references. The .NET XML classes such as XmlDocument, XmlReader, and XPathNavigator—all discussed later in this chapter—use the System.Xml.NameTable class—derived from System.Xml.XmlNameTable—to reduce the overhead of repeatedly comparing string values.

The NameTable class maintains a table of strings stored as objects. As strings are parsed from an XML document, the parser calls the NameTable.Add method. Add takes a String or char array, places it in the string table, and returns a reference to the contained object; if the string already exists, on subsequent calls the previously created object reference is returned. String values added to the NameTable are referred to as atomized strings.

The NameTable.Get method returns a reference to an atomized string based on the String or char array argument provided; null is returned if the string has not been atomized.

Classes such as XmlDocument and XmlReader provide a NameTable property that returns a reference to the internal NameTable. The programmer can use an existing NameTable instance in the constructor of other XML classes. A populated NameTable can be used to prime a new XML class and improve efficiency where element and attribute commonality exists across XML documents.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset