Chapter 10. Using XML in Visual Basic 2008

This chapter describes how you can generate and manipulate Extensible Markup Language (XML) using Visual Basic 2008. Of course, using XML in Visual Basic is a vast area to cover (more than possibly could be covered in a chapter). The .NET Framework exposes five XML-specific namespaces that contain over a hundred different classes. In addition, dozens of other classes support and implement XML-related technologies, such as ADO.NET, SQL Server, and BizTalk. Consequently, this chapter focuses on the general concepts and the most important classes.

Visual Basic relies on the classes exposed in the following XML-related namespaces to transform, manipulate, and stream XML documents:

  • System.Xml provides core support for a variety of XML standards, including DTD, namespace, DOM, XDR, XPath, XSLT, and SOAP.

  • System.Xml.Serialization provides the objects used to transform objects to and from XML documents or streams using serialization.

  • System.Xml.Schema provides a set of objects that enable schemas to be loaded, created, and streamed. This support is achieved using a suite of objects that support in-memory manipulation of the entities that compose an XML schema.

  • System.Xml.XPath provides a parser and evaluation engine for the XML Path language (XPath).

  • System.Xml.Xsl provides the objects necessary when working with Extensible Stylesheet Language (XSL) and XSL Transformations (XSLT).

The XML-related technologies utilized by Visual Basic include other technologies that generate XML documents and enable XML documents to be managed as a data source:

  • ADO — The legacy COM objects provided by ADO can generate XML documents in stream or file form. ADO can also retrieve a previously persisted XML document and manipulate it. (Although ADO is not used in this chapter, ADO and other legacy COM APIs can be accessed seamlessly from Visual Basic.)

  • ADO.NET — This uses XML as its underlying data representation: The in-memory data representation of the ADO.NET DataSet object is XML; the results of data queries are represented as XML documents; XML can be imported into a DataSet and exported from a DataSet. (ADO.NET is covered in Chapter 9.)

  • SQL Server 2000 — XML-specific features were added to SQL Server 2000 (FOR XML queries to retrieve XML documents and OPENXML to represent an XML document as a rowset). Visual Basic can use ADO.NET to access SQL Server's XML-specific features (the documents generated and consumed by SQL Server can then be manipulated programmatically). Recently, Microsoft also released SQLXML, which provides an SQL Server 2000 database with some excellent XML capabilities, such as querying a database using XQuery, getting back XML result sets from a database, working with data just as if it were XML, taking huge XML files and having SQLXML convert them to relational data, and much more. SQLXML enables you to perform these functions and more via a set of managed .NET classes. You can download SQLXML free from the Microsoft SQLXML website at http://msdn2.microsoft.com/aa286527.aspx.

  • SQL Server 2005 — SQL Server has now been modified with XML in mind. SQL Server 2005 can natively understand XML because it is now built into the underlying foundation of the database. SQL Server 2005 includes an XML data type that also supports an XSD schema validation. The capability to query and understand XML documents is a valuable addition to this database server. SQL Server 2005 also comes in a lightweight (and free) version called SQL Server Express Edition.

  • SQL Server 2008 — The latest edition of SQL Server, version 2008, works off of the SQL Server 2005 release and brings to the table an improved XSD schema validation process as well as enhanced support for XQuery.

This chapter makes sense of this range of technologies by introducing some basic XML concepts and demonstrating how Visual Basic, in conjunction with the .NET Framework, can make use of XML. Specifically, in this chapter you will do all of the following:

  • Learn the rationale behind XML.

  • Look at the namespaces within the .NET Framework class library that deal with XML and XML-related technologies.

  • Take a close look at some of the classes contained within these namespaces.

  • Gain an overview of some of the other Microsoft technologies that utilize XML, particularly SQL Server and ADO.NET.

At the end of this chapter, you will be able to generate, manipulate, and transform XML using Visual Basic.

This book also covers LINQ to XML and the new XML objects found in the System.Xml.Linq namespace. These items are covered in Chapter 11.

An Introduction to XML

XML is a tagged markup language similar to HTML. In fact, XML and HTML are distant cousins and have their roots in the Standard Generalized Markup Language (SGML). This means that XML leverages one of the most useful features of HTML — readability. However, XML differs from HTML in that XML represents data, whereas HTML is a mechanism for displaying data. The tags in XML describe the data, as shown in the following example:

<?xml version="1.0" encoding="utf-8" ?>
<Movies>
  <FilmOrder name="Grease" filmId="1" quantity="21"></FilmOrder>
  <FilmOrder name="Lawrence of Arabia" filmId="2" quantity="10"></FilmOrder>
  <FilmOrder name="Star Wars" filmId="3" quantity="12"></FilmOrder>
  <FilmOrder name="Shrek" filmId="4" quantity="14"></FilmOrder>
</Movies>

This XML document represents a store order for a collection of movies. The standard used to represent an order of films would be useful to movie rental firms, collectors, and others. This information can be shared using XML for the following reasons:

  • The data tags in XML are self-describing.

  • XML is an open standard and supported on most platforms today.

XML supports the parsing of data by applications not familiar with the contents of the XML document. XML documents can also be associated with a description (a schema) that informs an application as to the structure of the data within the XML document.

At this stage, XML looks simple — it is just a human-readable way to exchange data in a universally accepted format. The essential points that you should understand about XML are as follows:

  • XML data can be stored in a plain text file.

  • A document is said to be well formed if it adheres to the XML standard.

  • Tags are used to specify the contents of a document — for example, <FilmOrder>.

  • XML elements (also called nodes) can be thought of as the objects within a document.

  • Elements are the basic building blocks of the document. Each element contains a start tag and end tag. A tag can be both a start tag and an end tag in one — for example, <FilmOrder />. In this case, the tag specifies that there is no content (or inner text) to the element (there isn't a closing tag because none is required due to the lack of inner-text content). Such a tag is said to be empty.

  • Data can be contained in the element (the element content) or within attributes contained in the element.

  • XML is hierarchical. One document can contain multiple elements, which can themselves contain child elements, and so on. However, an XML document can only have one root element.

This last point means that the XML document hierarchy can be thought of as a tree containing nodes:

  • The example document has a root node, <Movies>.

  • The branches of the root node are elements of type <FilmOrder>.

  • The leaves of the XML element, <FilmOrder>, are its attributes: name, quantity, and filmId.

Of course, we are interested in the practical use of XML by Visual Basic. A practical manipulation of the example XML, for example, is to display (for the staff of a movie supplier) a particular movie order in an application so that the supplier can fill the order and then save the information to a database. This chapter explains how you can perform such tasks using the functionality provided by the .NET Framework class library.

XML Serialization

The simplest way to demonstrate Visual Basic's support for XML is not with a complicated technology, such as SQL Server or ADO.NET, but with a practical use of XML: serializing a class.

The serialization of an object means that it is written out to a stream, such as a file or a socket (this is also known as dehydrating an object). The reverse process can also be performed: An object can be deserialized (or rehydrated) by reading it from a stream.

The type of serialization described in this chapter is XML serialization, whereby XML is used to represent a class in serialized form.

To help you understand XML serialization, let's examine a class named FilmOrder (which can be found in the code download from www.wrox.com). This class is implemented in Visual Basic and is used by the company for processing a movie order. The class could be instantiated on a firm's PDA, laptop, or even mobile phone (as long as the device had the .NET Framework installed).

An instance of FilmOrder corresponding to each order could be serializing to XML and sending over a socket using the PDA's cellular modem. (If the person making the order had a PDA that did not have a cellular modem, then the instance of FilmOrder could be serialized to a file.) The order could then be processed when the PDA was dropped into a docking cradle and synced. We are talking about data in a propriety form here, an instance of FilmOrder being converted into a generic form — XML — that can be universally understood.

The System.Xml.Serialization namespace contains classes and interfaces that support the serialization of objects to XML, and the deserialization of objects from XML. Objects are serialized to documents or streams using the XmlSerializer class.

Let's look at how you can use XmlSerializer. First, you need to define an object that implements a default constructor, such as FilmOrder:

Public Class FilmOrder

  ' These are Public because we have yet to implement
  ' properties to provide program access.

  Public name As String
  Public filmId As Integer
  Public quantity As Integer

  Public Sub New()

  End Sub
Public Sub New(ByVal name As String, _
                 ByVal filmId As Integer, _
                 ByVal quantity As Integer)
    Me.name = name
    Me.filmId = filmId
    Me.quantity = quantity
  End Sub
End Class

This class should be created in a console application. From there, we can move on to the module. Within the module's Sub Main, create an instance of XmlSerializer, specifying the object to serialize and its type in the constructor (you need to make a reference to System.Xml.Serialization for this to work):

Dim serialize As XmlSerializer = _
  New XmlSerializer(GetType(FilmOrder))

Create an instance of the same type passed as a parameter to the constructor of XmlSerializer:

Dim MyFilmOrder As FilmOrder = _
  New FilmOrder("Grease", 101, 10)

Call the Serialize method of the XmlSerializer instance and specify the stream to which the serialized object is written (parameter one, Console.Out) and the object to be serialized (parameter two, MyFilmOrder):

serialize.Serialize(Console.Out, MyFilmOrder)
Console.ReadLine()

To make reference to the XmlSerializer object, you need to make reference to the System.Xml.Serialization namespace:

Imports System.Xml.Serialization

Running the module, the following output is generated by the preceding code:

<?xml version="1.0" encoding="IBM437"?>
<FilmOrder xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <name>Grease</name>
  <filmId>101</filmId>
  <quantity>10</quantity>
</FilmOrder>

This output demonstrates the default way in which the Serialize method serializes an object:

  • Each object serialized is represented as an element with the same name as the class — in this case, FilmOrder.

  • The individual data members of the class serialized are contained in elements named for each data member — in this case, name, filmId, and quantity.

Also generated are the following:

  • The specific version of XML generated — in this case, 1.0

  • The encoding used — in this case, IBM437

  • The schemas used to describe the serialized object — in this case, www.w3.org/2001/XMLSchema-instance and www.w3.org/2001/XMLSchema

A schema can be associated with an XML document and describe the data it contains (name, type, scale, precision, length, and so on). Either the actual schema or a reference to where the schema resides can be contained in the XML document. In either case, an XML schema is a standard representation that can be used by all applications that consume XML. This means that applications can use the supplied schema to validate the contents of an XML document generated by the Serialize method of the XmlSerializer object.

The code snippet that demonstrated the Serialize method of XmlSerializer displayed the XML generated to Console.Out. Clearly, we do not expect an application to use Console.Out when it would like to access a FilmOrder object in XML form. The point was to show how serialization can be performed in just two lines of code (one call to a constructor and one call to method). The entire section of code responsible for serializing the instance of FilmOrder is presented here:

Try
    Dim serialize As XmlSerializer = _
                New XmlSerializer(GetType(FilmOrder))
    Dim MyMovieOrder As FilmOrder = _
            New FilmOrder("Grease", 101, 10)
    serialize.Serialize(Console.Out, MyMovieOrder)
    Console.Out.WriteLine()
    Console.Readline()
Catch ex As Exception
    Console.Error.WriteLine(ex.ToString())
End Try

The Serialize method's first parameter is overridden so that it can serialize XML to a file (the filename is given as type String), a Stream, a TextWriter, or an XmlWriter. When serializing to Stream, TextWriter, or XmlWriter, adding a third parameter to the Serialize method is permissible. This third parameter is of type XmlSerializerNamespaces and is used to specify a list of namespaces that qualify the names in the XML-generated document. The permissible overrides of the Serialize method are as follows:

Public Sub Serialize(Stream, Object)
Public Sub Serialize(TextWriter, Object)
Public Sub Serialize(XmlWriter, Object)
Public Sub Serialize(Stream, Object, XmlSerializerNamespaces)
Public Sub Serialize(TextWriter, Object, XmlSerializerNamespaces)
Public Sub Serialize(XmlWriter, Object, XmlSerializerNamespaces)
Public Sub Serialize(XmlWriter, Object, XmlSerializerNamespaces, String)
Public Sub Serialize(XmlWriter, Object, XmlSerializerNamespaces, String, _
   String)

An object is reconstituted using the Deserialize method of XmlSerializer. This method is overridden and can deserialize XML presented as a Stream, a TextReader, or an XmlReader. The overloads for Deserialize are as follows:

Public Function Deserialize(Stream) As Object
Public Function Deserialize(TextReader) As Object
Public Function Deserialize(XmlReader) As Object
Public Function Deserialize(XmlReader, XmlDeserializationEvents) As Object
Public Function Deserialize(XmlReader, String) As Object
Public Function Deserialize(XmlReader, String, XmlDeserializationEvents) _
   As Object

Before demonstrating the Deserialize method, we will introduce a new class, FilmOrder_Multiple. This class contains an array of film orders (actually an array of FilmOrder objects). FilmOrder_Multiple is defined as follows:

Public Class FilmOrder_Multiple
    Public multiFilmOrders() As FilmOrder

    Public Sub New()

    End Sub

    Public Sub New(ByVal multiFilmOrders() As FilmOrder)
        Me.multiFilmOrders = multiFilmOrders
    End Sub
End Class

The FilmOrder_Multiple class contains a fairly complicated object, an array of FilmOrder objects. The underlying serialization and deserialization of this class is more complicated than that of a single instance of a class that contains several simple types, but the programming effort involved on your part is just as simple as before. This is one of the great ways in which the .NET Framework makes it easy for you to work with XML data, no matter how it is formed.

To work through an example of the deserialization process, first create a sample order stored as an XML file called Filmorama.xml:

<?xml version="1.0" encoding="utf-8" ?>
<FilmOrder_Multiple xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xmlns:xsd="http://www.w3.org/2001/XMLSchema">
   <multiFilmOrders>
         <FilmOrder>
            <name>Grease</name>
            <filmId>101</filmId>
            <quantity>10</quantity>
         </FilmOrder>
         <FilmOrder>
            <name>Lawrence of Arabia</name>
            <filmId>102</filmId>
            <quantity>10</quantity>
         </FilmOrder>
<FilmOrder>
            <name>Star Wars</name>
            <filmId>103</filmId>
            <quantity>10</quantity>
         </FilmOrder>
   </multiFilmOrders>
</FilmOrder_Multiple>

In order for this to run, you should either have the .xml file in the location of the executable or define the full path of the file within the code example.

Once the XML file is in place, the next step is to change your console application so it will deserialize the contents of this file. After you have the XML file in place, ensure that your console application has made the proper namespace references:

Imports System.Xml
Imports System.Xml.Serialization
Imports System.IO

The following code demonstrates an object of type FilmOrder_Multiple being deserialized (or rehydrated) from a file, Filmorama.xml. This object is deserialized using this file in conjunction with the Deserialize method of XmlSerializer:

' Open file, ..Filmorama.xml
Dim dehydrated As FileStream = _
   New FileStream("Filmorama.xml", FileMode.Open)

' Create an XmlSerializer instance to handle deserializing,
' FilmOrder_Multiple
Dim serialize As XmlSerializer = _
   New XmlSerializer(GetType(FilmOrder_Multiple))

' Create an object to contain the deserialized instance of the object.
Dim myFilmOrder As FilmOrder_Multiple = _
   New FilmOrder_Multiple

' Deserialize object
myFilmOrder = serialize.Deserialize(dehydrated)

Once deserialized, the array of film orders can be displayed:

Dim SingleFilmOrder As FilmOrder

For Each SingleFilmOrder In myFilmOrder.multiFilmOrders
   Console.Out.WriteLine("{0}, {1}, {2}", _
      SingleFilmOrder.name, _
      SingleFilmOrder.filmId, _
      SingleFilmOrder.quantity)
Next

Console.ReadLine()

This example is just code that serializes an instance of type FilmOrder_Multiple. The output generated by displaying the deserialized object containing an array of film orders is as follows:

Grease, 101, 10
Lawrence of Arabia, 102, 10
Star Wars, 103, 10

XmlSerializer also implements a CanDeserialize method. The prototype for this method is as follows:

Public Overridable Function CanDeserialize(ByVal xmlReader As XmlReader) _
   As Boolean

If CanDeserialize returns True, then the XML document specified by the xmlReader parameter can be deserialized. If the return value of this method is False, then the specified XML document cannot be deserialized.

The FromTypes method of XmlSerializer facilitates the creation of arrays that contain XmlSerializer objects. This array of XmlSerializer objects can be used in turn to process arrays of the type to be serialized. The prototype for FromTypes is shown here:

Public Shared Function FromTypes(ByVal types() As Type) As XmlSerializer()

Before exploring the System.Xml.Serialization namespace, take a moment to consider the various uses of the term "attribute."

Source Code Style Attributes

Thus far, you have seen attributes applied to a specific portion of an XML document. Visual Basic has its own flavor of attributes, as do C# and each of the other .NET languages. These attributes refer to annotations to the source code that specify information, or metadata, that can be used by other applications without the need for the original source code. We will call such attributes Source Code Style attributes.

In the context of the System.Xml.Serialization namespace, Source Code Style attributes can be used to change the names of the elements generated for the data members of a class or to generate XML attributes instead of XML elements for the data members of a class. To demonstrate this, we will use a class called ElokuvaTilaus, which contains data members named name, filmId, and quantity. It just so happens that the default XML generated when serializing this class is not in a form that can be readily consumed by an external application.

For example, assume that a Finnish development team has written this external application — hence, the XML element and attribute names are in Finnish (minus the umlauts), rather than English. To rename the XML generated for a data member, name, a Source Code Style attribute will be used. This Source Code Style attribute specifies that when ElokuvaTilaus is serialized, the name data member is represented as an XML element, <Nimi>. The actual Source Code Style attribute that specifies this is as follows:

<XmlElementAttribute("Nimi")> Public name As String

ElokuvaTilaus, which means MovieOrder in Finnish, also contains other Source Code Style attributes:

  • <XmlAttributeAttribute("ElokuvaId")> specifies that filmId is to be serialized as an XML attribute named ElokuvaId.

  • <XmlAttributeAttribute("Maara")> specifies that quantity is to be serialized as an XML attribute named Maara.

ElokuvaTilaus is defined as follows:

Imports System.Xml.Serialization

Public Class ElokuvaTilaus

  ' These are Public because we have yet to implement
  ' properties to provide program access.

  <XmlElementAttribute("Nimi")> Public name As String
  <XmlAttributeAttribute("ElokuvaId")> Public filmId As Integer
  <XmlAttributeAttribute("Maara")> Public quantity As Integer

  Public Sub New()
  End Sub

  Public Sub New(ByVal name As String, _
                 ByVal filmId As Integer, _
                 ByVal quantity As Integer)
      Me.name = name
      Me.filmId = filmId
      Me.quantity = quantity
  End Sub

End Class

ElokuvaTilaus can be serialized as follows:

Dim serialize As XmlSerializer = _
    New XmlSerializer(GetType(ElokuvaTilaus))
Dim MyMovieOrder As ElokuvaTilaus = _
    New ElokuvaTilaus("Grease", 101, 10)

serialize.Serialize(Console.Out, MyMovieOrder)
Console.Readline()

The output generated by this code reflects the Source Code Style attributes associated with the class ElokuvaTilaus:

<?xml version="1.0" encoding="IBM437"?>
<ElokuvaTilaus xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xmlns:xsd="http://www.w3.org/2001/XMLSchema"
 ElokuvaId="101" Maara="10">
    <Nimi>Grease</Nimi>
</ElokuvaTilaus>

The value of filmId is contained in an XML attribute, ElokuvaId, and the value of quantity is contained in an XML attribute, Maara. The value of name is contained in an XML element, Nimi.

The example only demonstrates the Source Code Style attributes exposed by the XmlAttributeAttribute and XmlElementAttribute classes in the System.Xml.Serialization namespace. A variety of other Source Code Style attributes exist in this namespace that also control the form of XML generated by serialization. The classes associated with such Source Code Style attributes include XmlTypeAttribute, XmlTextAttribute, XmlRootAttribute, XmlIncludeAttribute, XmlIgnoreAttribute, and XmlEnumAttribute.

System.Xml Document Support

The System.Xml namespace implements a variety of objects that support standards-based XML processing. The XML-specific standards facilitated by this namespace include XML 1.0, Document Type Definition (DTD) support, XML namespaces, XML schemas, XPath, XQuery, XSLT, DOM Level 1 and DOM Level 2 (Core implementations), as well as SOAP 1.1, SOAP 1.2, SOAP Contract Language, and SOAP Discovery. The System.Xml namespace exposes over 30 separate classes in order to facilitate this level of the XML standard's compliance.

To generate and navigate XML documents, there are two styles of access:

  • Stream-basedSystem.Xml exposes a variety of classes that read XML from and write XML to a stream. This approach tends to be a fast way to consume or generate an XML document because it represents a set of serial reads or writes. The limitation of this approach is that it does not view the XML data as a document composed of tangible entities, such as nodes, elements, and attributes. An example of where a stream could be used is when receiving XML documents from a socket or a file.

  • Document Object Model (DOM)-basedSystem.Xml exposes a set of objects that access XML documents as data. The data is accessed using entities from the XML document tree (nodes, elements, and attributes). This style of XML generation and navigation is flexible but may not yield the same performance as stream-based XML generation and navigation. DOM is an excellent technology for editing and manipulating documents. For example, the functionality exposed by DOM could simplify merging your checking, savings, and brokerage accounts.

XML Stream-Style Parsers

When demonstrating XML serialization, XML stream-style parsers were mentioned. After all, when an instance of an object is serialized to XML, it has to be written to a stream, and when it is deserialized, it is read from a stream. When an XML document is parsed using a stream parser, the parser always points to the current node in the document. The basic architecture of stream parsers is shown in Figure 10-1.

The following classes that access a stream of XML (read XML) and generate a stream of XML (write XML) are contained in the System.Xml namespace:

  • XmlWriter — This abstract class specifies a non-cached, forward-only stream that writes an XML document (data and schema).

    Figure 10-1

    Figure 10.1. Figure 10-1

  • XmlReader — This abstract class specifies a non-cached, forward-only stream that reads an XML document (data and schema).

The diagram of the classes associated with the XML stream-style parser referred to one other class, XslTransform. This class is found in the System.Xml.Xsl namespace and is not an XML stream-style parser. Rather, it is used in conjunction with XmlWriter and XmlReader. This class is covered in detail later.

The System.Xml namespace exposes a plethora of additional XML manipulation classes in addition to those shown in the architecture diagram. The classes shown in the diagram include the following:

  • XmlResolver — This abstract class resolves an external XML resource using a Uniform Resource Identifier (URI). XmlUrlResolver is an implementation of an XmlResolver.

  • XmlNameTable — This abstract class provides a fast means by which an XML parser can access element or attribute names.

Writing an XML Stream

An XML document can be created programmatically in .NET. One way to perform this task is by writing the individual components of an XML document (schema, attributes, elements, and so on) to an XML stream. Using a unidirectional write-stream means that each element and its attributes must be written in order — the idea is that data is always written at the head of the stream. To accomplish this, you use a writable XML stream class (a class derived from XmlWriter). Such a class ensures that the XML document you generate correctly implements the W3C Extensible Markup Language (XML) 1.0 specification and the Namespaces in XML specification.

Why is this necessary when you have XML serialization? You need to be very careful here to separate interface from implementation. XML serialization works for a specific class, such as the ElokuvaTilaus class. This class is a proprietary implementation and not the format in which data is exchanged. For this one specific case, the XML document generated when ElokuvaTilaus is serialized just so happens to be the XML format used when placing an order for some movies. ElokuvaTilaus was given a little help from Source Code Style attributes so that it would conform to a standard XML representation of a film order summary.

In a different application, if the software used to manage an entire movie distribution business wants to generate movie orders, then it must generate a document of the appropriate form. The movie distribution management software achieves this using the XmlWriter object.

Before reviewing the subtleties of XmlWriter, note that this class exposes over 40 methods and properties. The example in this section provides an overview that touches on a subset of these methods and properties. This subset enables the generation of an XML document that corresponds to a movie order.

This example builds the module that generates the XML document corresponding to a movie order. It uses an instance of XmlWriter, called FilmOrdersWriter, which is actually a file on disk. This means that the XML document generated is streamed to this file. Because the FilmOrdersWriter variable represents a file, you have to take a few actions against the file. For instance, you have to ensure the file is

  • Created — The instance of XmlWriter, FilmOrdersWriter, is created by using the Create method as well as by assigning all the properties of this object with the XmlWriterSettings object.

  • Opened — The file the XML is streamed to, FilmOrdersProgrammatic.xml, is opened by passing the filename to the constructor associated with XmlWriter.

  • Generated — The process of generating the XML document is described in detail at the end of this section.

  • Closed — The file (the XML stream) is closed using the Close method of XmlWriter or by simply making use of the Using keyword, which ensures that the object is closed at the end of the Using statement.

Before you create the XmlWriter object, you first need to customize how the object operates by using the XmlWriterSettings object. This object, introduced in .NET 2.0, enables you to configure the behavior of the XmlWriter object before you instantiate it:

Dim myXmlSettings As New XmlWriterSettings()
myXmlSettings.Indent = True
myXmlSettings.NewLineOnAttributes = True

You can specify a few settings for the XmlWriterSettings object that define how XML creation will be handled by the XmlWriter object. The following table details the properties of the XmlWriterSettings class:

Property

Initial Value

Description

CheckCharacters

True

This property, if set to True, performs a character check on the contents of the XmlWriter object. Legal characters can be found at www.w3.org/TR/REC-xml#charsets.

CloseOutput

False

Specifies whether the XmlWriter should also close the stream or the System.IO.TextWriter object

ConformanceLevel

Conformance Level.Document

Allows the XML to be checked to ensure that it follows certain specified rules. Possible conformance-level settings include Document, Fragment, and Auto.

Encoding

Encoding.UTF8

Defines the encoding of the XML generated

Indent

True

Defines whether the XML generated should be indented or not. Setting this value to False will not indent child nodes from parent nodes.

IndentChars

Two spaces

Specifies the number of spaces by which child nodes are indented from parent nodes. This setting only works when the Indent property is set to True. If you want, you can assign this any string value you choose.

NewLineChars

Assigns the characters that are used to define line breaks

NewLineHandling

NewLineHandling.Replace

Defines whether to normalize line breaks in the output. Possible values include Replace, Entitize, and None.

NewLineOn Attributes

True

Defines whether a node's attributes should be written to a new line in the construction. This will occur if set to True.

OmitXml Declaration

False

Defines whether an XML declaration should be generated in the output. This omission only occurs if set to True.

OutputMethod

OutputMethod.Xml

Defines the method to serialize the output. Possible values include Xml, Html, Text, and AutoDetect.

Once the XmlWriterSettings object has been instantiated and assigned the values you deem necessary, the next steps are to invoke the XmlWriter object and make the association between the XmlWriterSettings object and the XmlWriter object.

The basic infrastructure for managing the file (the XML text stream) and applying the settings class is either

Dim FilmOrdersWriter As XmlWriter = _
   XmlWriter.Create("..FilmOrdersProgrammatic.xml", myXmlSettings)

FilmOrdersWriter.Close()

or the following, if you are utilizing the Using keyword, which is the recommended approach:

Using FilmOrdersWriter As XmlWriter = _
   XmlWriter.Create("..FilmOrdersProgrammatic.xml", myXmlSettings)
End Using

With the preliminaries completed (file created and formatting configured), the process of writing the actual attributes and elements of your XML document can begin. The sequence of steps used to generate your XML document is as follows:

  1. Write an XML comment using the WriteComment method. This comment describes from whence the concept for this XML document originated and generates the following code:

    <!-- Same as generated by serializing, ElokuvaTilaus -->
  2. Begin writing the XML element, <ElokuvaTilaus>, by calling the WriteStartElement method. You can only begin writing this element because its attributes and child elements must be written before the element can be ended with a corresponding </ElokuvaTilaus>. The XML generated by the WriteStartElement method is as follows

    <ElokuvaTilaus>
  3. Write the attributes associated with <ElokuvaTilaus> by calling the WriteAttributeString method twice. The XML generated by calling the WriteAttributeString method twice adds to the <ElokuvaTilaus> XML element that is currently being written to the following:

    <ElokuvaTilaus ElokuvaId="101" Maara="10">
  4. Using the WriteElementString method, write the child XML element <Nimi> contained in the XML element, <ElokuvaTilaus>. The XML generated by calling this method is as follows:

    <Nimi>Grease</Nimi>
  5. Complete writing the <ElokuvaTilaus> parent XML element by calling the WriteEndElement method. The XML generated by calling this method is as follows:

    </ElokuvaTilaus>

Let's now put all this together in the Module1.vb file shown here:

Imports System.Xml
Imports System.Xml.Serialization
Imports System.IO

Module Module1

    Sub Main()

        Dim myXmlSettings As New XmlWriterSettings
        myXmlSettings.Indent = True
        myXmlSettings.NewLineOnAttributes = True

        Using FilmOrdersWriter As XmlWriter = _
            XmlWriter.Create("..FilmOrdersProgrammatic.xml", myXmlSettings)

            FilmOrdersWriter.WriteComment(" Same as generated " & _
               "by serializing, ElokuvaTilaus ")
            FilmOrdersWriter.WriteStartElement("ElokuvaTilaus")
FilmOrdersWriter.WriteAttributeString("ElokuvaId", "101")
            FilmOrdersWriter.WriteAttributeString("Maara", "10")
            FilmOrdersWriter.WriteElementString("Nimi", "Grease")
            FilmOrdersWriter.WriteEndElement() ' End ElokuvaTilaus

        End Using

    End Sub

End Module

Once this is run, you will find the XML file FilmOrdersProgrammatic.xml created in the same folder as the Module1.vb file or in the bin directory. The content of this file is as follows:

<?xml version="1.0" encoding="utf-8"?>
<!-- Same as generated by serializing, ElokuvaTilaus -->
<ElokuvaTilaus
  ElokuvaId="101"
  Maara="10">
  <Nimi>Grease</Nimi>
</ElokuvaTilaus>

The previous XML document is the same in form as the XML document generated by serializing the ElokuvaTilaus class. Notice that in the previous XML document, the <Nimi> element is indented two characters and that each attribute is on a different line in the document. This was achieved using the XmlWriterSettings class.

The sample application covered only a small portion of the methods and properties exposed by the XML stream-writing class, XmlWriter. Other methods implemented by this class manipulate the underlying file, such as the Flush method; and some methods allow XML text to be written directly to the stream, such as the WriteRaw method.

The XmlWriter class also exposes a variety of methods that write a specific type of XML data to the stream. These methods include WriteBinHex, WriteCData, WriteString, and WriteWhiteSpace.

You can now generate the same XML document in two different ways. You have used two different applications that took two different approaches to generating a document that represents a standardized movie order. However, there are even more ways to generate XML, depending on the circumstances. Using the previous scenario, you could receive a movie order from a store, and this order would have to be transformed from the XML format used by the supplier to your own order format.

Reading an XML Stream

In .NET, XML documents can be read from a stream as well. Data is traversed in the stream in order (first XML element, second XML element, and so on). This traversal is very quick because the data is processed in one direction and features such as write and move backward in the traversal are not supported. At any given instance, only data at the current position in the stream can be accessed.

Before exploring how an XML stream can be read, you need to understand why it should be read in the first place. Returning to our movie supplier example, imagine that the application managing the movie orders can generate a variety of XML documents corresponding to current orders, preorders, and returns. All the documents (current orders, preorders, and returns) can be extracted in stream form and processed by a report-generating application. This application prints the orders for a given day, the preorders that are going to be due, and the returns that are coming back to the supplier. The report-generating application processes the data by reading in and parsing a stream of XML.

One class that can be used to read and parse such an XML stream is XmlReader. Other classes in the .NET Framework are derived from XmlReader, such as XmlTextReader, which can read XML from a file (specified by a string corresponding to the file's name), a Stream, or an XmlReader. This example uses an XmlReader to read an XML document contained in a file. Reading XML from a file and writing it to a file is not the norm when it comes to XML processing, but a file is the simplest way to access XML data. This simplified access enables you to focus on XML-specific issues.

In creating a sample, the first step is to make the proper imports into the Module1.vb file:

Imports System.Xml
Imports System.Xml.Serialization
Imports System.IO

From there, the next step in accessing a stream of XML data is to create an instance of the object that will open the stream (the readMovieInfo variable of type XmlReader) and then open the stream itself. Your application performs this as follows (where MovieManage.xml is the name of the file containing the XML document):

Dim myXmlSettings As New XmlReaderSettings()
Using readMovieInfo As XmlReader = XmlReader.Create(fileName, myXmlSettings)

Note that because the XmlWriter has a settings class, the XmlReader also has a settings class. Though you can make assignments to the XmlReaderSettings object, in this case you do not. Later, this chapter covers the XmlReaderSettings object.

The basic mechanism for traversing each stream is to traverse from node to node using the Read method. Node types in XML include Element and Whitespace. Numerous other node types are defined, but this example focuses on traversing XML elements and the white space that is used to make the elements more readable (carriage returns, linefeeds, and indentation spaces). Once the stream is positioned at a node, the MoveToNextAttribute method can be called to read each attribute contained in an element. The MoveToNextAttribute method only traverses attributes for nodes that contain attributes (nodes of type element). An example of an XmlReader traversing each node and then traversing the attributes of each node follows:

While readMovieInfo.Read()
      ' Process node here.
      While readMovieInfo.MoveToNextAttribute()
         ' Process attribute here.
      End While
   End While

This code, which reads the contents of the XML stream, does not utilize any knowledge of the stream's contents. However, a great many applications know exactly how the stream they are going to traverse is structured. Such applications can use XmlReader in a more deliberate manner and not simply traverse the stream without foreknowledge.

Once the example stream has been read, it can be cleaned up using the End Using call:

End Using

This ReadMovieXml subroutine takes the filename containing the XML to read as a parameter. The code for the subroutine is as follows (and is basically the code just outlined):

Private Sub ReadMovieXml(ByVal fileName As String)
   Dim myXmlSettings As New XmlReaderSettings()
   Using readMovieInfo As XmlReader = XmlReader.Create(fileName, _
      myXmlSettings)
      While readMovieInfo.Read()
         ShowXmlNode(readMovieInfo)
         While readMovieInfo.MoveToNextAttribute()
            ShowXmlNode(readMovieInfo)
         End While
      End While
   End Using
   Console.ReadLine()
End Sub

For each node encountered after a call to the Read method, ReadMovieXml calls the ShowXmlNode subroutine. Similarly, for each attribute traversed, the ShowXmlNode subroutine is called. This subroutine breaks down each node into its sub-entities:

  • Depth — This property of XmlReader determines the level at which a node resides in the XML document tree. To understand depth, consider the following XML document composed solely of elements: <A><B></B><C><D></D></C></A>.

    Element <A> is the root element, and when parsed would return a Depth of 0. Elements <B> and <C> are contained in <A> and hence reflect a Depth value of 1. Element <D> is contained in <C>. The Depth property value associated with <D> (depth of 2) should, therefore, be one more than the Depth property associated with <C> (depth of 1).

  • Type — The type of each node is determined using the NodeType property of XmlReader. The node returned is of enumeration type, XmlNodeType. Permissible node types include Attribute, Element, and Whitespace. (Numerous other node types can also be returned, including CDATA, Comment, Document, Entity, and DocumentType.)

  • Name — The type of each node is retrieved using the Name property of XmlReader. The name of the node could be an element name, such as <ElokuvaTilaus>, or an attribute name, such as ElokuvaId.

  • Attribute Count — The number of attributes associated with a node is retrieved using the AttributeCount property of XmlReader's NodeType.

  • Value — The value of a node is retrieved using the Value property of XmlReader. For example, the element node <Nimi> contains a value of Grease.

The subroutine ShowXmlNode is implemented as follows:

Private Sub ShowXmlNode(ByVal reader As XmlReader)
If reader.Depth > 0 Then
     For depthCount As Integer = 1 To reader.Depth
        Console.Write(" ")
     Next
  End If

  If reader.NodeType = XmlNodeType.Whitespace Then

     Console.Out.WriteLine("Type: {0} ", reader.NodeType)

  ElseIf reader.NodeType = XmlNodeType.Text Then

     Console.Out.WriteLine("Type: {0}, Value: {1} ", _
                          reader.NodeType, _
                          reader.Value)

  Else

     Console.Out.WriteLine("Name: {0}, Type: {1}, " & _
                          "AttributeCount: {2}, Value: {3} ", _
                          reader.Name, _
                          reader.NodeType, _
                          reader.AttributeCount, _
                          reader.Value)
  End If

End Sub

Within the ShowXmlNode subroutine, each level of node depth adds two spaces to the output generated:

If reader.Depth > 0 Then
  For depthCount As Integer = 1 To reader.Depth
    Console.Write(" ")
  Next
End If

You add these spaces in order to create human-readable output (so you can easily determine the depth of each node displayed). For each type of node, ShowXmlNode displays the value of the NodeType property. The ShowXmlNode subroutine makes a distinction between nodes of type Whitespace and other types of nodes. The reason for this is simple: A node of type Whitespace does not contain a name or attribute count. The value of such a node is any combination of white-space characters (space, tab, carriage return, and so on). Therefore, it doesn't make sense to display the properties if the NodeType is XmlNodeType.WhiteSpace. Nodes of type Text have no name associated with them, so for this type, subroutine ShowXmlNode only displays the properties NodeType and Value. For all other node types, the Name, AttributeCount, Value, and NodeType properties are displayed.

To finalize this module, add a Sub Main as follows:

Sub Main(ByVal args() As String)
   ReadMovieXml("..MovieManage.xml")
End Sub

Here is an example construction of the MovieManage.xml file:

<?xml version="1.0" encoding="utf-8" ?>
<MovieOrderDump>

 <FilmOrder_Multiple>
    <multiFilmOrders>
       <FilmOrder>
          <name>Grease</name>
          <filmId>101</filmId>
          <quantity>10</quantity>
       </FilmOrder>
       <FilmOrder>
          <name>Lawrence of Arabia</name>
          <filmId>102</filmId>
          <quantity>10</quantity>
       </FilmOrder>
       <FilmOrder>
          <name>Star Wars</name>
          <filmId>103</filmId>
          <quantity>10</quantity>
       </FilmOrder>
    </multiFilmOrders>
 </FilmOrder_Multiple>

 <PreOrder>
    <FilmOrder>
       <name>Shrek III - Shrek Becomes a Programmer</name>
       <filmId>104</filmId>
       <quantity>10</quantity>
    </FilmOrder>
 </PreOrder>

 <Returns>
    <FilmOrder>
       <name>Star Wars</name>
       <filmId>103</filmId>
       <quantity>2</quantity>
    </FilmOrder>
 </Returns>

</MovieOrderDump>

Running this module produces the following output (a partial display, as it would be rather lengthy):

Name: xml, Type: XmlDeclaration, AttributeCount: 2, Value: version="1.0"
encoding="utf-8"
Name: version, Type: Attribute, AttributeCount: 2, Value: 1.0
Name: encoding, Type: Attribute, AttributeCount: 2, Value: utf-8
Type: Whitespace
Name: MovieOrderDump, Type: Element, AttributeCount: 0, Value:
 Type: Whitespace
 Name: FilmOrder_Multiple, Type: Element, AttributeCount: 0, Value:
  Type: Whitespace
Name: multiFilmOrders, Type: Element, AttributeCount: 0, Value:
   Type: Whitespace
   Name: FilmOrder, Type: Element, AttributeCount: 0, Value:
    Type: Whitespace
    Name: name, Type: Element, AttributeCount: 0, Value:
     Type: Text, Value: Grease

This example managed to use three methods and five properties of XmlReader. The output generated was informative but far from practical. XmlReader exposes over 50 methods and properties, which means that we have only scratched the surface of this highly versatile class. The remainder of this section looks at the XmlReaderSettings class, introduces a more realistic use of XmlReader, and demonstrates how the classes of System.Xml handle errors.

The XmlReaderSettings Class

Just like the XmlWriter object, the XmlReader object requires settings to be applied for instantiation of the object. This means that you can apply settings specifying how the XmlReader object behaves when it is reading whatever XML you might have for it. This includes settings for dealing with white space, schemas, and more:

Property

Initial Value

Description

CheckCharacters

True

This property, if set to True, performs a character check on the contents of the retrieved object. Legal characters can be found at www.w3.org/TR/REC-xml#charsets.

CloseOutput

False

Specifies whether the XmlWriter should also close the stream or the System.IO.TextWriter object

ConformanceLevel

Conformance Level.Document

Allows the XML to be checked to ensure that it follows certain specified rules. Possible conformance-level settings include Document, Fragment, and Auto.

IgnoreComments

False

Defines whether comments should be ignored or not

IgnoreProcessing Instructions

False

Defines whether processing instructions contained within the XML should be ignored

IgnoreWhitespace

False

Defines whether the XmlReader object should ignore all insignificant white space

LineNumberOffset

0

Defines the line number at which the LineNumber property starts counting within the XML file

LinePosition Offset

0

Defines the position in the line number at which the LineNumber property starts counting within the XML file

NameTable

An empty XmlNameTable object

Enables the XmlReader to work with a specific XmlNameTable object that is used for atomized string comparisons

ProhibitDtd

False

Defines whether the XmlReader should perform a DTD validation

Schemas

An empty XmlSchemaSet object

Enables the XmlReader to work with an instance of the XmlSchemaSet class

ValidationFlags

ValidationFlags.AllowXmlAttributes and validationFlags.ProcessidentityConstraints.

Enables you to apply validation schema settings. Possible values include AllowXmlAttributes, ProcessIdentityConstraints, ProcessInlineSchema, ProcessSchemaLocation, ReportValidationWarnings, and None.

ValidationType

None

Specifies whether the XmlReader will perform validation or type assignment when reading. Possible values include Auto, DTD, None, Schema, and XDR.

XmlResolver

 

A write-only property that enables you to access external documents

An example of using this settings class to modify the behavior of the XmlReader class is as follows:

Dim myXmlSettings As New XmlReaderSettings()
myXmlSettings.IgnoreWhitespace = True
myXmlSettings.IgnoreComments = True

Using readMovieInfo As XmlReader = XmlReader.Create(fileName, myXmlSettings)
   ' Use XmlReader object here.
End Using

In this case, the XmlReader object that is created ignores the white space that it encounters, as well as any of the XML comments. These settings, once established with the XmlReaderSettings object, are then associated with the XmlReader object through its Create method.

Traversing XML Using XmlTextReader

An application can easily use XmlReader to traverse a document that is received in a known format. The document can thus be traversed in a deliberate manner. You just implemented a class that serialized arrays of movie orders. The next example takes an XML document containing multiple XML documents of that type and traverses them. Each movie order is forwarded to the movie supplier via fax. The document is traversed as follows:

Read root element: <MovieOrderDump>
    Process each <FilmOrder_Multiple> element
        Read <multiFilmOrders> element
            Process each <FilmOrder>
                Send fax for each movie order here

The basic outline for the program's implementation is to open a file containing the XML document to parse and to traverse it from element to element:

Dim myXmlSettings As New XmlReaderSettings()

Using readMovieInfo As XmlReader = XmlReader.Create(fileName, myXmlSettings)
      readMovieInfo.Read()
      readMovieInfo.ReadStartElement("MovieOrderDump")

      Do While (True)
         '****************************************************
         '* Process FilmOrder elements here                  *

         '****************************************************
      Loop

      readMovieInfo.ReadEndElement()  '  </MovieOrderDump>

End Using

The preceding code opened the file using the constructor of XmlReader, and the End Using statement takes care of shutting everything down for you. The code also introduced two methods of the XmlReader class:

  • ReadStartElement(String) — This verifies that the current in the stream is an element and that the element's name matches the string passed to ReadStartElement. If the verification is successful, then the stream is advanced to the next element.

  • ReadEndElement() — This verifies that the current element is an end tab; and if the verification is successful, then the stream is advanced to the next element.

The application knows that an element, <MovieOrderDump>, will be found at a specific point in the document. The ReadStartElement method verifies this foreknowledge of the document format. After all the elements contained in element <MovieOrderDump> have been traversed, the stream should point to the end tag </MovieOrderDump>. The ReadEndElement method verifies this.

The code that traverses each element of type <FilmOrder> similarly uses the ReadStartElement and ReadEndElement methods to indicate the start and end of the <FilmOrder> and <multiFilmOrders> elements. The code that ultimately parses the list of movie orders and faxes the movie supplier (using the FranticallyFaxTheMovieSupplier subroutine) is as follows:

Dim myXmlSettings As New XmlReaderSettings()

Using readMovieInfo As XmlReader = XmlReader.Create(fileName, myXmlSettings)
      readMovieInfo.Read()
      readMovieInfo.ReadStartElement("MovieOrderDump")

      Do While (True)
readMovieInfo.ReadStartElement("FilmOrder_Multiple")
         readMovieInfo.ReadStartElement("multiFilmOrders")

         Do While (True)
            readMovieInfo.ReadStartElement("FilmOrder")
            movieName = readMovieInfo.ReadElementString()
            movieId = readMovieInfo.ReadElementString()
            quantity = readMovieInfo.ReadElementString()
            readMovieInfo.ReadEndElement() ' clear </FilmOrder>

            FranticallyFaxTheMovieSupplier(movieName, movieId, quantity)

            ' Should read next FilmOrder node
            ' else quits
            readMovieInfo.Read()

            If ("FilmOrder" <> readMovieInfo.Name) Then
               Exit Do
            End If
         Loop

         readMovieInfo.ReadEndElement() ' clear </multiFilmOrders>
         readMovieInfo.ReadEndElement() ' clear </FilmOrder_Multiple>

         ' Should read next FilmOrder_Multiple node
         ' else you quit
         readMovieInfo.Read() ' clear </MovieOrderDump>

         If ("FilmOrder_Multiple" <> readMovieInfo.Name) Then
            Exit Do
         End If
      Loop

      readMovieInfo.ReadEndElement()  '  </MovieOrderDump>

End Using

Three lines within the preceding code contain a call to the ReadElementString method:

movieName = readMovieInfo.ReadElementString()
movieId = readMovieInfo.ReadElementString()
quantity = readMovieInfo.ReadElementString()

While parsing the stream, it was known that an element named <name> existed and that this element contained the name of the movie. Rather than parse the start tag, get the value, and parse the end tag, it was easier to get the data using the ReadElementString method. This method retrieves the data string associated with an element and advances the stream to the next element. The ReadElementString method was also used to retrieve the data associated with the XML elements <filmId> and <quantity>.

The output of this example is a fax (not shown here because the point of this example is to demonstrate that it is simpler to traverse a document when its form is known). The format of the document is still verified by XmlReader as it is parsed.

The XmlReader class also exposes properties that provide more insight into the data contained in the XML document and the state of parsing: IsEmptyElement, EOF, and IsStartElement.

.NET CLR-compliant types are not 100 percent inline with XML types, so ever since the .NET Framework 2.0 was introduced, the new methods it made available in the XmlReader make the process of casting from one of these XML types to .NET types easier.

Using the ReadElementContentAs method, you can easily perform the necessary casting required:

Dim username As String = _
   myXmlReader.ReadElementContentAs(GetType(String), DBNull.Value)
Dim myDate As DateTime = _
   myXmlReader.ReadElementContentAs(GetType(DateTime), DBNull.Value)

Also available is a series of direct casts through new methods such as the following:

  • ReadElementContentAsBase64()

  • ReadElementContentAsBinHex()

  • ReadElementContentAsBoolean()

  • ReadElementContentAsDateTime()

  • ReadElementContentAsDecimal()

  • ReadElementContentAsDouble()

  • ReadElementContentAsFloat()

  • ReadElementContentAsInt()

  • ReadElementContentAsLong()

  • ReadElementContentAsObject()

  • ReadElementContentAsString()

In addition to these methods, the raw XML associated with the document can also be retrieved, using ReadInnerXml and ReadOuterXml. Again, this only scratches the surface of the XmlReader class, a class quite rich in functionality.

Handling Exceptions

XML is text and could easily be read using mundane methods such as Read and ReadLine. A key feature of each class that reads and traverses XML is inherent support for error detection and handling. To demonstrate this, consider the following malformed XML document found in the file named Malformed.xml:

<?xml version="1.0" encoding="IBM437" ?>
<ElokuvaTilaus ElokuvaId="101", Maara="10">
   <Nimi>Grease</Nimi>
<ElokuvaTilaus>

This document may not immediately appear to be malformed. By wrapping a call to the method you developed (ReadMovieXml), you can see what type of exception is raised when XmlReader detects the malformed XML within this document:

Try
    ReadMovieXml("Malformed.xml")
Catch xmlEx As XmlException
    Console.Error.WriteLine("XML Error: " + xmlEx.ToString())
Catch ex As Exception
    Console.Error.WriteLine("Some other error: " + ex.ToString())
End Try

The methods and properties exposed by the XmlReader class raise exceptions of type System.Xml.XmlException. In fact, every class in the System.Xml namespace raises exceptions of type XmlException. Although this is a discussion of errors using an instance of type XmlReader, the concepts reviewed apply to all errors generated by classes found in the System.Xml namespace.

Properties exposed by XmlException include the following:

  • Data — A set of key-value pairs that enable you to display user-defined information about the exception

  • HelpLink — The link to the help page that deals with the exception

  • InnerException — The System.Exception instance indicating what caused the current exception

  • LineNumber — The number of the line within an XML document where the error occurred

  • LinePosition — The position within the line specified by LineNumber where the error occurred

  • Message — The error message that corresponds to the error that occurred. This error took place at the line in the XML document specified by LineNumber and within the line at the position specified by LinePostion.

  • Source — Provides the name of the application or object that triggered the error

  • SourceUri — Provides the URI of the element or document in which the error occurred

  • StackTrace — Provides a string representation of the frames on the call stack when the error was triggered

  • TargetSite — The method that triggered the error

The error displayed when subroutine movieReadXML processes Malformed.xml is as follows:

XML Error: System.Xml.XmlException: The ',' character, hexadecimal value 0x2C,
 cannot begin a name. Line 2, position 49.

The preceding snippet indicates that a comma separates the attributes in element <FilmOrder> (ElokuvaTilaus="101", Maara="10"). This comma is invalid. Removing it and running the code again results in the following output:

XML Error: System.Xml.XmlException: This is an unexpected token. Expected
'EndElement'. Line 5, position 27.

Again, you can recognize the precise error. In this case, you do not have an end element, </ElokuvaTilaus>, but you do have an opening element, <ElokuvaTilaus>.

The properties provided by the XmlException class (such as LineNumber, LinePosition, and Message) provide a useful level of precision when tracking down errors. The XmlReader class also exposes a level of precision with respect to the parsing of the XML document. This precision is exposed by the XmlReader through properties such as LineNumber and LinePosition.

Using the MemoryStream Object

A very useful class that can greatly help you when working with XML is System.IO.MemoryStream. Rather than need a network or disk resource backing the stream (as in System.Net.Sockets.NetworkStream and System.IO.FileStream), MemoryStream backs itself up onto a block of memory. Imagine that you want to generate an XML document and e-mail it. The built-in classes for sending e-mail rely on having a System.String containing a block of text for the message body, but if you want to generate an XML document, then you need a stream.

If the document is reasonably sized, then write the document directly to memory and copy that block of memory to the e-mail. This is good from a performance and reliability perspective because you don't have to open a file, write it, rewind it, and read the data back in again. However, you must consider scalability in this situation because if the file is very large, or if you have a great number of smaller files, then you could run out of memory (in which case you have to go the "file" route).

This section describes how to generate an XML document to a MemoryStream object, reading the document back out again as a System.String value and e-mailing it. What you will do is create a new class called EmailStream that extends MemoryStream. This new class contains an extra method called CloseAndSend that, as its name implies, closes the stream and sends the e-mail message.

First, create a new console application project called "EmailStream." The first task is to create a basic Customer object that contains a few basic members and can be automatically serialized by .NET through use of the SerializableAttribute attribute:

<Serializable()> Public Class Customer

  ' members...
  Public Id As Integer
  Public FirstName As String
  Public LastName As String
  Public Email As String

End Class

The fun part is the EmailStream class itself. This needs access to the System.Net.Mail namespace, so import this namespace into your code for your class. The new class should also extend System.IO.MemoryStream, as shown here:

Imports System.IO
Imports System.Net.Mail

Public Class EmailStream
    Inherits MemoryStream

The first job of CloseAndSend is to start putting together the mail message. This is done by creating a new System.Web.Mail.MailMessage object and configuring the sender, recipient, and subject:

' CloseAndSend - close the stream and send the email...
Public Sub CloseAndSend(ByVal fromAddress As String, _
                        ByVal toAddress As String, _
                        ByVal subject As String)

   ' Create the new message...
   Dim message As New MailMessage()

   message.From = New MailAddress(fromAddress)
   message.To.Add(New MailAddress(toAddress))
   message.Subject = subject

This method will be called after the XML document has been written to the stream, so you can assume at this point that the stream contains a block of data. To read the data back out again, you have to rewind the stream and use a System.IO.StreamReader. Before you do this, however, call Flush. Traditionally, streams have always been buffered — that is, the data is not sent to the final destination (the memory block in this case, but a file in the case of a FileStream, and so on) each time the stream is written. Instead, the data is written in (mostly) a nondeterministic way. Because you need all the data to be written, you call Flush to ensure that all the data has been sent to the destination and that the buffer is empty.

In a way, EmailStream is a great example of buffering. All the data is held in a memory "buffer" until you finally send the data on to its destination in a response to an explicit call to this method:

' Flush and rewind the stream...

Flush()
Seek(0, SeekOrigin.Begin)

Once you have flushed and rewound the stream, you can create a StreamReader and dredge all the data out into the Body property of the MailMessage object:

' Read out the data...

   Dim reader As New StreamReader(Me)
   message.Body = reader.ReadToEnd()

After you have done that, close the stream by calling the base class method:

' Close the stream...

Close()

Finally, send the message:

' Send the message...
   Dim SmtpMail As New SmtpClient()
   SmtpMail.Send(message)

End Sub

End Class

To call this method, you need to add some code to the Main method. First, create a new Customer object and populate it with some test data:

Imports System.Xml.Serialization

Module Module1

  Sub Main()

    ' Create a new customer...
    Dim customer As New Customer
    customer.Id = 27
    customer.FirstName = "Bill"
    customer.LastName = "Gates"
    customer.Email = [email protected]

After you have done that, you can create a new EmailStream object. You then use XmlSerializer to write an XML document representing the newly created Customer instance to the block of memory that EmailStream is backing to:

' Create a new email stream...
Dim stream As New EmailStream

' Serialize...
Dim serializer As New XmlSerializer(customer.GetType())
serializer.Serialize(stream, customer)

At this point, the stream will be filled with data; and after all the data has been flushed, the block of memory that EmailStream backs on to will contain the complete document. Now you can call CloseAndSend to e-mail the document:

' Send the email...
    stream.CloseAndSend("[email protected]", _
       "[email protected]", "XML Customer Document")
  End Sub

End Module

You probably already have the Microsoft SMTP service properly configured — this service is necessary to send e-mail. You also need to ensure that the e-mail addresses used in your code go to your e-mail address! Run the project and check your e-mail; you should see something similar to what is shown in Figure 10-2.

Figure 10-2

Figure 10.2. Figure 10-2

Document Object Model (DOM)

The classes of the System.Xml namespace that support the Document Object Model (DOM) interact as illustrated in Figure 10-3.

Figure 10-3

Figure 10.3. Figure 10-3

Within this diagram, an XML document is contained in a class named XmlDocument. Each node within this document is accessible and managed using XmlNode. Nodes can also be accessed and managed using a class specifically designed to process a specific node's type (XmlElement, XmlAttribute, and so on). XML documents are extracted from XmlDocument using a variety of mechanisms exposed through such classes as XmlWriter, TextWriter, Stream, and a file (specified by filename of type String). XML documents are consumed by an XmlDocument using a variety of load mechanisms exposed through the same classes.

A DOM-style parser differs from a stream-style parser with respect to movement. Using the DOM, the nodes can be traversed forward and backward. Nodes can be added to the document, removed from the document, and updated. However, this flexibility comes at a performance cost. It is faster to read or write XML using a stream-style parser.

The DOM-specific classes exposed by System.Xml include the following:

  • XmlDocument — Corresponds to an entire XML document. A document is loaded using the Load method. XML documents are loaded from a file (the filename specified as type String), TextReader, or XmlReader. A document can be loaded using LoadXml in conjunction with a string containing the XML document. The Save method is used to save XML documents. The methods exposed by XmlDocument reflect the intricate manipulation of an XML document. For example, the following self-documenting creation methods are implemented by this class: CreateAttribute, CreateCDataSection, CreateComment, CreateDocumentFragment, CreateDocumentType, CreateElement, CreateEntityReference, CreateNavigator, CreateNode, CreateProcessingInstruction, CreateSignificantWhitespace, CreateTextNode, CreateWhitespace, and CreateXmlDeclaration. The elements contained in the document can be retrieved. Other methods support the retrieving, importing, cloning, loading, and writing of nodes.

  • XmlNode — Corresponds to a node within the DOM tree. This class supports data types, namespaces, and DTDs. A robust set of methods and properties is provided to create, delete, and replace nodes: AppendChild, Clone, CloneNode, CreateNavigator, InsertAfter, InsertBefore, Normalize, PrependChild, RemoveAll, RemoveChild, ReplaceChild, SelectNodes, SelectSingleNode, Supports, WriteContentTo, and WriteTo. The contents of a node can similarly be traversed in a variety of ways: FirstChild, LastChild, NextSibling, ParentNode, and PreviousSibling.

  • XmlElement — Corresponds to an element within the DOM tree. The functionality exposed by this class contains a variety of methods used to manipulate an element's attributes: AppendChild, Clone, CloneNode, CreateNavigator, GetAttribute, GetAttributeNode, GetElementsByTagName, GetNamespaceOfPrefix, GetPrefixOfNamespace, InsertAfter, InsertBefore, Normalize, PrependChild, RemoveAll, RemoveAllAttributes, RemoveAttribute, RemoveAttributeAt, RemoveAttributeNode, RemoveChild, ReplaceChild, SelectNodes, SelectSingleNode, SetAttribute, SetAttributeNode, Supports, WriteContentTo, and WriteTo.

  • XmlAttribute — Corresponds to an attribute of an element (XmlElement) within the DOM tree. An attribute contains data and lists of subordinate data, so it is a less complicated object than an XmlNode or an XmlElement. An XmlAttribute can retrieve its owner document (property, OwnerDocument), retrieve its owner element (property, OwnerElement), retrieve its parent node (property, ParentNode), and retrieve its name (property, Name). The value of an XmlAttribute is available via a read-write property named Value. Methods available to XmlAttribute include AppendChild, Clone, CloneNode, CreateNavigator, GetNamespaceOfPrefix, GetPrefixOfNamespace, InsertAfter, InsertBefore, Normalize, PrependChild, RemoveAll, RemoveChild, ReplaceChild, SelectNodes, SelectSingleNode, WriteContentTo, and WriteTo.

Given the diverse number of methods and properties exposed by XmlDocument, XmlNode, XmlElement, and XmlAttribute (and there are many more than those listed here), it's clear that any XML 1.0 or 1.1-compliant document can be generated and manipulated using these classes. In comparison to their XML stream counterparts, these classes offer more flexible movement within the XML document and through any editing of XML documents.

A similar comparison could be made between DOM and data serialized and deserialized using XML. Using serialization, the type of node (for example, attribute or element) and the node name are specified at compile time. There is no on-the-fly modification of the XML generated by the serialization process.

Other technologies that generate and consume XML are not as flexible as the DOM. This includes ADO.NET and ADO, which generate XML of a particular form. The default install of SQL Server 2000 does expose a certain amount of flexibility when it comes to the generation (FOR XML queries) and consumption (OPENXML) of XML. SQL Server 2005 has more support for XML and even supports an XML data type. SQL Server 2005 also expands upon the FOR XML query with FOR XML TYPE. The choice between using classes within the DOM and a version of SQL Server is a choice between using a language such as Visual Basic to manipulate objects or installing SQL Server and performing most of the XML manipulation in SQL.

DOM Traversing Raw XML Elements

The first DOM example loads an XML document into an XmlDocument object using a string that contains the actual XML document. This scenario is typical of an application that uses ADO.NET to generate XML, but then uses the objects of the DOM to traverse and manipulate this XML. ADO.NET's DataSet object contains the results of ADO.NET data access operations. The DataSet class exposes a GetXml method that retrieves the underlying XML associated with the DataSet. The following code demonstrates how the contents of the DataSet are loaded into the XmlDocument:

Dim xmlDoc As New XmlDocument
Dim ds As New DataSet()

' Set up ADO.NET DataSet() here
xmlDoc.LoadXml(ds.GetXml())

This example over the next few pages simply traverses each XML element (XmlNode) in the document (XmlDocument) and displays the data accordingly. The data associated with this example is not retrieved from a DataSet but is instead contained in a string, rawData, which is initialized as follows:

Dim rawData As String = _
    "<multiFilmOrders>" & _
    "  <FilmOrder>" & _
    "    <name>Grease</name>" & _
    "    <filmId>101</filmId>" & _
    "    <quantity>10</quantity>" & _
    "  </FilmOrder>" & _
    "  <FilmOrder>" & _
    "    <name>Lawrence of Arabia</name>" & _
    "    <filmId>102</filmId>" & _
    "    <quantity>10</quantity>" & _
    "  </FilmOrder>" & _
    "</multiFilmOrders>"

The XML document in rawData is a portion of the XML hierarchy associated with a movie order. The preceding example is what you would do if you were using any of the .NET Framework versions before version 3.5. If you are working on the .NET Framework 3.5, then you can use the new XML literal capability offered. This means that you can now put XML directly in your code as XML and not as a string. This approach is presented here:

Dim rawData As String = _
   <multiFilmOrders>
      <FilmOrder>
         <name>Grease</name>
         <filmId>101</filmId>
         <quantity>10</quantity>
      </FilmOrder>
      <FilmOrder>
         <name>Lawrence of Arabia</name>
         <filmId>102</filmId>
         <quantity>10</quantity>
      </FilmOrder>
   </multiFilmOrders>

The basic idea in processing this data is to traverse each <FilmOrder> element in order to display the data it contains. Each node corresponding to a <FilmOrder> element can be retrieved from your XmlDocument using the GetElementsByTagName method (specifying a tag name of FilmOrder). The GetElementsByTagName method returns a list of XmlNode objects in the form of a collection of type XmlNodeList. Using the For Each statement to construct this list, the XmlNodeList (movieOrderNodes) can be traversed as individual XmlNode elements (movieOrderNode). The code for handling this is as follows:

Dim xmlDoc As New XmlDocument
Dim movieOrderNodes As XmlNodeList
Dim movieOrderNode As XmlNode

xmlDoc.LoadXml(rawData)

' Traverse each <FilmOrder>
movieOrderNodes = xmlDoc.GetElementsByTagName("FilmOrder")

For Each movieOrderNode In movieOrderNodes
    '**********************************************************
    ' Process <name>, <filmId> and <quantity> here
    '**********************************************************
Next

Each XmlNode can then have its contents displayed by traversing the children of this node using the ChildNodes method. This method returns an XmlNodeList (baseDataNodes) that can be traversed one XmlNode list element at a time:

Dim baseDataNodes As XmlNodeList
Dim bFirstInRow As Boolean

baseDataNodes = movieOrderNode.ChildNodes
bFirstInRow = True

For Each baseDataNode As XmlNode In baseDataNodes
  If (bFirstInRow) Then
    bFirstInRow = False
  Else
    Console.Out.Write(", ")
  End If
  Console.Out.Write(baseDataNode.Name & ": " & baseDataNode.InnerText)
Next
Console.Out.WriteLine()

The bulk of the preceding code retrieves the name of the node using the Name property and the InnerText property of the node. The InnerText property of each XmlNode retrieved contains the data associated with the XML elements (nodes) <name>, <filmId>, and <quantity>. The example displays the contents of the XML elements using Console.Out. The XML document is displayed as follows:

name: Grease, filmId: 101, quantity: 10
name: Lawrence of Arabia, filmId: 102, quantity: 10

Other, more practical, methods for using this data could have been implemented, including the following:

  • The contents could have been directed to an ASP.NET Response object, and the data retrieved could have been used to create an HTML table (<table> table, <tr> row, and <td> data) that would be written to the Response object.

  • The data traversed could have been directed to a ListBox or ComboBox Windows Forms control. This would enable the data returned to be selected as part of a GUI application.

  • The data could have been edited as part of your application's business rules. For example, you could have used the traversal to verify that the <filmId> matched the <name>. Something like this could be done if you wanted to validate the data entered into the XML document in any manner.

Here is the example in its entirety:

Dim rawData As String = _
   <multiFilmOrders>
      <FilmOrder>
         <name>Grease</name>
         <filmId>101</filmId>
         <quantity>10</quantity>
      </FilmOrder>
      <FilmOrder>
         <name>Lawrence of Arabia</name>
         <filmId>102</filmId>
         <quantity>10</quantity>
      </FilmOrder>
   </multiFilmOrders>

Dim xmlDoc As New XmlDocument
Dim movieOrderNodes As XmlNodeList
Dim movieOrderNode As XmlNode
Dim baseDataNodes As XmlNodeList
Dim bFirstInRow As Boolean

xmlDoc.LoadXml(rawData)

' Traverse each <FilmOrder>
movieOrderNodes = xmlDoc.GetElementsByTagName("FilmOrder")

For Each movieOrderNode In movieOrderNodes
  baseDataNodes = movieOrderNode.ChildNodes
  bFirstInRow = True
  For Each baseDataNode As XmlNode In baseDataNodes
    If (bFirstInRow) Then
      bFirstInRow = False
    Else
      Console.Out.Write(", ")
    End If
    Console.Out.Write(baseDataNode.Name & ": " & baseDataNode.InnerText)
  Next
  Console.Out.WriteLine()
Next

DOM Traversing XML Attributes

This next example demonstrates how to traverse data contained in attributes and how to update the attributes based on a set of business rules. In this example, the XmlDocument object is populated by retrieving an XML document from a file. After the business rules edit the object, the data is persisted back to the file:

Dim xmlDoc As New XmlDocument

xmlDoc.Load("..MovieSupplierShippingListV2.xml")
'*******************************************
' Business rules process document here
'*******************************************

xmlDoc.Save("..MovieSupplierShippingListV2.xml")

The data contained in the file, MovieSupplierShippingListV2.xml, is a variation of the movie order. You have altered your rigid standard (for the sake of example) so that the data associated with individual movie orders is contained in XML attributes instead of XML elements. An example of this movie order data is as follows:

<FilmOrder name="Grease" filmId="101" quantity="10" />

You already know how to traverse the XML elements associated with a document, so let's assume that you have successfully retrieved the XmlNode associated with the <FilmOrder> element:

Dim attributes As XmlAttributeCollection
Dim filmId As Integer
Dim quantity As Integer
attributes = node.Attributes()

For Each attribute As XmlAttribute In attributes
  If 0 = String.Compare(attribute.Name, "filmId") Then
    filmId = attribute.InnerXml
  ElseIf 0 = String.Compare(attribute.Name, "quantity") Then
    quantity = attribute.InnerXml
  End If
Next

The preceding code traverses the attributes of an XmlNode by retrieving a list of attributes using the Attributes method. The value of this method is used to set the attributes' object (data type, XmlAttributeCollection). The individual XmlAttribute objects (variable, attribute) contained in attributes are traversed using a For Each loop. Within the loop, the contents of the filmId and the quantity attribute are saved for processing by your business rules.

Your business rules execute an algorithm that ensures that the movies in the company's order are provided in the correct quantity. This rule specifies that the movie associated with filmId=101 must be sent to the customer in batches of six at a time due to packaging. In the event of an invalid quantity, the code for enforcing this business rule will remove a single order from the quantity value until the number is divisible by six. Then this number is assigned to the quantity attribute. The Value property of the XmlAttribute object is used to set the correct value of the order's quantity. The code performing this business rule is as follows:

If filmId = 101 Then
  ' This film comes packaged in batches of six.
  Do Until (quantity / 6) = True
    quantity -= 1
  Loop

  Attributes.ItemOf("quantity").Value = quantity
End If

What is elegant about this example is that the list of attributes was traversed using For Each. Then ItemOf was used to look up a specific attribute that had already been traversed. This would not have been possible by reading an XML stream with an object derived from the XML stream reader class, XmlReader.

You can use this code as follows:

Sub TraverseAttributes(ByRef node As XmlNode)
    Dim attributes As XmlAttributeCollection
    Dim filmId As Integer
    Dim quantity As Integer

    attributes = node.Attributes()

    For Each attribute As XmlAttribute In attributes
        If 0 = String.Compare(attribute.Name, "filmId") Then
            filmId = attribute.InnerXml
        ElseIf 0 = String.Compare(attribute.Name, "quantity") Then
            quantity = attribute.InnerXml
End If
    Next

    If filmId = 101 Then
       ' This film comes packaged in batches of six
       Do Until (quantity / 6) = True
          quantity -= 1
       Loop

       Attributes.ItemOf("quantity").Value = quantity
    End If
End Sub

Sub WXReadMovieDOM()
    Dim xmlDoc As New XmlDocument
    Dim movieOrderNodes As XmlNodeList
    xmlDoc.Load("..MovieSupplierShippingListV2.xml")

    ' Traverse each <FilmOrder>
    movieOrderNodes = xmlDoc.GetElementsByTagName("FilmOrder")

    For Each movieOrderNode As XmlNode In movieOrderNodes
        TraverseAttributes(movieOrderNode)
    Next

    xmlDoc.Save("..MovieSupplierShippingListV2.xml")
End Sub

XSLT Transformations

XSLT is a language that is used to transform XML documents into another format altogether. One popular use of XSLT is to transform XML into HTML so that XML documents can be presented visually. You have performed a similar task before. When working with XML serialization, you rewrote the FilmOrder class. This class was used to serialize a movie order object to XML using nodes that contained English-language names. The rewritten version of this class, ElokuvaTilaus, serialized XML nodes containing Finnish names. Source Code Style attributes were used in conjunction with the XmlSerializer class to accomplish this transformation. Two words in this paragraph send chills down the spine of any experienced developer: rewrote and rewritten. The point of an XSL transform is to use an alternate language (XSLT) to transform the XML, rather than rewrite the source code, SQL commands, or some other mechanism used to generate XML.

Conceptually, XSLT is straightforward. A file with an .xslt extension describes the changes (transformations) that will be applied to a particular XML file. Once this is completed, an XSLT processor is provided with the source XML file and the XSLT file, and performs the transformation. The System.Xml.Xsl.XslTransform class is such an XSLT processor. Another processor you will find (introduced in the .NET Framework 2.0) is the XsltCommand object found at SystemXml.Query.XsltCommand. This section looks at using both of these processors.

There are also some new features to be found in Visual Studio 2008 that deal with XSLT. The new version of the IDE supports items such as XSLT data breakpoints and better support in the editor for loading large documents. Additionally, XSLT stylesheets can be compiled into assemblies even more easily with the new command-line stylesheet compiler, XSLTC.exe.

The XSLT file is itself an XML document, although certain elements within this document are XSLT-specific commands. Dozens of XSLT commands can be used in writing an XSLT file. The first example explores the following XSLT elements (commands):

  • stylesheet — This element indicates the start of the style sheet (XSL) in the XSLT file.

  • template — This element denotes a reusable template for producing specific output. This output is generated using a specific node type within the source document under a specific context. For example, the text <xsl: template match="/"> selects all root notes ("/") for the specific transform template.

  • for-each — This element applies the same template to each node in the specified set. Recall the example class (FilmOrder_Multiple) that could be serialized. This class contained an array of movie orders. Given the XML document generated when a FilmOrder_Multiple is serialized, each movie order serialized could be processed using <xsl:for-each select = "FilmOrder_Multiple/multiFilmOrders/FilmOrder">.

  • value-of — This element retrieves the value of the specified node and inserts it into the document in text form. For example, <xsl:value-of select="name" /> would take the value of the XML element <name> and insert it into the transformed document.

When serialized, the FilmOrder_Multiple class generates XML such as the following (where...indicates where additional <FilmOrder> elements may reside):

<?xml version="1.0" encoding="UTF-8" ?>
<FilmOrder_Multiple>
    <multiFilmOrders>
        <FilmOrder>
            <name>Grease</name>
            <filmId>101</filmId>
            <quantity>10</quantity>
        </FilmOrder>
        ...
    </multiFilmOrders>
</FilmOrder_Multiple>

The preceding XML document is used to generate a report that is viewed by the manager of the movie supplier. This report is in HTML form, so that it can be viewed via the Web. The XSLT elements you previously reviewed (stylesheet, template, and for-each) are the only XSLT elements required to transform the XML document (in which data is stored) into an HTML file (data that can be displayed). An XSLT file DisplayThatPuppy.xslt contains the following text, which is used to transform a serialized version, FilmOrder_Multiple:

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
 <xsl:template match="/">
      <HTML>
      <TITLE>What people are ordering</TITLE>
      <BODY>
          <TABLE BORDER="1">
<TR>
              <TD><B>Film Name</B></TD>
              <TD><B>Film ID</B></TD>
              <TD><B>Quantity</B></TD>
            </TR>
            <xsl:for-each select=
             "FilmOrder_Multiple/multiFilmOrders/FilmOrder">
            <TR>
               <TD><xsl:value-of select="name" /></TD>
               <TD><xsl:value-of select="filmId" /></TD>
               <TD><xsl:value-of select="quantity" /></TD>
            </TR>
            </xsl:for-each>
        </TABLE>
      </BODY>
    </HTML>
  </xsl:template>
</xsl:stylesheet>

In the preceding XSLT file, the XSLT elements are marked in bold. These elements perform operations on the source XML file containing a serialized FilmOrder_Multiple object and generate the appropriate HTML file. Your file contains a table (marked by the table tag, <TABLE>) that contains a set of rows (each row marked by a table row tag, <TR>). The columns of the table are contained in table data tags, <TD>. The XSLT file contains the header row for the table:

<TR>
    <TD><B>Film Name</B></TD>
    <TD><B>Film ID</B></TD>
    <TD><B>Quantity</B></TD>
</TR>

Each row containing data (an individual movie order from the serialized object, FilmOrder_Multiple) is generated using the XSLT element, for-each, to traverse each <FilmOrder> element within the source XML document:

<xsl:for-each select=
    "FilmOrder_Multiple/multiFilmOrders/FilmOrder">

The individual columns of data are generated using the value-of XSLT element, in order to query the elements contained within each <FilmOrder> element (<name>, <filmId>, and <quantity>):

<TR>
    <TD><xsl:value-of select="name" /></TD>
    <TD><xsl:value-of select="filmId" /></TD>
    <TD><xsl:value-of select="quantity" /></TD>
</TR>

The code to create a displayable XML file using the XslTransform object is as follows:

Dim myXslTransform As New XslCompiledTransform()

Dim destFileName As String = "..ShowIt.html"
myXslTransform.Load("..DisplayThatPuppy.xsl")
myXslTransform.Transform("..FilmOrders.xml", destFileName)

System.Diagnostics.Process.Start(destFileName)

This consists of only seven lines of code, with the bulk of the coding taking place in the XSLT file. The previous code snippet created an instance of a System.Xml.Xsl.XslCompiledTransform object named myXslTransform. The Load method of this class is used to load the XSLT file you previously reviewed, DisplayThatPuppy.xslt. The Transform method takes a source XML file as the first parameter, which in this case was a file containing a serialized FilmOrder_Multiple object. The second parameter is the destination file created by the transform (ShowIt.html). The Start method of the Process class is used to display the HTML file. This method launches a process that is best suited for displaying the file provided. Basically, the extension of the file dictates which application will be used to display the file. On a typical Windows machine, the program used to display this file is Internet Explorer, as shown in Figure 10-4.

Figure 10-4

Figure 10.4. Figure 10-4

Don't confuse displaying this HTML file with ASP.NET. Displaying an HTML file in this manner takes place on a single machine without the involvement of a Web server. Using ASP.NET is more complex than displaying an HTML page in the default browser.

As demonstrated, the backbone of the System.Xml.Xsl namespace is the XslCompiledTransform class. This class uses XSLT files to transform XML documents. XslTransform exposes the following methods and properties:

  • XmlResolver — This get/set property is used to specify a class (abstract base class, XmlResolver) that is used to handle external references (import and include elements within the style sheet). These external references are encountered when a document is transformed (the method, Transform, is executed). The System.Xml namespace contains a class, XmlUrlResolver, which is derived from XmlResolver. The XmlUrlResolver class resolves the external resource based on a URI.

  • Load — This overloaded method loads an XSLT style sheet to be used in transforming XML documents. It is permissible to specify the XSLT style sheet as a parameter of type XPathNavigator, filename of XSLT file (specified as parameter type String), XmlReader, or IXPathNavigable. For each type of XSLT supported, an overloaded member is provided that enables an XmlResolver to also be specified. For example, it is possible to call Load(String, XmlResolver), where String corresponds to a filename and XmlResolver is an object that handles references in the style sheet of type xsl:import and xsl:include. It would also be permissible to pass in a value of Nothing for the second parameter of the Load method (so that no XmlResolver would be specified).

  • Transform — This overloaded method transforms a specified XML document using the previously specified XSLT style sheet and an XmlResolver. The location where the transformed XML is to be output is specified as a parameter to this method. The first parameter of each overloaded method is the XML document to be transformed. This parameter can be represented as an IXPathNavigable, XML filename (specified as parameter type String), or XPathNavigator.

The most straightforward variant of the Transform method is Transform(String, String, XmlResolver). In this case, a file containing an XML document is specified as the first parameter, a filename that receives the transformed XML document is specified as the second parameter, and the XmlResolver is used as the third parameter. This is exactly how the first XSLT example utilized the Transform method:

myXslTransform.Transform("..FilmOrders.xml", destFileName)

The first parameter to the Transform method can also be specified as IXPathNavigable or XPath-Navigator. Either of these parameter types allows the XML output to be sent to an object of type Stream, TextWriter, or XmlWriter. When these two flavors of input are specified, a parameter containing an object of type XsltArgumentList can be specified. An XsltArgumentList object contains a list of arguments that are used as input to the transform.

When working with a .NET 2.0/3.5 project, it is preferable to use the XslCompiledTransform object instead of the XslTransform object, because the XslTransform object is considered obsolete.

The XslCompiledTransform object uses the same Load and Transform methods to pull the data. The Transform method provides the following signatures:

XslCompiledTransform.Transform(IXPathNavigable, XmlWriter)
XslCompiledTransform.Transform(IXPathNavigable, XsltArguementList, XmlWriter)
XslCompiledTransform.Transform(IXPathNavigable, XsltArguementList, TextWriter)
XslCompiledTransform.Transform(IXPathNavigable, XsltArguementList, Stream)
XslCompiledTransform.Transform(XmlReader, XmlWriter)
XslCompiledTransform.Transform(XmlReader, XsltArguementList, XmlWriter)
XslCompiledTransform.Transform(XmlReader, XsltArguementList, TextWriter)
XslCompiledTransform.Transform(XmlReader, XsltArguementList, Stream)
XslCompiledTransform.Transform(XmlReader, XsltArguementList, XmlWriter,
   XmlResolver)
XslCompiledTransform.Transform(String, String)
XslCompiledTransform.Transform(String, XmlWriter)
XslCompiledTransform.Transform(String, XsltArguementList, XmlWriter)
XslCompiledTransform.Transform(String, XsltArguementList, TextWriter)
XslCompiledTransform.Transform(String, XsltArguementList, Stream)

In this case, String is a representation of the .xslt file that should be used in the transformation. Here, String represents the location of specific files (whether it is source files or output files). Some of the signatures also allow for output to XmlWriter objects, streams, and TextWriter objects. These can be used by also providing additional arguments using the XsltArgumentList object.

The preceding example used the second signature XslCompiledTransform.Transform(String, String), which asked for the source file and the destination file (both string representations of the location of said files):

myXslCompiledTransform.Transform("..FilmOrders.xml", destFileName)

XSLT Transforming between XML Standards

The first example used four XSLT elements to transform an XML file into an HTML file. Such an example has merit, but it doesn't demonstrate an important use of XSLT: transforming XML from one standard into another standard. This may involve renaming elements/attributes, excluding elements/attributes, changing data types, altering the node hierarchy, and representing elements as attributes, and vice versa.

Returning to the example, a case of differing XML standards could easily affect your software that automates movie orders coming into a supplier. Imagine that the software, including its XML representation of a movie order, is so successful that you sell 100,000 copies. However, just as you are celebrating, a consortium of the largest movie supplier chains announces that they are no longer accepting faxed orders and that they are introducing their own standard for the exchange of movie orders between movie sellers and buyers.

Rather than panic, you simply ship an upgrade that includes an XSLT file. This upgrade (a bit of extra code plus the XSLT file) transforms your XML representation of a movie order into the XML representation dictated by the consortium of movie suppliers. Using an XSLT file enables you to ship the upgrade immediately. If the consortium of movie suppliers revises their XML representation, then you are not obliged to change your source code. Instead, you can simply ship the upgraded XSLT file that ensures each movie order document is compliant.

The specific source code that executes the transform is as follows:

Dim myXslCompiledTransform As XslCompiledTransform = New XslCompiledTransform

myXslCompiledTransform.Load("..ConvertLegacyToNewStandard.xslt")
myXslCompiledTransform.Transform("..MovieOrdersOriginal.xml", _
   "..MovieOrdersModified.xml")

Those three lines of code accomplish the following:

  • Create an XslCompiledTransform object

  • Use the Load method to load an XSLT file (ConvertLegacyToNewStandard.xslt)

  • Use the Transform method to transform a source XML file (MovieOrdersOriginal.xml) into a destination XML file (MovieOrdersModified.xml)

Recall that the input XML document (MovieOrdersOriginal.xml) does not match the format required by your consortium of movie supplier chains. The content of this source XML file is as follows:

<?xml version="1.0" encoding="utf-8" ?>
<FilmOrder_Multiple>
    <multiFilmOrders>
        <FilmOrder>
<name>Grease</name>
            <filmId>101</filmId>
            <quantity>10</quantity>
        </FilmOrder>
        ...
    </multiFilmOrders>
</FilmOrder_Multiple>

The format exhibited in the preceding XML document does not match the format of the consortium of movie supplier chains. To be accepted by the collective of suppliers, you must transform the document as follows:

  • Remove element <FilmOrder_Multiple>.

  • Remove element <multiFilmOrders>.

  • Rename element <FilmOrder> to <DvdOrder>.

  • Remove element <name> (the film's name is not to be contained in the document).

  • Rename element <quantity> to HowMuch and make HowMuch an attribute of <DvdOrder>.

  • Rename element <filmId> to FilmOrderNumber and make FilmOrderNumber an attribute of <DvdOrder>.

  • Display attribute HowMuch before attribute FilmOrderNumber.

Many of the steps performed by the transform could have been achieved using an alternative technology. For example, you could have used Source Code Style attributes with your serialization to generate the correct XML attribute and XML element name. Had you known in advance that a consortium of suppliers was going to develop a standard, you could have written your classes to be serialized based on the standard. The point is that you did not know and now one standard (your legacy standard) has to be converted into a newly adopted standard of the movie suppliers' consortium. The worst thing you could do would be to change your working code and then force all users working with the application to upgrade. It is vastly simpler to add an extra transformation step to address the new standard.

The XSLT file that facilitates the transform is named ConvertLegacyToNewStandard.xslt. A portion of this file is implemented as follows:

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="FilmOrder">
    <!- rename <FilmOrder> to <DvdOrder> ->
    <xsl:element name="DvdOrder">
      <!- Make element 'quantity' attribute HowMuch
            Notice attribute HowMuch comes before attribute FilmOrderNumber ->
      <xsl:attribute name="HowMuch">
        <xsl:value-of select="quantity"></xsl:value-of>
      </xsl:attribute>
      <!- Make element filmId attribute FilmOrderNumber ->
      <xsl:attribute name="FilmOrderNumber">
        <xsl:value-of select="filmId"></xsl:value-of>
      </xsl:attribute>
    </xsl:element>
<!- end of DvdOrder element ->
  </xsl:template>
</xsl:stylesheet>

In the previous snippet of XSLT, the following XSLT elements are used to facilitate the transformation:

  • <xsl:template match="FilmOrder"> — All operations in this template XSLT element take place on the original document's FilmOrder node.

  • <xsl:element name="DvdOrder"> — The element corresponding to the source document's FilmOrder element will be called DvdOrder in the destination document.

  • <xsl:attribute name="HowMuch"> — An attribute named HowMuch will be contained in the previously specified element, <DvdOrder>. This attribute XSLT element for HowMuch comes before the attribute XSLT element for FilmOrderNumber. This order was specified as part of your transform to adhere to the new standard.

  • <xsl:value-of select='quantity'> — Retrieve the value of the source document's <quantity> element and place it in the destination document. This instance of XSLT element value-of provides the value associated with the attribute HowMuch.

Two new XSLT elements have crept into your vocabulary: element and attribute. Both of these XSLT elements live up to their names. Specifying the XSLT element named element places an element in the destination XML document. Specifying the XSLT element named attribute places an attribute in the destination XML document. The XSLT transform found in ConvertLegacyToNewStandard.xslt is too long to review here. When reading this file in its entirety, remember that this XSLT file contains inline documentation to specify precisely what aspect of the transformation is being performed at which location in the XSLT document. For example, the following XML code comments indicate what the XSLT element attribute is about to do:

<!-- Make element 'quantity' attribute HowMuch
      Notice attribute HowMuch comes before attribute FilmOrderNumber -->
<xsl:attribute name="HowMuch">
     <xsl:value-of select='quantity'></xsl:value-of>
</xsl:attribute>

The preceding example spans several pages but contains just three lines of code. This demonstrates that there is more to XML than learning how to use it in Visual Basic and the .NET Framework. Among other things, you also need a good understanding of XSLT, XPath, and XQuery.

Other Classes and Interfaces in System.Xml.Xsl

We just took a good look at XSLT and the System.Xml.Xsl namespace, but there is a lot more to it than that. Other classes and interfaces exposed by the System.Xml.Xsl namespace include the following:

  • IXsltContextFunction — This interface accesses at runtime a given function defined in the XSLT style sheet.

  • IXsltContextVariable — This interface accesses at runtime a given variable defined in the XSLT style sheet.

  • XsltArgumentList — This class contains a list of arguments. These arguments are XSLT parameters or XSLT extension objects. The XsltArgumentList object is used in conjunction with the Transform method of XslTransform.

  • XsltContext — This class contains the state of the XSLT processor. This context information enables XPath expressions to have their various components resolved (functions, parameters, and namespaces).

  • XsltException, XsltCompileException — These classes contain the information pertaining to an exception raised while transforming data. XsltCompileException is derived from XsltException.

ADO.NET

ADO.NET enables Visual Basic applications to generate XML documents and use such documents to update persisted data. ADO.NET natively represents its DataSet's underlying data store in XML. ADO.NET also enables SQL Server-specific XML support to be accessed. This chapter focuses on those features of ADO.NET that enable the XML generated and consumed to be customized. ADO.NET is covered in detail in Chapter 9.

The DataSet properties and methods that are pertinent to XML include Namespace, Prefix, GetXml, GetXmlSchema, InferXmlSchema, ReadXml, ReadXmlSchema, WriteXml, and WriteXmlSchema. An example of code that uses the GetXml method is shown here:

Dim adapter As New _
    SqlClient.SqlDataAdapter("SELECT ShipperID, CompanyName, Phone " & _
                   "FROM Shippers", _
                   "SERVER=localhost;UID=sa;PWD=sa;Database=Northwind;")
Dim ds As New DataSet()

adapter.Fill(ds)
Console.Out.WriteLine(ds.GetXml())

The preceding code uses the sample Northwind database, retrieving all rows from the Shippers table. This table was selected because it contains only three rows of data.

The following example makes use of the Northwind.mdf SQL Server Express Database file. To get this database, please search for "Northwind and pubs Sample Databases for SQL Server 2000." You can find this link at www.microsoft.com/downloads/details.aspx?familyid=06616212-0356-46a0-8da2-eebc53a68034&displaylang=en. Once you've installed it, you'll find the Northwind.mdf file in the C:SQL Server 2000 Sample Databases directory. To add this database to your application, right-click on the solution you are working with and select Add Existing Item. From the provided dialog, you'll then be able to browse to the location of the Northwind.mdf file that you just installed. If you have trouble getting permissions to work with the database, make a data connection to the file from the Visual Studio Server Explorer. You will be asked to be made the appropriate user of the database, and VS will make the appropriate changes on your behalf for this to occur. When added, you will encounter a Data Source Configuration Wizard. For the purposes of this chapter, simply press the Cancel button when you encounter this dialog.

The XML returned by GetXml is as follows (where ... signifies that <Table> elements were removed for the sake of brevity):

<NewDataSet>
  <Table>
    <ShipperID>1</ShipperID>
    <CompanyName>Speedy Express</CompanyName>
    <Phone>(503) 555-9831</Phone>
  </Table>
  ...
</NewDataSet>

What you are trying to determine from this XML document is how to customize the XML generated. The more customization you can perform at the ADO.NET level, the less will be needed later. With this in mind, note that the root element is <NewDataSet> and that each row of the DataSet is returned as an XML element, <Table>. The data returned is contained in an XML element named for the column in which the data resides (<ShipperID>, <CompanyName>, and <Phone>, respectively).

The root element, <NewDataSet>, is just the default name of the DataSet. This name could have been changed when the DataSet was constructed by specifying the name as a parameter to the constructor:

Dim ds As New DataSet("WeNameTheDataSet")

If the previous version of the constructor were executed, then the <NewDataSet> element would be renamed <WeNameTheDataSet>. After the DataSet has been constructed, you can still set the property DataSetName, thus changing <NewDataSet> to a name such as <WeNameTheDataSetAgain>:

ds.DataSetName = "WeNameTheDataSetAgain"

The <Table> element is actually the name of a table in the DataSet's Tables property. Programmatically, you can change <Table> to <WeNameTheTable>:

ds.Tables("Table").TableName = "WeNameTheTable"

You can customize the names of the data columns returned by modifying the SQL to use alias names. For example, you could retrieve the same data but generate different elements using the following SQL code:

SELECT ShipperID As TheID, CompanyName As CName, Phone
   As TelephoneNumber FROM Shippers

Using the preceding SQL statement, the <ShipperID> element would become the <TheID> element. The <CompanyName> element would become <CName>, and <Phone> would become <TelephoneNumber>. The column names can also be changed programmatically by using the Columns property associated with the table in which the column resides. An example of this follows, where the XML element <TheID> is changed to <AnotherNewName>:

ds.Tables("WeNameTheTable").Columns("TheID").ColumnName = "AnotherNewName"

This XML could be transformed using System.Xml.Xsl. It could be read as a stream (XmlTextReader) or written as a stream (XmlTextWriter). The XML returned by ADO.NET could even be deserialized and used to create an object or objects using XmlSerializer. The point is to recognize what ADO.NET-generated XML looks like. If you know its format, then you can transform it into whatever you like.

ADO.NET and SQL Server 2000's Built-in XML Features

Those interested in fully exploring the XML-specific features of SQL Server should take a look at Professional SQL Server 2000 Programming by Robert Vieira (Wrox Press, 2000). However, because the content of that book is not .NET-specific, the next example forms a bridge between Professional SQL Server 2000 Programming and the .NET Framework.

Two of the major XML-related features exposed by SQL Server are as follows:

  • FOR XML — The FOR XML clause of an SQL SELECT statement enables a rowset to be returned as an XML document. The XML document generated by a FOR XML clause is highly customizable with respect to the document hierarchy generated, per-column data transforms, representation of binary data, XML schema generated, and a variety of other XML nuances.

  • OPENXML — The OPENXML extension to Transact-SQL enables a stored procedure call to manipulate an XML document as a rowset. Subsequently, this rowset can be used to perform a variety of tasks, such as SELECT, INSERT INTO, DELETE, and UPDATE.

SQL Server's support for OPENXML is a matter of calling a stored procedure. A developer who can execute a stored procedure call using Visual Basic in conjunction with ADO.NET can take full advantage of SQL Server's support for OPENXML. FOR XML queries have a certain caveat when it comes to ADO.NET. To understand this caveat, consider the following FOR XML query:

SELECT ShipperID, CompanyName, Phone FROM Shippers FOR XML RAW

Using SQL Server's Query Analyzer, this FOR XML RAW query generated the following XML:

<row ShipperID="1" CompanyName="Speedy Express" Phone="(314) 555-9831" />
<row ShipperID="2" CompanyName="United Package" Phone="(314) 555-3199" />
<row ShipperID="3" CompanyName="Federal Shipping" Phone="(314) 555-9931" />

The same FOR XML RAW query can be executed from ADO.NET as follows:

Dim adapter As New _
    SqlDataAdapter("SELECT ShipperID, CompanyName, Phone " & _
                   "FROM Shippers FOR XML RAW", _
                   "SERVER=localhost;UID=sa;PWD=sa;Database=Northwind;")
Dim ds As New DataSet

adapter.Fill(ds)
Console.Out.WriteLine(ds.GetXml())

The caveat with respect to a FOR XML query is that all data (the XML text) must be returned via a result set containing a single row and a single column named XML_F52E2B61-18A1-11d1-B105- 00805F49916B.

The output from the preceding code snippet demonstrates this caveat (where...represents similar data not shown for reasons of brevity):

<NewDataSet>
  <Table>
    <XML_F52E2B61-18A1-11d1-B105-00805F49916B>
      /&lt;row ShipperID="1" CompanyName="Speedy Express"
      Phone="(503) 555-9831"/&gt;
      ...
    </XML_F52E2B61-18A1-11d1-B105-00805F49916B>
  </Table>
</NewDataSet>

The value of the single row and single column returned contains what looks like XML, but it contains /&lt; instead of the less-than character, and /&gt; instead of the greater-than character. The symbols < and > cannot appear inside XML data, so they must be entity-encoded — that is, represented as /&gt; and /&lt;. The data returned in element <XML_F52E2B61-18A1-11d1-B105-00805F49916B> is not XML, but data contained in an XML document.

To fully utilize FOR XML queries, the data must be accessible as XML. The solution to this quandary is the ExecuteXmlReader method of the SQLCommand class. When this method is called, an SQLCommand object assumes that it is executed as a FOR XML query and returns the results of this query as an XmlReader object. An example of this follows:

Dim connection As New _
    SqlConnection("SERVER=localhost;UID=sa;PWD=sa;Database=Northwind;")
Dim command As New _
    SqlCommand("SELECT ShipperID, CompanyName, Phone " & _
                    "FROM Shippers FOR XML RAW")
Dim memStream As MemoryStream = New MemoryStream
Dim xmlReader As New XmlTextReader(memStream)

connection.Open()
command.Connection = connection
xmlReader = command.ExecuteXmlReader()
' Extract results from XMLReader

You will need to import the System.Data.SqlClient namespace for this example to work.

The XmlReader created in this code is of type XmlTextReader, which derives from XmlReader. The XmlTextReader is backed by a MemoryStream; hence, it is an in-memory stream of XML that can be traversed using the methods and properties exposed by XmlTextReader. Streaming XML generation and retrieval was discussed earlier.

Using the ExecuteXmlReader method of the SQLCommand class, it is possible to retrieve the result of FOR XML queries. What makes the FOR XML style of queries so powerful is that it can configure the data retrieved. The three types of FOR XML queries support the following forms of XML customization:

  • FOR XML RAW — This type of query returns each row of a result set inside an XML element named <row>. The data retrieved is contained as attributes of the <row> element. The attributes are named for the column name or column alias in the FOR XML RAW query.

  • FOR XML AUTO — By default, this type of query returns each row of a result set inside an XML element named for the table or table alias contained in the FOR XML AUTO query. The data retrieved is contained as attributes of this element. The attributes are named for the column name or column alias in the FOR XML AUTO query. By specifying FOR XML AUTO, ELEMENTS, it is possible to retrieve all data inside elements, rather than inside attributes. All data retrieved must be in attribute or element form. There is no mix-and-match capability.

  • FOR XML EXPLICIT — This form of the FOR XML query enables the precise XML type of each column returned to be specified. The data associated with a column can be returned as an attribute or an element. Specific XML types, such as CDATA and ID, can be associated with a column returned. Even the level in the XML hierarchy in which data resides can be specified using a FOR XML EXPLICIT query. This style of query is fairly complicated to implement.

FOR XML queries are flexible. Using FOR XML EXPLICIT and the movie rental database, it would be possible to generate any form of XML movie order standard. The decision that needs to be made is where XML configuration takes place. Using Visual Basic, a developer could use XmlTextReader and XmlTextWriter to create any style of XML document. Using the XSLT language and an XSLT file, the same level of configuration can be achieved. SQL Server and, in particular, FOR XML EXPLICIT, enable the same level of XML customization, but this customization takes place at the SQL level and may even be configured to stored procedure calls.

XML and SQL Server 2005

As a representation for data, XML is ideal in that it is a self-describing data format that enables you to provide your data sets as complex data types. It also provides order to your data. SQL Server 2005 embraces this direction.

More and more developers are turning to XML as a means of data storage. For instance, Microsoft Office enables documents to be saved and stored as XML documents. As an increasing number of products and solutions turn toward XML as a means of storage, this allows for a separation between the underlying data and the presentation aspect of what is being viewed. XML is also being used as a means of communicating data sets across platforms and the enterprise. The entire XML Web Services story is a result of this new capability. Simply said, XML is a powerful alternative to your data storage solutions.

Just remember that the power of using XML isn't only about storing data as XML somewhere (whether that is XML files or not); it is also about the capability to quickly access this XML data and to be able to query the data that is retrieved.

SQL Server 2005 makes a big leap toward XML in adding an XML data type as an option. This enables you to unify the relational aspects of the database and the current desires to work with XML data.

FOR XML has also been expanded from within this latest edition of SQL Server. This includes a new TYPE directive that returns an XML data type instance. In addition, the NET 2.0 Framework introduced a new namespace — System.Data.SqlXml — that enables you to easily work with the XML data that comes from SQL Server 2005. The SqlXml object is an XmlReader-derived type. Another addition is the use of the SqlDataReader object's GetXml method.

XML and SQL Server 2008

SQL Server 2008 continues on this path and introduces some new XML features. First, it supports lax validation using XSD schemas. This wasn't possible prior to this release. Another big change is related to how SQL Server handles the storage of dateTime values. In SQL Server 2005, when you stored dateTime values, the database would first normalize everything to UTC time, regardless of whether or not you wanted to store the information in a specific time zone. In addition, if you excluded the time in your dateTime declaration, SQL Server 2005 would add it back for you so that there was a full dateTime stored within the database. SQL Server 2008, conversely, enables you to store the dateTime value exactly as you declared it. No modifications or alterations are made to your value as it is stored in the database.

Another new feature of SQL Server 2008 is support of union types that contain list types. This means that you can now work with elements such as the following:

<Stocks>INTC MSFT CSCO IBM RTRSY</Stocks>

Union types enable you to define multiple items within a single element with a space between the elements, rather than define each as separate elements, as shown here:

<Stocks>
   <Item>INTC</Item>
   <Item>MSFT</Item>
   <Item>CSCO</Item>
   <Item>IBM</Item>
   <Item>RTRSY</Item>
</Stocks>

XML in ASP.NET 3.5

Most Microsoft-focused Web developers have usually concentrated on either Microsoft SQL Server or Microsoft Access for their data storage needs. Today, however, a considerable amount of data is stored in XML format, so considerable inroads have been made in improving Microsoft's core Web technology to work easily with this format.

The XmlDataSource Server Control

ASP.NET 3.5 contains a series of data source controls designed to bridge the gap between your data stores (such as XML) and the data-bound controls at your disposal. These new data controls not only enable you to retrieve data from various data stores, they also enable you to easily manipulate the data (using paging, sorting, editing, and filtering) before the data is bound to an ASP.NET server control.

With XML being as important as it is, a specific data source control is available in ASP.NET just for retrieving and working with XML data: XmlDataSource. This control enables you to connect to your XML data and use this data with any of the ASP.NET data-bound controls. Just like the SqlDataSource and the ObjectDataSource controls (which are two of the other data source controls), the XmlDataSource control enables you to not only retrieve data, but also insert, delete, and update data items. With increasing numbers of users turning to XML data formats, such as Web services, RSS feeds, and more, this control is a valuable resource for your Web applications.

To show the XmlDataSource control in action, first create a simple XML file and include this file in your application. The following code reflects a simple XML file of Russian painters:

<?xml version="1.0" encoding="utf-8" ?>
<Artists>
   <Painter name="Vasily Kandinsky">
      <Painting>
         <Title>Composition No. 218</Title>
         <Year>1919</Year>
      </Painting>
   </Painter>
   <Painter name="Pavel Filonov">
      <Painting>
         <Title>Formula of Spring</Title>
         <Year>1929</Year>
      </Painting>
   </Painter>
   <Painter name="Pyotr Konchalovsky">
      <Painting>
         <Title>Sorrento Garden</Title>
         <Year>1924</Year>
      </Painting>
   </Painter>
</Artists>

Now that the Painters.xml file is in place, the next step is to use an ASP.NET DataList control and connect this DataList control to an <asp:XmlDataSource> control, as shown here:

<%@ Page Language="VB"%>

<html xmlns="http://www.w3.org/1999/xhtml" >
<head runat="server">
    <title>XmlDataSource</title>
</head>
<body>
    <form id="form1" runat="server">
        <asp:DataList ID="DataList1" Runat="server"
         DataSourceID="XmlDataSource1">
            <ItemTemplate>
                <p><b><%# XPath("@name") %></b><br />
                <i><%# XPath("Painting/Title") %></i><br />
                <%# XPath("Painting/Year") %></p>
            </ItemTemplate>
        </asp:DataList>

<asp:XmlDataSource ID="XmlDataSource1" Runat="server"
         DataFile="~/Painters.xml" XPath="Artists/Painter">
        </asp:XmlDataSource>
    </form>
</body>
</html>

This is a simple example, but it shows you the power and ease of using the XmlDataSource control. Pay attention to two attributes in this example. The first is the DataFile attribute. This attribute points to the location of the XML file. Because the file resides in the root directory of the application, it is simply ~/Painters.xml. The next attribute included in the XmlDataSource control is the XPath attribute. The XmlDataSource control uses XPath for the filtering of XML data. In this case, the XmlDataSource control is taking everything within the <Painter> set of elements. The value Artists/Painter means that the XmlDataSource control navigates to the <Artists> element and then to the <Painter> element within the specified XML file.

The DataList control next must specify the DataSourceID as the XmlDataSource control. In the <ItemTemplate> section of the DataList control, you can retrieve specific values from the XML file by using XPath commands. The XPath commands filter the data from the XML file. The first value retrieved is an element attribute (name) contained in the <Painter> element. When you retrieve an attribute of an element, you preface the name of the attribute with an @ symbol. In this case, you simply specify @name to get the painter's name. The next two XPath commands go deeper into the XML file, getting the specific painting and the year of the painting. Remember to separate nodes with a /. When run in the browser, this code produces the results shown in Figure 10-5.

Figure 10-5

Figure 10.5. Figure 10-5

Besides working from static XML files such as the Painters.xml file, the XmlDataSource file can work from dynamic, URL-accessible XML files. One popular XML format pervasive on the Internet today is blogs, or weblogs. Blogs, or personal diaries, can be viewed either in the browser, through an RSS-aggregator, or just as pure XML.

Figure 10-6 shows blog entries directly in the browser (if you are using IE7). Behind this blog is an actual XML document that can be worked with by your code. You can find a lot of blogs to play with for this example at weblogs.asp.net. This screen shot uses the blog found at www.geekswithblogs.net/evjen.

Now that you know the location of the XML from the blog, you can use this XML with the XmlDataSource control and display some of the results in a DataList control. The code for this example is shown here:

<%@ Page Language="VB"%>

<html xmlns="http://www.w3.org/1999/xhtml" >
<head runat="server">
Figure 10-6

Figure 10.6. Figure 10-6

<title>XmlDataSource</title>
</head>
<body>
    <form id="form1" runat="server">
        <asp:DataList ID="DataList1" Runat="server"
         DataSourceID="XmlDataSource1">
            <HeaderTemplate>
                <table border="1" cellpadding="3">
            </HeaderTemplate>
            <ItemTemplate>
                <tr><td><b><%# XPath("title") %></b><br />
                <i><%# XPath("pubDate") %></i><br />
                <%# XPath("description") %></td></tr>
            </ItemTemplate>
            <AlternatingItemTemplate>
                <tr bgcolor="LightGrey"><td><b><%# XPath("title") %></b><br />
                <i><%# XPath("pubDate") %></i><br />
                <%# XPath("description") %></td></tr>
            </AlternatingItemTemplate>
            <FooterTemplate>
                </table>
            </FooterTemplate>
        </asp:DataList>

        <asp:XmlDataSource ID="XmlDataSource1" Runat="server"
         DataFile="http://geekswithblogs.net/evjen/Rss.aspx"
         XPath="rss/channel/item">
        </asp:XmlDataSource>
    </form>
</body>
</html>

This example shows that the DataFile points to a URL where the XML is retrieved. The XPath property filters out all the <item> elements from the RSS feed. The DataList control creates an HTML table and pulls out specific data elements from the RSS feed, such as the <title>, <pubDate>, and <description> elements.

Running this page in the browser results in something similar to what is shown in Figure 10-7.

Figure 10-7

Figure 10.7. Figure 10-7

This approach also works with XML Web Services, even those for which you can pass in parameters using HTTP-GET. You just set up the DataFile value in the following manner:

DataFile="http://www.someserver.com/GetWeather.asmx/ZipWeather?zipcode=63301"

The XmlDataSource Control's Namespace Problem

One big issue with using the XmlDataSource control is that when using the XPath capabilities of the control, it is unable to understand namespace-qualified XML. The XmlDataSource control chokes on any XML data that contains namespaces, so it is important to yank out any prefixes and namespaces contained in the XML.

To make this a bit easier, the XmlDataSource control includes the TransformFile attribute. This attribute takes your XSLT transform file, which can be applied to the XML pulled from the XmlDataSource control. That means you can use an XSLT file, which will transform your XML in such a way that the prefixes and namespaces are completely removed from the overall XML document. An example of this XSLT document is illustrated here:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
   <xsl:template match="*">
      <!-- Remove any prefixes -->
      <xsl:element name="{local-name()}">
          <!-- Work through attributes -->
          <xsl:for-each select="@*">
             <!-- Remove any attribute prefixes -->
             <xsl:attribute name="{local-name()}">
                <xsl:value-of select="."/>
             </xsl:attribute>
          </xsl:for-each>
      <xsl:apply-templates/>
      </xsl:element>
   </xsl:template>
</xsl:stylesheet>

Now, with this XSLT document in place within your application, you can use the XmlDataSource control to pull XML data and strip that data of any prefixes and namespaces:

<asp:XmlDataSource ID="XmlDataSource1" runat="server"
 DataFile="NamespaceFilled.xml" TransformFile="~/RemoveNamespace.xsl"
 XPath="ItemLookupResponse/Items/Item"></asp:XmlDataSource>

The Xml Server Control

Since the very beginning of ASP.NET, there has always been a server control called the Xml server control. This control performs the simple operation of XSLT transformation upon an XML document. The control is easy to use: All you do is point to the XML file you wish to transform using the DocumentSource attribute, and the XSLT transform file using the TransformSource attribute.

To see this in action, use the Painters.xml file shown earlier. Create your XSLT transform file, as shown in the following example:

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="/">
      <html>
      <body>
        <h3>List of Painters &amp; Paintings</h3>
        <table border="1">
          <tr bgcolor="LightGrey">
            <th>Name</th>
            <th>Painting</th>
            <th>Year</th>
          </tr>
          <xsl:apply-templates select="//Painter"/>
        </table>
      </body>
    </html>
  </xsl:template>
<xsl:template match="Painter">
    <tr>
      <td>
        <xsl:value-of select="@name"/>
      </td>
      <td>
        <xsl:value-of select="Painting/Title"/>
      </td>
      <td>
        <xsl:value-of select="Painting/Year"/>
      </td>
    </tr>
  </xsl:template>

</xsl:stylesheet>

With the XML document and the XSLT document in place, the final step is to combine the two using the Xml server control provided by ASP.NET:

<%@ Page Language="VB" %>

<html xmlns="http://www.w3.org/1999/xhtml" >
<head id="Head1" runat="server">
    <title>XmlDataSource</title>
</head>
<body>
    <form id="form1" runat="server">
        <asp:Xml ID="Xml1" runat="server" DocumentSource="~/Painters.xml"
         TransformSource="~/PaintersTransform.xsl"></asp:Xml>
    </form>
</body>
</html>

The result is shown in Figure 10-8.

Figure 10-8

Figure 10.8. Figure 10-8

Summary

Ultimately, XML could be the underpinning of electronic commerce, banking transactions, and data exchange of almost every conceivable kind. The beauty of XML is that it isolates data representation from data display. Technologies such as HTML contain data that is tightly bound to its display format. XML does not suffer this limitation, and at the same time it has the readability of HTML. Accordingly, the XML facilities available to a Visual Basic application are vast, and a large number of XML-related features, classes, and interfaces are exposed by the .NET Framework.

This chapter showed you how to use System.Xml.Serialization.XmlSerializer to serialize classes. Source Code Style attributes were introduced in conjunction with serialization. This style of attributes enables the customization of the XML serialized to be extended to the source code associated with a class. What is important to remember about the direction of serialization classes is that a required change in the XML format becomes a change in the underlying source code. Developers should resist the temptation to rewrite serialized classes in order to conform to some new XML data standard (such as the example movie order format endorsed by your consortium of movie rental establishments). Technologies such as XSLT, exposed via the System.Xml.Query namespace, should be examined first as alternatives. This chapter demonstrated how to use XSLT style sheets to transform XML data using the classes found in the System.Xml.Query namespace.

The most useful classes and interfaces in the System.Xml namespace were reviewed, including those that support document-style XML access: XmlDocument, XmlNode, XmlElement, and XmlAttribute. The System.Xml namespace also contains classes and interfaces that support stream-style XML access: XmlReader and XmlWriter.

Finally, you looked at Microsoft's SQL Server 2005, 2008, and XQuery, as well as how to use XML with ASP.NET 3.5. The next chapter takes a look at LINQ, one of the biggest new features related to how the .NET Framework 3.5 works with XML. LINQ, which provides a new means of querying your data, is a lightweight façade over ADO.NET. You will likely find that the new LINQ to XML is a great way to work with XML.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset