This chapter describes how you can generate and manipulate Extensible Markup Language (XML) using Visual Basic 2008. Of course, using XML in Visual Basic is a vast area to cover (more than possibly could be covered in a chapter). The .NET Framework exposes five XML-specific namespaces that contain over a hundred different classes. In addition, dozens of other classes support and implement XML-related technologies, such as ADO.NET, SQL Server, and BizTalk. Consequently, this chapter focuses on the general concepts and the most important classes.
Visual Basic relies on the classes exposed in the following XML-related namespaces to transform, manipulate, and stream XML documents:
System.Xml
provides core support for a variety of XML standards, including DTD, namespace, DOM, XDR, XPath, XSLT, and SOAP.
System.Xml.Serialization
provides the objects used to transform objects to and from XML documents or streams using serialization.
System.Xml.Schema
provides a set of objects that enable schemas to be loaded, created, and streamed. This support is achieved using a suite of objects that support in-memory manipulation of the entities that compose an XML schema.
System.Xml.XPath
provides a parser and evaluation engine for the XML Path language (XPath).
System.Xml.Xsl
provides the objects necessary when working with Extensible Stylesheet Language (XSL) and XSL Transformations (XSLT).
The XML-related technologies utilized by Visual Basic include other technologies that generate XML documents and enable XML documents to be managed as a data source:
ADO — The legacy COM objects provided by ADO can generate XML documents in stream or file form. ADO can also retrieve a previously persisted XML document and manipulate it. (Although ADO is not used in this chapter, ADO and other legacy COM APIs can be accessed seamlessly from Visual Basic.)
ADO.NET — This uses XML as its underlying data representation: The in-memory data representation of the ADO.NET DataSet
object is XML; the results of data queries are represented as XML documents; XML can be imported into a DataSet
and exported from a DataSet
. (ADO.NET is covered in Chapter 9.)
SQL Server 2000 — XML-specific features were added to SQL Server 2000 (FOR XML
queries to retrieve XML documents and OPENXML
to represent an XML document as a rowset). Visual Basic can use ADO.NET to access SQL Server's XML-specific features (the documents generated and consumed by SQL Server can then be manipulated programmatically). Recently, Microsoft also released SQLXML, which provides an SQL Server 2000 database with some excellent XML capabilities, such as querying a database using XQuery, getting back XML result sets from a database, working with data just as if it were XML, taking huge XML files and having SQLXML convert them to relational data, and much more. SQLXML enables you to perform these functions and more via a set of managed .NET classes. You can download SQLXML free from the Microsoft SQLXML website at http://msdn2.microsoft.com/aa286527.aspx
.
SQL Server 2005 — SQL Server has now been modified with XML in mind. SQL Server 2005 can natively understand XML because it is now built into the underlying foundation of the database. SQL Server 2005 includes an XML data type that also supports an XSD schema validation. The capability to query and understand XML documents is a valuable addition to this database server. SQL Server 2005 also comes in a lightweight (and free) version called SQL Server Express Edition.
SQL Server 2008 — The latest edition of SQL Server, version 2008, works off of the SQL Server 2005 release and brings to the table an improved XSD schema validation process as well as enhanced support for XQuery.
This chapter makes sense of this range of technologies by introducing some basic XML concepts and demonstrating how Visual Basic, in conjunction with the .NET Framework, can make use of XML. Specifically, in this chapter you will do all of the following:
Learn the rationale behind XML.
Look at the namespaces within the .NET Framework class library that deal with XML and XML-related technologies.
Take a close look at some of the classes contained within these namespaces.
Gain an overview of some of the other Microsoft technologies that utilize XML, particularly SQL Server and ADO.NET.
At the end of this chapter, you will be able to generate, manipulate, and transform XML using Visual Basic.
This book also covers LINQ to XML and the new XML objects found in the
System.Xml.Linq
namespace. These items are covered in Chapter 11.
XML is a tagged markup language similar to HTML. In fact, XML and HTML are distant cousins and have their roots in the Standard Generalized Markup Language (SGML). This means that XML leverages one of the most useful features of HTML — readability. However, XML differs from HTML in that XML represents data, whereas HTML is a mechanism for displaying data. The tags in XML describe the data, as shown in the following example:
<?xml version="1.0" encoding="utf-8" ?> <Movies> <FilmOrder name="Grease" filmId="1" quantity="21"></FilmOrder> <FilmOrder name="Lawrence of Arabia" filmId="2" quantity="10"></FilmOrder> <FilmOrder name="Star Wars" filmId="3" quantity="12"></FilmOrder> <FilmOrder name="Shrek" filmId="4" quantity="14"></FilmOrder> </Movies>
This XML document represents a store order for a collection of movies. The standard used to represent an order of films would be useful to movie rental firms, collectors, and others. This information can be shared using XML for the following reasons:
The data tags in XML are self-describing.
XML is an open standard and supported on most platforms today.
XML supports the parsing of data by applications not familiar with the contents of the XML document. XML documents can also be associated with a description (a schema) that informs an application as to the structure of the data within the XML document.
At this stage, XML looks simple — it is just a human-readable way to exchange data in a universally accepted format. The essential points that you should understand about XML are as follows:
XML data can be stored in a plain text file.
A document is said to be well formed if it adheres to the XML standard.
Tags are used to specify the contents of a document — for example, <FilmOrder>
.
XML elements (also called nodes) can be thought of as the objects within a document.
Elements are the basic building blocks of the document. Each element contains a start tag and end tag. A tag can be both a start tag and an end tag in one — for example, <FilmOrder />
. In this case, the tag specifies that there is no content (or inner text) to the element (there isn't a closing tag because none is required due to the lack of inner-text content). Such a tag is said to be empty.
Data can be contained in the element (the element content) or within attributes contained in the element.
XML is hierarchical. One document can contain multiple elements, which can themselves contain child elements, and so on. However, an XML document can only have one root element.
This last point means that the XML document hierarchy can be thought of as a tree containing nodes:
The example document has a root node, <Movies>
.
The branches of the root node are elements of type <FilmOrder>
.
The leaves of the XML element, <FilmOrder>
, are its attributes: name, quantity
, and filmId
.
Of course, we are interested in the practical use of XML by Visual Basic. A practical manipulation of the example XML, for example, is to display (for the staff of a movie supplier) a particular movie order in an application so that the supplier can fill the order and then save the information to a database. This chapter explains how you can perform such tasks using the functionality provided by the .NET Framework class library.
The simplest way to demonstrate Visual Basic's support for XML is not with a complicated technology, such as SQL Server or ADO.NET, but with a practical use of XML: serializing a class.
The serialization of an object means that it is written out to a stream, such as a file or a socket (this is also known as dehydrating an object). The reverse process can also be performed: An object can be deserialized (or rehydrated) by reading it from a stream.
The type of serialization described in this chapter is XML serialization, whereby XML is used to represent a class in serialized form.
To help you understand XML serialization, let's examine a class named FilmOrder
(which can be found in the code download from www.wrox.com
). This class is implemented in Visual Basic and is used by the company for processing a movie order. The class could be instantiated on a firm's PDA, laptop, or even mobile phone (as long as the device had the .NET Framework installed).
An instance of FilmOrder
corresponding to each order could be serializing to XML and sending over a socket using the PDA's cellular modem. (If the person making the order had a PDA that did not have a cellular modem, then the instance of FilmOrder
could be serialized to a file.) The order could then be processed when the PDA was dropped into a docking cradle and synced. We are talking about data in a propriety form here, an instance of FilmOrder
being converted into a generic form — XML — that can be universally understood.
The System.Xml.Serialization
namespace contains classes and interfaces that support the serialization of objects to XML, and the deserialization of objects from XML. Objects are serialized to documents or streams using the XmlSerializer
class.
Let's look at how you can use XmlSerializer
. First, you need to define an object that implements a default constructor, such as FilmOrder
:
Public Class FilmOrder ' These are Public because we have yet to implement ' properties to provide program access. Public name As String Public filmId As Integer Public quantity As Integer Public Sub New() End Sub
Public Sub New(ByVal name As String, _ ByVal filmId As Integer, _ ByVal quantity As Integer) Me.name = name Me.filmId = filmId Me.quantity = quantity End Sub End Class
This class should be created in a console application. From there, we can move on to the module. Within the module's Sub Main
, create an instance of XmlSerializer
, specifying the object to serialize and its type in the constructor (you need to make a reference to System.Xml.Serialization
for this to work):
Dim serialize As XmlSerializer = _ New XmlSerializer(GetType(FilmOrder))
Create an instance of the same type passed as a parameter to the constructor of XmlSerializer
:
Dim MyFilmOrder As FilmOrder = _ New FilmOrder("Grease", 101, 10)
Call the Serialize
method of the XmlSerializer
instance and specify the stream to which the serialized object is written (parameter one, Console.Out
) and the object to be serialized (parameter two, MyFilmOrder
):
serialize.Serialize(Console.Out, MyFilmOrder) Console.ReadLine()
To make reference to the XmlSerializer
object, you need to make reference to the System.Xml.Serialization
namespace:
Imports System.Xml.Serialization
Running the module, the following output is generated by the preceding code:
<?xml version="1.0" encoding="IBM437"?> <FilmOrder xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <name>Grease</name> <filmId>101</filmId> <quantity>10</quantity> </FilmOrder>
This output demonstrates the default way in which the Serialize
method serializes an object:
Each object serialized is represented as an element with the same name as the class — in this case, FilmOrder
.
The individual data members of the class serialized are contained in elements named for each data member — in this case, name, filmId
, and quantity
.
Also generated are the following:
A schema can be associated with an XML document and describe the data it contains (name, type, scale, precision, length, and so on). Either the actual schema or a reference to where the schema resides can be contained in the XML document. In either case, an XML schema is a standard representation that can be used by all applications that consume XML. This means that applications can use the supplied schema to validate the contents of an XML document generated by the Serialize
method of the XmlSerializer
object.
The code snippet that demonstrated the Serialize
method of XmlSerializer
displayed the XML generated to Console.Out
. Clearly, we do not expect an application to use Console.Out
when it would like to access a FilmOrder
object in XML form. The point was to show how serialization can be performed in just two lines of code (one call to a constructor and one call to method). The entire section of code responsible for serializing the instance of FilmOrder
is presented here:
Try Dim serialize As XmlSerializer = _ New XmlSerializer(GetType(FilmOrder)) Dim MyMovieOrder As FilmOrder = _ New FilmOrder("Grease", 101, 10) serialize.Serialize(Console.Out, MyMovieOrder) Console.Out.WriteLine() Console.Readline() Catch ex As Exception Console.Error.WriteLine(ex.ToString()) End Try
The Serialize
method's first parameter is overridden so that it can serialize XML to a file (the filename is given as type String
), a Stream
, a TextWriter
, or an XmlWriter
. When serializing to Stream, TextWriter
, or XmlWriter
, adding a third parameter to the Serialize
method is permissible. This third parameter is of type XmlSerializerNamespaces
and is used to specify a list of namespaces that qualify the names in the XML-generated document. The permissible overrides of the Serialize
method are as follows:
Public Sub Serialize(Stream, Object) Public Sub Serialize(TextWriter, Object) Public Sub Serialize(XmlWriter, Object) Public Sub Serialize(Stream, Object, XmlSerializerNamespaces) Public Sub Serialize(TextWriter, Object, XmlSerializerNamespaces) Public Sub Serialize(XmlWriter, Object, XmlSerializerNamespaces) Public Sub Serialize(XmlWriter, Object, XmlSerializerNamespaces, String) Public Sub Serialize(XmlWriter, Object, XmlSerializerNamespaces, String, _ String)
An object is reconstituted using the Deserialize
method of XmlSerializer
. This method is overridden and can deserialize XML presented as a Stream
, a TextReader
, or an XmlReader
. The overloads for Deserialize
are as follows:
Public Function Deserialize(Stream) As Object Public Function Deserialize(TextReader) As Object Public Function Deserialize(XmlReader) As Object Public Function Deserialize(XmlReader, XmlDeserializationEvents) As Object Public Function Deserialize(XmlReader, String) As Object Public Function Deserialize(XmlReader, String, XmlDeserializationEvents) _ As Object
Before demonstrating the Deserialize
method, we will introduce a new class, FilmOrder_Multiple
. This class contains an array of film orders (actually an array of FilmOrder
objects). FilmOrder_Multiple
is defined as follows:
Public Class FilmOrder_Multiple Public multiFilmOrders() As FilmOrder Public Sub New() End Sub Public Sub New(ByVal multiFilmOrders() As FilmOrder) Me.multiFilmOrders = multiFilmOrders End Sub End Class
The FilmOrder_Multiple
class contains a fairly complicated object, an array of FilmOrder
objects. The underlying serialization and deserialization of this class is more complicated than that of a single instance of a class that contains several simple types, but the programming effort involved on your part is just as simple as before. This is one of the great ways in which the .NET Framework makes it easy for you to work with XML data, no matter how it is formed.
To work through an example of the deserialization process, first create a sample order stored as an XML file called Filmorama.xml
:
<?xml version="1.0" encoding="utf-8" ?> <FilmOrder_Multiple xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <multiFilmOrders> <FilmOrder> <name>Grease</name> <filmId>101</filmId> <quantity>10</quantity> </FilmOrder> <FilmOrder> <name>Lawrence of Arabia</name> <filmId>102</filmId> <quantity>10</quantity> </FilmOrder>
<FilmOrder> <name>Star Wars</name> <filmId>103</filmId> <quantity>10</quantity> </FilmOrder> </multiFilmOrders> </FilmOrder_Multiple>
In order for this to run, you should either have the
.xml
file in the location of the executable or define the full path of the file within the code example.
Once the XML file is in place, the next step is to change your console application so it will deserialize the contents of this file. After you have the XML file in place, ensure that your console application has made the proper namespace references:
Imports System.Xml Imports System.Xml.Serialization Imports System.IO
The following code demonstrates an object of type FilmOrder_Multiple
being deserialized (or rehydrated) from a file, Filmorama.xml
. This object is deserialized using this file in conjunction with the Deserialize
method of XmlSerializer
:
' Open file, ..Filmorama.xml Dim dehydrated As FileStream = _ New FileStream("Filmorama.xml", FileMode.Open) ' Create an XmlSerializer instance to handle deserializing, ' FilmOrder_Multiple Dim serialize As XmlSerializer = _ New XmlSerializer(GetType(FilmOrder_Multiple)) ' Create an object to contain the deserialized instance of the object. Dim myFilmOrder As FilmOrder_Multiple = _ New FilmOrder_Multiple ' Deserialize object myFilmOrder = serialize.Deserialize(dehydrated)
Once deserialized, the array of film orders can be displayed:
Dim SingleFilmOrder As FilmOrder For Each SingleFilmOrder In myFilmOrder.multiFilmOrders Console.Out.WriteLine("{0}, {1}, {2}", _ SingleFilmOrder.name, _ SingleFilmOrder.filmId, _ SingleFilmOrder.quantity) Next Console.ReadLine()
This example is just code that serializes an instance of type FilmOrder_Multiple
. The output generated by displaying the deserialized object containing an array of film orders is as follows:
Grease, 101, 10 Lawrence of Arabia, 102, 10 Star Wars, 103, 10
XmlSerializer
also implements a CanDeserialize
method. The prototype for this method is as follows:
Public Overridable Function CanDeserialize(ByVal xmlReader As XmlReader) _ As Boolean
If CanDeserialize
returns True
, then the XML document specified by the xmlReader
parameter can be deserialized. If the return value of this method is False
, then the specified XML document cannot be deserialized.
The FromTypes
method of XmlSerializer
facilitates the creation of arrays that contain XmlSerializer
objects. This array of XmlSerializer
objects can be used in turn to process arrays of the type to be serialized. The prototype for FromTypes
is shown here:
Public Shared Function FromTypes(ByVal types() As Type) As XmlSerializer()
Before exploring the System.Xml.Serialization
namespace, take a moment to consider the various uses of the term "attribute."
Thus far, you have seen attributes applied to a specific portion of an XML document. Visual Basic has its own flavor of attributes, as do C# and each of the other .NET languages. These attributes refer to annotations to the source code that specify information, or metadata, that can be used by other applications without the need for the original source code. We will call such attributes Source Code Style attributes.
In the context of the System.Xml.Serialization
namespace, Source Code Style attributes can be used to change the names of the elements generated for the data members of a class or to generate XML attributes instead of XML elements for the data members of a class. To demonstrate this, we will use a class called ElokuvaTilaus
, which contains data members named name, filmId
, and quantity
. It just so happens that the default XML generated when serializing this class is not in a form that can be readily consumed by an external application.
For example, assume that a Finnish development team has written this external application — hence, the XML element and attribute names are in Finnish (minus the umlauts), rather than English. To rename the XML generated for a data member, name
, a Source Code Style attribute will be used. This Source Code Style attribute specifies that when ElokuvaTilaus
is serialized, the name
data member is represented as an XML element, <Nimi>
. The actual Source Code Style attribute that specifies this is as follows:
<XmlElementAttribute("Nimi")> Public name As String
ElokuvaTilaus
, which means MovieOrder in Finnish, also contains other Source Code Style attributes:
<XmlAttributeAttribute("ElokuvaId")>
specifies that filmId
is to be serialized as an XML attribute named ElokuvaId
.
<XmlAttributeAttribute("Maara")>
specifies that quantity
is to be serialized as an XML attribute named Maara
.
ElokuvaTilaus
is defined as follows:
Imports System.Xml.Serialization Public Class ElokuvaTilaus ' These are Public because we have yet to implement ' properties to provide program access. <XmlElementAttribute("Nimi")> Public name As String <XmlAttributeAttribute("ElokuvaId")> Public filmId As Integer <XmlAttributeAttribute("Maara")> Public quantity As Integer Public Sub New() End Sub Public Sub New(ByVal name As String, _ ByVal filmId As Integer, _ ByVal quantity As Integer) Me.name = name Me.filmId = filmId Me.quantity = quantity End Sub End Class
ElokuvaTilaus
can be serialized as follows:
Dim serialize As XmlSerializer = _ New XmlSerializer(GetType(ElokuvaTilaus)) Dim MyMovieOrder As ElokuvaTilaus = _ New ElokuvaTilaus("Grease", 101, 10) serialize.Serialize(Console.Out, MyMovieOrder) Console.Readline()
The output generated by this code reflects the Source Code Style attributes associated with the class ElokuvaTilaus
:
<?xml version="1.0" encoding="IBM437"?> <ElokuvaTilaus xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" ElokuvaId="101" Maara="10"> <Nimi>Grease</Nimi> </ElokuvaTilaus>
The value of filmId
is contained in an XML attribute, ElokuvaId
, and the value of quantity
is contained in an XML attribute, Maara
. The value of name
is contained in an XML element, Nimi
.
The example only demonstrates the Source Code Style attributes exposed by the XmlAttributeAttribute
and XmlElementAttribute
classes in the System.Xml.Serialization
namespace. A variety of other Source Code Style attributes exist in this namespace that also control the form of XML generated by serialization. The classes associated with such Source Code Style attributes include XmlTypeAttribute, XmlTextAttribute, XmlRootAttribute, XmlIncludeAttribute, XmlIgnoreAttribute
, and XmlEnumAttribute
.
The System.Xml
namespace implements a variety of objects that support standards-based XML processing. The XML-specific standards facilitated by this namespace include XML 1.0, Document Type Definition (DTD) support, XML namespaces, XML schemas, XPath, XQuery, XSLT, DOM Level 1 and DOM Level 2 (Core implementations), as well as SOAP 1.1, SOAP 1.2, SOAP Contract Language, and SOAP Discovery. The System.Xml
namespace exposes over 30 separate classes in order to facilitate this level of the XML standard's compliance.
To generate and navigate XML documents, there are two styles of access:
Stream-based — System.Xml
exposes a variety of classes that read XML from and write XML to a stream. This approach tends to be a fast way to consume or generate an XML document because it represents a set of serial reads or writes. The limitation of this approach is that it does not view the XML data as a document composed of tangible entities, such as nodes, elements, and attributes. An example of where a stream could be used is when receiving XML documents from a socket or a file.
Document Object Model (DOM)-based — System.Xml
exposes a set of objects that access XML documents as data. The data is accessed using entities from the XML document tree (nodes, elements, and attributes). This style of XML generation and navigation is flexible but may not yield the same performance as stream-based XML generation and navigation. DOM is an excellent technology for editing and manipulating documents. For example, the functionality exposed by DOM could simplify merging your checking, savings, and brokerage accounts.
When demonstrating XML serialization, XML stream-style parsers were mentioned. After all, when an instance of an object is serialized to XML, it has to be written to a stream, and when it is deserialized, it is read from a stream. When an XML document is parsed using a stream parser, the parser always points to the current node in the document. The basic architecture of stream parsers is shown in Figure 10-1.
The following classes that access a stream of XML (read XML) and generate a stream of XML (write XML) are contained in the System.Xml
namespace:
The diagram of the classes associated with the XML stream-style parser referred to one other class, XslTransform
. This class is found in the System.Xml.Xsl
namespace and is not an XML stream-style parser. Rather, it is used in conjunction with XmlWriter
and XmlReader
. This class is covered in detail later.
The System.Xml
namespace exposes a plethora of additional XML manipulation classes in addition to those shown in the architecture diagram. The classes shown in the diagram include the following:
An XML document can be created programmatically in .NET. One way to perform this task is by writing the individual components of an XML document (schema, attributes, elements, and so on) to an XML stream. Using a unidirectional write-stream means that each element and its attributes must be written in order — the idea is that data is always written at the head of the stream. To accomplish this, you use a writable XML stream class (a class derived from XmlWriter
). Such a class ensures that the XML document you generate correctly implements the W3C Extensible Markup Language (XML) 1.0 specification and the Namespaces in XML specification.
Why is this necessary when you have XML serialization? You need to be very careful here to separate interface from implementation. XML serialization works for a specific class, such as the ElokuvaTilaus
class. This class is a proprietary implementation and not the format in which data is exchanged. For this one specific case, the XML document generated when ElokuvaTilaus
is serialized just so happens to be the XML format used when placing an order for some movies. ElokuvaTilaus
was given a little help from Source Code Style attributes so that it would conform to a standard XML representation of a film order summary.
In a different application, if the software used to manage an entire movie distribution business wants to generate movie orders, then it must generate a document of the appropriate form. The movie distribution management software achieves this using the XmlWriter
object.
Before reviewing the subtleties of XmlWriter
, note that this class exposes over 40 methods and properties. The example in this section provides an overview that touches on a subset of these methods and properties. This subset enables the generation of an XML document that corresponds to a movie order.
This example builds the module that generates the XML document corresponding to a movie order. It uses an instance of XmlWriter
, called FilmOrdersWriter
, which is actually a file on disk. This means that the XML document generated is streamed to this file. Because the FilmOrdersWriter
variable represents a file, you have to take a few actions against the file. For instance, you have to ensure the file is
Created — The instance of XmlWriter, FilmOrdersWriter
, is created by using the Create
method as well as by assigning all the properties of this object with the XmlWriterSettings
object.
Opened — The file the XML is streamed to, FilmOrdersProgrammatic.xml
, is opened by passing the filename to the constructor associated with XmlWriter
.
Generated — The process of generating the XML document is described in detail at the end of this section.
Closed — The file (the XML stream) is closed using the Close
method of XmlWriter
or by simply making use of the Using
keyword, which ensures that the object is closed at the end of the Using
statement.
Before you create the XmlWriter
object, you first need to customize how the object operates by using the XmlWriterSettings
object. This object, introduced in .NET 2.0, enables you to configure the behavior of the XmlWriter
object before you instantiate it:
Dim myXmlSettings As New XmlWriterSettings() myXmlSettings.Indent = True myXmlSettings.NewLineOnAttributes = True
You can specify a few settings for the XmlWriterSettings
object that define how XML creation will be handled by the XmlWriter
object. The following table details the properties of the XmlWriterSettings
class:
Property | Initial Value | Description |
---|---|---|
| This property, if set to | |
| Specifies whether the | |
| Allows the XML to be checked to ensure that it follows certain specified rules. Possible conformance-level settings include | |
| Defines the encoding of the XML generated | |
| Defines whether the XML generated should be indented or not. Setting this value to | |
| Specifies the number of spaces by which child nodes are indented from parent nodes. This setting only works when the | |
| Assigns the characters that are used to define line breaks | |
| Defines whether to normalize line breaks in the output. Possible values include | |
| Defines whether a node's attributes should be written to a new line in the construction. This will occur if set to | |
| Defines whether an XML declaration should be generated in the output. This omission only occurs if set to | |
| Defines the method to serialize the output. Possible values include |
Once the XmlWriterSettings
object has been instantiated and assigned the values you deem necessary, the next steps are to invoke the XmlWriter
object and make the association between the XmlWriterSettings
object and the XmlWriter
object.
The basic infrastructure for managing the file (the XML text stream) and applying the settings class is either
Dim FilmOrdersWriter As XmlWriter = _ XmlWriter.Create("..FilmOrdersProgrammatic.xml", myXmlSettings) FilmOrdersWriter.Close()
or the following, if you are utilizing the Using
keyword, which is the recommended approach:
Using FilmOrdersWriter As XmlWriter = _ XmlWriter.Create("..FilmOrdersProgrammatic.xml", myXmlSettings) End Using
With the preliminaries completed (file created and formatting configured), the process of writing the actual attributes and elements of your XML document can begin. The sequence of steps used to generate your XML document is as follows:
Write an XML comment using the WriteComment
method. This comment describes from whence the concept for this XML document originated and generates the following code:
<!-- Same as generated by serializing, ElokuvaTilaus -->
Begin writing the XML element, <ElokuvaTilaus>
, by calling the WriteStartElement
method. You can only begin writing this element because its attributes and child elements must be written before the element can be ended with a corresponding </ElokuvaTilaus>
. The XML generated by the WriteStartElement
method is as follows
<ElokuvaTilaus>
Write the attributes associated with <ElokuvaTilaus>
by calling the WriteAttributeString
method twice. The XML generated by calling the WriteAttributeString
method twice adds to the <ElokuvaTilaus>
XML element that is currently being written to the following:
<ElokuvaTilaus ElokuvaId="101" Maara="10">
Using the WriteElementString
method, write the child XML element <Nimi>
contained in the XML element, <ElokuvaTilaus>
. The XML generated by calling this method is as follows:
<Nimi>Grease</Nimi>
Complete writing the <ElokuvaTilaus>
parent XML element by calling the WriteEndElement
method. The XML generated by calling this method is as follows:
</ElokuvaTilaus>
Let's now put all this together in the Module1.vb
file shown here:
Imports System.Xml Imports System.Xml.Serialization Imports System.IO Module Module1 Sub Main() Dim myXmlSettings As New XmlWriterSettings myXmlSettings.Indent = True myXmlSettings.NewLineOnAttributes = True Using FilmOrdersWriter As XmlWriter = _ XmlWriter.Create("..FilmOrdersProgrammatic.xml", myXmlSettings) FilmOrdersWriter.WriteComment(" Same as generated " & _ "by serializing, ElokuvaTilaus ") FilmOrdersWriter.WriteStartElement("ElokuvaTilaus")
FilmOrdersWriter.WriteAttributeString("ElokuvaId", "101") FilmOrdersWriter.WriteAttributeString("Maara", "10") FilmOrdersWriter.WriteElementString("Nimi", "Grease") FilmOrdersWriter.WriteEndElement() ' End ElokuvaTilaus End Using End Sub End Module
Once this is run, you will find the XML file FilmOrdersProgrammatic.xml
created in the same folder as the Module1.vb
file or in the bin
directory. The content of this file is as follows:
<?xml version="1.0" encoding="utf-8"?> <!-- Same as generated by serializing, ElokuvaTilaus --> <ElokuvaTilaus ElokuvaId="101" Maara="10"> <Nimi>Grease</Nimi> </ElokuvaTilaus>
The previous XML document is the same in form as the XML document generated by serializing the ElokuvaTilaus
class. Notice that in the previous XML document, the <Nimi>
element is indented two characters and that each attribute is on a different line in the document. This was achieved using the XmlWriterSettings
class.
The sample application covered only a small portion of the methods and properties exposed by the XML stream-writing class, XmlWriter
. Other methods implemented by this class manipulate the underlying file, such as the Flush
method; and some methods allow XML text to be written directly to the stream, such as the WriteRaw
method.
The XmlWriter
class also exposes a variety of methods that write a specific type of XML data to the stream. These methods include WriteBinHex, WriteCData, WriteString
, and WriteWhiteSpace
.
You can now generate the same XML document in two different ways. You have used two different applications that took two different approaches to generating a document that represents a standardized movie order. However, there are even more ways to generate XML, depending on the circumstances. Using the previous scenario, you could receive a movie order from a store, and this order would have to be transformed from the XML format used by the supplier to your own order format.
In .NET, XML documents can be read from a stream as well. Data is traversed in the stream in order (first XML element, second XML element, and so on). This traversal is very quick because the data is processed in one direction and features such as write and move backward in the traversal are not supported. At any given instance, only data at the current position in the stream can be accessed.
Before exploring how an XML stream can be read, you need to understand why it should be read in the first place. Returning to our movie supplier example, imagine that the application managing the movie orders can generate a variety of XML documents corresponding to current orders, preorders, and returns. All the documents (current orders, preorders, and returns) can be extracted in stream form and processed by a report-generating application. This application prints the orders for a given day, the preorders that are going to be due, and the returns that are coming back to the supplier. The report-generating application processes the data by reading in and parsing a stream of XML.
One class that can be used to read and parse such an XML stream is XmlReader
. Other classes in the .NET Framework are derived from XmlReader
, such as XmlTextReader
, which can read XML from a file (specified by a string corresponding to the file's name), a Stream
, or an XmlReader
. This example uses an XmlReader
to read an XML document contained in a file. Reading XML from a file and writing it to a file is not the norm when it comes to XML processing, but a file is the simplest way to access XML data. This simplified access enables you to focus on XML-specific issues.
In creating a sample, the first step is to make the proper imports into the Module1.vb
file:
Imports System.Xml Imports System.Xml.Serialization Imports System.IO
From there, the next step in accessing a stream of XML data is to create an instance of the object that will open the stream (the readMovieInfo
variable of type XmlReader
) and then open the stream itself. Your application performs this as follows (where MovieManage.xml
is the name of the file containing the XML document):
Dim myXmlSettings As New XmlReaderSettings() Using readMovieInfo As XmlReader = XmlReader.Create(fileName, myXmlSettings)
Note that because the XmlWriter
has a settings class, the XmlReader
also has a settings class. Though you can make assignments to the XmlReaderSettings
object, in this case you do not. Later, this chapter covers the XmlReaderSettings
object.
The basic mechanism for traversing each stream is to traverse from node to node using the Read
method. Node types in XML include Element
and Whitespace
. Numerous other node types are defined, but this example focuses on traversing XML elements and the white space that is used to make the elements more readable (carriage returns, linefeeds, and indentation spaces). Once the stream is positioned at a node, the MoveToNextAttribute
method can be called to read each attribute contained in an element. The MoveToNextAttribute
method only traverses attributes for nodes that contain attributes (nodes of type element
). An example of an XmlReader
traversing each node and then traversing the attributes of each node follows:
While readMovieInfo.Read() ' Process node here. While readMovieInfo.MoveToNextAttribute() ' Process attribute here. End While End While
This code, which reads the contents of the XML stream, does not utilize any knowledge of the stream's contents. However, a great many applications know exactly how the stream they are going to traverse is structured. Such applications can use XmlReader
in a more deliberate manner and not simply traverse the stream without foreknowledge.
Once the example stream has been read, it can be cleaned up using the End Using
call:
End Using
This ReadMovieXml
subroutine takes the filename containing the XML to read as a parameter. The code for the subroutine is as follows (and is basically the code just outlined):
Private Sub ReadMovieXml(ByVal fileName As String) Dim myXmlSettings As New XmlReaderSettings() Using readMovieInfo As XmlReader = XmlReader.Create(fileName, _ myXmlSettings) While readMovieInfo.Read() ShowXmlNode(readMovieInfo) While readMovieInfo.MoveToNextAttribute() ShowXmlNode(readMovieInfo) End While End While End Using Console.ReadLine() End Sub
For each node encountered after a call to the Read
method, ReadMovieXml
calls the ShowXmlNode
subroutine. Similarly, for each attribute traversed, the ShowXmlNode
subroutine is called. This subroutine breaks down each node into its sub-entities:
Depth — This property of XmlReader
determines the level at which a node resides in the XML document tree. To understand depth, consider the following XML document composed solely of elements: <A><B></B><C><D></D></C></A>
.
Element <A>
is the root element, and when parsed would return a Depth
of 0
. Elements <B>
and <C>
are contained in <A>
and hence reflect a Depth
value of 1
. Element <D>
is contained in <C>
. The Depth
property value associated with <D>
(depth of 2
) should, therefore, be one more than the Depth
property associated with <C>
(depth of 1
).
Type — The type of each node is determined using the NodeType
property of XmlReader
. The node returned is of enumeration type, XmlNodeType
. Permissible node types include Attribute, Element
, and Whitespace
. (Numerous other node types can also be returned, including CDATA, Comment, Document, Entity
, and DocumentType
.)
Name — The type of each node is retrieved using the Name
property of XmlReader
. The name of the node could be an element name, such as <ElokuvaTilaus>
, or an attribute name, such as ElokuvaId
.
Attribute Count — The number of attributes associated with a node is retrieved using the AttributeCount
property of XmlReader
's NodeType
.
Value — The value of a node is retrieved using the Value
property of XmlReader
. For example, the element node <Nimi>
contains a value of Grease
.
The subroutine ShowXmlNode
is implemented as follows:
Private Sub ShowXmlNode(ByVal reader As XmlReader)
If reader.Depth > 0 Then For depthCount As Integer = 1 To reader.Depth Console.Write(" ") Next End If If reader.NodeType = XmlNodeType.Whitespace Then Console.Out.WriteLine("Type: {0} ", reader.NodeType) ElseIf reader.NodeType = XmlNodeType.Text Then Console.Out.WriteLine("Type: {0}, Value: {1} ", _ reader.NodeType, _ reader.Value) Else Console.Out.WriteLine("Name: {0}, Type: {1}, " & _ "AttributeCount: {2}, Value: {3} ", _ reader.Name, _ reader.NodeType, _ reader.AttributeCount, _ reader.Value) End If End Sub
Within the ShowXmlNode
subroutine, each level of node depth adds two spaces to the output generated:
If reader.Depth > 0 Then For depthCount As Integer = 1 To reader.Depth Console.Write(" ") Next End If
You add these spaces in order to create human-readable output (so you can easily determine the depth of each node displayed). For each type of node, ShowXmlNode
displays the value of the NodeType
property. The ShowXmlNode
subroutine makes a distinction between nodes of type Whitespace
and other types of nodes. The reason for this is simple: A node of type Whitespace
does not contain a name or attribute count. The value of such a node is any combination of white-space characters (space, tab, carriage return, and so on). Therefore, it doesn't make sense to display the properties if the NodeType
is XmlNodeType.WhiteSpace
. Nodes of type Text
have no name associated with them, so for this type, subroutine ShowXmlNode
only displays the properties NodeType
and Value
. For all other node types, the Name, AttributeCount, Value
, and NodeType
properties are displayed.
To finalize this module, add a Sub Main
as follows:
Sub Main(ByVal args() As String) ReadMovieXml("..MovieManage.xml") End Sub
Here is an example construction of the MovieManage.xml
file:
<?xml version="1.0" encoding="utf-8" ?> <MovieOrderDump> <FilmOrder_Multiple> <multiFilmOrders> <FilmOrder> <name>Grease</name> <filmId>101</filmId> <quantity>10</quantity> </FilmOrder> <FilmOrder> <name>Lawrence of Arabia</name> <filmId>102</filmId> <quantity>10</quantity> </FilmOrder> <FilmOrder> <name>Star Wars</name> <filmId>103</filmId> <quantity>10</quantity> </FilmOrder> </multiFilmOrders> </FilmOrder_Multiple> <PreOrder> <FilmOrder> <name>Shrek III - Shrek Becomes a Programmer</name> <filmId>104</filmId> <quantity>10</quantity> </FilmOrder> </PreOrder> <Returns> <FilmOrder> <name>Star Wars</name> <filmId>103</filmId> <quantity>2</quantity> </FilmOrder> </Returns> </MovieOrderDump>
Running this module produces the following output (a partial display, as it would be rather lengthy):
Name: xml, Type: XmlDeclaration, AttributeCount: 2, Value: version="1.0" encoding="utf-8" Name: version, Type: Attribute, AttributeCount: 2, Value: 1.0 Name: encoding, Type: Attribute, AttributeCount: 2, Value: utf-8 Type: Whitespace Name: MovieOrderDump, Type: Element, AttributeCount: 0, Value: Type: Whitespace Name: FilmOrder_Multiple, Type: Element, AttributeCount: 0, Value: Type: Whitespace
Name: multiFilmOrders, Type: Element, AttributeCount: 0, Value: Type: Whitespace Name: FilmOrder, Type: Element, AttributeCount: 0, Value: Type: Whitespace Name: name, Type: Element, AttributeCount: 0, Value: Type: Text, Value: Grease
This example managed to use three methods and five properties of XmlReader
. The output generated was informative but far from practical. XmlReader
exposes over 50 methods and properties, which means that we have only scratched the surface of this highly versatile class. The remainder of this section looks at the XmlReaderSettings
class, introduces a more realistic use of XmlReader
, and demonstrates how the classes of System.Xml
handle errors.
Just like the XmlWriter
object, the XmlReader
object requires settings to be applied for instantiation of the object. This means that you can apply settings specifying how the XmlReader
object behaves when it is reading whatever XML you might have for it. This includes settings for dealing with white space, schemas, and more:
Property | Initial Value | Description |
---|---|---|
| This property, if set to | |
| Specifies whether the | |
| Allows the XML to be checked to ensure that it follows certain specified rules. Possible conformance-level settings include | |
| Defines whether comments should be ignored or not | |
|
| Defines whether processing instructions contained within the XML should be ignored |
| Defines whether the | |
| Defines the line number at which the | |
| Defines the position in the line number at which the | |
An empty | Enables the | |
| Defines whether the | |
An empty | Enables the | |
| Enables you to apply validation schema settings. Possible values include | |
| Specifies whether the | |
A write-only property that enables you to access external documents |
An example of using this settings class to modify the behavior of the XmlReader
class is as follows:
Dim myXmlSettings As New XmlReaderSettings() myXmlSettings.IgnoreWhitespace = True myXmlSettings.IgnoreComments = True Using readMovieInfo As XmlReader = XmlReader.Create(fileName, myXmlSettings) ' Use XmlReader object here. End Using
In this case, the XmlReader
object that is created ignores the white space that it encounters, as well as any of the XML comments. These settings, once established with the XmlReaderSettings
object, are then associated with the XmlReader
object through its Create
method.
An application can easily use XmlReader
to traverse a document that is received in a known format. The document can thus be traversed in a deliberate manner. You just implemented a class that serialized arrays of movie orders. The next example takes an XML document containing multiple XML documents of that type and traverses them. Each movie order is forwarded to the movie supplier via fax. The document is traversed as follows:
Read root element: <MovieOrderDump> Process each <FilmOrder_Multiple> element Read <multiFilmOrders> element Process each <FilmOrder> Send fax for each movie order here
The basic outline for the program's implementation is to open a file containing the XML document to parse and to traverse it from element to element:
Dim myXmlSettings As New XmlReaderSettings() Using readMovieInfo As XmlReader = XmlReader.Create(fileName, myXmlSettings) readMovieInfo.Read() readMovieInfo.ReadStartElement("MovieOrderDump") Do While (True) '**************************************************** '* Process FilmOrder elements here * '**************************************************** Loop readMovieInfo.ReadEndElement() ' </MovieOrderDump> End Using
The preceding code opened the file using the constructor of XmlReader
, and the End Using
statement takes care of shutting everything down for you. The code also introduced two methods of the XmlReader
class:
ReadStartElement(String)
— This verifies that the current in the stream is an element and that the element's name matches the string passed to ReadStartElement
. If the verification is successful, then the stream is advanced to the next element.
ReadEndElement()
— This verifies that the current element is an end tab; and if the verification is successful, then the stream is advanced to the next element.
The application knows that an element, <MovieOrderDump>
, will be found at a specific point in the document. The ReadStartElement
method verifies this foreknowledge of the document format. After all the elements contained in element <MovieOrderDump>
have been traversed, the stream should point to the end tag </MovieOrderDump>
. The ReadEndElement
method verifies this.
The code that traverses each element of type <FilmOrder>
similarly uses the ReadStartElement
and ReadEndElement
methods to indicate the start and end of the <FilmOrder>
and <multiFilmOrders>
elements. The code that ultimately parses the list of movie orders and faxes the movie supplier (using the FranticallyFaxTheMovieSupplier
subroutine) is as follows:
Dim myXmlSettings As New XmlReaderSettings() Using readMovieInfo As XmlReader = XmlReader.Create(fileName, myXmlSettings) readMovieInfo.Read() readMovieInfo.ReadStartElement("MovieOrderDump") Do While (True)
readMovieInfo.ReadStartElement("FilmOrder_Multiple") readMovieInfo.ReadStartElement("multiFilmOrders") Do While (True) readMovieInfo.ReadStartElement("FilmOrder") movieName = readMovieInfo.ReadElementString() movieId = readMovieInfo.ReadElementString() quantity = readMovieInfo.ReadElementString() readMovieInfo.ReadEndElement() ' clear </FilmOrder> FranticallyFaxTheMovieSupplier(movieName, movieId, quantity) ' Should read next FilmOrder node ' else quits readMovieInfo.Read() If ("FilmOrder" <> readMovieInfo.Name) Then Exit Do End If Loop readMovieInfo.ReadEndElement() ' clear </multiFilmOrders> readMovieInfo.ReadEndElement() ' clear </FilmOrder_Multiple> ' Should read next FilmOrder_Multiple node ' else you quit readMovieInfo.Read() ' clear </MovieOrderDump> If ("FilmOrder_Multiple" <> readMovieInfo.Name) Then Exit Do End If Loop readMovieInfo.ReadEndElement() ' </MovieOrderDump> End Using
Three lines within the preceding code contain a call to the ReadElementString
method:
movieName = readMovieInfo.ReadElementString() movieId = readMovieInfo.ReadElementString() quantity = readMovieInfo.ReadElementString()
While parsing the stream, it was known that an element named <name>
existed and that this element contained the name of the movie. Rather than parse the start tag, get the value, and parse the end tag, it was easier to get the data using the ReadElementString
method. This method retrieves the data string associated with an element and advances the stream to the next element. The ReadElementString
method was also used to retrieve the data associated with the XML elements <filmId>
and <quantity>
.
The output of this example is a fax (not shown here because the point of this example is to demonstrate that it is simpler to traverse a document when its form is known). The format of the document is still verified by XmlReader
as it is parsed.
The XmlReader
class also exposes properties that provide more insight into the data contained in the XML document and the state of parsing: IsEmptyElement, EOF
, and IsStartElement
.
.NET CLR-compliant types are not 100 percent inline with XML types, so ever since the .NET Framework 2.0 was introduced, the new methods it made available in the XmlReader
make the process of casting from one of these XML types to .NET types easier.
Using the ReadElementContentAs
method, you can easily perform the necessary casting required:
Dim username As String = _ myXmlReader.ReadElementContentAs(GetType(String), DBNull.Value) Dim myDate As DateTime = _ myXmlReader.ReadElementContentAs(GetType(DateTime), DBNull.Value)
Also available is a series of direct casts through new methods such as the following:
ReadElementContentAsBase64()
ReadElementContentAsBinHex()
ReadElementContentAsBoolean()
ReadElementContentAsDateTime()
ReadElementContentAsDecimal()
ReadElementContentAsDouble()
ReadElementContentAsFloat()
ReadElementContentAsInt()
ReadElementContentAsLong()
ReadElementContentAsObject()
ReadElementContentAsString()
In addition to these methods, the raw XML associated with the document can also be retrieved, using ReadInnerXml
and ReadOuterXml
. Again, this only scratches the surface of the XmlReader
class, a class quite rich in functionality.
XML is text and could easily be read using mundane methods such as Read
and ReadLine
. A key feature of each class that reads and traverses XML is inherent support for error detection and handling. To demonstrate this, consider the following malformed XML document found in the file named Malformed.xml
:
<?xml version="1.0" encoding="IBM437" ?> <ElokuvaTilaus ElokuvaId="101", Maara="10"> <Nimi>Grease</Nimi> <ElokuvaTilaus>
This document may not immediately appear to be malformed. By wrapping a call to the method you developed (ReadMovieXml
), you can see what type of exception is raised when XmlReader
detects the malformed XML within this document:
Try ReadMovieXml("Malformed.xml") Catch xmlEx As XmlException Console.Error.WriteLine("XML Error: " + xmlEx.ToString()) Catch ex As Exception Console.Error.WriteLine("Some other error: " + ex.ToString()) End Try
The methods and properties exposed by the XmlReader
class raise exceptions of type System.Xml.XmlException
. In fact, every class in the System.Xml
namespace raises exceptions of type XmlException
. Although this is a discussion of errors using an instance of type XmlReader
, the concepts reviewed apply to all errors generated by classes found in the System.Xml
namespace.
Properties exposed by XmlException
include the following:
Data
— A set of key-value pairs that enable you to display user-defined information about the exception
HelpLink
— The link to the help page that deals with the exception
InnerException
— The System.Exception
instance indicating what caused the current exception
LineNumber
— The number of the line within an XML document where the error occurred
LinePosition
— The position within the line specified by LineNumber
where the error occurred
Message
— The error message that corresponds to the error that occurred. This error took place at the line in the XML document specified by LineNumber
and within the line at the position specified by LinePostion
.
Source
— Provides the name of the application or object that triggered the error
SourceUri
— Provides the URI of the element or document in which the error occurred
StackTrace
— Provides a string representation of the frames on the call stack when the error was triggered
The error displayed when subroutine movieReadXML
processes Malformed.xml
is as follows:
XML Error: System.Xml.XmlException: The ',' character, hexadecimal value 0x2C, cannot begin a name. Line 2, position 49.
The preceding snippet indicates that a comma separates the attributes in element <FilmOrder> (ElokuvaTilaus="101", Maara="10")
. This comma is invalid. Removing it and running the code again results in the following output:
XML Error: System.Xml.XmlException: This is an unexpected token. Expected 'EndElement'. Line 5, position 27.
Again, you can recognize the precise error. In this case, you do not have an end element, </ElokuvaTilaus>
, but you do have an opening element, <ElokuvaTilaus>
.
The properties provided by the XmlException
class (such as LineNumber, LinePosition
, and Message
) provide a useful level of precision when tracking down errors. The XmlReader
class also exposes a level of precision with respect to the parsing of the XML document. This precision is exposed by the XmlReader
through properties such as LineNumber
and LinePosition
.
A very useful class that can greatly help you when working with XML is System.IO.MemoryStream
. Rather than need a network or disk resource backing the stream (as in System.Net.Sockets.NetworkStream
and System.IO.FileStream
), MemoryStream
backs itself up onto a block of memory. Imagine that you want to generate an XML document and e-mail it. The built-in classes for sending e-mail rely on having a System.String
containing a block of text for the message body, but if you want to generate an XML document, then you need a stream.
If the document is reasonably sized, then write the document directly to memory and copy that block of memory to the e-mail. This is good from a performance and reliability perspective because you don't have to open a file, write it, rewind it, and read the data back in again. However, you must consider scalability in this situation because if the file is very large, or if you have a great number of smaller files, then you could run out of memory (in which case you have to go the "file" route).
This section describes how to generate an XML document to a MemoryStream
object, reading the document back out again as a System.String
value and e-mailing it. What you will do is create a new class called EmailStream
that extends MemoryStream
. This new class contains an extra method called CloseAndSend
that, as its name implies, closes the stream and sends the e-mail message.
First, create a new console application project called "EmailStream." The first task is to create a basic Customer
object that contains a few basic members and can be automatically serialized by .NET through use of the SerializableAttribute
attribute:
<Serializable()> Public Class Customer ' members... Public Id As Integer Public FirstName As String Public LastName As String Public Email As String End Class
The fun part is the EmailStream
class itself. This needs access to the System.Net.Mail
namespace, so import this namespace into your code for your class. The new class should also extend System.IO.MemoryStream
, as shown here:
Imports System.IO Imports System.Net.Mail Public Class EmailStream Inherits MemoryStream
The first job of CloseAndSend
is to start putting together the mail message. This is done by creating a new System.Web.Mail.MailMessage
object and configuring the sender, recipient, and subject:
' CloseAndSend - close the stream and send the email... Public Sub CloseAndSend(ByVal fromAddress As String, _ ByVal toAddress As String, _ ByVal subject As String) ' Create the new message... Dim message As New MailMessage() message.From = New MailAddress(fromAddress) message.To.Add(New MailAddress(toAddress)) message.Subject = subject
This method will be called after the XML document has been written to the stream, so you can assume at this point that the stream contains a block of data. To read the data back out again, you have to rewind the stream and use a System.IO.StreamReader
. Before you do this, however, call Flush
. Traditionally, streams have always been buffered — that is, the data is not sent to the final destination (the memory block in this case, but a file in the case of a FileStream
, and so on) each time the stream is written. Instead, the data is written in (mostly) a nondeterministic way. Because you need all the data to be written, you call Flush
to ensure that all the data has been sent to the destination and that the buffer is empty.
In a way, EmailStream
is a great example of buffering. All the data is held in a memory "buffer" until you finally send the data on to its destination in a response to an explicit call to this method:
' Flush and rewind the stream... Flush() Seek(0, SeekOrigin.Begin)
Once you have flushed and rewound the stream, you can create a StreamReader
and dredge all the data out into the Body
property of the MailMessage
object:
' Read out the data... Dim reader As New StreamReader(Me) message.Body = reader.ReadToEnd()
After you have done that, close the stream by calling the base class method:
' Close the stream... Close()
Finally, send the message:
' Send the message... Dim SmtpMail As New SmtpClient() SmtpMail.Send(message) End Sub End Class
To call this method, you need to add some code to the Main
method. First, create a new Customer
object and populate it with some test data:
Imports System.Xml.Serialization Module Module1 Sub Main() ' Create a new customer... Dim customer As New Customer customer.Id = 27 customer.FirstName = "Bill" customer.LastName = "Gates" customer.Email = [email protected]
After you have done that, you can create a new EmailStream
object. You then use XmlSerializer
to write an XML document representing the newly created Customer
instance to the block of memory that EmailStream
is backing to:
' Create a new email stream... Dim stream As New EmailStream ' Serialize... Dim serializer As New XmlSerializer(customer.GetType()) serializer.Serialize(stream, customer)
At this point, the stream will be filled with data; and after all the data has been flushed, the block of memory that EmailStream
backs on to will contain the complete document. Now you can call CloseAndSend
to e-mail the document:
' Send the email... stream.CloseAndSend("[email protected]", _ "[email protected]", "XML Customer Document") End Sub End Module
You probably already have the Microsoft SMTP service properly configured — this service is necessary to send e-mail. You also need to ensure that the e-mail addresses used in your code go to your e-mail address! Run the project and check your e-mail; you should see something similar to what is shown in Figure 10-2.
The classes of the System.Xml
namespace that support the Document Object Model (DOM) interact as illustrated in Figure 10-3.
Within this diagram, an XML document is contained in a class named XmlDocument
. Each node within this document is accessible and managed using XmlNode
. Nodes can also be accessed and managed using a class specifically designed to process a specific node's type (XmlElement, XmlAttribute
, and so on). XML documents are extracted from XmlDocument
using a variety of mechanisms exposed through such classes as XmlWriter, TextWriter, Stream
, and a file (specified by filename of type String
). XML documents are consumed by an XmlDocument
using a variety of load mechanisms exposed through the same classes.
A DOM-style parser differs from a stream-style parser with respect to movement. Using the DOM, the nodes can be traversed forward and backward. Nodes can be added to the document, removed from the document, and updated. However, this flexibility comes at a performance cost. It is faster to read or write XML using a stream-style parser.
The DOM-specific classes exposed by System.Xml
include the following:
XmlDocument
— Corresponds to an entire XML document. A document is loaded using the Load
method. XML documents are loaded from a file (the filename specified as type String
), TextReader
, or XmlReader
. A document can be loaded using LoadXml
in conjunction with a string containing the XML document. The Save
method is used to save XML documents. The methods exposed by XmlDocument
reflect the intricate manipulation of an XML document. For example, the following self-documenting creation methods are implemented by this class: CreateAttribute, CreateCDataSection, CreateComment, CreateDocumentFragment, CreateDocumentType, CreateElement, CreateEntityReference, CreateNavigator, CreateNode, CreateProcessingInstruction, CreateSignificantWhitespace, CreateTextNode, CreateWhitespace
, and CreateXmlDeclaration
. The elements contained in the document can be retrieved. Other methods support the retrieving, importing, cloning, loading, and writing of nodes.
XmlNode
— Corresponds to a node within the DOM tree. This class supports data types, namespaces, and DTDs. A robust set of methods and properties is provided to create, delete, and replace nodes: AppendChild, Clone, CloneNode, CreateNavigator, InsertAfter, InsertBefore, Normalize, PrependChild, RemoveAll, RemoveChild, ReplaceChild, SelectNodes, SelectSingleNode, Supports, WriteContentTo
, and WriteTo
. The contents of a node can similarly be traversed in a variety of ways: FirstChild, LastChild, NextSibling, ParentNode
, and PreviousSibling
.
XmlElement
— Corresponds to an element within the DOM tree. The functionality exposed by this class contains a variety of methods used to manipulate an element's attributes: AppendChild, Clone, CloneNode, CreateNavigator, GetAttribute, GetAttributeNode, GetElementsByTagName, GetNamespaceOfPrefix, GetPrefixOfNamespace, InsertAfter, InsertBefore, Normalize, PrependChild, RemoveAll, RemoveAllAttributes, RemoveAttribute, RemoveAttributeAt, RemoveAttributeNode, RemoveChild, ReplaceChild, SelectNodes, SelectSingleNode, SetAttribute, SetAttributeNode, Supports, WriteContentTo
, and WriteTo
.
XmlAttribute
— Corresponds to an attribute of an element (XmlElement
) within the DOM tree. An attribute contains data and lists of subordinate data, so it is a less complicated object than an XmlNode
or an XmlElement
. An XmlAttribute
can retrieve its owner document (property, OwnerDocument
), retrieve its owner element (property, OwnerElement
), retrieve its parent node (property, ParentNode
), and retrieve its name (property, Name
). The value of an XmlAttribute
is available via a read-write property named Value
. Methods available to XmlAttribute
include AppendChild, Clone, CloneNode, CreateNavigator, GetNamespaceOfPrefix, GetPrefixOfNamespace, InsertAfter, InsertBefore, Normalize, PrependChild, RemoveAll, RemoveChild, ReplaceChild, SelectNodes, SelectSingleNode, WriteContentTo
, and WriteTo
.
Given the diverse number of methods and properties exposed by XmlDocument, XmlNode, XmlElement
, and XmlAttribute
(and there are many more than those listed here), it's clear that any XML 1.0 or 1.1-compliant document can be generated and manipulated using these classes. In comparison to their XML stream counterparts, these classes offer more flexible movement within the XML document and through any editing of XML documents.
A similar comparison could be made between DOM and data serialized and deserialized using XML. Using serialization, the type of node (for example, attribute or element) and the node name are specified at compile time. There is no on-the-fly modification of the XML generated by the serialization process.
Other technologies that generate and consume XML are not as flexible as the DOM. This includes ADO.NET and ADO, which generate XML of a particular form. The default install of SQL Server 2000 does expose a certain amount of flexibility when it comes to the generation (FOR XML
queries) and consumption (OPENXML
) of XML. SQL Server 2005 has more support for XML and even supports an XML data type. SQL Server 2005 also expands upon the FOR XML
query with FOR XML TYPE
. The choice between using classes within the DOM and a version of SQL Server is a choice between using a language such as Visual Basic to manipulate objects or installing SQL Server and performing most of the XML manipulation in SQL.
The first DOM example loads an XML document into an XmlDocument
object using a string that contains the actual XML document. This scenario is typical of an application that uses ADO.NET to generate XML, but then uses the objects of the DOM to traverse and manipulate this XML. ADO.NET's DataSet
object contains the results of ADO.NET data access operations. The DataSet
class exposes a GetXml
method that retrieves the underlying XML associated with the DataSet
. The following code demonstrates how the contents of the DataSet
are loaded into the XmlDocument
:
Dim xmlDoc As New XmlDocument Dim ds As New DataSet() ' Set up ADO.NET DataSet() here xmlDoc.LoadXml(ds.GetXml())
This example over the next few pages simply traverses each XML element (XmlNode
) in the document (XmlDocument
) and displays the data accordingly. The data associated with this example is not retrieved from a DataSet
but is instead contained in a string, rawData
, which is initialized as follows:
Dim rawData As String = _ "<multiFilmOrders>" & _ " <FilmOrder>" & _ " <name>Grease</name>" & _ " <filmId>101</filmId>" & _ " <quantity>10</quantity>" & _ " </FilmOrder>" & _ " <FilmOrder>" & _ " <name>Lawrence of Arabia</name>" & _ " <filmId>102</filmId>" & _ " <quantity>10</quantity>" & _ " </FilmOrder>" & _ "</multiFilmOrders>"
The XML document in rawData
is a portion of the XML hierarchy associated with a movie order. The preceding example is what you would do if you were using any of the .NET Framework versions before version 3.5. If you are working on the .NET Framework 3.5, then you can use the new XML literal capability offered. This means that you can now put XML directly in your code as XML and not as a string. This approach is presented here:
Dim rawData As String = _ <multiFilmOrders> <FilmOrder> <name>Grease</name> <filmId>101</filmId> <quantity>10</quantity> </FilmOrder> <FilmOrder> <name>Lawrence of Arabia</name> <filmId>102</filmId> <quantity>10</quantity> </FilmOrder> </multiFilmOrders>
The basic idea in processing this data is to traverse each <FilmOrder>
element in order to display the data it contains. Each node corresponding to a <FilmOrder>
element can be retrieved from your XmlDocument
using the GetElementsByTagName
method (specifying a tag name of FilmOrder
). The GetElementsByTagName
method returns a list of XmlNode
objects in the form of a collection of type XmlNodeList
. Using the For Each
statement to construct this list, the XmlNodeList
(movieOrderNodes
) can be traversed as individual XmlNode
elements (movieOrderNode
). The code for handling this is as follows:
Dim xmlDoc As New XmlDocument Dim movieOrderNodes As XmlNodeList Dim movieOrderNode As XmlNode xmlDoc.LoadXml(rawData) ' Traverse each <FilmOrder> movieOrderNodes = xmlDoc.GetElementsByTagName("FilmOrder") For Each movieOrderNode In movieOrderNodes '********************************************************** ' Process <name>, <filmId> and <quantity> here '********************************************************** Next
Each XmlNode
can then have its contents displayed by traversing the children of this node using the ChildNodes
method. This method returns an XmlNodeList
(baseDataNodes
) that can be traversed one XmlNode
list element at a time:
Dim baseDataNodes As XmlNodeList Dim bFirstInRow As Boolean baseDataNodes = movieOrderNode.ChildNodes
bFirstInRow = True For Each baseDataNode As XmlNode In baseDataNodes If (bFirstInRow) Then bFirstInRow = False Else Console.Out.Write(", ") End If Console.Out.Write(baseDataNode.Name & ": " & baseDataNode.InnerText) Next Console.Out.WriteLine()
The bulk of the preceding code retrieves the name of the node using the Name
property and the InnerText
property of the node. The InnerText
property of each XmlNode
retrieved contains the data associated with the XML elements (nodes) <name>, <filmId>
, and <quantity>
. The example displays the contents of the XML elements using Console.Out
. The XML document is displayed as follows:
name: Grease, filmId: 101, quantity: 10 name: Lawrence of Arabia, filmId: 102, quantity: 10
Other, more practical, methods for using this data could have been implemented, including the following:
The contents could have been directed to an ASP.NET Response
object, and the data retrieved could have been used to create an HTML table (<table>
table, <tr>
row, and <td>
data) that would be written to the Response
object.
The data traversed could have been directed to a ListBox
or ComboBox
Windows Forms control. This would enable the data returned to be selected as part of a GUI application.
The data could have been edited as part of your application's business rules. For example, you could have used the traversal to verify that the <filmId>
matched the <name>
. Something like this could be done if you wanted to validate the data entered into the XML document in any manner.
Here is the example in its entirety:
Dim rawData As String = _ <multiFilmOrders> <FilmOrder> <name>Grease</name> <filmId>101</filmId> <quantity>10</quantity> </FilmOrder> <FilmOrder> <name>Lawrence of Arabia</name> <filmId>102</filmId> <quantity>10</quantity> </FilmOrder> </multiFilmOrders> Dim xmlDoc As New XmlDocument Dim movieOrderNodes As XmlNodeList Dim movieOrderNode As XmlNode
Dim baseDataNodes As XmlNodeList Dim bFirstInRow As Boolean xmlDoc.LoadXml(rawData) ' Traverse each <FilmOrder> movieOrderNodes = xmlDoc.GetElementsByTagName("FilmOrder") For Each movieOrderNode In movieOrderNodes baseDataNodes = movieOrderNode.ChildNodes bFirstInRow = True For Each baseDataNode As XmlNode In baseDataNodes If (bFirstInRow) Then bFirstInRow = False Else Console.Out.Write(", ") End If Console.Out.Write(baseDataNode.Name & ": " & baseDataNode.InnerText) Next Console.Out.WriteLine() Next
This next example demonstrates how to traverse data contained in attributes and how to update the attributes based on a set of business rules. In this example, the XmlDocument
object is populated by retrieving an XML document from a file. After the business rules edit the object, the data is persisted back to the file:
Dim xmlDoc As New XmlDocument xmlDoc.Load("..MovieSupplierShippingListV2.xml") '******************************************* ' Business rules process document here '******************************************* xmlDoc.Save("..MovieSupplierShippingListV2.xml")
The data contained in the file, MovieSupplierShippingListV2.xml
, is a variation of the movie order. You have altered your rigid standard (for the sake of example) so that the data associated with individual movie orders is contained in XML attributes instead of XML elements. An example of this movie order data is as follows:
<FilmOrder name="Grease" filmId="101" quantity="10" />
You already know how to traverse the XML elements associated with a document, so let's assume that you have successfully retrieved the XmlNode
associated with the <FilmOrder>
element:
Dim attributes As XmlAttributeCollection Dim filmId As Integer Dim quantity As Integer
attributes = node.Attributes() For Each attribute As XmlAttribute In attributes If 0 = String.Compare(attribute.Name, "filmId") Then filmId = attribute.InnerXml ElseIf 0 = String.Compare(attribute.Name, "quantity") Then quantity = attribute.InnerXml End If Next
The preceding code traverses the attributes of an XmlNode
by retrieving a list of attributes using the Attributes
method. The value of this method is used to set the attributes' object (data type, XmlAttributeCollection
). The individual XmlAttribute
objects (variable, attribute
) contained in attributes are traversed using a For Each
loop. Within the loop, the contents of the filmId
and the quantity
attribute are saved for processing by your business rules.
Your business rules execute an algorithm that ensures that the movies in the company's order are provided in the correct quantity. This rule specifies that the movie associated with filmId=101
must be sent to the customer in batches of six at a time due to packaging. In the event of an invalid quantity, the code for enforcing this business rule will remove a single order from the quantity
value until the number is divisible by six. Then this number is assigned to the quantity
attribute. The Value
property of the XmlAttribute
object is used to set the correct value of the order's quantity. The code performing this business rule is as follows:
If filmId = 101 Then ' This film comes packaged in batches of six. Do Until (quantity / 6) = True quantity -= 1 Loop Attributes.ItemOf("quantity").Value = quantity End If
What is elegant about this example is that the list of attributes was traversed using For Each
. Then ItemOf
was used to look up a specific attribute that had already been traversed. This would not have been possible by reading an XML stream with an object derived from the XML stream reader class, XmlReader
.
You can use this code as follows:
Sub TraverseAttributes(ByRef node As XmlNode) Dim attributes As XmlAttributeCollection Dim filmId As Integer Dim quantity As Integer attributes = node.Attributes() For Each attribute As XmlAttribute In attributes If 0 = String.Compare(attribute.Name, "filmId") Then filmId = attribute.InnerXml ElseIf 0 = String.Compare(attribute.Name, "quantity") Then quantity = attribute.InnerXml
End If Next If filmId = 101 Then ' This film comes packaged in batches of six Do Until (quantity / 6) = True quantity -= 1 Loop Attributes.ItemOf("quantity").Value = quantity End If End Sub Sub WXReadMovieDOM() Dim xmlDoc As New XmlDocument Dim movieOrderNodes As XmlNodeList xmlDoc.Load("..MovieSupplierShippingListV2.xml") ' Traverse each <FilmOrder> movieOrderNodes = xmlDoc.GetElementsByTagName("FilmOrder") For Each movieOrderNode As XmlNode In movieOrderNodes TraverseAttributes(movieOrderNode) Next xmlDoc.Save("..MovieSupplierShippingListV2.xml") End Sub
XSLT is a language that is used to transform XML documents into another format altogether. One popular use of XSLT is to transform XML into HTML so that XML documents can be presented visually. You have performed a similar task before. When working with XML serialization, you rewrote the FilmOrder
class. This class was used to serialize a movie order object to XML using nodes that contained English-language names. The rewritten version of this class, ElokuvaTilaus
, serialized XML nodes containing Finnish names. Source Code Style attributes were used in conjunction with the XmlSerializer
class to accomplish this transformation. Two words in this paragraph send chills down the spine of any experienced developer: rewrote and rewritten. The point of an XSL transform is to use an alternate language (XSLT) to transform the XML, rather than rewrite the source code, SQL commands, or some other mechanism used to generate XML.
Conceptually, XSLT is straightforward. A file with an .xslt
extension describes the changes (transformations) that will be applied to a particular XML file. Once this is completed, an XSLT processor is provided with the source XML file and the XSLT file, and performs the transformation. The System.Xml.Xsl.XslTransform
class is such an XSLT processor. Another processor you will find (introduced in the .NET Framework 2.0) is the XsltCommand
object found at SystemXml.Query.XsltCommand
. This section looks at using both of these processors.
There are also some new features to be found in Visual Studio 2008 that deal with XSLT. The new version of the IDE supports items such as XSLT data breakpoints and better support in the editor for loading large documents. Additionally, XSLT stylesheets can be compiled into assemblies even more easily with the new command-line stylesheet compiler, XSLTC.exe
.
The XSLT file is itself an XML document, although certain elements within this document are XSLT-specific commands. Dozens of XSLT commands can be used in writing an XSLT file. The first example explores the following XSLT elements (commands):
stylesheet
— This element indicates the start of the style sheet (XSL) in the XSLT file.
template
— This element denotes a reusable template for producing specific output. This output is generated using a specific node type within the source document under a specific context. For example, the text <xsl: template match="/">
selects all root notes ("/"
) for the specific transform template.
for-each
— This element applies the same template to each node in the specified set. Recall the example class (FilmOrder_Multiple
) that could be serialized. This class contained an array of movie orders. Given the XML document generated when a FilmOrder_Multiple
is serialized, each movie order serialized could be processed using <xsl:for-each select = "FilmOrder_Multiple/multiFilmOrders/FilmOrder">
.
value-of
— This element retrieves the value of the specified node and inserts it into the document in text form. For example, <xsl:value-of select="name" />
would take the value of the XML element <name>
and insert it into the transformed document.
When serialized, the FilmOrder_Multiple
class generates XML such as the following (where...indicates where additional <FilmOrder>
elements may reside):
<?xml version="1.0" encoding="UTF-8" ?> <FilmOrder_Multiple> <multiFilmOrders> <FilmOrder> <name>Grease</name> <filmId>101</filmId> <quantity>10</quantity> </FilmOrder> ... </multiFilmOrders> </FilmOrder_Multiple>
The preceding XML document is used to generate a report that is viewed by the manager of the movie supplier. This report is in HTML form, so that it can be viewed via the Web. The XSLT elements you previously reviewed (stylesheet, template
, and for-each
) are the only XSLT elements required to transform the XML document (in which data is stored) into an HTML file (data that can be displayed). An XSLT file DisplayThatPuppy.xslt
contains the following text, which is used to transform a serialized version, FilmOrder_Multiple
:
<?xml version="1.0" encoding="UTF-8" ?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <HTML> <TITLE>What people are ordering</TITLE> <BODY> <TABLE BORDER="1">
<TR> <TD><B>Film Name</B></TD> <TD><B>Film ID</B></TD> <TD><B>Quantity</B></TD> </TR> <xsl:for-each select= "FilmOrder_Multiple/multiFilmOrders/FilmOrder"> <TR> <TD><xsl:value-of select="name" /></TD> <TD><xsl:value-of select="filmId" /></TD> <TD><xsl:value-of select="quantity" /></TD> </TR> </xsl:for-each> </TABLE> </BODY> </HTML> </xsl:template> </xsl:stylesheet>
In the preceding XSLT file, the XSLT elements are marked in bold. These elements perform operations on the source XML file containing a serialized FilmOrder_Multiple
object and generate the appropriate HTML file. Your file contains a table (marked by the table tag, <TABLE>
) that contains a set of rows (each row marked by a table row tag, <TR>
). The columns of the table are contained in table data tags, <TD>
. The XSLT file contains the header row for the table:
<TR> <TD><B>Film Name</B></TD> <TD><B>Film ID</B></TD> <TD><B>Quantity</B></TD> </TR>
Each row containing data (an individual movie order from the serialized object, FilmOrder_Multiple
) is generated using the XSLT element, for-each
, to traverse each <FilmOrder>
element within the source XML document:
<xsl:for-each select= "FilmOrder_Multiple/multiFilmOrders/FilmOrder">
The individual columns of data are generated using the value-of
XSLT element, in order to query the elements contained within each <FilmOrder>
element (<name>, <filmId>
, and <quantity>
):
<TR> <TD><xsl:value-of select="name" /></TD> <TD><xsl:value-of select="filmId" /></TD> <TD><xsl:value-of select="quantity" /></TD> </TR>
The code to create a displayable XML file using the XslTransform
object is as follows:
Dim myXslTransform As New XslCompiledTransform() Dim destFileName As String = "..ShowIt.html"
myXslTransform.Load("..DisplayThatPuppy.xsl") myXslTransform.Transform("..FilmOrders.xml", destFileName) System.Diagnostics.Process.Start(destFileName)
This consists of only seven lines of code, with the bulk of the coding taking place in the XSLT file. The previous code snippet created an instance of a System.Xml.Xsl.XslCompiledTransform
object named myXslTransform
. The Load
method of this class is used to load the XSLT file you previously reviewed, DisplayThatPuppy.xslt
. The Transform
method takes a source XML file as the first parameter, which in this case was a file containing a serialized FilmOrder_Multiple
object. The second parameter is the destination file created by the transform (ShowIt.html
). The Start
method of the Process
class is used to display the HTML file. This method launches a process that is best suited for displaying the file provided. Basically, the extension of the file dictates which application will be used to display the file. On a typical Windows machine, the program used to display this file is Internet Explorer, as shown in Figure 10-4.
Don't confuse displaying this HTML file with ASP.NET. Displaying an HTML file in this manner takes place on a single machine without the involvement of a Web server. Using ASP.NET is more complex than displaying an HTML page in the default browser.
As demonstrated, the backbone of the System.Xml.Xsl
namespace is the XslCompiledTransform
class. This class uses XSLT files to transform XML documents. XslTransform
exposes the following methods and properties:
XmlResolver
— This get/set property is used to specify a class (abstract base class, XmlResolver
) that is used to handle external references (import and include elements within the style sheet). These external references are encountered when a document is transformed (the method, Transform
, is executed). The System.Xml
namespace contains a class, XmlUrlResolver
, which is derived from XmlResolver
. The XmlUrlResolver
class resolves the external resource based on a URI.
Load
— This overloaded method loads an XSLT style sheet to be used in transforming XML documents. It is permissible to specify the XSLT style sheet as a parameter of type XPathNavigator
, filename of XSLT file (specified as parameter type String
), XmlReader
, or IXPathNavigable
. For each type of XSLT supported, an overloaded member is provided that enables an XmlResolver
to also be specified. For example, it is possible to call Load(String, XmlResolver)
, where String
corresponds to a filename and XmlResolver
is an object that handles references in the style sheet of type xsl:import
and xsl:include
. It would also be permissible to pass in a value of Nothing
for the second parameter of the Load
method (so that no XmlResolver
would be specified).
Transform
— This overloaded method transforms a specified XML document using the previously specified XSLT style sheet and an XmlResolver
. The location where the transformed XML is to be output is specified as a parameter to this method. The first parameter of each overloaded method is the XML document to be transformed. This parameter can be represented as an IXPathNavigable
, XML filename (specified as parameter type String
), or XPathNavigator
.
The most straightforward variant of the Transform
method is Transform(String, String, XmlResolver)
. In this case, a file containing an XML document is specified as the first parameter, a filename that receives the transformed XML document is specified as the second parameter, and the XmlResolver
is used as the third parameter. This is exactly how the first XSLT example utilized the Transform
method:
myXslTransform.Transform("..FilmOrders.xml", destFileName)
The first parameter to the Transform
method can also be specified as IXPathNavigable
or XPath-Navigator
. Either of these parameter types allows the XML output to be sent to an object of type Stream, TextWriter
, or XmlWriter
. When these two flavors of input are specified, a parameter containing an object of type XsltArgumentList
can be specified. An XsltArgumentList
object contains a list of arguments that are used as input to the transform.
When working with a .NET 2.0/3.5 project, it is preferable to use the XslCompiledTransform
object instead of the XslTransform
object, because the XslTransform
object is considered obsolete.
The XslCompiledTransform
object uses the same Load
and Transform
methods to pull the data. The Transform
method provides the following signatures:
XslCompiledTransform.Transform(IXPathNavigable, XmlWriter) XslCompiledTransform.Transform(IXPathNavigable, XsltArguementList, XmlWriter) XslCompiledTransform.Transform(IXPathNavigable, XsltArguementList, TextWriter) XslCompiledTransform.Transform(IXPathNavigable, XsltArguementList, Stream) XslCompiledTransform.Transform(XmlReader, XmlWriter) XslCompiledTransform.Transform(XmlReader, XsltArguementList, XmlWriter) XslCompiledTransform.Transform(XmlReader, XsltArguementList, TextWriter) XslCompiledTransform.Transform(XmlReader, XsltArguementList, Stream) XslCompiledTransform.Transform(XmlReader, XsltArguementList, XmlWriter, XmlResolver) XslCompiledTransform.Transform(String, String) XslCompiledTransform.Transform(String, XmlWriter) XslCompiledTransform.Transform(String, XsltArguementList, XmlWriter) XslCompiledTransform.Transform(String, XsltArguementList, TextWriter) XslCompiledTransform.Transform(String, XsltArguementList, Stream)
In this case, String
is a representation of the .xslt
file that should be used in the transformation. Here, String
represents the location of specific files (whether it is source files or output files). Some of the signatures also allow for output to XmlWriter
objects, streams, and TextWriter
objects. These can be used by also providing additional arguments using the XsltArgumentList
object.
The preceding example used the second signature XslCompiledTransform.Transform(String, String)
, which asked for the source file and the destination file (both string representations of the location of said files):
myXslCompiledTransform.Transform("..FilmOrders.xml", destFileName)
The first example used four XSLT elements to transform an XML file into an HTML file. Such an example has merit, but it doesn't demonstrate an important use of XSLT: transforming XML from one standard into another standard. This may involve renaming elements/attributes, excluding elements/attributes, changing data types, altering the node hierarchy, and representing elements as attributes, and vice versa.
Returning to the example, a case of differing XML standards could easily affect your software that automates movie orders coming into a supplier. Imagine that the software, including its XML representation of a movie order, is so successful that you sell 100,000 copies. However, just as you are celebrating, a consortium of the largest movie supplier chains announces that they are no longer accepting faxed orders and that they are introducing their own standard for the exchange of movie orders between movie sellers and buyers.
Rather than panic, you simply ship an upgrade that includes an XSLT file. This upgrade (a bit of extra code plus the XSLT file) transforms your XML representation of a movie order into the XML representation dictated by the consortium of movie suppliers. Using an XSLT file enables you to ship the upgrade immediately. If the consortium of movie suppliers revises their XML representation, then you are not obliged to change your source code. Instead, you can simply ship the upgraded XSLT file that ensures each movie order document is compliant.
The specific source code that executes the transform is as follows:
Dim myXslCompiledTransform As XslCompiledTransform = New XslCompiledTransform myXslCompiledTransform.Load("..ConvertLegacyToNewStandard.xslt") myXslCompiledTransform.Transform("..MovieOrdersOriginal.xml", _ "..MovieOrdersModified.xml")
Those three lines of code accomplish the following:
Create an XslCompiledTransform
object
Use the Load
method to load an XSLT file (ConvertLegacyToNewStandard.xslt
)
Use the Transform
method to transform a source XML file (MovieOrdersOriginal.xml
) into a destination XML file (MovieOrdersModified.xml
)
Recall that the input XML document (MovieOrdersOriginal.xml
) does not match the format required by your consortium of movie supplier chains. The content of this source XML file is as follows:
<?xml version="1.0" encoding="utf-8" ?> <FilmOrder_Multiple> <multiFilmOrders> <FilmOrder>
<name>Grease</name> <filmId>101</filmId> <quantity>10</quantity> </FilmOrder> ... </multiFilmOrders> </FilmOrder_Multiple>
The format exhibited in the preceding XML document does not match the format of the consortium of movie supplier chains. To be accepted by the collective of suppliers, you must transform the document as follows:
Remove element <FilmOrder_Multiple>
.
Remove element <multiFilmOrders>
.
Rename element <FilmOrder>
to <DvdOrder>
.
Remove element <name>
(the film's name is not to be contained in the document).
Rename element <quantity>
to HowMuch
and make HowMuch
an attribute of <DvdOrder>
.
Rename element <filmId>
to FilmOrderNumber
and make FilmOrderNumber
an attribute of <DvdOrder>
.
Display attribute HowMuch
before attribute FilmOrderNumber
.
Many of the steps performed by the transform could have been achieved using an alternative technology. For example, you could have used Source Code Style attributes with your serialization to generate the correct XML attribute and XML element name. Had you known in advance that a consortium of suppliers was going to develop a standard, you could have written your classes to be serialized based on the standard. The point is that you did not know and now one standard (your legacy standard) has to be converted into a newly adopted standard of the movie suppliers' consortium. The worst thing you could do would be to change your working code and then force all users working with the application to upgrade. It is vastly simpler to add an extra transformation step to address the new standard.
The XSLT file that facilitates the transform is named ConvertLegacyToNewStandard.xslt
. A portion of this file is implemented as follows:
<?xml version="1.0" encoding="UTF-8" ?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="FilmOrder"> <!- rename <FilmOrder> to <DvdOrder> -> <xsl:element name="DvdOrder"> <!- Make element 'quantity' attribute HowMuch Notice attribute HowMuch comes before attribute FilmOrderNumber -> <xsl:attribute name="HowMuch"> <xsl:value-of select="quantity"></xsl:value-of> </xsl:attribute> <!- Make element filmId attribute FilmOrderNumber -> <xsl:attribute name="FilmOrderNumber"> <xsl:value-of select="filmId"></xsl:value-of> </xsl:attribute> </xsl:element>
<!- end of DvdOrder element -> </xsl:template> </xsl:stylesheet>
In the previous snippet of XSLT, the following XSLT elements are used to facilitate the transformation:
<xsl:template match="FilmOrder">
— All operations in this template
XSLT element take place on the original document's FilmOrder
node.
<xsl:element name="DvdOrder">
— The element corresponding to the source document's FilmOrder
element will be called DvdOrder
in the destination document.
<xsl:attribute name="HowMuch">
— An attribute named HowMuch
will be contained in the previously specified element, <DvdOrder>
. This attribute
XSLT element for HowMuch
comes before the attribute
XSLT element for FilmOrderNumber
. This order was specified as part of your transform to adhere to the new standard.
<xsl:value-of select='quantity'>
— Retrieve the value of the source document's <quantity>
element and place it in the destination document. This instance of XSLT element value-of
provides the value associated with the attribute HowMuch
.
Two new XSLT elements have crept into your vocabulary: element
and attribute
. Both of these XSLT elements live up to their names. Specifying the XSLT element named element
places an element in the destination XML document. Specifying the XSLT element named attribute
places an attribute in the destination XML document. The XSLT transform found in ConvertLegacyToNewStandard.xslt
is too long to review here. When reading this file in its entirety, remember that this XSLT file contains inline documentation to specify precisely what aspect of the transformation is being performed at which location in the XSLT document. For example, the following XML code comments indicate what the XSLT element attribute
is about to do:
<!-- Make element 'quantity' attribute HowMuch Notice attribute HowMuch comes before attribute FilmOrderNumber --> <xsl:attribute name="HowMuch"> <xsl:value-of select='quantity'></xsl:value-of> </xsl:attribute>
The preceding example spans several pages but contains just three lines of code. This demonstrates that there is more to XML than learning how to use it in Visual Basic and the .NET Framework. Among other things, you also need a good understanding of XSLT, XPath, and XQuery.
We just took a good look at XSLT and the System.Xml.Xsl
namespace, but there is a lot more to it than that. Other classes and interfaces exposed by the System.Xml.Xsl
namespace include the following:
IXsltContextFunction
— This interface accesses at runtime a given function defined in the XSLT style sheet.
IXsltContextVariable
— This interface accesses at runtime a given variable defined in the XSLT style sheet.
XsltArgumentList
— This class contains a list of arguments. These arguments are XSLT parameters or XSLT extension objects. The XsltArgumentList
object is used in conjunction with the Transform
method of XslTransform
.
XsltContext
— This class contains the state of the XSLT processor. This context information enables XPath expressions to have their various components resolved (functions, parameters, and namespaces).
XsltException, XsltCompileException
— These classes contain the information pertaining to an exception raised while transforming data. XsltCompileException
is derived from XsltException
.
ADO.NET enables Visual Basic applications to generate XML documents and use such documents to update persisted data. ADO.NET natively represents its DataSet
's underlying data store in XML. ADO.NET also enables SQL Server-specific XML support to be accessed. This chapter focuses on those features of ADO.NET that enable the XML generated and consumed to be customized. ADO.NET is covered in detail in Chapter 9.
The DataSet
properties and methods that are pertinent to XML include Namespace, Prefix, GetXml, GetXmlSchema, InferXmlSchema, ReadXml, ReadXmlSchema, WriteXml
, and WriteXmlSchema
. An example of code that uses the GetXml
method is shown here:
Dim adapter As New _ SqlClient.SqlDataAdapter("SELECT ShipperID, CompanyName, Phone " & _ "FROM Shippers", _ "SERVER=localhost;UID=sa;PWD=sa;Database=Northwind;") Dim ds As New DataSet() adapter.Fill(ds) Console.Out.WriteLine(ds.GetXml())
The preceding code uses the sample Northwind
database, retrieving all rows from the Shippers
table. This table was selected because it contains only three rows of data.
The following example makes use of the
Northwind.mdf
SQL Server Express Database file. To get this database, please search for "Northwind and pubs Sample Databases for SQL Server 2000." You can find this link atwww.microsoft.com/downloads/details.aspx?familyid=06616212-0356-46a0-8da2-eebc53a68034&displaylang=en
. Once you've installed it, you'll find theNorthwind.mdf
file in theC:SQL Server 2000 Sample Databases
directory. To add this database to your application, right-click on the solution you are working with and select Add Existing Item. From the provided dialog, you'll then be able to browse to the location of theNorthwind.mdf
file that you just installed. If you have trouble getting permissions to work with the database, make a data connection to the file from the Visual Studio Server Explorer. You will be asked to be made the appropriate user of the database, and VS will make the appropriate changes on your behalf for this to occur. When added, you will encounter a Data Source Configuration Wizard. For the purposes of this chapter, simply press the Cancel button when you encounter this dialog.
The XML returned by GetXml
is as follows (where ...
signifies that <Table>
elements were removed for the sake of brevity):
<NewDataSet> <Table> <ShipperID>1</ShipperID> <CompanyName>Speedy Express</CompanyName> <Phone>(503) 555-9831</Phone> </Table> ... </NewDataSet>
What you are trying to determine from this XML document is how to customize the XML generated. The more customization you can perform at the ADO.NET level, the less will be needed later. With this in mind, note that the root element is <NewDataSet>
and that each row of the DataSet
is returned as an XML element, <Table>
. The data returned is contained in an XML element named for the column in which the data resides (<ShipperID>, <CompanyName>
, and <Phone>
, respectively).
The root element, <NewDataSet>
, is just the default name of the DataSet
. This name could have been changed when the DataSet
was constructed by specifying the name as a parameter to the constructor:
Dim ds As New DataSet("WeNameTheDataSet")
If the previous version of the constructor were executed, then the <NewDataSet>
element would be renamed <WeNameTheDataSet>
. After the DataSet
has been constructed, you can still set the property DataSetName
, thus changing <NewDataSet>
to a name such as <WeNameTheDataSetAgain>
:
ds.DataSetName = "WeNameTheDataSetAgain"
The <Table>
element is actually the name of a table in the DataSet
's Tables
property. Programmatically, you can change <Table>
to <WeNameTheTable>
:
ds.Tables("Table").TableName = "WeNameTheTable"
You can customize the names of the data columns returned by modifying the SQL to use alias names. For example, you could retrieve the same data but generate different elements using the following SQL code:
SELECT ShipperID As TheID, CompanyName As CName, Phone As TelephoneNumber FROM Shippers
Using the preceding SQL statement, the <ShipperID>
element would become the <TheID>
element. The <CompanyName>
element would become <CName>
, and <Phone>
would become <TelephoneNumber>
. The column names can also be changed programmatically by using the Columns
property associated with the table in which the column resides. An example of this follows, where the XML element <TheID>
is changed to <AnotherNewName>
:
ds.Tables("WeNameTheTable").Columns("TheID").ColumnName = "AnotherNewName"
This XML could be transformed using System.Xml.Xsl
. It could be read as a stream (XmlTextReader
) or written as a stream (XmlTextWriter
). The XML returned by ADO.NET could even be deserialized and used to create an object or objects using XmlSerializer
. The point is to recognize what ADO.NET-generated XML looks like. If you know its format, then you can transform it into whatever you like.
Those interested in fully exploring the XML-specific features of SQL Server should take a look at Professional SQL Server 2000 Programming by Robert Vieira (Wrox Press, 2000). However, because the content of that book is not .NET-specific, the next example forms a bridge between Professional SQL Server 2000 Programming and the .NET Framework.
Two of the major XML-related features exposed by SQL Server are as follows:
FOR XML
— The FOR XML
clause of an SQL SELECT
statement enables a rowset to be returned as an XML document. The XML document generated by a FOR XML
clause is highly customizable with respect to the document hierarchy generated, per-column data transforms, representation of binary data, XML schema generated, and a variety of other XML nuances.
OPENXML
— The OPENXML
extension to Transact-SQL enables a stored procedure call to manipulate an XML document as a rowset. Subsequently, this rowset can be used to perform a variety of tasks, such as SELECT, INSERT INTO, DELETE
, and UPDATE
.
SQL Server's support for OPENXML
is a matter of calling a stored procedure. A developer who can execute a stored procedure call using Visual Basic in conjunction with ADO.NET can take full advantage of SQL Server's support for OPENXML. FOR XML
queries have a certain caveat when it comes to ADO.NET. To understand this caveat, consider the following FOR XML
query:
SELECT ShipperID, CompanyName, Phone FROM Shippers FOR XML RAW
Using SQL Server's Query Analyzer, this FOR XML RAW
query generated the following XML:
<row ShipperID="1" CompanyName="Speedy Express" Phone="(314) 555-9831" /> <row ShipperID="2" CompanyName="United Package" Phone="(314) 555-3199" /> <row ShipperID="3" CompanyName="Federal Shipping" Phone="(314) 555-9931" />
The same FOR XML RAW
query can be executed from ADO.NET as follows:
Dim adapter As New _ SqlDataAdapter("SELECT ShipperID, CompanyName, Phone " & _ "FROM Shippers FOR XML RAW", _ "SERVER=localhost;UID=sa;PWD=sa;Database=Northwind;") Dim ds As New DataSet adapter.Fill(ds) Console.Out.WriteLine(ds.GetXml())
The caveat with respect to a FOR XML
query is that all data (the XML text) must be returned via a result set containing a single row and a single column named XML_F52E2B61-18A1-11d1-B105- 00805F49916B
.
The output from the preceding code snippet demonstrates this caveat (where...represents similar data not shown for reasons of brevity):
<NewDataSet> <Table> <XML_F52E2B61-18A1-11d1-B105-00805F49916B> /<row ShipperID="1" CompanyName="Speedy Express" Phone="(503) 555-9831"/> ... </XML_F52E2B61-18A1-11d1-B105-00805F49916B> </Table> </NewDataSet>
The value of the single row and single column returned contains what looks like XML, but it contains /<
instead of the less-than character, and />
instead of the greater-than character. The symbols <
and >
cannot appear inside XML data, so they must be entity-encoded — that is, represented as />
and /<
. The data returned in element <XML_F52E2B61-18A1-11d1-B105-00805F49916B>
is not XML, but data contained in an XML document.
To fully utilize FOR XML
queries, the data must be accessible as XML. The solution to this quandary is the ExecuteXmlReader
method of the SQLCommand
class. When this method is called, an SQLCommand
object assumes that it is executed as a FOR XML
query and returns the results of this query as an XmlReader
object. An example of this follows:
Dim connection As New _ SqlConnection("SERVER=localhost;UID=sa;PWD=sa;Database=Northwind;") Dim command As New _ SqlCommand("SELECT ShipperID, CompanyName, Phone " & _ "FROM Shippers FOR XML RAW") Dim memStream As MemoryStream = New MemoryStream Dim xmlReader As New XmlTextReader(memStream) connection.Open() command.Connection = connection xmlReader = command.ExecuteXmlReader() ' Extract results from XMLReader
You will need to import the System.Data.SqlClient
namespace for this example to work.
The XmlReader
created in this code is of type XmlTextReader
, which derives from XmlReader
. The XmlTextReader
is backed by a MemoryStream
; hence, it is an in-memory stream of XML that can be traversed using the methods and properties exposed by XmlTextReader
. Streaming XML generation and retrieval was discussed earlier.
Using the ExecuteXmlReader
method of the SQLCommand
class, it is possible to retrieve the result of FOR XML
queries. What makes the FOR XML
style of queries so powerful is that it can configure the data retrieved. The three types of FOR XML
queries support the following forms of XML customization:
FOR XML RAW
— This type of query returns each row of a result set inside an XML element named <row>
. The data retrieved is contained as attributes of the <row>
element. The attributes are named for the column name or column alias in the FOR XML RAW
query.
FOR XML AUTO
— By default, this type of query returns each row of a result set inside an XML element named for the table or table alias contained in the FOR XML AUTO
query. The data retrieved is contained as attributes of this element. The attributes are named for the column name or column alias in the FOR XML AUTO
query. By specifying FOR XML AUTO, ELEMENTS
, it is possible to retrieve all data inside elements, rather than inside attributes. All data retrieved must be in attribute or element form. There is no mix-and-match capability.
FOR XML EXPLICIT
— This form of the FOR XML
query enables the precise XML type of each column returned to be specified. The data associated with a column can be returned as an attribute or an element. Specific XML types, such as CDATA
and ID
, can be associated with a column returned. Even the level in the XML hierarchy in which data resides can be specified using a FOR XML EXPLICIT
query. This style of query is fairly complicated to implement.
FOR XML
queries are flexible. Using FOR XML EXPLICIT
and the movie rental database, it would be possible to generate any form of XML movie order standard. The decision that needs to be made is where XML configuration takes place. Using Visual Basic, a developer could use XmlTextReader
and XmlTextWriter
to create any style of XML document. Using the XSLT language and an XSLT file, the same level of configuration can be achieved. SQL Server and, in particular, FOR XML EXPLICIT
, enable the same level of XML customization, but this customization takes place at the SQL level and may even be configured to stored procedure calls.
As a representation for data, XML is ideal in that it is a self-describing data format that enables you to provide your data sets as complex data types. It also provides order to your data. SQL Server 2005 embraces this direction.
More and more developers are turning to XML as a means of data storage. For instance, Microsoft Office enables documents to be saved and stored as XML documents. As an increasing number of products and solutions turn toward XML as a means of storage, this allows for a separation between the underlying data and the presentation aspect of what is being viewed. XML is also being used as a means of communicating data sets across platforms and the enterprise. The entire XML Web Services story is a result of this new capability. Simply said, XML is a powerful alternative to your data storage solutions.
Just remember that the power of using XML isn't only about storing data as XML somewhere (whether that is XML files or not); it is also about the capability to quickly access this XML data and to be able to query the data that is retrieved.
SQL Server 2005 makes a big leap toward XML in adding an XML data type as an option. This enables you to unify the relational aspects of the database and the current desires to work with XML data.
FOR XML
has also been expanded from within this latest edition of SQL Server. This includes a new TYPE
directive that returns an XML data type instance. In addition, the NET 2.0 Framework introduced a new namespace — System.Data.SqlXml
— that enables you to easily work with the XML data that comes from SQL Server 2005. The SqlXml
object is an XmlReader
-derived type. Another addition is the use of the SqlDataReader
object's GetXml
method.
SQL Server 2008 continues on this path and introduces some new XML features. First, it supports lax validation using XSD schemas. This wasn't possible prior to this release. Another big change is related to how SQL Server handles the storage of dateTime
values. In SQL Server 2005, when you stored dateTime
values, the database would first normalize everything to UTC time, regardless of whether or not you wanted to store the information in a specific time zone. In addition, if you excluded the time in your dateTime
declaration, SQL Server 2005 would add it back for you so that there was a full dateTime
stored within the database. SQL Server 2008, conversely, enables you to store the dateTime
value exactly as you declared it. No modifications or alterations are made to your value as it is stored in the database.
Another new feature of SQL Server 2008 is support of union types that contain list types. This means that you can now work with elements such as the following:
<Stocks>INTC MSFT CSCO IBM RTRSY</Stocks>
Union types enable you to define multiple items within a single element with a space between the elements, rather than define each as separate elements, as shown here:
<Stocks> <Item>INTC</Item> <Item>MSFT</Item> <Item>CSCO</Item> <Item>IBM</Item> <Item>RTRSY</Item> </Stocks>
Most Microsoft-focused Web developers have usually concentrated on either Microsoft SQL Server or Microsoft Access for their data storage needs. Today, however, a considerable amount of data is stored in XML format, so considerable inroads have been made in improving Microsoft's core Web technology to work easily with this format.
ASP.NET 3.5 contains a series of data source controls designed to bridge the gap between your data stores (such as XML) and the data-bound controls at your disposal. These new data controls not only enable you to retrieve data from various data stores, they also enable you to easily manipulate the data (using paging, sorting, editing, and filtering) before the data is bound to an ASP.NET server control.
With XML being as important as it is, a specific data source control is available in ASP.NET just for retrieving and working with XML data: XmlDataSource
. This control enables you to connect to your XML data and use this data with any of the ASP.NET data-bound controls. Just like the SqlDataSource
and the ObjectDataSource
controls (which are two of the other data source controls), the XmlDataSource
control enables you to not only retrieve data, but also insert, delete, and update data items. With increasing numbers of users turning to XML data formats, such as Web services, RSS feeds, and more, this control is a valuable resource for your Web applications.
To show the XmlDataSource
control in action, first create a simple XML file and include this file in your application. The following code reflects a simple XML file of Russian painters:
<?xml version="1.0" encoding="utf-8" ?> <Artists> <Painter name="Vasily Kandinsky"> <Painting> <Title>Composition No. 218</Title> <Year>1919</Year> </Painting> </Painter> <Painter name="Pavel Filonov"> <Painting> <Title>Formula of Spring</Title> <Year>1929</Year> </Painting> </Painter> <Painter name="Pyotr Konchalovsky"> <Painting> <Title>Sorrento Garden</Title> <Year>1924</Year> </Painting> </Painter> </Artists>
Now that the Painters.xml
file is in place, the next step is to use an ASP.NET DataList
control and connect this DataList
control to an <asp:XmlDataSource>
control, as shown here:
<%@ Page Language="VB"%> <html xmlns="http://www.w3.org/1999/xhtml" > <head runat="server"> <title>XmlDataSource</title> </head> <body> <form id="form1" runat="server"> <asp:DataList ID="DataList1" Runat="server" DataSourceID="XmlDataSource1"> <ItemTemplate> <p><b><%# XPath("@name") %></b><br /> <i><%# XPath("Painting/Title") %></i><br /> <%# XPath("Painting/Year") %></p> </ItemTemplate> </asp:DataList> <asp:XmlDataSource ID="XmlDataSource1" Runat="server" DataFile="~/Painters.xml" XPath="Artists/Painter"> </asp:XmlDataSource> </form> </body> </html>
This is a simple example, but it shows you the power and ease of using the XmlDataSource
control. Pay attention to two attributes in this example. The first is the DataFile
attribute. This attribute points to the location of the XML file. Because the file resides in the root directory of the application, it is simply ~/Painters.xml
. The next attribute included in the XmlDataSource
control is the XPath
attribute. The XmlDataSource
control uses XPath for the filtering of XML data. In this case, the XmlDataSource
control is taking everything within the <Painter>
set of elements. The value Artists/Painter
means that the XmlDataSource
control navigates to the <Artists>
element and then to the <Painter>
element within the specified XML file.
The DataList
control next must specify the DataSourceID
as the XmlDataSource
control. In the <ItemTemplate>
section of the DataList
control, you can retrieve specific values from the XML file by using XPath commands. The XPath commands filter the data from the XML file. The first value retrieved is an element attribute (name
) contained in the <Painter>
element. When you retrieve an attribute of an element, you preface the name of the attribute with an @
symbol. In this case, you simply specify @name
to get the painter's name. The next two XPath commands go deeper into the XML file, getting the specific painting and the year of the painting. Remember to separate nodes with a /
. When run in the browser, this code produces the results shown in Figure 10-5.
Besides working from static XML files such as the Painters.xml
file, the XmlDataSource
file can work from dynamic, URL-accessible XML files. One popular XML format pervasive on the Internet today is blogs, or weblogs. Blogs, or personal diaries, can be viewed either in the browser, through an RSS-aggregator, or just as pure XML.
Figure 10-6 shows blog entries directly in the browser (if you are using IE7). Behind this blog is an actual XML document that can be worked with by your code. You can find a lot of blogs to play with for this example at weblogs.asp.net
. This screen shot uses the blog found at www.geekswithblogs.net/evjen
.
Now that you know the location of the XML from the blog, you can use this XML with the XmlDataSource
control and display some of the results in a DataList
control. The code for this example is shown here:
<%@ Page Language="VB"%> <html xmlns="http://www.w3.org/1999/xhtml" > <head runat="server">
<title>XmlDataSource</title> </head> <body> <form id="form1" runat="server"> <asp:DataList ID="DataList1" Runat="server" DataSourceID="XmlDataSource1"> <HeaderTemplate> <table border="1" cellpadding="3"> </HeaderTemplate> <ItemTemplate> <tr><td><b><%# XPath("title") %></b><br /> <i><%# XPath("pubDate") %></i><br /> <%# XPath("description") %></td></tr> </ItemTemplate> <AlternatingItemTemplate> <tr bgcolor="LightGrey"><td><b><%# XPath("title") %></b><br /> <i><%# XPath("pubDate") %></i><br /> <%# XPath("description") %></td></tr> </AlternatingItemTemplate> <FooterTemplate> </table> </FooterTemplate> </asp:DataList> <asp:XmlDataSource ID="XmlDataSource1" Runat="server" DataFile="http://geekswithblogs.net/evjen/Rss.aspx" XPath="rss/channel/item"> </asp:XmlDataSource> </form> </body> </html>
This example shows that the DataFile
points to a URL where the XML is retrieved. The XPath
property filters out all the <item>
elements from the RSS feed. The DataList
control creates an HTML table and pulls out specific data elements from the RSS feed, such as the <title>, <pubDate>
, and <description>
elements.
Running this page in the browser results in something similar to what is shown in Figure 10-7.
This approach also works with XML Web Services, even those for which you can pass in parameters using HTTP-GET
. You just set up the DataFile
value in the following manner:
DataFile="http://www.someserver.com/GetWeather.asmx/ZipWeather?zipcode=63301"
One big issue with using the XmlDataSource
control is that when using the XPath capabilities of the control, it is unable to understand namespace-qualified XML. The XmlDataSource
control chokes on any XML data that contains namespaces, so it is important to yank out any prefixes and namespaces contained in the XML.
To make this a bit easier, the XmlDataSource
control includes the TransformFile
attribute. This attribute takes your XSLT transform file, which can be applied to the XML pulled from the XmlDataSource
control. That means you can use an XSLT file, which will transform your XML in such a way that the prefixes and namespaces are completely removed from the overall XML document. An example of this XSLT document is illustrated here:
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:template match="*"> <!-- Remove any prefixes --> <xsl:element name="{local-name()}"> <!-- Work through attributes --> <xsl:for-each select="@*"> <!-- Remove any attribute prefixes --> <xsl:attribute name="{local-name()}"> <xsl:value-of select="."/> </xsl:attribute> </xsl:for-each> <xsl:apply-templates/> </xsl:element> </xsl:template> </xsl:stylesheet>
Now, with this XSLT document in place within your application, you can use the XmlDataSource
control to pull XML data and strip that data of any prefixes and namespaces:
<asp:XmlDataSource ID="XmlDataSource1" runat="server" DataFile="NamespaceFilled.xml" TransformFile="~/RemoveNamespace.xsl" XPath="ItemLookupResponse/Items/Item"></asp:XmlDataSource>
Since the very beginning of ASP.NET, there has always been a server control called the Xml
server control. This control performs the simple operation of XSLT transformation upon an XML document. The control is easy to use: All you do is point to the XML file you wish to transform using the DocumentSource
attribute, and the XSLT transform file using the TransformSource
attribute.
To see this in action, use the Painters.xml
file shown earlier. Create your XSLT transform file, as shown in the following example:
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <body> <h3>List of Painters & Paintings</h3> <table border="1"> <tr bgcolor="LightGrey"> <th>Name</th> <th>Painting</th> <th>Year</th> </tr> <xsl:apply-templates select="//Painter"/> </table> </body> </html> </xsl:template>
<xsl:template match="Painter"> <tr> <td> <xsl:value-of select="@name"/> </td> <td> <xsl:value-of select="Painting/Title"/> </td> <td> <xsl:value-of select="Painting/Year"/> </td> </tr> </xsl:template> </xsl:stylesheet>
With the XML document and the XSLT document in place, the final step is to combine the two using the Xml
server control provided by ASP.NET:
<%@ Page Language="VB" %> <html xmlns="http://www.w3.org/1999/xhtml" > <head id="Head1" runat="server"> <title>XmlDataSource</title> </head> <body> <form id="form1" runat="server"> <asp:Xml ID="Xml1" runat="server" DocumentSource="~/Painters.xml" TransformSource="~/PaintersTransform.xsl"></asp:Xml> </form> </body> </html>
The result is shown in Figure 10-8.
Ultimately, XML could be the underpinning of electronic commerce, banking transactions, and data exchange of almost every conceivable kind. The beauty of XML is that it isolates data representation from data display. Technologies such as HTML contain data that is tightly bound to its display format. XML does not suffer this limitation, and at the same time it has the readability of HTML. Accordingly, the XML facilities available to a Visual Basic application are vast, and a large number of XML-related features, classes, and interfaces are exposed by the .NET Framework.
This chapter showed you how to use System.Xml.Serialization.XmlSerializer
to serialize classes. Source Code Style attributes were introduced in conjunction with serialization. This style of attributes enables the customization of the XML serialized to be extended to the source code associated with a class. What is important to remember about the direction of serialization classes is that a required change in the XML format becomes a change in the underlying source code. Developers should resist the temptation to rewrite serialized classes in order to conform to some new XML data standard (such as the example movie order format endorsed by your consortium of movie rental establishments). Technologies such as XSLT, exposed via the System.Xml.Query
namespace, should be examined first as alternatives. This chapter demonstrated how to use XSLT style sheets to transform XML data using the classes found in the System.Xml.Query
namespace.
The most useful classes and interfaces in the System.Xml
namespace were reviewed, including those that support document-style XML access: XmlDocument, XmlNode, XmlElement
, and XmlAttribute
. The System.Xml
namespace also contains classes and interfaces that support stream-style XML access: XmlReader
and XmlWriter
.
Finally, you looked at Microsoft's SQL Server 2005, 2008, and XQuery, as well as how to use XML with ASP.NET 3.5. The next chapter takes a look at LINQ, one of the biggest new features related to how the .NET Framework 3.5 works with XML. LINQ, which provides a new means of querying your data, is a lightweight façade over ADO.NET. You will likely find that the new LINQ to XML is a great way to work with XML.