Significant API Design Enhancements

After a few years of experience with Microsoft's W3C XML DOM API, several key areas have been identified by Microsoft as inconveniences, annoyances, or weaknesses in the original API. To combat these issues, the following points have been addressed:

  • XML tree construction

  • Document centricity

  • Namespaces and prefixes

  • Node value extraction

Each of these problem domains has been a stumbling block to working with XML. Not only have these issues made XML code bloated, and oftentimes unintentionally obfuscated, they needed to be addressed for XML to really work seamlessly with LINQ queries. For example, if you want to use projection to return XML from a LINQ query, it's a bit of a problem if you can't instantiate an element with a new statement. This limitation of the existing XML API had to be addressed in order for LINQ to be practical with XML. Let's take a look at each of these problem areas and how they have been addressed in the new LINQ to XML API.

XML Tree Construction Simplified with Functional Construction

When reading the first sample code of the previous chapter, Listing 6-1, it becomes clear that it is very difficult to determine the XML schema from looking at the code that creates the XML tree. The code is also very verbose. After creating the XML document, we must create some type of XML node such as an element, set its value, and append it to its parent element. However, each of those three steps must be performed individually using the W3C DOM API. This leads to an obfuscated schema and a lot of code. The API just doesn't support creating an element, or any other type of node, in place in the XML tree with respect to its parent, and initializing it, all in a single operation.

The LINQ to XML API not only provides the same ability to create the XML tree as the W3C DOM does, but it also provides a new technique known as functional construction to create an XML tree. Functional construction allows the schema to be dictated as the XML objects are constructed and the values initialized all at the same time in a single statement. The API accomplishes this by providing constructors for the new API's XML objects that accept either a single object or multiple objects, which specify its value. The type of object, or objects, being added determines where in the schema the added object belongs. The pattern looks like this:

XMLOBJECT o =
  new XMLOBJECT(OBJECTNAME,
                XMLOBJECT1,
                XMLOBJECT2,
                ...
                XMLOBJECTN);

NOTE

The preceding code is merely pseudocode meant to illustrate a pattern. None of the classes referenced in the pseudocode actually exist; they just represent some conceptually abstract XML class.

If you add an XML attribute, which is implemented with the LINQ to XML XAttribute class, to an element, implemented with the XElement class, the attribute becomes an attribute of the element. For example, if XMLOBJECT1 in the previous pseudocode is added to the newly created XMLOBJECT named o, and o is an XElement, and XMLOBJECT1 is an XAttribute, XMLOBJECT1 becomes an attribute of XElement o.

If you add an XElement to an XElement, the added XElement becomes a child element of the element to which it is added. So for example, if XMLOBJECT1 is an element and o is an element, XMLOBJECT1 becomes a child element of o.

When we instantiate an XMLOBJECT, as indicated in the previous pseudocode, we can specify its contents by specifying 1 to N XMLOBJECTs. As you will learn later in the section titled "Creating Text with XText," you can even specify its contents to include a string, because that string will be automatically converted to an XMLOBJECT for you.

This makes complete sense and is at the heart of functional construction. Listing 7-1 shows an example.

Example. Using Functional Construction to Create an XML Schema
XElement xBookParticipant =
  new XElement("BookParticipant",
    new XElement("FirstName", "Joe"),
    new XElement("LastName", "Rattz"));

Console.WriteLine(xBookParticipant.ToString());

Notice that when I constructed the element named BookParticipant, I passed two XElement objects as its value, each of which becomes a child element. Also notice that when I constructed the FirstName and LastName elements, instead of specifying multiple child objects, as I did when constructing the BookParticipant element, I provided the element's text value. Here are the results of this code:

<BookParticipant>
  <FirstName>Joe</FirstName>
  <LastName>Rattz</LastName>
</BookParticipant>

Notice how much easier it is now to visualize the XML schema from the code. Also notice how much less verbose that code is than the first code sample of the previous chapter, Listing 6-1. The LINQ to XML API code necessary to replace the code in Listing 6-1 that actually creates the XML tree is significantly shorter, as shown in Listing 7-2.

Example. Creates the Same XML Tree as Listing 6-1 but with Far Less Code
XElement xBookParticipants =
  new XElement("BookParticipants",
    new XElement("BookParticipant",
      new XAttribute("type", "Author"),
      new XElement("FirstName", "Joe"),
      new XElement("LastName", "Rattz")),
    new XElement("BookParticipant",
      new XAttribute("type", "Editor"),
      new XElement("FirstName", "Ewan"),
      new XElement("LastName", "Buckingham")));

Console.WriteLine(xBookParticipants.ToString());

That is far less code to create and maintain. Also, the schema is fairly ascertainable just reading the code. Here is the output:

<BookParticipants>
  <BookParticipant type="Author">
    <FirstName>Joe</FirstName>
    <LastName>Rattz</LastName>
  </BookParticipant>
  <BookParticipant type="Editor">
    <FirstName>Ewan</FirstName>
    <LastName>Buckingham</LastName>
  </BookParticipant>
</BookParticipants>

There is one more additional benefit to the new API that is apparent in the example's results. Please notice that the output is formatted to look like a tree of XML. If I output the XML tree created in Listing 6-1, it actually looks like this:

<BookParticipants><BookParticipant type="Author"><FirstName>Joe</FirstName>...

Which would you rather read? In the next chapter, when I get to the section on performing LINQ queries that produce XML output, you will see the necessity of functional construction.

Document Centricity Eliminated in Favor of Element Centricity

With the original W3C DOM API, you could not simply create an XML element, XmlElement; you must have an XML document, XmlDocument, from which to create it. If you try to instantiate an XmlElement like this

XmlElement xmlBookParticipant = new XmlElement("BookParticipant");

you will be greeted with the following compiler error:

'System.Xml.XmlElement.XmlElement(string, string, string, System.Xml.XmlDocument)'
is inaccessible due to its protection level

With the W3C DOM API, you can only create an XmlElement by calling an XmlDocument object's CreateElement method like this:

XmlDocument xmlDoc = new XmlDocument();
XmlElement xmlBookParticipant = xmlDoc.CreateElement("BookParticipant");

This code compiles just fine. But it is often inconvenient to be forced to create an XML document when you just want to create an XML element. The new LINQ-enabled XML API allows you to instantiate an element itself without creating an XML document:

XElement xeBookParticipant = new XElement("BookParticipant");

XML elements are not the only XML type of node impacted by this W3C DOM restriction. Attributes, comments, CData sections, processing instructions, and entity references all must be created from an XML document. Thankfully, the LINQ to XML API has made it possible to directly instantiate each of these on the fly.

Of course, nothing prevents you from creating an XML document with the new API. For example, you could create an XML document and add the BookParticipants element and one BookParticipant to it, as shown in Listing 7-3.

Example. Using the LINQ to XML API to Create an XML Document and Adding Some Structure to It
XDocument xDocument =
  new XDocument(
    new XElement("BookParticipants",
      new XElement("BookParticipant",
        new XAttribute("type", "Author"),
        new XElement("FirstName", "Joe"),
        new XElement("LastName", "Rattz"))));

Console.WriteLine(xDocument.ToString());

Pressing Ctrl+F5 yields the following results:

<BookParticipants>
  <BookParticipant type="Author">
    <FirstName>Joe</FirstName>
    <LastName>Rattz</LastName>
  </BookParticipant>
</BookParticipants>

The XML produced by the previous code is very similar to the XML I created in Listing 6-1, with the exception that I only added one BookParticipant instead of two. This code is much more readable, though, than Listing 6-1, thanks to our new functional construction capabilities. And it is feasible to determine the schema from looking at the code. However, now that XML documents are no longer necessary, I could just leave the XML document out and obtain the same results, as shown in Listing 7-4.

Example. Same Example as the Previous but Without the XML Document
XElement xElement =
  new XElement("BookParticipants",
    new XElement("BookParticipant",
      new XAttribute("type", "Author"),
      new XElement("FirstName", "Joe"),
      new XElement("LastName", "Rattz")));

Console.WriteLine(xElement.ToString());

Running the code produces the exact same results as the previous example:

<BookParticipants>
  <BookParticipant type="Author">
    <FirstName>Joe</FirstName>
    <LastName>Rattz</LastName>
  </BookParticipant>
</BookParticipants>

In addition to creating XML trees without an XML document, you can do most of the other things that a document requires as well, such as reading XML from a file and saving it to a file.

Names, Namespaces, and Prefixes

To eliminate some of the confusion stemming from names, namespaces, and namespace prefixes, namespace prefixes are out; out of the API that is. With the LINQ to XML API, namespace prefixes get expanded on input and honored on output. On the inside, they no longer exist.

A namespace is used in XML to uniquely identify the XML schema for some portion of the XML tree. A URI is used for XML namespaces because they are already unique to any organization. In several of my code samples, I have created an XML tree that looks like this:

<BookParticipants>
  <BookParticipant type="Author">
    <FirstName>Joe</FirstName>
    <LastName>Rattz</LastName>
  </BookParticipant>
</BookParticipants>

Any code that is processing that XML data will be written to expect the BookParticipants node to contain multiple BookParticipant nodes, each of which have a type attribute and a FirstName and LastName node. But what if this code also needs to be able to process XML from another source, and it too has a BookParticipants node but the schema within that node is different from the previous? A namespace will alert the code as to what the schema should look like, thereby allowing the code to handle the XML appropriately.

With XML, every element needs a name. When an element gets created, if its name is specified in the constructor, that name is implicitly converted from a string to an XName object. An XName object consists of a namespace, XNamespace, object, and its local name, which is the name you provided. So, for example, you can create the BookParticipants element like this:

XElement xBookParticipants = new XElement("BookParticipants");

When you create the element, an XName object gets created with an empty namespace, and a local name of BookParticipants. If you debug that line of code and examine the xBookParticipants variable in the watch window, you will see that its Name member is set to {BookParticipants}. If you expand the Name member, it contains a member named LocalName that will be set to BookParticipants, and a member named Namespace that is empty, {}. In this case, there is no namespace.

To specify a namespace, you need merely create an XNamespace object and prepend it to the local name you specify like this:

XNamespace nameSpace = "http://www.linqdev.com";
XElement xBookParticipants = new XElement(nameSpace + "BookParticipants");

Now when you examine the xBookParticipants element in the debugger's watch window, the Name is set to {{http://www.linqdev.com}BookParticipants}. Expanding the Name member reveals that the LocalName member is still BookParticipants, but now the Namespace member is set to {http://www.linqdev.com}.

It is not necessary to actually use an XNamespace object to specify the namespace. I could have specified it as a hard-coded string literal like this:

XElement xBookParticipants = new XElement("{http://www.linqdev.com}" +
  "BookParticipants");

Notice that I enclose the namespace in braces. This clues the XElement constructor into the fact that this portion is the namespace. If you examine the BookParticipants's Name member in the watch window again, you will see that the Name member and its embedded LocalName and Namespace members are all set identically to the same values as the previous example where I used an XNamespace object to create the element.

Keep in mind that when setting the namespace, merely specifying the URI to your company or organization domain may not be enough to guarantee its uniqueness. It only guarantees you won't have any collisions with any other (well-meaning) organization that also plays by the namespace naming convention rules. However, once inside your organization, any other department could have a collision if you provide nothing more than the organization URI. This is where your knowledge of your organization's divisions, departments, and so on, can be quite useful. It would be best if your namespace could extend all the way to some level you have control over. For example, if you work at LINQDev.com and you are creating a schema for the human resources department that will contain information for the pension plan, your namespace might be the following:

XNamespace nameSpace = "http://www.linqdev.com/humanresources/pension";

So for a final example showing how namespaces are used, I will modify the code from Listing 7-2 to use a namespace, as shown in Listing 7-5.

Example. Modified Version Listing 7-2 with a Namespace Specified
XNamespace nameSpace = "http://www.linqdev.com";

XElement xBookParticipants =
  new XElement(nameSpace + "BookParticipants",
    new XElement(nameSpace + "BookParticipant",
      new XAttribute("type", "Author"),
      new XElement(nameSpace + "FirstName", "Joe"),
      new XElement(nameSpace + "LastName", "Rattz")),
    new XElement(nameSpace + "BookParticipant",
      new XAttribute("type", "Editor"),
      new XElement(nameSpace + "FirstName", "Ewan"),
      new XElement(nameSpace + "LastName", "Buckingham")));

Console.WriteLine(xBookParticipants.ToString());

Pressing Ctrl+F5 reveals the following results:

<BookParticipants xmlns="http://www.linqdev.com">
  <BookParticipant type="Author">
    <FirstName>Joe</FirstName>
    <LastName>Rattz</LastName>
  </BookParticipant>
  <BookParticipant type="Editor">
    <FirstName>Ewan</FirstName>
    <LastName>Buckingham</LastName>
  </BookParticipant>
</BookParticipants>

Now any code could read that and know that the schema should match the schema provided by LINQDev.com.

To have control over the namespace prefixes going out, use the XAttribute object to create a prefix as in Listing 7-6.

Example. Specifying a Namespace Prefix
XNamespace nameSpace = "http://www.linqdev.com";

XElement xBookParticipants =
  new XElement(nameSpace + "BookParticipants",
    new XAttribute(XNamespace.Xmlns + "linqdev", nameSpace),
    new XElement(nameSpace + "BookParticipant"));

Console.WriteLine(xBookParticipants.ToString());

In the previous code, I am specifying linqdev as the namespace prefix, and I am utilizing the XAttribute object to get the prefix specification into the schema. Here is the output from this code:

<linqdev:BookParticipants xmlns:linqdev="http://www.linqdev.com">
  <linqdev:BookParticipant />
</linqdev:BookParticipants>

Node Value Extraction

If you read the first code sample of the previous chapter, Listing 6-1, and laughed at my results, which I hope you did, you no doubt have experienced the same issue that prevented me from getting the results I was after—getting the actual value from a node is a bit of a nuisance. If I haven't been working with any XML DOM code for a while, I inevitably end up with an error like the one in Listing 6-1. I just about always forget I have to take the extra step to get the value of the node.

The LINQ to XML API fixes that problem very nicely. First, calling the ToString method of an element outputs the XML string itself, not the object type as it does with the W3C DOM API. This is very handy when you want an XML fragment from a certain point in the tree and makes far more sense than outputting the object type. Listing 7-7 shows an example.

Example. Calling the ToString Method on an Element Produces the XML Tree
XElement name = new XElement("Name", "Joe");
Console.WriteLine(name.ToString());

Pressing Ctrl+F5 gives me the following:

<Name>Joe</Name>

Wow, that's a nice change. But wait, it gets better. Of course, child nodes are included in the output, and since the WriteLine method doesn't have an explicit overload accepting an XElement, it calls the ToString method for you, as shown in Listing 7-8.

Example. Console.WriteLine Implicitly Calling the ToString Method on an Element to Produce an XML Tree
XElement name = new XElement("Person",
  new XElement("FirstName", "Joe"),
  new XElement("LastName", "Rattz"));
Console.WriteLine(name);

And the following is the output:

<Person>
  <FirstName>Joe</FirstName>
  <LastName>Rattz</LastName>
</Person>

Even more important, if you cast a node to a data type that its value can be converted to, the value itself will be output. Listing 7-9 shows another example, but I will also print out the node cast to a string.

Example. Casting an Element to Its Value's Data Type Outputs the Value
XElement name = new XElement("Name", "Joe");
Console.WriteLine(name);
Console.WriteLine((string)name);

Here are the results of this code:

<Name>Joe</Name>
Joe

How slick is that? Now how much would you pay? And there are cast operators provided for string, int, int?, uint, uint?, long, long?, ulong, ulong?, bool, bool?, float, float?, double, double?, decimal, decimal?, TimeSpan, TimeSpan?, DateTime, DateTime?, GUID, and GUID?.

Listing 7-10 shows an example of a few different node value types.

Example. Different Node Value Types Retrieved via Casting to the Node Value's Type
XElement count = new XElement("Count", 12);
Console.WriteLine(count);
Console.WriteLine((int)count);

XElement smoker = new XElement("Smoker", false);
Console.WriteLine(smoker);
Console.WriteLine((bool)smoker);

XElement pi = new XElement("Pi", 3.1415926535);
Console.WriteLine(pi);
Console.WriteLine((double)pi);

And the envelope please!

<Count>12</Count>
12
<Smoker>false</Smoker>
False
<Pi>3.1415926535</Pi>
3.1415926535

That seems very simple and intuitive. It looks like if I use the LINQ to XML API instead of the W3C DOM API, errors like the one in Listing 6-1 of the previous chapter will be a thing of the past.

While all of those examples make obtaining an element's value simple, they are all cases of casting the element to the same data type that its value initially was. This is not necessary. All that is necessary is for the element's value to be able to be converted to the specified data type. Listing 7-11 shows an example where the initial data type is string, but I will obtain its value as a bool.

Example. Casting a Node to a Different Data Type Than Its Value's Original Data Type
XElement smoker = new XElement("Smoker", "true");
Console.WriteLine(smoker);
Console.WriteLine((bool)smoker);

Since I have specified the value of the element to be "true", and since the string "true" can be successfully converted to a bool, the code works:

<Smoker>true</Smoker>
True

Unfortunately, exactly how the values get converted is not specified, but it appears that the conversion methods in the System.Xml.XmlConvert class are used for this purpose. Listing 7-12 demonstrates that this is the case when casting as a bool.

Example. Casting to a Bool Calls the System.Xml.XmlConvert.ToBoolean Method
try
{
  XElement smoker = new XElement("Smoker", "Tue");
  Console.WriteLine(smoker);
  Console.WriteLine((bool)smoker);
}
catch (Exception ex)
{
  Console.WriteLine(ex);
}

Notice that I intentionally misspell "True" above to force an exception in the conversion hoping for a clue to be revealed in the exception that is thrown. Will I be so lucky? Let's press Ctrl+F5 to find out.

<Smoker>Tue</Smoker>
System.FormatException: The string 'tue' is not a valid Boolean value.
   at System.Xml.XmlConvert.ToBoolean(String s)
...

As you can see, the exception occurred in the call to the System.Xml.XmlConvert.ToBoolean method.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset