DTD versus XDR Schemas

DTDs validate XML data formats as do XML schemas, which include XDR and XSD. This section covers DTD versus XDR schemas in detail.

DTD

First look at the DTD, as shown in Listing A.4.

Listing A.4. An XML DTD
<?xml version='1.0' encoding='UTF-8' ?>
<!ELEMENT Transactions (Transaction+)>
<!ATTLIST Transactions  location CDATA  #REQUIRED
                         type     CDATA  #REQUIRED >
<!ELEMENT Transaction (Amount , Type , Facility , Location)>
<!ATTLIST Transaction  date CDATA  #REQUIRED
                        id   CDATA  #REQUIRED >
<!ELEMENT Amount (#PCDATA)*>
<!ATTLIST Amount  currency  (usd )  #REQUIRED >
<!ELEMENT Type (#PCDATA)*>
<!ELEMENT Facility (#PCDATA)*>
<!ELEMENT Location (Name , Address)>
<!ELEMENT Name (#PCDATA)*>
<!ELEMENT Address (#PCDATA)*>

The first line in the DTD is an XML processing instruction. This tells the application that it is dealing with an XML 1.0 DTD, rather than an SGML DTD, and that the characters are encoded in 8-bit Unicode.

The second line dictates that the first element in the document must be a <Transactions> element and that one or more <Transaction> tags will follow it. The plus (+) sign in the second line is the operator that determines the number of elements allowed in a particular node list. A node list is a record of the number of elements meeting a certain criteria. For instance, the length of the node list for <Transaction> elements in our XML instance is 2 (or 1, on a zero-based scale, for those programmers who will hold me to that). This is because there are two <Transaction> nodes in the document. If we add another <Transaction> element to the document, the length will be three.

The third and fourth lines define attributes that belong to the <Transactions> element. In this case, you have two attributes, named location and type. Both are required, and each consists of CDATA, or Character Data, which means that the characters in the attributes will be ignored by the XML parser and treated as normal text.

The fifth through the fourteenth lines convey much of the same to the XML application as the second, third, and fourth lines. The application, in this case, should expect a <Transaction> element with an <Amount> child node, a <Type> child node, a <Facility> child node, and a <Location> child node. Each of the child nodes contains PCDATA—Parsed Character Data, which means that the XML parser will search the text for things such as additional elements and any possible entity references.

XDR

At this point, you should be grasping how a DTD defines the element and attribute names in an XML document, as well as the markup's structure. Now look at Listing A.5, which shows how the same XML is defined using XDR.

Listing A.5. An XML Schema Using the XDR Language—Transactions.xdr
<?xml version = "1.0" encoding = "UTF-8"?>
<Schema name = "Transaction.xdr"
xmlns = "urn:schemas-microsoft-com:xml-data"
xmlns:dt = "urn:schemas-microsoft-com:datatypes">
<ElementType name = "Transactions" content = "eltOnly" order = "seq" model =_ "closed">
   <AttributeType name = "location" dt:type = "string" required = "yes"/>
   <AttributeType name = "type" dt:type = "string" required = "yes"/>
    <attribute type = "location"/>
    <attribute type = "type"/>
   <element type = "Transaction" minOccurs = "1" maxOccurs = "*"/>
 </ElementType>
 <ElementType name = "Transaction" content = "eltOnly" order = "seq"_ model = "closed">
  <AttributeType name = "date" dt:type = "string" required = "yes"/>
  <AttributeType name = "id" dt:type = "string" required = "yes"/>
   <attribute type = "date"/>
   <attribute type = "id"/>
   <element type = "Amount"/>
   <element type = "Type"/>
   <element type = "Facility"/>
   <element type = "Location"/>
 </ElementType>
 <ElementType name = "Amount" content = "textOnly" dt:type = "fixed.14.4"_ model = "closed">
  <AttributeType name = "currency" dt:type = "string" dt:value = "usd "_required = "yes"/>
   <attribute type = "currency"/>
 </ElementType>
 <ElementType name = "Type" order = "many" model = "closed"/>
 <ElementType name = "Facility" order = "many" model = "closed"/>
 <ElementType name = "Location" content = "eltOnly" order = "seq" model_= "closed">
  <element type = "Name"/>
   <element type = "Address"/>
</ElementType>
 <ElementType name = "Name" order = "many" model = "closed"/>
 <ElementType name = "Address" order = "many" model = "closed"/>
</Schema>

The first thing you might notice is that the XDR version is longer than the DTD and that the XDR version is written using the XML syntax. You will soon see that this extra robustness—and the decision to use XML as the syntax of choice—adds enough benefits to make it worth its heavy verbiage.

The first line of Listing A.5 is exactly the same as the first line in the DTD version, and the purpose of the processing instruction is similar as well.

The second line of Listing A.5 is where the similarity ends. Except for the fact that the XDR schema is going to define the structure of the XML documents—as the DTD did—the XDR language allows us to be much more definitive as to the data that will be encapsulated in the XML.

The following code from Listing A.5 represents the document element of the schema and the namespaces that will be used:

<Schema name = "Transaction.xdr" 
  xmlns = "urn:schemas-microsoft-com:xml-data"
  xmlns:dt = "urn:schemas-microsoft-com:datatypes">

Every XML document requires a document element, and in this case, the schema's document element is the <Schema> element. The name attribute is of little importance, but it should be known that it can help to identify the schema's name to an application. The xmlns attributes are namespace identifiers and allow the processor to know what to do with particular pieces of the XML document. In this case, the default namespace is the Uniform Resource Name (URN), a unique identifier tied programmatically to the XDR schema.

Note

It is probably important to pause for a moment and discuss what namespaces actually represent. A namespace provides a simple method for qualifying element and attribute names used in [XML] documents by associating them with namespaces identified by URI references. This means that a namespace is a way to ensure that the elements and attributes that you are using in your XML are unique.


In xmlns = "urn:schemas-microsoft-com:xml-data", the default namespace is set to the namespace for XDR. Therefore, each of the elements in the schema will be unique to XDR, unless otherwise directed to use another namespace. The fourth line of Listing A.5 defines another namespace, and this one is for the specification that defines how the XML (within the scope of the XDR language) uses data types. As you work with XML, you will learn that any element or attribute prefixed with a dt: will be referenced to a particular data type.

Before diving much deeper into the XDR schema language, you can use Table A.1 as a guide through the rest of this section.

Table A.1. XML Schema Elements
Schema ElementDescription
attributeRefers to a declared attribute type that can appear within the scope of the named ElementType element
AttributeTypeDefines an attribute type for use within the Schema element
datatypeSpecifies the data type for the ElementType or AttributeType element
descriptionProvides documentation about an ElementType or AttributeType element
elementRefers to a declared element type that can appear within the scope of the named ElementType element
ElementTypeDefines an element type for use within the Schema element
groupOrganizes content into a group to specify a sequence
SchemaIdentifies the start of a schema definition

The following lines of code from Listing A.5 define the Transactions element—that is, the <Transactions> tag.

 <ElementType name = "Transactions" content = "eltOnly" order = "seq" model = "closed"> 
   <AttributeType name = "location" dt:type = "string" required = "yes"/>
   <AttributeType name = "type" dt:type = "string" required = "yes"/>
    <attribute type = "location"/>
    <attribute type = "type"/>
   <element type = "Transaction" minOccurs = "1" maxOccurs = "*"/>
</ElementType>

The <ElementType> Element

In the lines

<ElementType name = "Transactions" 
 content = "eltOnly" order = "seq" model = "closed">

the properties of an XML element are defined by the <ElementType>. The element type is used to describe the specific details of a particular tag in an XML document, which in this case happens to be the <Transactions> tag. The name of the tag is defined by the name attribute. The tag name in this case will be "Transactions", as shown in Listing A.6.

Note

The words “tag” and “element” are used synonymously throughout this appendix.


Listing A.6. Constructs of an <ElementType>
<ElementType
  name="idref"
  content="{empty | textOnly | eltOnly | mixed}"
  dt:type="datatype"
  model="{open | closed}"
  order="{one | seq | many}">

Next, because the content attribute is set to eltOnly, the inclusion of text is disallowed between the opening and closing <Transactions> </Transactions> tags. In other words, according to the content model, putting the words “Here is some text” between the <Transactions> tags is invalid, but adding other elements—for example, the <Transaction> tag—to the element list is completely valid. Table A.2 shows some other properties of the content attribute.

Table A.2. Content Attribute Values in <ElementType>
ValueDescription
emptyThe element cannot contain content.
textOnlyThe element can contain only text, not elements. If the model attribute is set to “open,” the element can contain text and other unnamed elements.
eltOnlyThe element can contain only the specified elements. It cannot contain any free text.
MixedThe element can contain both elements and attributes.

The order and model attributes are also vital to the classification of the element type. The order attribute specifies how the elements will be sequenced within the context of this element. For instance, think of HTML. The order is important within the context of the <HTML> element because <HEAD> must go before <BODY>. However, within the context of the <P> tag, the order is not as pressing with regards to certain elements. For example, a <DIV> tag might come before a <P> tag, under which there's another <P> tag. There is no set order in which they must occur under the <BODY> element—that is, the order is set to “many.” Table A.3 shows possible attribute values for the element <ElementType>.

Table A.3. Order Attribute Values in <ElementType>
ValueDescription
onePermits only one of a set of elements. For a document to correctly validate when the one attribute is specified, the model attribute for the ElementType must be specified as “closed.”
seqRequires the elements to appear in the specified sequence.
manyPermits the elements to appear (or not appear) in any order. If you specify many for the order attribute, maxOccurs values are no longer relevant during validation.

The model attribute is necessary for controlling the extensibility of the document. Remember, we are working with XML here, which means that extensibility is the key. A document can be expanded at the whim of the developer because XML models are “open” by default. However, if the model is “closed,” then a developer is not allowed to add custom elements to the model. In the case of our transactions document, a developer would not be allowed to add any elements under the <Transactions> tag because the model is closed.

The <AttributeType> Element

The attribute types of the <Transactions> element are defined in Listing A.7. The <AttributeType> element defines the properties of attributes.

Listing A.7. Constructs of an <AttributeType>
<AttributeType
    default="default-value"
    dt:type="primitive-type"
    dt:values="enumerated-values"
    name="idref"
    required="{yes | no}">

In the attribute type, you can identify the default value of the attribute. For example, if you wanted to create a schema for an XML document describing apple trees, you could set the default value for the color attribute to “red.” However, it is important to remember that the default value must be legal for that attribute instance, according to its data type.

The attribute's data type values may be classified using dt:type (remember that the dt: namespace identifier corresponds to Microsoft's implementation of XML data types). As you can see, in Listing A.7, the data types have been set to the type "string". Here, we could set a default value for the attribute to either "bank", or "checking", or whatever else would make sense in light of the application. Note that if you use the "enumerator" data type, then the enumerated values must go into the dt:value attribute, with the default value being listed first.

The final property to remember is the required attribute. Fairly self-explanatory, this attribute specifies whether the attribute must exist in the selected element.

The <attribute> Element

The following code lines contain the <attribute> element:

<attribute type = "location"/> 
<attribute type = "type"/>

This element refers to a declared attribute type that can appear within the scope of the named <ElementType> element. In other words, you build an attribute definition with the <AttributeType> and effectively “place” it at one or more locations throughout the schema by using the <attribute> element.

You reference an attribute type within the attribute element by using the type property. For instance

<attribute type = "location"/> 

references the attribute type in

<AttributeType name = "location" dt:type = "string" required = "yes"/> 

because the value of its type property is "location". In a sense, the <AttributeType> element is the real representation of the actual attribute, and the <attribute> element is the pointer.

There are two other properties for the <attribute> element, the value property and the required property. The value property represents a default value for the attribute that supersedes the default value set in the <AttributeType> element.

The required property simply states whether the attribute is required in the XML document. Two things to keep in mind with regard to the required property are that when the required attribute is set to "yes" and the default attribute specifies a default value, the supplied default value must always be the value, and documents containing other attribute values are invalid. When the required attribute is set to "yes" and no default is specified, each element whose type is declared to have the attribute must supply its value.

The <element> Element

The following code line introduces the <element> element:

<element type = "Transaction" minOccurs = "1" maxOccurs = "*"/> 

Much like the <attribute> element you just learned about, the <element> is simply a pointer to the <ElementType> and tells the parser where a particular element needs to be within the XML document. In the preceding code line, the parser knows that the <Transaction> element is required under the <Transactions> node. This is also referred to as being nested within a particular node. The value of the type property is a pointer to the element type. Therefore, when you see

<element type = "Transaction" minOccurs = "1" maxOccurs = "*"/> 

it is pointing to the following <ElementType>:

<ElementType name = "Transaction" content = "eltOnly" order = "seq"_ model = "closed"> 

So, all you need is one <ElementType> for all your <element> declarations, as long as you want them to inherit their properties from that particular <ElementType>.

Only two other properties belong to the <element>. The minOccurs and the maxOccurs properties. The values of these attributes can be "0", "1", or "*". The asterisk represents more than one. This means that an element can be used in a number of different ways, as you can see in Table A.4.

Table A.4. MinOccurs and MaxOccurs Attribute Values in <element>
minOccurs : maxOccursDescription
1 : *1 or more times
0 : *0 or more times
0 : 10 or 1 time
1 : 1Only 1 time

If you have been able to stick with me so far, you are going to have a much better understanding of how the BizTalk Editor works because many of the values you fill in on the Editor will correspond directly to the elements that we've just reviewed. However, we have covered only a portion of what a complete BizTalk specification represents. After we discuss a few other variations of XML schema languages, we will move on to a thorough discussion of the BizTalk specification and how it uses XDR.

Note

If you aren't completely satisfied with this introduction to XDR, visit the Microsoft MSDN site at http://msdn.microsoft.com/library/psdk/xmlsdk/xmls5gkl.htm. You will find it helpful in learning more about XML-Data Reduced and other XML-related technologies.


Even though it is highly recommended that our focus should remain on XDR and the W3C XML Schema language (XSD), we want to share a few other languages with you to illustrate the diverse nature from which schemas have evolved.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset