Word Markup Language (WordML)
Mixing WordML with other vocabularies
Creating WordML with stylesheets
In the previous chapter we learned how to use Word to create and edit XML documents as unrendered abstractions. We also learned how to convert a rendered Word document to XML. We did these things using Word’s default schema-independent presentation of XML documents: pink icons that represent tags.
In this chapter, we do the opposite. We learn how to transform an abstract XML document into a WordML rendition, both manually and with XSLT stylesheets. Doing so allows us to use Word’s WYSIWYG interface to view and print the documents, and even to edit the abstract XML data.
These presentations can include any formatting available in Word, such as styles and page numbers. The transformations can also filter out unwanted information, or summarize or reorganize it.
In addition to general Word end user skills, it is helpful to understand the basics of XSLT, which can be found in Chapter 18, “XSL Transformations (XSLT)”, on page 392.
The Word Markup Language (WordML) is the native XML representation for Microsoft Word. It captures everything that might be known about a Word document. It covers not just the text of the document itself, but also all the formatting, all the styles associated with that document (whether they are used or not), and all of the various settings (such as page margins and tabs). Since it covers so many things, it is very verbose, and it is somewhat difficult to understand just by reading it.
Nevertheless, WordML has a significant benefit over the equivalent .doc
binary format of Word documents: Any tool that can parse XML can make use of the Word document. This includes tools that transform, display, search, validate, store, index and query XML documents.
As Office 2003 increases in popularity, we expect third-party tools to be released that will use WordML to process Word documents in new ways and to generate Word documents from other data sources.
Because WordML is a native Word document representation, Word treats it quite differently from other uses of XML. To avoid the constant interjection of “except for WordML”, we normally do not include WordML when we discuss Word’s treatment of XML documents. If we do mean to include it, that will be clear from the context.
WordML is a large, complex vocabulary with over 400 different element types. Fortunately, in order to create, or even parse, WordML documents, you only need to be familiar with a small fraction of the vocabulary.[1] In fact, the first WordML document you write can be quite small and simple. It is shown in Example 5-1.
Example 5-1. Your first WordML document (minimal WordML.xml)
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <?mso-application progid="Word.Document"?> <w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml"> <w:body> <w:p> <w:r><w:t>hello, Word</w:t></w:r> </w:p> </w:body> </w:wordDocument>
Recall Doug’s article for Worldwide Widget’s newsletter. It started life as an ordinary Word document. We repeat it here in Figure 5-1 for your convenience.
The default format when you save a Word document is still the binary .doc
format. However, if you choose to save a document as XML and Word cannot associate that document with a schema, it will be saved as WordML.[2]
Let’s save Doug’s article as WordML and see what we get. To do so:
We’ll look at the actual WordML representation, as a Word rendition would be identical to Figure 5-1. Because the WordML document is extremely long, we will excerpt pieces as examples as we go along.
The basic structure of a WordML document is shown in Model 5-1/>.
Example 5-1. WordML document structure
[Document (wordDocument) [0..1]Document Properties -- General (DocumentProperties) [0..1]Lists (lists) [0..1]Styles(styles) [0..1]Document Properties -- Word-specific (docPr) [1..1]Body (body)
The root of a WordML document is always a wordDocument
element. The most commonly used children of a wordDocument
element are:
an optional DocumentProperties
element, which contains general information about the document such as the date it was created and last updated, the author name, and the revision number
an optional lists
element contains information about the formatting of lists, such as the type of bullet or number, and the indentation used
an optional styles
element contains the information about the styles used in the document, such as the font and size, language, and paragraph formatting
an optional docPr
element, which contains Word-specific information on the settings for the document, such as margins and header and footer properties
a required body
element that contains the bulk of the document
As you can see, most of these elements can be left out. If you omit an optional element, it defaults to the settings for new documents in Word.
Example 5-2 shows the very beginning of the WordML document.[3]
Example 5-2. Beginning of WordML document (article WordML.xml)
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <?mso-application progid="Word.Document"?> <w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:SL="http://schemas.microsoft.com/schemaLibrary/2003/core" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:aml="http://schemas.microsoft.com/aml/2001/core" xmlns:wx="http://schemas.microsoft.com/office/word/2003/auxHint" xmlns:o="urn:schemas-microsoft-com:office:office" xml:space="preserve"> <o:DocumentProperties> <o:Title>Heading 1</o:Title> <o:Author>Priscilla Walmsley</o:Author> <o:LastAuthor>Priscilla Walmsley</o:LastAuthor> <o:Revision>2</o:Revision>
line 1
The document starts out on line 1 with an XML declaration, which identifies the document as XML and indicates the encoding used in the document.
line 2
On line 2, a processing instruction appears which identifies the document as a Word document. The purpose of this processing instruction is to tell Windows to open this file in Word, rather than in Internet Explorer, which is often the application associated with the
.xml
extension.
line 3
The root element is
w:wordDocument
, whose start-tag has a number of namespace declarations.
line 4
The namespace of the WordML vocabulary is:
http://schemas.microsoft.com/office/word/2003/wordml
This namespace is commonly mapped to thew
prefix, although there is no requirement that this prefix be used.
line 13
The first child of
w:wordDocument
is ao:DocumentProperties
element that contains general information about the document. It is followed by a huge number of elements representing style information, which is not shown.
The body of the WordML document, represented by the body
element, contains all the text of the document. Its structure is shown in Model 5-2.
The body can contain sections that contain paragraphs, or it can contain paragraphs directly. Paragraphs, in turn, contain text runs, which contain text elements, which contain data characters. There is a separate text run for every data character string that has a distinct style or other properties. A paragraph can also contain images, hyperlinks and other components.
Example 5-2. WordML body structure
Body (body) [0..*]Section (sect) [0..*]Paragraph (p) [0..*]Text Run (r) [0..*]Text (t) [0..*]Paragraph (p) ...
Each paragraph is represented by a p
element. The paragraph has a style (and possibly other settings) associated with it in its properties child, pPr
. If no style is associated with the paragraph, it defaults to “Normal” style.
A text run (r
) can contain multiple text elements, as well as pictures, footnotes, fields and other Word objects. A text element (t
), on the other hand, can only contain data characters, with no child elements. Every data character in the document text is contained directly in a t
element.
An excerpt from the body of the WordML representation of Figure 5-1 is shown in Example 5-3. It contains two paragraphs (p
elements). The first paragraph has a pPr
child that identifies properties of the paragraph, namely that the style is “Heading2”. It then contains a text run (r
element), which contains a single text element (t
).
Example 5-3. WordML paragraphs (article WordML.xml)
<w:p> <w:pPr> <w:pStyle w:val="Heading2"/> </w:pPr> <w:r><w:t>A great month!</w:t></w:r> </w:p> <w:p> <w:r><w:t>This month's figures are a </w:t></w:r> <w:r> <w:rPr> <w:i/> </w:rPr> <w:t>huge</w:t> </w:r> <w:r> <w:t> improvement over this month last year. We sold 1,342 widgets for a total revenue of $14,327.</w:t> </w:r> </w:p>
The second paragraph contains three text runs (w:r
elements). As the word “huge” is in italics, it must have its own text run with its own properties (the w:rPr
element) that specify the italics (the w:i
element).
Bulleted and numbered lists are common in Word documents. In WordML, list items are simply paragraphs that refer to a list ID in their properties. The list ID corresponds to a list
defined in the lists
section of the document.
For example, suppose Doug wanted to list the identifying elements of his article in a bulleted list, as shown in Figure 5-2. The corresponding WordML would look like Example 5-4.
Example 5-4. WordML list
<w:p> <w:pPr> <w:listPr><w:ilvl w:val="0"/><w:ilfo w:val="2"/></w:listPr> </w:pPr> <w:r><w:t>Title: Sales Update</w:t></w:r> </w:p> <w:p> <w:pPr> <w:listPr><w:ilvl w:val="0"/><w:ilfo w:val="2"/></w:listPr> </w:pPr> <w:r><w:t>Author: Doug Jones</w:t></w:r> </w:p> <w:p> <w:pPr> <w:listPr><w:ilvl w:val="0"/><w:ilfo w:val="2"/></w:listPr> </w:pPr> <w:r><w:t>Date: February 3, 2004</w:t></w:r> </w:p>
Each paragraph properties (pPr
) element contains a list properties (listPr
) element which in turn has two children:
The ilvl
element indicates the level of the item in the list, starting with zero. If a list contains items at different outline levels, this property indicates this.
The ilfo
element associates the paragraph with a specific list. The number specified in its val
attribute is an ID that corresponds to the ilfo
attribute of a list
element in the lists
section.
The lists
element of the same document appears in Example 5-5. Notice that it has two types of children. The listDef
element defines various properties of the list, such as the style used and a unique identifier. The list
element has only a unique identifier and the link to a listDef
element through its ilst
child. The many levels of definitions for lists are due to the complexity of starting and stopping the numbering for numbered lists.
The structure of WordML tables (Model 5-3) is very similar to XHTML tables, so if you are familiar with HTML you have a head start. A table element (tbl
) can appear anywhere a paragraph can appear, namely as a child of body
.
Example 5-3. WordML table structure
Table (tbl) [1..1]Table Properties (tblPr) [1..1]Table Grid (tblGrid) [1..*]Table Grid Column (tblGridCol) [0..*]Row (tr) [0..1]Row Properties (trPr) [1..*]Cell (tc) [1..1]Cell Properties (tcPr) [0..*]Tables (tbl) [1..*]Paragraphs (p)
The table properties element (tblPr
) is used to specify the properties of the table, such as the style used, the cell spacing, and the borders. The element is required, but none of its children (which set the individual properties) is required, so it is possible to have an empty tblPr
element. All of the settings have defaults, which are used in case they are not specified.
The table grid element (tblGrid
) is used to set the column widths. For each column in the table it contains a tblGridCol
with a w
attribute that specifies the column width in twips (twentieths of a point). The tblGrid
element and its tblGridCol
children are required.
Each row in the table is represented by a tr
element. Each tr
element has an optional properties child, trPr
, and one or more cells, represented by tc
elements. Each tc
may itself have a properties child, tcPr
, and must have one or more other tables (tbl
) or paragraphs (P
). The last child of the tc
must always be a paragraph rather than another table.
Suppose that Doug wants to display sales data in a table. The table shown in Example 5-6 will look like Figure 5-3 when shown in Word.
Example 5-6. WordML table
<w:tbl> <w:tblGrid> <w:gridCol w:w="828"/> <w:gridCol w:w="1620"/> <w:gridCol w:w="1440"/> </w:tblGrid> <w:tr> <w:tc> <w:p> <w:pPr><w:pStyle w:val="Heading3"/></w:pPr> <w:r><w:t>Q</w:t></w:r> </w:p> </w:tc> <w:tc> <w:p> <w:pPr><w:pStyle w:val="Heading3"/></w:pPr> <w:r><w:t>Revenue</w:t></w:r> </w:p> </w:tc> <w:tc> <w:p> <w:pPr><w:pStyle w:val="Heading3"/></w:pPr> <w:r><w:t>Profit</w:t></w:r> </w:p> </w:tc> </w:tr> <w:tr> <w:tc><w:p><w:r><w:t>1</w:t></w:r></w:p></w:tc> <w:tc><w:p><w:r><w:t>$14,332.35</w:t></w:r></w:p></w:tc> <w:tc><w:p><w:r><w:t>$2,115.12</w:t></w:r></w:p></w:tc> </w:tr> <w:tr> <w:tc><w:p><w:r><w:t>2</w:t></w:r></w:p></w:tc> <w:tc><w:p><w:r><w:t>$13,224.22</w:t></w:r></w:p></w:tc> <w:tc><w:p><w:r><w:t>$1,655.51</w:t></w:r></w:p></w:tc> </w:tr> <w:tr> <w:tc><w:p><w:r><w:t>3</w:t></w:r></w:p></w:tc> <w:tc> <w:p><w:r><w:t>$14,778.26</w:t></w:r></w:p></w:tc><w:tc> <w:p><w:r><w:t>$2,243.98</w:t></w:r></w:p></w:tc> </w:tr> <w:tr> <w:tc><w:p><w:r><w:t>4</w:t></w:r></w:p></w:tc> <w:tc><w:p><w:r><w:t>$17,455.15</w:t></w:r></w:p></w:tc> <w:tc><w:p><w:r><w:t>$2,988.22</w:t></w:r></w:p></w:tc> </w:tr> </w:tbl>
For more complex tables, you can use the many table formatting features of Word, such as vertical and horizontal merge, and borders and shading. You can even include tables within other tables, as we saw.
An image embedded in a Word document is represented in WordML by a pict
element. Each pict
element contains a Vector Markup Language (VML) description of the shape, location and size of the image, and the image data itself in base64Binary
datatype format.
As with other Word components, the best way to include an image in a generated WordML document is to create a Word document that contains the image in the desired location and size, and save it as WordML. You can then copy the pict
element from the saved WordML document and place it in your XSLT stylesheet.
A hyperlink is represented in WordML by an hlink
element. Example 5-7 shows a paragraph that has an embedded hyperlink.
Example 5-7. Hyperlink in WordML
<w:p> <w:r> <w:t>More information on the new marketing plan can be found at </w:t> </w:r> <w:hlink w:dest="http://www.xmlinoffice.com/mkplan"> <w:r> <w:rPr><w:rStyle w:val="Hyperlink"/></w:rPr> <w:t>http://www.xmlinoffice.com/mkplan</w:t> </w:r> </w:hlink> <w:r> <w:t>. </w:t> </w:r> </w:p>
The hlink
element is contained directly within the p
element, rather than within a text run. In fact, it contains its own text run for the hyperlink text that appears when the document is presented, as in Figure 5-4. The dest
attribute of the hlink
element specifies the linked URL.
There are four kinds of style in Word:
A character style applies to a data character string within a paragraph.
A paragraph style applies to an entire paragraph.
A table style has special settings relating to tables, such as background color and justification.
A list style has special settings related to lists, such as the bullet or numbering used.
There are quite a few different properties of a style, ranging from character properties, such as font and size, to paragraph properties, such as indentation and tab settings. Any style setting that can be specified in Word can also be expressed in WordML.
The styles
element that appears before the body contains all the information about the styles used in the document. Each style
element has a unique name that is specified in its styleId
attribute. The text in the body of the document then refers to these styles by name.
In Example 5-3, the first paragraph refers to the style whose name is “Heading2”. The style
element for Heading2 is shown in Example 5-8.
Example 5-8. WordML style (article WordML.xml)
<w:style w:type="paragraph" w:styleId="Heading2"> <w:name w:val="heading 2"/> <w:basedOn w:val="Normal"/> <w:next w:val="Normal"/> <w:rsid w:val="CF4316"/> <w:pPr> <w:pStyle w:val="Heading2"/> <w:spacing w:before="240" w:after="60"/> </w:pPr> <w:rPr> <w:rFonts w:ascii="Arial" w:h-ansi="Arial" w:cs="Arial"/> <w:b/> <w:b-cs/> <w:kern w:val="48"/> <w:sz w:val="48"/> <w:sz-cs w:val="48"/> </w:rPr> </w:style>
Fortunately, there is no need to learn all the WordML elements for the style settings you need. Attempting to construct WordML style definitions by hand would be a tedious, trial-and-error process. Because Word already provides a user-friendly front-end for defining styles, you should use Word itself to create a document that has all the styles you want to use.
You can save that document as WordML using the procedure described in 5.1.2, “Saving a Word document as WordML”, on page 89. The result is a WordML document that contains all the styles you need. You can then copy the styles
section of that document (and the lists
section if needed).
This is a good approach not just for paragraph styles, but also for character styles. For example, if you wish to italicize a word in the middle of a sentence, you could do this using the i
property for the text run, as shown in Example 5-3. However, it is sometimes difficult to remember the names of all the different properties that can be applied to text.
Using Word, you can create a character style for italics named, for example, “emphasis”. Any text that should be italicized because it should be emphasized can then refer to that style, rather than using the i
property. In effect you are using the principles of generalized markup for style names, just as you do for XML element-type names.
As with XML, this approach to style definitions has the added benefit of making it easy to apply a change to all text of that type. For example, if you use italics for both emphasized words and citations, you can create two styles: “emphasis” and “citation”. If later, you decide you want to put citations in a different font, you can simply change the “citation” style rather than having to change the font of some but not all of the italicized text.
As we saw in 4.6, “Saving a document”, on page 79, WordML can be interspersed with other vocabularies. When a Word document associated with a schema is saved as XML, by default the saved file contains both WordML elements and elements of the associated schema.
For example, saving an article document as XML results in a document that contains elements from the article
schema interspersed with WordML elements, as shown in Example 5-9.
Example 5-9. article
/WordML mixture (article data and WordML.xml)
<ns0:section> <ns0:header> <w:p> <w:pPr> <w:pStyle w:val="Heading2"/> </w:pPr> <w:r> <w:t>A great month!</w:t> </w:r> </w:p> </ns0:header> <ns0:para> <w:p> <w:r> <w:t>This month's figures are a</w:t> </w:r> <ns0:em> <w:r> <w:rPr> <w:i/> </w:rPr> <w:t>huge</w:t> </w:r> </ns0:em> <w:r> <w:t> improvement over this month last year. We sold 1,342 widgets for a total revenue of 14,327.</w:t> </w:r> </w:p> </ns0:para> </ns0:section>
Namespaces are used to distinguish between the two vocabularies. The WordML elements use the w
prefix, as in w:p
and w:t
. The article
vocabulary uses the ns0
prefix, as in ns0:section
and ns0:header
.
Because of the hierarchical structure of XML, an element from the article
schema must always contain one or more entire WordML paragraphs, or be contained in a WordML paragraph. It is not possible for it to span part of one WordML paragraph plus part of the next paragraph.
In addition, each element from the article
schema must contain its own text run. It cannot be included as a child of a text run element (r)
, nor as a child of a text element (t)
. For example, the following text run is illegal:
<w:r><w:t><ns0:title>Sales Update</ns0:title></w:t></w:r>
Instead, the ns0:title
element should be moved out to contain the w:r
element, as in:
<ns0:title><w:r><w:t>Sales Update</w:t></w:r></ns0:title>
Combining WordML with other vocabularies allows all of the Word formatting and other information to be retained, so that the document can be reopened in Word and the styles and settings will be intact. This is useful if you or other users will continue to edit the document in Word.
However, a mixed document is not valid according to the article
schema. If you need an article
document to pass to an application that is expecting it, just save the document as XML with the Save data only box checked.
WordML is very rarely created by hand, because it is much easier for a person to format a document with Word than to compose the equivalent WordML representation. But that approach is most useful for one-off tasks.
If you want to create Word renditions of multiple XML documents of the same document type, it is more efficient to generate WordML using XSLT. This technique also allows data from other sources to be incorporated into the Word documents. Moreover, multiple views of the same data (e.g. for different classes of user) can be created using several different transformations.
This section provides an overview of the creation and use of stylesheets for Microsoft Word. For more information on XSLT stylesheets, please see Chapter 18, “XSL Transformations (XSLT)”, on page 392.
Each desired rendition is expressed as an XSLT stylesheet that is associated with a particular schema in the Word Schema Library. An XSLT stylesheet contains XSLT instruction elements (usually prefixed with xsl:
) that select elements and attributes from the source document. The instructions are interleaved with elements from other namespaces that are to appear in the result document.
For use with Word, a stylesheet must transform documents from your source vocabulary (e.g. article
) to WordML. As we have seen, this can be a challenging task, since WordML is quite complex and not entirely intuitive. But fortunately, as we have also seen, you can copy most of the complex parts from an existing WordML document.
If you have (or create) a Word document that has the settings you want to use – for example the styles and page margins – you can simply save that document as WordML. You can then paste the beginning of the document, which contains all the settings, into your stylesheet.
Worldwide Widget maintains an archive of its newsletter articles in XML so they can easily be reused. Authors frequently access the archive to rework old articles for new issues of the newsletter. The company has implemented a stylesheet that will transform an XML article
(the source document) into a WordML/article
combination (the result document).
Example 5-10 shows the general structure of the stylesheet (named article_view.xsl
) that accomplishes this.
Example 5-10. Article stylesheet general structure (article_view.xsl)
<xsl:stylesheet version="1.0" xml:space="preserve" xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml" xmlns:art="http://xmlinoffice.com/article" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <w:wordDocument xml:space="preserve"> <w:lists> <!--taken from Word--> </w:lists> <w:styles> <!--taken from Word--> </w:styles> <w:docPr> <!--taken from Word--> </w:docPr> <w:body> <xsl:apply-templates select="/art:article"/> </w:body> </w:wordDocument> </xsl:template> <!--rest of template rules here--> </xsl:stylesheet>
Rather than start from scratch with the styles, we took the lists
and styles
elements from a Word document that had styles defined to our liking, as was recommended in 5.1.6.2, “Generating WordML style definitions”, on page 101.[4]
The body
element contains the xsl:apply-templates
instruction to apply the correct template rule to the source elements whose element-type name is article
. (Only one such element is allowed by the article
schema.)[5]
The article_view.xsl
stylesheet has a template rule for every article
schema element type. The template rule that matches the root element, article
, is shown in Example 5-11. It inserts WordML paragraphs (p
elements) for the title, author and date. It also applies other template rules that transform the article
element’s children.
Example 5-11. Template rule for article
schema element type (article_view.xsl)
<xsl:template match="art:article"> <xsl:copy> <w:p> <w:pPr> <w:pStyle w:val="Heading1"/> </w:pPr> <xsl:apply-templates select="art:title"/> </w:p> <w:p> <xsl:apply-templates select="art:author"/> </w:p> <w:p> <w:pPr> <w:pBdr> <w:bottom w:val="single" w:sz="6" w:space="22" w:color="auto"/> </w:pBdr> </w:pPr> <xsl:apply-templates select="art:date"/> </w:p> <xsl:apply-templates select="art:body"/> </xsl:copy> </xsl:template>
The xsl:copy
instruction copies a source element so that it also appears in the resulting WordML; it does not copy child elements or attributes. If you want a pure WordML document with no elements from the article
vocabulary, you can leave out the xsl:copy
instructions in the templates.
The template rule shown in Example 5-12 is used to transform both author
and date
elements, since they are processed similarly. Their contents are simply included unchanged in a text element contained within a text run.
Example 5-12. Template rule for author
and date
element types (article_view.xsl)
<xsl:template match="art:author|art:date"> <xsl:copy> <w:r> <w:rPr> <w:b/> </w:rPr> <w:t><xsl:value-of select="."/></w:t> </w:r> </xsl:copy> </xsl:template>
A third template rule, for the section
element type, is shown in Example 5-13. It includes a paragraph (p
) for the header, then uses an XSLT for-each
element to loop through the child paragraphs and process them individually.
Example 5-13. Template rule for section
element type (article_view.xsl)
<xsl:template match="art:section"> <xsl:copy> <w:p> <w:pPr> <w:pStyle w:val="Heading2"/> </w:pPr> <xsl:apply-templates select="art:header"/> </w:p> <xsl:for-each select="art:para"> <xsl:apply-templates select="."/> </xsl:for-each> </xsl:copy> </xsl:template>
The result of applying the stylesheet is a WordML/article
combination that can be displayed in Word with or without the article
tag icons, as shown in Figure 5-5.
This section explains how to use our newly-created stylesheet (article_view.xsl
) with Word.
Stylesheets are associated with schemas using the Schema Library, where they are known as solutions. Multiple stylesheets can be associated with the same schema. First, let’s associate the article_view.xsl
stylesheet with the article
schema. To do this:
On the Tools menu, click Templates and Add-Ins.
Click the XML Schema tab.
Click Schema Library.
Select the article
schema.
Click Add Solution.
Browse to the location of the article_view.xsl
document and select it.
This will bring up the Solution Settings dialog shown in Figure 5-6.
The default type is XSL Transformation, which is what we want in this case.
In the Alias box, enter a nickname for the solution, such as “Article View”, and click OK.
The solution (stylesheet) now appears in the bottom half of the Schema Library dialog, as shown in Figure 5-7.
It is possible in the Schema Library dialog to specify a particular solution as the default. To do this, select a solution from the Default solution list. The default stylesheet is applied whenever a document associated with that schema is opened.
There are three ways to choose a stylesheet to apply while opening a document: by default, while opening the document, and after opening the document but before editing it.
Opening a document whose schema is associated with a default stylesheet will result in that stylesheet being applied automatically. For example, now, when we open article.xml
, the article_view.xsl
stylesheet is automatically applied, as shown in Figure 5-8.
If you previously had the Show XML tags in document box checked, you will have to uncheck it in order to see the document rendered according to the stylesheet. You can do this on the XML Structure task pane.
If there is no default stylesheet for a schema, or if you wish to use a different style, you can choose a stylesheet while opening a document. To do this:
The XML Document task pane lists all the associated stylesheets, as well as the generic Data only rendition that is used as a default when no other stylesheet is available. You can check out the different renditions of the document simply by selecting them from this list.
You can also click Browse to look for an additional XSLT stylesheet to apply.
You can choose between stylesheets only until you have begun editing the document. Once you change the document in any way, the XML Document task pane closes and you lose the ability to change to a different rendition of the document.
Styles are not necessarily applied to newly added elements. For example, if you insert a new section
in your article, its header
element will not automatically be given the Heading2 style like the original section headers. However, if you save the document using Save data only, then reopen it, the stylesheet will format it appropriately.
You can apply a stylesheet when you are saving a document. This may seem confusing, since styles are generally associated with the way a document is formatted and presented rather than the way it is stored.
However, XSLT stylesheets are not just for formatting; you can also use them to transform XML documents from one vocabulary to another. This feature is useful if, for example, you want to allow the user to work with a small, simple vocabulary, but the documents need to be transformed into a more complex vocabulary to be used by another process.
To specify a stylesheet when saving a document:
[1] A reference guide that covers the entire WordML vocabulary is included with the Microsoft Word XML Content Development Kit that can be downloaded from the MSDN library at: http://msdn.microsoft.com
[2] We saw in 4.6, “Saving a document”, on page 79 how to save a document that is associated with a schema, using that schema alone. Later we will see how to save it using a combination of its own schema and WordML.
[3] Some whitespace was added to all examples to make them more readable.
[4] The comment <!--taken from Word-->
appears instead of the actual definitions to reduce the size of the example; for a full listing, see article_view.xsl
.
[5] The xsl:template
instruction element is actually a template rule; its content is the template. The xsl:apply-templates
instruction actually applies template rules, which include match patterns as well as templates.