Chapter 8. XML to HTML

That was a surprise to me—that people were prepared to painstakingly write HTML.

Tim Berners-Lee

If I had to hazard a guess, I would say that at least 60% of the HTML delivered over the Internet today is at least partially generated. This is not because HTML is painstakingly hard to write, as Tim Berners-Lee states in the opening quotation (it is, but now we have fancy HTML editors), but because dynamically generated HTML allows you to do so much more.

There are many open and proprietary technologies for delivering HTML content from data stored in other forms. However, when the data is in XML, XSLT is one of the most important tools of which web authors should be aware.

You can use XSLT to generate HTML in three basic ways.

First, XSLT can transform XML into HTML and statically store the generated HTML on a web server or hard drive for delivery to a browser. This is also a good way to test such transformations.

Second, you can use XSLT as a server-side scripting solution in which XML extracted from flat files or databases is dynamically transformed by the web server as requested by the client browser. This solution is necessary when the underlying data changes frequently. However, sometimes a hybrid solution is used in which HTML is constructed on demand, but then cached on the server to avoid the need for subsequent transformations as long as the underlying data does not change.

Third, you can use XSLT as a client-side stylesheet, provided the browser supports XSLT processing. At this time, only the latest versions of Microsoft Internet Explorer (Version 6.0) and Netscape Navigator (6.1) have support for XSLT right out of the box. Older versions of IE require installation of MSXML 3.0 in replacement mode. In addition, XSmiles (http://www.x-smiles.org/) and the Antenna House XSL Formatter (http://www.antennahouse.com/) perform client-side XSLT processing and display the result. X-Smiles can handle all sorts of results, including SVG and XSL-FO, although the HTML handling is not perfect. The Antenna House XSL Formatter handles XSL-FO. As the state of the world changes rapidly in this area, you should check the latest online documentation of your favorite browser or browser add-in.

Using XSLT as a Styling Language

Problem

You want the browser to dynamically stylize an XML document into HTML.

Solution

Here is an example for publishing a snippet of a DocBook document in HTML using an XSLT stylesheet. The document source is a portion of this chapter:

<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="application/xml" href="chapter.xsl"?>
<chapter label="8">
  <chapterinfo>
    <author>
      <surname>Mangano</surname>
      <firstname>Sal</firstname>
    </author>
    <copyright>
      <year>2002</year>
      <holder>O'Reilly</holder>
    </copyright>
  </chapterinfo>
  <title>XML to HTML</title>
  <epigraph>
    <para>That was a surprise to me - that people were prepared to painstakingly 
write HTML</para>
    <attribution>Tim Berners-Lee</attribution>
  </epigraph>
  <sect1>
    <title>Using XSLT as a Styling Language</title>
    <sect2>
      <title>Problem</title>
      <para>You want to use XSLT to stylize a XML document for dissemination via 
HTML.</para>
    </sect2>
    <sect2>
      <title>Solution</title>
      <para>Here we show an example for publishing a snippet of a DocBook document in 
HTML using a XSLT stylesheet. The document source is a portion of this 
chapter.</para>
    </sect2>
    <sect2>
      <title>Discussion</title>
      <para>DocBook is an example of a document centric DTD that enables you to 
author and store document content in a presentation-neutral form that captures the 
logical structure of the content. The beauty of authoring documents (especially 
technical ones) in a this form is that one can use XSLT to transform a single content 
specification into multiple delivery vehicles such as HTML, PDF, Microsoft Help files 
and Unix man pages. Although we present this recipe in terms of DocBook, the 
techniques are applicable to other public domain document schema or documents of your 
own creation. </para>
    </sect2>
  </sect1>
</chapter>

Notice that the second line of this document includes a processing instruction, xml-stylesheet. This instructs the browser to apply the following stylesheet to the XML and render the stylesheet’s output rather than the actual XML. (Remember, this instruction works only in the most recent browser versions):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="html"/>
  
  <xsl:template match="/">
    <html>
      <head>
        <xsl:apply-templates mode="head"/>
      </head>
      <!-- You may want to use styles in a CSS style element rather -->
      <!-- than hardcoding as I do here -->
      <body style="margin-left:100;margin-right:100;margin-top:50;margin-bottom:50">
        <xsl:apply-templates/> 
        <xsl:apply-templates select="chapter/chapterinfo/*" mode="copyright"/> 
      </body>
    </html>
  
  </xsl:template>
   
  <!-- Head -->
   
  <xsl:template match="chapter" mode="head">
    <xsl:apply-templates select="chapterinfo" mode="head" />
    <xsl:apply-templates select="title" mode="head" />
  </xsl:template>
   
  
  <xsl:template match="chapter/title" mode="head">
        <title><xsl:value-of select="."/></title>
  </xsl:template>
   
  <xsl:template match="author" mode="head">
        <meta name="author" content="{concat(firstname,' ', surname)}"/>
  </xsl:template>
   
  <xsl:template match="copyright" mode="head">
        <meta name="copyright" content="{concat(holder,' ',year)}"/>
  </xsl:template>
   
  <xsl:template match="text(  )" mode="head"/>
   
<!-- Body -->
  
  <xsl:template match="chapter">
    <div align="right" style="font-size : 48pt; font-family: Times serif; font-weight 
: bold; padding-bottom:10; color:red" ><xsl:value-of select="@label"/></div>
    <xsl:apply-templates/>
  </xsl:template>  
   
  <xsl:template match="chapter/title">
    <div align="right" style="font-size : 24pt; font-family: Times serif; padding-
bottom:150; color:red"><xsl:value-of select="."/></div>
  </xsl:template>
   
  <xsl:template match="epigraph/para">
    <div align="right" style="font-size : 10pt; font-family: Times serif; font-style 
: italic; padding-top:4; padding-bottom:4"><xsl:value-of select="."/></div>
  </xsl:template>
   
  <xsl:template match="epigraph/attribution">
    <div align="right" style="font-size : 10pt; font-family: Times serif; padding-
top:4; padding-bottom:4"><xsl:value-of select="."/></div>
  </xsl:template>
  
  
  <xsl:template match="sect1">
    <h1 style="font-size : 18pt; font-family: Times serif; font-weight : bold">
      <xsl:value-of select="title"/>
    </h1>
    <xsl:apply-templates/>
  </xsl:template>
   
  <xsl:template match="sect2">
    <h2 style="font-size : 14pt; font-family: Times serif; font-weight : bold">
    <xsl:value-of select="title"/>
    </h2>
     <xsl:apply-templates/>
  </xsl:template>
  
  <xsl:template match="para">
    <p style="font-size : 12pt; font-family: Times serif">
      <xsl:value-of select="."/>
    </p>
  </xsl:template>
   
  <xsl:template match="text(  )"/>
   
<xsl:template match="copyright" mode="copyright">
  <div style="font-size : 10pt; font-family: Times serif; padding-top : 100">
    <xsl:text>Copyright </xsl:text>
    <xsl:value-of select="holder"/>
    <xsl:text> </xsl:text>
    <xsl:value-of select="year"/>
    <xsl:text>. All rights reserved.</xsl:text>
  </div>
</xsl:template>   
   
<xsl:template match="*" mode="copyright"/>
   
</xsl:stylesheet>

Ultimately, the browser sees the following HTML:

<html>
   <head>
      <meta name="author" content="Sal Mangano">
      <meta name="copyright" content="O'Reilly 2002">
      <title>XML to HTML</title>
   </head>
   <body style="margin-left:100;margin-right:100;margin-top:50;margin-bottom:50">
      <div align="right" style="font-size : 48pt; font-family: Times serif; font-
weight : bold; padding-bottom:10; color:red">8</div>
      <div align="right" style="font-size : 24pt; font-family: Times serif; padding-
bottom:150; color:red">XML to HTML</div>
      <div align="right" style="font-size : 10pt; font-family: Times serif; font-
style : italic; padding-top:4; padding-bottom:4">That was a surprise to me - that 
people were prepared to painstakingly write HTML</div>
      <div align="right" style="font-size : 10pt; font-family: Times serif; padding-
top:4; padding-bottom:4">Tim Berners-Lee</div>
      <h1 style="font-size : 18pt; font-family: Times serif; font-weight : bold">
Using XSLT as a Styling Language</h1>
      <h2 style="font-size : 14pt; font-family: Times serif; font-weight : bold">
Problem</h2>
      <p style="font-size : 12pt; font-family: Times serif">You want to use XSLT to 
stylize a XML document for dissemination via HTML.</p>
      <h2 style="font-size : 14pt; font-family: Times serif; font-weight : bold">
Solution</h2>
      <p style="font-size : 12pt; font-family: Times serif">Here we show an example 
for publishing a snippet of a DocBook document in HTML using a XSLT stylesheet. The 
document source
         is a portion of this chapter.
      </p>
      <h2 style="font-size : 14pt; font-family: Times serif; font-weight : bold">
Discussion</h2>
      <p style="font-size : 12pt; font-family: Times serif">DocBook is an example of 
a document centric DTD that enables you to author and store document content in a 
presentation-neutral
         form that captures the logical structure of the content. The beauty of 
authoring documents (especially technical ones) in
         a this form is that one can use XSLT to transform a single content 
specification into multiple delivery vehicles such as HTML,
         PDF, Microsoft Help files and Unix man pages. Although we present this 
recipe in terms of DocBook, the techniques are applicable
         to other public domain document schema or documents of your own creation. 
      </p>
      <div style="font-size : 10pt; font-family: Times serif; padding-top : 100">
Copyright O'Reilly 2002. All rights reserved.</div>
   </body>
</html>

Discussion

DocBook is a document-centric DTD that enables you to author and store document content in a presentation-neutral form that captures the content’s logical structure. The beauty of authoring documents (especially technical ones) in this form is that you can use XSLT to transform a single content specification into multiple delivery vehicles such as HTML, PDF, Microsoft Help files, and Unix manpages. Although the book presents this example in terms of DocBook, the techniques apply to other public-domain document schema(s) of your own creation.

Since you only used a subset of the DocBook DTD, creating a simple monolithic stylesheet was convenient. However, an industrial-strength solution would modularize the handling of many DocBook elements by using separate stylesheets and moded templates.

See Also

The best source for information about DocBook is http://www.docbook.org/. Norman Walsh has developed a set of open source stylesheets to convert DocBook into various publishing formats. These stylesheets are located at http://docbook.sourceforge.net/projects/xsl/.

Recipe 14.2 demonstrates several techniques for creating more modular and extensible stylesheets.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset