Chapter 10. Code Generation

Good programmers write good code. Great programmers write programs to generate it.

Unknown

Automation is the holy grail of software development. In fact, much of the progress in software development is driven by the notion of code generation from some higher-level specification. After all, isn’t that what assemblers and compilers do? However, in another form of code generation, the target language is not executable machine code, but a high-level language such as Java or C++. Why would you want to generate code in this way, and what does XML have to do with it?

When you write programs, you essentially encode many kinds of knowledge into a very specific syntax that is optimized for one particular development life-cycle phase. It is difficult to leverage the work done in coding to other important development tasks because programming languages are difficult to parse and much of the interesting information is encoded in ad-hoc comments. Representing application knowledge in XML provides the opportunity for much greater leverage. From XML specifications, you can generate application code, test programs, documentation, and possibly even test data. This is not to say that XML gives you this for free. As with all software-development tasks, a great deal of planning and infrastructure building is required to reap the benefits.

This chapter is different from most other chapters in this book because most examples are components of a solution within the context of a particular application. The reason for this structure is two-fold.

First, it is unlikely that you would encode information in XML to generate code just because XML is cool. In most cases, a larger problem must be solved in which XML can be further leveraged. The examples in this section will make more sense if they are presented in the context of a larger problem.

Second, the particular problem is common in large-scale application development, so readers might find it interesting in its own right. However, even if this is not the case, the larger problem will not take away from the application of the concepts to other development tasks.

So what is this large problem?

Imagine a complex client-server application. Complex means that it consists of many types of server and client processes. These processes communicate via messages using message-oriented middleware (either pointing to point, publish/subscribe, or both). IBM MQSeries, Microsoft Message Queuing (MSMQ), BEA Systems Tuxedo, and TIBCO Rendezvous are just a few of the many products in this space. In this example, the particular middleware product is not particularly relevant. What is relevant is that all significant work performed by the system is triggered by the receipt of a message and the subsequent response involving one or more messages.[22] The message may contain XML (SOAP), non-XML text, or binary data. Chapter 12 covers SOAP in the context of WSDL. This chapter is primarily interested in server-to-server communication in which XML is used less often.

What is particularly daunting about such complex systems is that you cannot simply understand them by viewing the source code of any one particular type of process. You must begin by first understanding the conversations or inter-process messaging protocols spoken by these processes. This chapter goes even further and states that, at a first level of approximation, the details of each individual process are irrelevant. You can simply treat each process as a black box. Then, rather than understand the hundreds of thousands of lines of code that make up the entire system, you can start by understanding the smaller set of messages that these processes exchange.

Thus the question becomes, how do you go about understanding the interprocess language of a complex application? Can you go to a single place to get this information? Sadly, this is often not the case. I find that you can rarely find an up-to-date and complete specification of an application’s messaging protocols. You can usually find pieces of the puzzle in various shared header files and other pieces in design documents developed over the system’s life cycle, but rarely will you find a one-stop source for such vital information. And in many cases, the only truly reliable method of obtaining such information is to reverse-engineer it from the applications’ source code, which is exactly what I claimed you should not have to do!

Okay, so what does this problem have to do with XML, XSLT, and, in particular, code generation? You can describe the solution to this problem in terms of the need for a documentation that describes in complete detail an application’s interprocess messaging structure. What kind of document should this be? Maybe the developers should maintain an MS Word document describing all the messages or, better still, a messaging web site that can be browsed and searched. Or, maybe (and you should have guessed the answer already) the information should be kept in XML! Perhaps you should generate the web site from this XML. While you’re at it, maybe you should generate some of the code needed by the applications that processes these messages. This is, in fact, exactly what you shall do in this chapter. I call the set of XML files an interprocess message repository . Many recipes in this chapter demonstrate how to generate code using this repository.

Before moving to the actual recipes, this chapter presents the repository’s design in terms of its schema. It uses W3C XSD Schema for this purpose but only shows an intuitive graphical view for those unfamiliar with XML schema.

Figure 10-1 was produced using Altova’s XML Spy 4.0 (http://www.xmlspy.com). The icons with three dots (...) represent an ordered sequence. The icon that looks like a multiway switch represents a choice.

Graphical representation of XSD schema for repository

Figure 10-1. Graphical representation of XSD schema for repository

Although this schema is sufficient to illustrate interesting code-generation recipes, it is probably inadequate for an industrial-strength message repository. Additional data might be stored in a message repository, as shown in the following list:

  • Symbolic constants used in array and string sizes, as well as enumerated values

  • Information about more complex data representations, such as unions and type-name aliases (C typedefs)

  • Information about message protocols (complex sequences of messages exchanged by sets of processes to achieve a specific functionality)

  • Historical information such as authors, last changed by, change dates, etc.

  • Delivery and transport information related to publishers and subscribers or queue names

As sample repository data, imagine a simple client-server application that submits orders and cancelations for common stock. The repository for such an application might look like this:

<MessageRepository xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:
noNamespaceSchemaLocation="C:MyProjectsXSLT Cookbookcode genMessageRepository.
xsd">
  <DataTypes>
    <Primitive>
      <Name>Real</Name>
      <Size>8</Size>
      <Category>real</Category>
    </Primitive>
    <Primitive>
      <Name>Integer</Name>
      <Size>4</Size>
      <Category>signed integer</Category>
    </Primitive>
    <Primitive>
      <Name>StkSymbol</Name>
      <Size>10</Size>
      <Category>string</Category>
    </Primitive>
    <Primitive>
      <Name>Message</Name>
      <Size>100</Size>
      <Category>string</Category>
    </Primitive>
    <Primitive>
      <Name>Shares</Name>
      <Size>4</Size>
      <Category>signed integer</Category>
    </Primitive>
    <Enumeration>
      <Name>BuyOrSell</Name>
      <Enumerators>
        <Enumerator>
          <Name>BUY</Name>
          <Value>0</Value>
        </Enumerator>
        <Enumerator>
          <Name>SELL</Name>
          <Value>1</Value>
        </Enumerator>
      </Enumerators>
    </Enumeration>
    <Enumeration>
      <Name>OrderType</Name>
      <Enumerators>
        <Enumerator>
          <Name>MARKET</Name>
          <Value>0</Value>
        </Enumerator>
        <Enumerator>
          <Name>LIMIT</Name>
          <Value>1</Value>
        </Enumerator>
      </Enumerators>
    </Enumeration>
    <Structure>
      <Name>TestData</Name>
      <Members>
        <Member>
          <Name>order</Name>
          <DataTypeName>AddStockOrderData</DataTypeName>
        </Member>
        <Member>
          <Name>cancel</Name>
          <DataTypeName>CancelStockOrderData</DataTypeName>
        </Member>
      </Members>
    </Structure>
    <Structure>
      <Name>AddStockOrderData</Name>
      <Documentation>A request to add a new order.</Documentation>
      <Members>
        <Member>
          <Name>symbol</Name>
          <DataTypeName>StkSymbol</DataTypeName>
        </Member>
        <Member>
          <Name>quantity</Name>
          <DataTypeName>Shares</DataTypeName>
        </Member>
        <Member>
          <Name>side</Name>
          <DataTypeName>BuyOrSell</DataTypeName>
        </Member>
        <Member>
          <Name>type</Name>
          <DataTypeName>OrderType</DataTypeName>
        </Member>
        <Member>
          <Name>price</Name>
          <DataTypeName>Real</DataTypeName>
        </Member>
      </Members>
    </Structure>
    <Structure>
      <Name>AddStockOrderAckData</Name>
      <Documentation>A positive acknowledgment that order was added successfully.
      </Documentation>
      <Members>
        <Member>
          <Name>orderId</Name>
          <DataTypeName>Integer</DataTypeName>
        </Member>
      </Members>
    </Structure>
    <Structure>
      <Name>AddStockOrderNackData</Name>
      <Documentation>An negative acknowledgment that order add was unsuccessful.
      </Documentation>
      <Members>
        <Member>
          <Name>reason</Name>
          <DataTypeName>Message</DataTypeName>
        </Member>
      </Members>
    </Structure>
    <Structure>
      <Name>CancelStockOrderData</Name>
      <Documentation>A request to cancel all or part of an order</Documentation>
      <Members>
        <Member>
          <Name>orderId</Name>
          <DataTypeName>Integer</DataTypeName>
        </Member>
        <Member>
          <Name>quantity</Name>
          <DataTypeName>Shares</DataTypeName>
        </Member>
      </Members>
    </Structure>
    <Structure>
      <Name>CancelStockOrderAckData</Name>
      <Documentation>A positive acknowledgment that order was canceled successfully.
      </Documentation>
      <Members>
        <Member>
          <Name>orderId</Name>
          <DataTypeName>Integer</DataTypeName>
        </Member>
        <Member>
          <Name>quantityRemaining</Name>
          <DataTypeName>Shares</DataTypeName>
        </Member>
      </Members>
    </Structure>
    <Structure>
      <Name>CancelStockOrderNackData</Name>
      <Documentation>An negative acknowledgment that the order cancel was
      unsuccessful.</Documentation>
      <Members>
        <Member>
          <Name>orderId</Name>
          <DataTypeName>Integer</DataTypeName>
        </Member>
        <Member>
          <Name>reason</Name>
          <DataTypeName>Message</DataTypeName>
        </Member>
      </Members>
    </Structure>
  </DataTypes>
  <Messages>
    <Message>
      <Name>ADD_STOCK_ORDER</Name>
      <MsgId>1</MsgId>
      <DataTypeName>AddStockOrderData</DataTypeName>
      <Senders>
        <ProcessRef>StockClient</ProcessRef>
      </Senders>
      <Receivers>
        <ProcessRef>StockServer</ProcessRef>
      </Receivers>
    </Message>
    <Message>
      <Name>ADD_STOCK_ORDER_ACK</Name>
      <MsgId>2</MsgId>
      <DataTypeName>AddStockOrderAckData</DataTypeName>
      <Senders>
        <ProcessRef>StockServer</ProcessRef>
      </Senders>
      <Receivers>
        <ProcessRef>StockClient</ProcessRef>
      </Receivers>
    </Message>
    <Message>
      <Name>ADD_STOCK_ORDER_NACK</Name>
      <MsgId>3</MsgId>
      <DataTypeName>AddStockOrderNackData</DataTypeName>
      <Senders>
        <ProcessRef>StockServer</ProcessRef>
      </Senders>
      <Receivers>
        <ProcessRef>StockClient</ProcessRef>
      </Receivers>
    </Message>
    <Message>
      <Name>CANCEL_STOCK_ORDER</Name>
      <MsgId>4</MsgId>
      <DataTypeName>CancelStockOrderData</DataTypeName>
      <Senders>
        <ProcessRef>StockClient</ProcessRef>
      </Senders>
      <Receivers>
        <ProcessRef>StockServer</ProcessRef>
      </Receivers>
    </Message>
    <Message>
      <Name>CANCEL_STOCK_ORDER_ACK</Name>
      <MsgId>5</MsgId>
      <DataTypeName>CancelStockOrderAckData</DataTypeName>
      <Senders>
        <ProcessRef>StockServer</ProcessRef>
      </Senders>
      <Receivers>
        <ProcessRef>StockClient</ProcessRef>
      </Receivers>
    </Message>
    <Message>
      <Name>CANCEL_STOCK_ORDER_NACK</Name>
      <MsgId>6</MsgId>
      <DataTypeName>CancelStockOrderNackData</DataTypeName>
      <Senders>
        <ProcessRef>StockServer</ProcessRef>
      </Senders>
      <Receivers>
        <ProcessRef>StockClient</ProcessRef>
      </Receivers>
    </Message>
    <Message>
      <Name>TEST</Name>
      <MsgId>7</MsgId>
      <DataTypeName>TestData</DataTypeName>
      <Senders>
        <ProcessRef>StockServer</ProcessRef>
      </Senders>
      <Receivers>
        <ProcessRef>StockClient</ProcessRef>
      </Receivers>
    </Message>
  </Messages>
  <Processes>
    <Process>
      <Name>StockClient</Name>
    </Process>
    <Process>
      <Name>StockServer</Name>
    </Process>
  </Processes>
</MessageRepository>

This repository describes the messages that are sent between a client (called StockClient) and a server (called StockServer) as the application performs its various duties. Readers familiar with WSDL will see a similarity; however, WSDL is specific to web-service specifications and is most often used in the context of SOAP services, even though the WSDL specification is technically protocol-neutral (http://www.w3.org/TR/wsdl). Note that WSDL is a W3C note, not a recommendation. The official Web Services Description Working Group (http://www.w3.org/2002/ws/desc/) is working on what will eventually be a W3C-sanctioned standard.

The last two examples in this chapter are independent of the messaging problem. The first focuses on generating C++ code from Unified Modeling Language (UML) models exported from a UML modeling tool via XML Metadata Interchange (XMI). The second discusses using XSLT to generate XSLT.

Before proceeding with the actual examples, I apologize for using C++ for most of the examples. I did this only because it is the language with which I am most familiar; it is the language for which I have actually written generators; and the conceptual framework is transferable to other languages, even if the literal XSLT is not.[23]

Generating Constant Definitions

Problem

You want to generate a source file containing all message names as constants equivalent to their message IDs.

Solution

You can construct a single transformation that uses C++ as the default target but is easily customized for C, C#, or Java:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   
  <xsl:output method="text"/>
  <xsl:strip-space elements="*"/>
   
  <!--The name of the output source code file. --> 
  <xsl:param name="file" select=" 'MESSAGE_IDS.h' "/>
  
  <!-- The default behavior is to generate C++ style constants -->
  <xsl:variable name="constants-type" select=" 'const int' "/>
   
  <!-- The default C++ assigment operator -->
  <xsl:variable name="assignment" select=" ' = ' "/>
   
  <!-- The default C++ statement terminator -->
  <xsl:variable name="terminator" select=" ';' "/>
   
   
  <!--Transform repository into a sequence of message constant 
      definitions -->  
  <xsl:template match="MessageRepository">
    <xsl:call-template name="constants-start"/>
    <xsl:apply-templates select="Messages/Message"/>
    <xsl:call-template name="constants-end"/>
  </xsl:template>  
   
  <!--Each meesage becomes a comment and an constant definition -->
  <xsl:template match="Message">
    <xsl:apply-templates select="." mode="doc" />
    <xsl:apply-templates select="." mode="constant" />
  </xsl:template>
   
  <!-- C++ header files start with an inclusion guard -->
  <xsl:template name="constants-start">
    <xsl:variable name="guard" select="translate($file,'.','_')"/>
    <xsl:text>#ifndef </xsl:text>
    <xsl:value-of select="$guard"/>
    <xsl:text>&#xa;</xsl:text> 
    <xsl:text>#define </xsl:text>
    <xsl:value-of select="$guard"/>
    <xsl:text>&#xa;&#xa;&#xa;</xsl:text>
  </xsl:template>
   
  <!-- C++ header files end with the closure of the top level inclusion 
       guard -->
  <xsl:template name="constants-end">
    <xsl:variable name="guard" select="translate($file,'.','_')"/>
    <xsl:text>&#xa;&#xa;&#xa;#endif /* </xsl:text>
    <xsl:value-of select="$guard"/>
    <xsl:text> */&#xa;</xsl:text> 
  </xsl:template>
   
  <!-- Each constant definition is preceeded by a cooment describing the 
       associated message -->
  <xsl:template match="Message" mode="doc">
  /*
  * Purpose:      <xsl:call-template name="format-comment"> 
                        <xsl:with-param name="text" select="Documentation"/>
                        </xsl:call-template>
  * Data Format: <xsl:value-of select="DataTypeName"/>
  * From:        <xsl:apply-templates select="Senders" mode="doc"/>
  * To:          <xsl:apply-templates select="Receivers" mode="doc"/>
  */
  </xsl:template>
   
  <!-- Used in the generation of message documentation. Lists sender or
       receiver processes -->
  <xsl:template match="Senders|Receivers" mode="doc">
    <xsl:for-each select="ProcessRef">
      <xsl:value-of select="."/>
      <xsl:if test="position(  ) != last(  )">
       <xsl:text>, </xsl:text>
      </xsl:if>
    </xsl:for-each>
  </xsl:template>
   
  <!-- This utility wraps comments at 40 characters wide -->
  <xsl:template name="format-comment">
    <xsl:param name="text"/>
    <xsl:choose>
      <xsl:when test="string-length($text)&lt;40">
        <xsl:value-of select="$text"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="substring($text,1,39)"/>
        <xsl:text>*&#xa;</xsl:text>
        <xsl:call-template name="format-comment">
          <xsl:with-param name="text" select="substring($text,40)"/>
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>
   
  <!-- Each message name becomes a constant whose value is the message 
       id -->
  <xsl:template match="Message" mode="constant">
    <xsl:value-of select="$constants-type"/><xsl:text> </xsl:text>
    <xsl:value-of select="Name"/>
    <xsl:value-of select="$assignment"/>
    <xsl:value-of select="MsgId"/>
    <xsl:value-of select="$terminator"/>
    <xsl:text>&#xa;</xsl:text>
  </xsl:template>
  
  <!-- Ignore text nodes not explicitly handled by above templates -->
  <xsl:template match="text(  )"/>
  
</xsl:stylesheet>

When run against your repository, this transform generates the following code:

#ifndef MESSAGE_IDS_h
#define MESSAGE_IDS_h
   
   
  /*
  * Purpose:     Add a new order.
  * Data Format: AddStockOrderData
  * From:        StockClient
  * To:          StockServer
  */
  const int ADD_STOCK_ORDER_ID = 1;
   
  /*
  * Purpose:     Acknowledge the order has been added.
  * Data Format: AddStockOrderAckData
  * From:        StockServer
  * To:          StockClient
  */
  const int ADD_STOCK_ORDER_ACK_ID = 2;
   
  /*
  * Purpose:     Error adding the order. Perhaps it violates
  *              a rule.
  * Data Format: AddStockOrderNackData
  * From:        StockServer
  * To:          StockClient
  */
  const int ADD_STOCK_ORDER_NACK_ID = 3;
   
//Etc ...
   
#endif /* MESSAGE_IDS_h */

Discussion

To make the code-generation transformation customizable for several languages, I use a stylesheet that is more complex than necessary for any single language. Still, this chapter did not generalize it completely. For example, the commenting conventions assume the language is in the C ancestry. The content of the comments also may not suit your particular style or taste. However, as you create your own code-generation templates, you should apply these customization techniques:

  1. Encode language-specific constructs in top-level parameters or variables so they can be overridden by importing stylesheets or (if you use parameters) by passing in parameter values when the stylesheet is run.

  2. Break the various generated components into separate templates that can be overridden individually by importing stylesheets.

Having designed the transformation in this way allows C-style #define constants to be generated with only minor changes:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   
  <xsl:import href="msgIds.xslt"/>
  
  <xsl:variable name="constants-type" select=" '#define ' "/>
  <xsl:variable name="assignment" select=" '   ' "/>
  <xsl:variable name="terminator" select=" '' "/>
  
</xsl:stylesheet>

Java requires everything to live inside a class, but you can accommodate that too:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   
<xsl:import href="msgIds.xslt"/>
   
 <xsl:variable name="constants-type" select=" 'public static final int' "/>
   
  <xsl:template name="constants-start">
  <xsl:text>final public class MESSAGE_IDS &#xa;</xsl:text> 
  <xsl:text>{&#xa;</xsl:text>
  </xsl:template>
   
  <xsl:template name="constants-end">
  <xsl:text>&#xa;&#xa;}&#xa;</xsl:text> 
  </xsl:template>
   
</xsl:stylesheet>


[22] Obviously, user input and output is also relevant. However, you can think of I/O in terms of messages.These user I/O messages are normally sent and received over different channels, though, not interprocess messages.

[23] I am tempted to add, only half in jest, that C++ is such a horrendously complex language that its developers are the most motivated to generate rather than code it!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset