Parser Design

In this section, we will design a custom parser that we will implement in the next section. To this point, we have covered the architecture of the BizTalk channel in which custom functoids, parsers, and serializers have a role. The preceding section implemented a custom functoid that assumes the data in the channel is already in an XML format. The next two sections will produce that XML format.

A parser operates generally as follows. It accepts from BizTalk a stream of data representing the raw data posted to the channel. The parser inspects the stream to identify whether it is in a format the parser can handle. If so, the parser decides whether the stream contains groups of documents or just a flat sequence of documents. The parser converts the raw format into documents and imposes grouping and document order.

A parser design requires the following specifications:

  • Format of the non-XML data entering the channel

  • Format of the XML data output by the parser

  • Support for batching

  • Support for correlation receipts

  • Integration with document tracking

Format of Non-XML Data Entering the Channel

Data entering the channel is a non-XML format. For example, the FIX format for financial data mentioned earlier uses a tag=value(delimiter)tag=value(delimiter)tag=value format. For simplicity of demonstration, we will use an HTML form post that has the same structure. Here, the delimiter is the & symbol. The parser for this chapter will recognize four tag-value pairs.

broker=Alpha%20Financials&stock=ABC&quantity=10&minimum-price=5.00 

To easily construct a document of this format, a simple HTML form suffices. The HTML form will submit a document directly into the BizTalk channel using a BizTalk HTTP receive function.

The non-XML format must have some identifiable signature. BizTalk will ask our parser to decide whether a block of data is in a format it recognizes. In practice, we will have to examine the data directly to find a suitable signature. For simplicity in our example, an efficient method for identification is a prefix. Our parser will expect the following prefix on the data:

schema=biztalk-unleashed-custom-parser-ch18 

In general, BizTalk has an ordered sequence of parsers registered. Stock parsers ship for XML, EDIFACT, X12, and flat files. Whenever BizTalk receives a document, it passes an IStream for it to each of the parsers until one of them accepts. BizTalk then asks that parser to parse the document.

Format of XML Data Output by the Parser

The output of the parser is XML. Recall that a BizTalk channel has a document definition for inbound XML data. The parser output must match this definition.

The document definition for the custom parser in this chapter appears in Listing 18.4.

Listing 18.4. XML Schema for Our Custom Parser (BrokerItemSchema.xml)
<xdr:Schema name="broker-item"
            xmlns:xdr="urn:schemas-microsoft-com:xml-data"
            xmlns:dt="urn:schemas-microsoft-com:datatypes">
   <xdr:ElementType name="broker"    content="textOnly" model="closed" dt:type="string"/>
   <xdr:ElementType name="stock"     content="textOnly" model="closed" dt:type="string"/>
   <xdr:ElementType name="quantity"  content="textOnly" model="closed" dt:type="int"/>
   <xdr:ElementType name="min-price" content="textOnly" model="closed" dt:type="number"/>
   <xdr:ElementType name="broker-item" content="eltOnly" model="closed">
      <xdr:element type="broker"    maxOccurs="1" minOccurs="0" />
      <xdr:element type="stock"     maxOccurs="1" minOccurs="0" />
      <xdr:element type="quantity"  maxOccurs="1" minOccurs="0" />
      <xdr:element type="min-price" maxOccurs="1" minOccurs="0" />
   </xdr:ElementType>
</xdr:Schema>

The non-XML data given earlier is repeated here for reference.

broker=Alpha%20Financials&stock=ABC&quantity=10&minimum-price=5.00 

The corresponding XML document meeting the document definition in Listing 18.4 appears in Listing 18.5.

Listing 18.5. Sample XML for Our Custom Parser (SampleBrokerItem.xml)
<broker-item>
   <broker>Alpha Financials</broker>
   <stock>ABC</stock>
   <quantity>10</quantity>
   <min-price>5.00</min-price>
</broker-item>

An important design consideration is whether to make the document definition public. If it is made public, then there is an immediate benefit. The channel can accept both XML and non-XML versions for the document. For example, there is an emerging XML standard for FIX financial data format called FIXML. With a parser that outputs FIXML, the BizTalk channel can accept both FIX and FIXML documents. This setup allows old clients to submit FIX and newer clients to submit FIXML. No additional work is required.

Support for Batching

Batching is a common method for submitting data. The parser design must accommodate batching. In general, the parser receives an interchange. An interchange is a BizTalk term for a sequence of documents or a sequence of document groups. A document group is itself a sequence of documents related by some grouping criteria such as the source of the document. Groups cannot contain groups. If the parser does not support batching, then it handles a document sequence of length one:

Document Sequence:D1, D2, D3
Group Sequence:G1 (D11, D12, D13), G2 (D21, D22), G3 (D31, D32, D 33, D 34)

BizTalk asks the parser whether it supports groups. If so, BizTalk asks the parser for group boundaries and allows group-level information to be identified.

Documents can appear in the input in any order, and documents from different groups can interleave. However, the unparsed form of a document must be contiguous in the input. Note that empty groups and empty document sequences are not allowed.

The parser in this chapter supports batching with document groups. Recall that each document is a single form post. Multiple form posts will be submitted at one time. Listing 18.6 has an example batch submission.

Listing 18.6. Sample Batch Submission (SampleBatchSubmit.txt)
schema=biztalk-unleashed-custom-parser-ch18
broker=Delta&stock=ABC&quantity=1000&min-price=25.00
broker=Beta&stock=JKL&quantity=1500&min-price=35.00
broker=Alpha&stock=DEF&quantity=2500&min-price=20.00
broker=Beta&stock=WXYZ&quantity=2500&min-price=100.00
broker=Delta&stock=STUV&quantity=50&min-price=35.00

Grouping is done on the broker tag. This example has three groups: Delta, Beta, and Alpha. The Delta and Beta groups have two documents, and the Alpha group has one document.

Support for Correlation Receipts

The parser has the option of participating in BizTalk's mechanism for issuing receipts for documents received. For XML-based documents, BizTalk can automatically generate receipts using its guaranteed messaging feature. For other documents, such as those received by a channel with a custom parser, a separate BizTalk channel is required for sending receipts.

To set up a receipt channel, create a new channel and provide document definitions and a map between them. BizTalk has a standard definition for the inbound receipt although the document definition for the outbound receipt format must adhere to the expectations of the receiver of the receipt, such as a trading partner. The final step is to create a COM component that implements the interface IBizTalkCorrelation. BizTalk asks the custom parser for the COM ProgID of this COM component.

The parser in this chapter will not issue receipts.

Integration with Document Tracking

The parser has the option of integrating with BizTalk's Document Tracking feature. BizTalk provides an HTML application, http://localhost/BizTalkTracking. Replace localhost with the name of the appropriate machine on your network.

Document Tracking can log all inbound and outbound documents for a channel in both the XML and non-XML formats. User-defined fields can be tracked explicitly. Tracking setup is done on a per-channel basis in the BizTalk Messaging Manager.

The parser in this book will not explore document tracking.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset