FlatSourceConverter Class (Extends SourceConverter)

Overview

The FlatSourceConverter is the main driver for the actual conversion. It inherits all the attributes and methods of its base classes, the SourceConverter and Converter classes (see Chapter 6).

Attributes:

  • FlatRecordReader Object

  • Array of Strings Partner Array

  • Integer Partner Break Field Offset

  • Integer Partner Break Field Length

  • String Saved Partner ID

Methods:

  • Constructor

  • processFile

  • processGroup (added to SourceConverter base class)

  • testPartnerBreak

Methods

Constructor

The constructor method for our FlatSourceConverter object sets up that object as well as the FlatRecordReader object.

Logic for the FlatSourceConverter Constructor Method
Arguments:
  Boolean Validation option
  String Output Directory
  String File Description Document Name

Call base class constructor
Initialize Partner Array
Call loadFileDescriptionDocument from passed File Description
    Document Name
Schema Location URL <- Call File Description Document's
    getElementsByTagName on "SchemaLocationURL", then
    getAttribute on "value"
IF Schema Location URL is null and Validation is true
  Throw Exception
ENDIF
Partner Break Field Offset <- 0
Partner Break Field Length <- 0
NodeList Temp <- call File Description Document's
    getElementsByTagName for "PartnerBreak"
IF (Temp length = 1)
  Partner Break Element <- Temp NodeList item(0)
  Partner Break Offset <- call Partner Break Element's
getAttribute
      for "Offset", and convert to integer
  Partner Break Length <- call Partner Break Element's
      getAttribute for "Length", and convert to integer
ENDIF
Initialize Saved Partner ID
Create FlatRecordReader object, passing:
    File Description Document

processFile

The main processing is driven by the FlatSourceConverter's processFile method. This method converts one input flat file into one or more output XML documents based on the input parameters.

Logic for the FlatSourceConverter processFile Method
Arguments:
  String Input File Name

Returns:
  Status or throws exception

Output Directory Path <- Base Directory + directory separator
Initialize Output Document to null
Initialize Sequence Number
Header Record Tag <- Get Grammar Element's "TagValue" Attribute
Open input file
Call FlatRecordReader's setInputStream method
Record Length <- Call FlatRecordReader's readRecord method
Record Tag <- Call FlatRecordReader's getRecordType method
IF (Record Tag != Header Record Tag)
  Return error or throw exception
ENDIF
DO while Record Length => 0
  Partner Break <- Call testPartnerBreak
  IF Partner Break = true
    Output Directory Path <- Base Directory +
        Partner ID + directory separator
    Lookup Partner in Partner Array
    IF Partner is not in Array
      Create output directory from Output Directory Path
      Partner Array <- Add Partner ID
    ENDIF
  ENDIF
  Create new Output Document
  Increment Sequence Number, and pad with leading zeroes
      to three digits
  Output File Path <- Output directory path + Root Element
      Name + Sequence Number + ".xml"
  Call FlatRecordReader's setOutputDocument method for new
      Output Document
  Record Length <- Call processGroup to process the document,
      passing the Root Element and the Grammar Element
  IF (Record Length > 0)
    Header Tag <- Call FlatRecordReader's getRecordType method
    IF (Header Tag != Grammar Header Record Tag)
      Return error or throw exception
    ENDIF
  ENDIF
  Call saveDocument
ENDDO
Close input file
Display completion message with number of documents processed

There are several similarities between this processFile method and the CSVSourceConverter's processFile method. We do very similar processing for partner lookup, directory and file management, and saving documents. However, instead of processing records individually, we process a document as a whole using the processGroup method.

processGroup (Base Class SourceConverter Method)

We noted in the discussion of flat file grammars that the recursive definition for the group production lends itself to processing by a recursive algorithm. I also noted near the end of Chapter 2 that while we can often process XML using recursive algorithms, doing so doesn't always offer any advantages over nonrecursive approaches. However, for our purposes in this utility a recursive approach, implemented with the processGroup method, is quite appropriate and very powerful. It processes the first record in a group, then all the other records. If one of the records starts another logical group, the processGroup method calls itself.

Since both flat files and EDI files have the same logical group structures (at least when processing their XML representations in our architecture) we can use the same method for processing them. So, although we introduce the processGroup method in this chapter, we're actually adding it to the SourceConverter base class.

Logic for the FlatSourceConverter processGroup Method
Arguments:
  DOM Node Parent
  DOM Element Group Grammar

Returns
  RecordLength of last input record read, or EOF

Group Element Name <- Group Grammar getAttributeValue
    For "ElementName"
Group Element <- createElement using Group Element Name
Parent Element <- appendNode Group Element
Grammar Element Name <- Record Grammar getNodeName
IF Grammar Element Name = "Grammar"
       and Schema Location URL is not NULL
       Create namespace Attribute for SchemaInstance and append
           to Group Element
       Create noNamespaceSchemaLocation Attribute and append
           to Group Element
ENDIF
Child Node <- Get Grammar Element's firstChild
DO while Child Node is not an Element Node
  Child Node <- nextSibling
ENDDO
Record Grammar Element <- Child Node
Record Element Name <- Record Grammar Element's
    getAttributeValue For "ElementName"
//  Process the first record in the group
Call RecordReader's parseRecord
Call RecordReader's toXML
Call RecordReader's writeRecord
// This advance in the grammar makes sure that we don't repeat
//  the starting record of the group
Record Grammar Element <- Get next Record Element from
    Group Grammar
Grammar Record Tag <- Record Grammar getAttribute for
    "TagValue"
Record Length <- call RecordReader's readRecord
DO until end of file
  Record Tag <- call RecordReader's getRecordTag
  DO until Grammar Record Tag = Record Tag
    Record Grammar Element <- Get next Record Element from
        Group Grammar
    IF Record Grammar Element is NULL
      return Record Length //  This record is not part of the group
    ENDIF
  ENDDO
  Grammar Element Name <- Record Grammar getNodeName
  IF Grammar Element Name = "GroupDescription"
    Do recursive call of processGroup
  ELSE
    Record Element Name <- Record Grammar getAttributeValue for
        "ElementName"
    Call RecordReader's parseRecord
    Call RecordReader's toXML
    Call RecordReader's writeRecord
    Record Length <- call RecordReader's readRecord
  ENDIF
ENDDO
Return Record Length

The logic is mostly straightforward, but it may help to review the recursive and termination cases. The first time processGroup is executed it is called from processFile after we have read the header record for a logical document. We pass processGroup the Document as the parent Node to which to append all the record Elements we create in processGroup. We also pass the complete Grammar Element from the file description document as the grammar for the group. After we have processed the header record, we advance to the next Element in the grammar. We then read the next record from the file. We advance Elements in the grammar until we match the record identifier of the record that we read from the file. If it is a normal record (that is, it doesn't start another group), we process it. However, if the matching grammar Element indicates that it starts a group (indicated by an Element name of “GroupDescription” instead of “RecordDescription”), we execute the recursive call. In this circumstance we pass the Document's root Element as the parent Element and the group's grammar Element (the GroupDescription Element) as the grammar. We thus start a new instance of processGroup and proceed as before.

We reach the termination case when we have read a record that is not part of the grammar of the current group. This case is recognized when we advance through all the grammar Elements that are children of the current group grammar and don't find a TagValue Attribute that matches the record's identifier. If we're processing a lower-level group in the logical document hierarchy (and a higher execution point on the stack), we exit processGroup and return to the previous iteration of processGroup. If the record is part of the group being processed by that iteration, we resume processing. However, if it isn't part of that group, we again exit. This holds true if the record we have read starts another iteration of the same type of record group. This continues until we finally exit back to processFile. If the current record is a header record, we save the current Document to disk and begin a new iteration of the while loop. However, if the record is not a header record, we have encountered a record that is either not defined for this type of document or is not where we have said it would be in the grammar. In that case we force an abnormal termination.

testPartnerBreak

This testPartnerBreak method serves the same purpose as the one in the CSVSourceConverter class, but the processing is a bit different. In addition to a few other minor differences, we trim trailing whitespace through the call to the Flat RecordReader's getFieldValue method.

Logic for the FlatSourceConverter testPartnerBreak Method
Arguments:
  None

Returns:
  Boolean - true if new partner and false if not

IF Partner Break Length is zero
  return false
ENDIF
PartnerID <- Call FlatRecordReader's getFieldValue method
    passing the Partner Break Offset and Partner Break Length
IF PartnerID = Saved Partner ID
  return false
ENDIF
Saved Partner ID <- Partner ID
Return true

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset