There's more...

If TransformProcess returns sequential data, then use the executeSequence() method instead:

List<List<List<Writable>>> transformed = LocalTransformExecutor.executeSequence(sequenceRecordReader, transformProcess)

If you need to join two record readers based on joinCondition, then you need the executeJoin() method:

List<List<Writable>> transformed = LocalTransformExecutor.executeJoin(joinCondition, leftReader, rightReader) 

The following is an overview of local/Spark executor methods:

  • execute(): This applies the transformation to the record reader. LocalTransformExecutor takes the record reader as input, while SparkTransformExecutor needs the input data to be loaded into a JavaRDD object. This cannot be used for sequential data.
  • executeSequence(): This applies the transformation to a sequence reader. However, the transform process should start with non-sequential data and then convert it into sequential data.
  • executeJoin(): This method is used for joining two different input readers based on joinCondition.
  • executeSequenceToSeparate(): This applies the transformation to a sequence reader. However, the transform process should start with sequential data and return non-sequential data.
  • executeSequenceToSequence(): This applies the transformation to a sequence reader. However, the transform process should start with sequential data and return sequential data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset