If TransformProcess returns sequential data, then use the executeSequence() method instead:
List<List<List<Writable>>> transformed = LocalTransformExecutor.executeSequence(sequenceRecordReader, transformProcess)
If you need to join two record readers based on joinCondition, then you need the executeJoin() method:
List<List<Writable>> transformed = LocalTransformExecutor.executeJoin(joinCondition, leftReader, rightReader)
The following is an overview of local/Spark executor methods:
- execute(): This applies the transformation to the record reader. LocalTransformExecutor takes the record reader as input, while SparkTransformExecutor needs the input data to be loaded into a JavaRDD object. This cannot be used for sequential data.
- executeSequence(): This applies the transformation to a sequence reader. However, the transform process should start with non-sequential data and then convert it into sequential data.
- executeJoin(): This method is used for joining two different input readers based on joinCondition.
- executeSequenceToSeparate(): This applies the transformation to a sequence reader. However, the transform process should start with sequential data and return non-sequential data.
- executeSequenceToSequence(): This applies the transformation to a sequence reader. However, the transform process should start with sequential data and return sequential data.