How it works...

In step 1, we added all the transformations that are needed for the dataset. TransformProcess defines an unordered list of all the transformations that we want to apply to the dataset. We removed any unnecessary features by calling removeColumns(). During schema creation, we marked the categorical features in the Schema. Now, we can actually decide on what kind of transformation is required for a particular categorical variable. Categorical variables can be converted into integers by calling categoricalToInteger(). Categorical variables can undergo one-hot encoding if we call categoricalToOneHot(). Note that the schema needs to be created prior to the transformation process. We need the schema to create a TransformProcess.
In step 2, we apply the transformations that were added before with the help of TransformProcessRecordReader. All we need to do is create the basic record reader object with the raw data and pass it to TransformProcessRecordReader, along with the defined transformation process.

Table of Contents for How it works...

Create new playlist

Sign In

Sign Up

Table of Contents for
How it works...