How it works...

In step 1, we added all the transformations that are needed for the dataset. TransformProcess defines an unordered list of all the transformations that we want to apply to the dataset. We removed any unnecessary features by calling removeColumns(). During schema creation, we marked the categorical features in the Schema. Now, we can actually decide on what kind of transformation is required for a particular categorical variable. Categorical variables can be converted into integers by calling categoricalToInteger(). Categorical variables can undergo one-hot encoding if we call categoricalToOneHot(). Note that the schema needs to be created prior to the transformation process. We need the schema to create a TransformProcess.
In step 2, we apply the transformations that were added before with the help of TransformProcessRecordReader. All we need to do is create the basic record reader object with the raw data and pass it to TransformProcessRecordReader, along with the defined transformation process. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset