- Leave out all the noise features before training the neural network. Remove noise features at the schema transformation stage:
TransformProcess transformProcess = new TransformProcess.Builder(schema)
.removeColumns("RowNumber","CustomerId","Surname")
.build();
- Identify the missing values using the DataVec analysis API:
DataQualityAnalysis analysis = AnalyzeLocal.analyzeQuality(schema,recordReader);
System.out.println(analysis);
- Remove null values using a schema transformation:
Condition condition = new NullWritableColumnCondition("columnName");
TransformProcess transformProcess = new TransformProcess.Builder(schema)
.conditionalReplaceValueTransform("columnName",new IntWritable(0),condition)
.build();
- Remove NaN values using a schema transformation:
Condition condition = new NaNColumnCondition("columnName");
TransformProcess transformProcess = new TransformProcess.Builder(schema)
.conditionalReplaceValueTransform("columnName",new IntWritable(0),condition)
.build();