- Manage a range of records using FileSplit:
String[] allowedFormats=new String[]{".JPEG"};
FileSplit fileSplit = new FileSplit(new File("temp"), allowedFormats,true)
You can find the FileSplit example at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data%20Extraction%2C%20Transform%20and%20Loading/sourceCode/cookbook-app/src/main/java/com/javadeeplearningcookbook/app/FileSplitExample.java.
- Manage the URI collection from a file using CollectionInputSplit:
FileSplit fileSplit = new FileSplit(new File("temp"));
CollectionInputSplit collectionInputSplit = new CollectionInputSplit(fileSplit.locations());
You can find the CollectionInputSplit example at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data%20Extraction%2C%20Transform%20and%20Loading/sourceCode/cookbook-app/src/main/java/com/javadeeplearningcookbook/app/CollectionInputSplitExample.java.
- Use NumberedFileInputSplit to manage data with numbered file formats:
NumberedFileInputSplit numberedFileInputSplit = new NumberedFileInputSplit("numberedfiles/file%d.txt",1,4);
numberedFileInputSplit.locationsIterator().forEachRemaining(System.out::println);
You can find the NumberedFileInputSplit example at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data%20Extraction%2C%20Transform%20and%20Loading/sourceCode/cookbook-app/src/main/java/com/javadeeplearningcookbook/app/NumberedFileInputSplitExample.java.
- Use TransformSplit to map the input URIs to the different output URIs:
TransformSplit.URITransform uriTransform = URI::normalize;
List<URI> uriList = Arrays.asList(new URI("file://storage/examples/./cats.txt"),
new URI("file://storage/examples//dogs.txt"),
new URI("file://storage/./examples/bear.txt"));
TransformSplit transformSplit = new TransformSplit(new CollectionInputSplit(uriList),uriTransform);
You can find the TransformSplit example at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data%20Extraction%2C%20Transform%20and%20Loading/sourceCode/cookbook-app/src/main/java/com/javadeeplearningcookbook/app/TransformSplitExample.java.
- Perform URI string replacement using TransformSplit:
InputSplit transformSplit = TransformSplit.ofSearchReplace(new CollectionInputSplit(inputFiles),"-in.csv","-out.csv");
- Extract the CSV data for the neural network using CSVRecordReader:
RecordReader reader = new CSVRecordReader(numOfRowsToSkip,deLimiter);
recordReader.initialize(new FileSplit(file));
You can find the CSVRecordReader example at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data%20Extraction%2C%20Transform%20and%20Loading/sourceCode/cookbook-app/src/main/java/com/javadeeplearningcookbook/app/recordreaderexamples/CSVRecordReaderExample.java.
The dataset for this can be found at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data_Extraction_Transform_and_Loading/titanic.csv.
The dataset for this can be found at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data_Extraction_Transform_and_Loading/titanic.csv.
- Extract image data for the neural network using ImageRecordReader:
ImageRecordReader imageRecordReader = new ImageRecordReader(imageHeight,imageWidth,channels,parentPathLabelGenerator);
imageRecordReader.initialize(trainData,transform);
You can find the ImageRecordReader example at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data%20Extraction%2C%20Transform%20and%20Loading/sourceCode/cookbook-app/src/main/java/com/javadeeplearningcookbook/app/recordreaderexamples/ImageRecordReaderExample.java.
- Transform and extract the data using TransformProcessRecordReader:
RecordReader recordReader = new TransformProcessRecordReader(recordReader,transformProcess);
You can find the TransformProcessRecordReader example at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data_Extraction_Transform_and_Loading/sourceCode/cookbook-app/src/main/java/com/javadeeplearningcookbook/app/recordreaderexamples/TransformProcessRecordReaderExample.java
The dataset for this can be found at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data_Extraction_Transform_and_Loading/transform-data.csv.
The dataset for this can be found at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data_Extraction_Transform_and_Loading/transform-data.csv.
- Extract the sequence data using SequenceRecordReader and CodecRecordReader:
RecordReader codecReader = new CodecRecordReader();
codecReader.initialize(conf,split);
You can find the CodecRecordReader example at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data%20Extraction%2C%20Transform%20and%20Loading/sourceCode/cookbook-app/src/main/java/com/javadeeplearningcookbook/app/recordreaderexamples/CodecReaderExample.java.
The following code shows how to use RegexSequenceRecordReader:
RecordReader recordReader = new RegexSequenceRecordReader((d{2}/d{2}/d{2}) (d{2}:d{2}:d{2}) ([A-Z]) (.*)",skipNumLines);
recordReader.initialize(new NumberedFileInputSplit(path/log%d.txt));
You can find the RegexSequenceRecordReader example at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data_Extraction_Transform_and_Loading/sourceCode/cookbook-app/src/main/java/com/javadeeplearningcookbook/app/recordreaderexamples/RegexSequenceRecordReaderExample.java.
The dataset for this can be found at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data_Extraction_Transform_and_Loading/logdata.zip.
The dataset for this can be found at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data_Extraction_Transform_and_Loading/logdata.zip.
The following code shows how to use CSVSequenceRecordReader:
CSVSequenceRecordReader seqReader = new CSVSequenceRecordReader(skipNumLines, delimiter);
seqReader.initialize(new FileSplit(file));
You can find the CSVSequenceRecordReader example at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data%20Extraction%2C%20Transform%20and%20Loading/sourceCode/cookbook-app/src/main/java/com/javadeeplearningcookbook/app/recordreaderexamples/SequenceRecordReaderExample.java.
The dataset for this can be found at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data_Extraction_Transform_and_Loading/dataset.zip.
The dataset for this can be found at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data_Extraction_Transform_and_Loading/dataset.zip.
- Extract the JSON/XML/YAML data using JacksonLineRecordReader:
RecordReader recordReader = new JacksonLineRecordReader(fieldSelection, new ObjectMapper(new JsonFactory()));
recordReader.initialize(new FileSplit(new File("json_file.txt")));
You can find the JacksonLineRecordReader example at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data_Extraction_Transform_and_Loading/sourceCode/cookbook-app/src/main/java/com/javadeeplearningcookbook/app/recordreaderexamples/JacksonLineRecordReaderExample.java.
The dataset for this can be found at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data_Extraction_Transform_and_Loading/irisdata.txt.
The dataset for this can be found at https://github.com/PacktPublishing/Java-Deep-Learning-Cookbook/blob/master/02_Data_Extraction_Transform_and_Loading/irisdata.txt.