The Scala object containing the main() method has the following workflow:
- We read all the business labels from the train.csv file
- We read and create a map from image ID to business ID of form imageID → busID
- We get a list of images from the photoDir directory to load and process and, finally, get the image IDs of 10,000 images (feel free to set the range)
- We then read and process images into a photoID → vector map
- We chain the output of step 3 and step 4 to align the business feature, image IDs, and label IDs to get the feature extracted for the CNN
- We construct nine CNNs.
- We train all the CNNs and specify the model savings locations
- We then repeat step 2 to step 6 to extract the features from the test set
- Finally, we evaluate the model and save the prediction in a CSV file
Now let's see how the preceding steps would look in a high-level diagram:
Figure 8: DL4j image processing pipeline for image classification
Programmatically, the preceding steps can be represented as follows:
val labelMap = readBusinessLabels("data/labels/train.csv")
val businessMap = readBusinessToImageLabels("data/labels/train_photo_to_biz_ids.csv")
val imgs = getImageIds("data/images/train/", businessMap, businessMap.map(_._2).toSet.toList).slice(0,100) // 20000 images
println("Image ID retreival done!")
val dataMap = processImages(imgs, resizeImgDim = 128)
println("Image processing done!")
val alignedData = new featureAndDataAligner(dataMap, businessMap, Option(labelMap))()
println("Feature extraction done!")
val cnn0 = trainModelEpochs(alignedData, businessClass = 0, saveNN = "models/model0")
val cnn1 = trainModelEpochs(alignedData, businessClass = 1, saveNN = "models/model1")
val cnn2 = trainModelEpochs(alignedData, businessClass = 2, saveNN = "models/model2")
val cnn3 = trainModelEpochs(alignedData, businessClass = 3, saveNN = "models/model3")
val cnn4 = trainModelEpochs(alignedData, businessClass = 4, saveNN = "models/model4")
val cnn5 = trainModelEpochs(alignedData, businessClass = 5, saveNN = "models/model5")
val cnn6 = trainModelEpochs(alignedData, businessClass = 6, saveNN = "models/model6")
val cnn7 = trainModelEpochs(alignedData, businessClass = 7, saveNN = "models/model7")
val cnn8 = trainModelEpochs(alignedData, businessClass = 8, saveNN = "models/model8")
val businessMapTE = readBusinessToImageLabels("data/labels/test_photo_to_biz.csv")
val imgsTE = getImageIds("data/images/test//", businessMapTE, businessMapTE.map(_._2).toSet.toList)
val dataMapTE = processImages(imgsTE, resizeImgDim = 128) // make them 128*128
val alignedDataTE = new featureAndDataAligner(dataMapTE, businessMapTE, None)()
val Results = SubmitObj(alignedDataTE, "results/ModelsV0/")
val SubmitResults = writeSubmissionFile("kaggleSubmitFile.csv", Results, thresh = 0.9)
Too much of a mouthful? Don't worry, we will now see each step in detail. If you look at the preceding steps carefully, you'll see steps 1 to step 5 are basically image processing and feature constructions.