Let's start building our image categorization classifier. Our approach will be to build models on our training dataset and validate it on our validation dataset. In the end, we will test the performance of all our models on the test dataset. Before we jump into modeling, let's load and prepare our datasets. To start with, we load up some basic dependencies:
import glob import numpy as np import matplotlib.pyplot as plt from keras.preprocessing.image import ImageDataGenerator, load_img, img_to_array, array_to_img %matplotlib inline
Let's now load our datasets, using the following code snippet:
IMG_DIM = (150, 150) train_files = glob.glob('training_data/*') train_imgs = [img_to_array(load_img(img, target_size=IMG_DIM)) for img
in train_files] train_imgs = np.array(train_imgs) train_labels = [fn.split('/')[1].split('.')[0].strip() for fn in
train_files] validation_files = glob.glob('validation_data/*') validation_imgs = [img_to_array(load_img(img, target_size=IMG_DIM)) for
img in validation_files] validation_imgs = np.array(validation_imgs) validation_labels = [fn.split('/')[1].split('.')[0].strip() for fn in
validation_files] print('Train dataset shape:', train_imgs.shape, 'tValidation dataset shape:', validation_imgs.shape) Train dataset shape: (3000, 150, 150, 3) Validation dataset shape: (1000, 150, 150, 3)
We can clearly see that we have 3000 training images and 1000 validation images. Each image is of size 150 x 150 and has three channels for red, green, and blue (RGB), hence giving each image the (150, 150, 3) dimensions. We will now scale each image with pixel values between (0, 255) to values between (0, 1) because deep learning models work really well with small input values:
train_imgs_scaled = train_imgs.astype('float32') validation_imgs_scaled = validation_imgs.astype('float32') train_imgs_scaled /= 255 validation_imgs_scaled /= 255 # visualize a sample image print(train_imgs[0].shape) array_to_img(train_imgs[0]) (150, 150, 3)
The preceding code generates the following output:
The preceding output shows one of the sample images from our training dataset. Let's now set up some basic configuration parameters and also encode our text class labels into numeric values (otherwise, Keras will throw an error):
batch_size = 30 num_classes = 2 epochs = 30 input_shape = (150, 150, 3) # encode text category labels from sklearn.preprocessing import LabelEncoder le = LabelEncoder() le.fit(train_labels) train_labels_enc = le.transform(train_labels) validation_labels_enc = le.transform(validation_labels) print(train_labels[1495:1505], train_labels_enc[1495:1505]) ['cat', 'cat', 'cat', 'cat', 'cat', 'dog', 'dog', 'dog', 'dog', 'dog']
[0 0 0 0 0 1 1 1 1 1]
We can see that our encoding scheme assigns the number 0 to the cat labels and 1 to the dog labels. We are now ready to build our first CNN-based deep learning model.