Building CNN models from scratch

Let's start building our image categorization classifier. Our approach will be to build models on our training dataset and validate it on our validation dataset. In the end, we will test the performance of all our models on the test dataset. Before we jump into modeling, let's load and prepare our datasets. To start with, we load up some basic dependencies:

import glob 
import numpy as np 
import matplotlib.pyplot as plt 
from keras.preprocessing.image import ImageDataGenerator, load_img, img_to_array, array_to_img 
 
%matplotlib inline

Let's now load our datasets, using the following code snippet:

IMG_DIM = (150, 150) 
 
train_files = glob.glob('training_data/*') 
train_imgs = [img_to_array(load_img(img, target_size=IMG_DIM)) for img  
              in train_files] 
train_imgs = np.array(train_imgs) 
train_labels = [fn.split('/')[1].split('.')[0].strip() for fn in 
                train_files] 
 
validation_files = glob.glob('validation_data/*') 
validation_imgs = [img_to_array(load_img(img, target_size=IMG_DIM)) for 
                   img in validation_files] 
validation_imgs = np.array(validation_imgs) 
validation_labels = [fn.split('/')[1].split('.')[0].strip() for fn in 
                     validation_files] 
 
print('Train dataset shape:', train_imgs.shape,  
      'tValidation dataset shape:', validation_imgs.shape) 
 
 
Train dataset shape: (3000, 150, 150, 3)         
Validation dataset shape: (1000, 150, 150, 3)

We can clearly see that we have 3000 training images and 1000 validation images. Each image is of size 150 x 150 and has three channels for red, green, and blue (RGB), hence giving each image the (150, 150, 3) dimensions. We will now scale each image with pixel values between (0, 255) to values between (0, 1) because deep learning models work really well with small input values:

train_imgs_scaled = train_imgs.astype('float32') 
validation_imgs_scaled = validation_imgs.astype('float32') 
train_imgs_scaled /= 255 
validation_imgs_scaled /= 255 
 
# visualize a sample image 
print(train_imgs[0].shape) 
array_to_img(train_imgs[0]) 
 
(150, 150, 3)

The preceding code generates the following output:

The preceding output shows one of the sample images from our training dataset. Let's now set up some basic configuration parameters and also encode our text class labels into numeric values (otherwise, Keras will throw an error):

batch_size = 30 
num_classes = 2 
epochs = 30 
input_shape = (150, 150, 3) 
 
# encode text category labels 
from sklearn.preprocessing import LabelEncoder 
 
le = LabelEncoder() 
le.fit(train_labels) 
train_labels_enc = le.transform(train_labels) 
validation_labels_enc = le.transform(validation_labels) 
 
print(train_labels[1495:1505], train_labels_enc[1495:1505]) 
 
 
['cat', 'cat', 'cat', 'cat', 'cat', 'dog', 'dog', 'dog', 'dog', 'dog'] 
[0 0 0 0 0 1 1 1 1 1]

We can see that our encoding scheme assigns the number 0 to the cat labels and 1 to the dog labels. We are now ready to build our first CNN-based deep learning model.

Table of Contents for Building CNN models from scratch

Create new playlist

Sign In

Sign Up

Table of Contents for
Building CNN models from scratch