Model architecture

We will upload the RESNET50 model without including the top layer. This will help us customize the pretrained model for use with CIFAR10 data. Since the RESNET50 model is trained with the help of over 1 million images, it captures useful features and representations of images that can be reused with new but similar and smaller data. This reusability aspect of pretrained models not only helps to reduce the time and cost of developing an image classification model from scratch, but is especially useful when the training data is relatively small. 

The code that's used for developing the model is as follows:

# RESNET50 network without the top layer
pretrained <- application_resnet50(weights = "imagenet",
include_top = FALSE,
input_shape = c(224, 224, 3))

model <- keras_model_sequential() %>%
pretrained %>%
layer_flatten() %>%
layer_dense(units = 256, activation = "relu") %>%
layer_dense(units = 10, activation = "softmax")
summary(model)
_______________________________________________________________________
Layer (type) Output Shape Param #
=======================================================================
resnet50 (Model) (None, 7, 7, 2048) 23587712
________________________________________________________________________
flatten_6 (Flatten) (None, 100352) 0
________________________________________________________________________
dense_12 (Dense) (None, 256) 25690368
________________________________________________________________________
dense_13 (Dense) (None, 10) 2570
========================================================================
Total params: 49,280,650
Trainable params: 49,227,530
Non-trainable params: 53,120
_________________________________________________________________________

When uploading the RESNET50 model, the input dimensions for the data based on color images are specified as 224 x 224 x 3. Although smaller dimensions will work too, image dimensions cannot be less than 32 x 32 x 3. Images in the CIFAR10 dataset have dimensions of 32 x 32 x 3, but we have resized them to 224 x 224 x 3 as it gives us better image classification accuracy.

From the preceding summary, we can observe the following:

  • The output dimensions from the RESNET50 network are 7 x 7 x 2,048.
  • We use a flattened layer to change the output shape to a single column with 7 x 7 x 2,048 = 100,352 elements.
  • A dense layer with 256 units and a relu activation function is added.
  • This dense layer leads to (100,353 x 256) + 256 = 25,690,368 parameters.
  • The last dense layer has 10 units for images with 10 categories and a softmax activation function. This network has a total of 49,280,650 parameters.
  • Out of the total parameters in the network, 49,227,530 are trainable parameters.

Although we can train the network with all of these parameters, this is not advisable. Training and updating parameters related to the RESNET50 network will cause us to lose the benefits that we would get as a result of the features that have been learned from over 1 million images. We are only using data from 2,000 images for training and have 10 different categories. So, for each category, we only have approximately 200 images. Therefore, it is important to freeze the weights in the RESNET50 network, which will allow us to obtain the benefits of using a pretrained network.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset