Input layer

This is the first layer in any CNN architecture. All the subsequent convolution and pooling layers expect the input to be in a specific format. The input variables will tensors, that has the following shape:

[batch_size, image_width, image_height, channels]

Here:

  • batch_size is a random sample from the original training set that's used during applying stochastic gradient descent.
  • image_width is the width of the input images to the network.
  • image_height is the height of the input images to the network.
  • channels are the number of color channels of the input images. This number could be 3 for RGB images or 1 for binary images.

For example, consider our famous MNIST dataset. Let's say we are going to perform digit classification using CNNs using this dataset. 

If the dataset is composed of monochrome 28 x 28 pixel images like the MNIST dataset, then the desired shape for our input layer is as follows:

[batch_size, 28, 28, 1].

To change the shape of our input features, we can do the following reshaping operation:

input_layer = tf.reshape(features["x"], [-1, 28, 28, 1])
As you can see, we have specified the batch size to be -1, which means that this number should be determined dynamically based on the input values in the features. By doing this, we will be able to fine-tune our CNN model by controlling the batch size.

As an example for the reshape operation, suppose that we divided our input samples into batches of five and our feature ["x"] array will hold 3,920 values() of the input images, where each value of this array corresponds to a pixel in an image. For this case, the input layer will have the following shape:

[5, 28, 28, 1]
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset