The pooling step

One of the important steps for our learning process is the pooling step, which is sometimes called the subsampling or downsampling step. This step is mainly for reducing the dimensionality of the output of the convolution step (feature map). The advantage of this pooling step is reducing the size of the feature map while keeping the important information in the newly reduced version.

The following figure shows this step by scanning the image with a 2 by 2 filter and stride 2 while applying the max operation. This kind of pooling operation is called max pool:

Figure 9.8: An example of a max pooling operation on a rectified feature map (obtained after convolution and ReLU operation) by using a 2 x 2 window (source: http://textminingonline.com/wp-content/uploads/2016/10/max_polling-300x256.png)

We can connect the output of the convolution step to the pooling layer by using the following line of code:

pool_layer1 = tf.layers.max_pooling2d(inputs=conv_layer1, pool_size=[2, 2], strides=2)

The pooling layer receives the input from the convolution step with the following shape:

[batch_size, image_width, image_height, channels]

For example, in our digit classification task, the input to the pooling layer will have the following shape:

[batch_size, 28, 28, 20]

The output of the pooling operation will have the following shape:

[batch_size, 14, 14, 20]

In this example, we have reduced the size of the output of the convolution step by 50%. This step is very useful because it keeps only the important information and it also reduces the model's complexity and hence avoids overfitting.

Table of Contents for The pooling step

Create new playlist

Sign In

Sign Up

Table of Contents for
The pooling step