Chapter 6 - Demystifying Convolutional Networks

The different layers of CNN include convolution, pooling, and fully connected layers.
We slide over the input matrix with the filter matrix by one pixel and perform the convolution operation. But we can not only slide over the input matrix by one pixel-we can also slide over the input matrix by any number of pixels. The number of pixels we slide over the input matrix by the filter matrix is called stride.
With the convolution operation, we slide over the input matrix with a filter matrix. But in some cases, the filter does not perfectly fit the input matrix. That is, there exists a situation that when we move our filter matrix by two pixels, it reaches the border and the filter does not fit the input matrix, that is, some part of our filter matrix is outside the input matrix. In this case, we perform padding.
The pooling layer reduces spatial dimensions by keeping only the important features. The different types of pooling operation include max pooling, average pooling, and sum pooling.
VGGNet is one of the most popularly used CNN architectures. It was invented by the Visual Geometry Group (VGG) at the University of Oxford. The architecture of the VGG network consists of convolutional layers followed by a pooling layer. It uses 3 x 3 convolution and 2 x 2 pooling throughout the network.
With factorized convolution, we break down a convolutional layer with a larger filter size into a stack of convolutional layers, with a smaller filter size. So, in the inception block, a convolutional layer with a 5 x 5 filter can be broken down into two convolutional layers with 3 x 3 filters.
Like the CNN, the Capsule network checks the presence of certain features to classify the image, but apart from detecting the features, it will also check the spatial relationship between them- that is, it learns the hierarchy of the features.
In the Capsule networks, apart from calculating probabilities, we also need to preserve the direction of the vectors, so we use a different activation function, called the squash function. It is given as follows:

Table of Contents for Chapter 6 - Demystifying Convolutional Networks

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 6 - Demystifying Convolutional Networks