Architecture of the Capsule network

Let's suppose our network is trying to predict handwritten digits. We know that capsules in the earlier layers detect basic features, and those in the later layers detect the digit. So, let's call the capsules in the earlier layers primary capsules and those in the later layers digit capsules.

The architecture of a Capsule network is shown here:

In the preceding diagram, we can observe the following:

First, we take the input image and feed it to a standard convolution layer, and we call the result convolutional inputs.
Then, we feed the convolutional inputs to the primary capsules layer and get the primary capsules.
Next, we compute digit capsules with primary capsules as input using the dynamic-routing algorithm.
The digit capsules consist of 10 rows, and each of the rows represents the probability of the predicted digit. That is, row 1 represents the probability of the input digit to be 0, row 2 represents the probability of the digit 1, and so on.
Since the input image is digit 3 in the preceding image, row 4, which represents the probability of digit 3, will be high in the digit capsules.

Table of Contents for Architecture of the Capsule network

Create new playlist

Sign In

Sign Up

Table of Contents for
Architecture of the Capsule network