Computing prediction vectors

In the previous figure, , , and, represent the output vectors from the previous capsule. First, we multiply these vectors by the weight matrix and compute a prediction vector:

Okay, so what exactly are we doing here, and what are prediction vectors? Let's consider a simple example. Say that capsule is trying to predict whether an image has a face. We have learned that capsules in the earlier layers detect basic features and send their results to the capsules in the higher layer. So, the capsules in the earlier layer, , , and, detect basic low features, such as eyes, a nose, and a mouth, and send their results to the capsules in the high-level layer, that is, capsule , which detects the face.

Thus, capsule takes the previous capsules, , , and, , as inputs and multiplies them by a weights matrix, .

The weights matrix represents the spatial and other relationship between low-level features and high-level features. For instance, the weight tells us that eyes should be on the top. tells us that a nose should be in the middle. tells us that a mouth should be on the bottom. Note that the weight matrix not only captures the position (that is, the spatial relationship), but also other relationships.

So, by multiplying the inputs by weights, we can predict the position of the face:

implies the predicted position of the face based on the eyes
implies the predicted position of the face based on the nose
implies the predicted position of the face based on the mouth

When all the predicted positions of the face are the same, that is, in agreement with each other, then we can say that the image contains a face. We learn these weights using backward propagation.

Table of Contents for Computing prediction vectors

Create new playlist

Sign In

Sign Up

Table of Contents for
Computing prediction vectors