Prototypical networks

Prototypical networks are yet another simple, efficient, and popular learning algorithm. Like siamese networks, they try to learn the metric space to perform classification.

The basic idea of the prototypical network is to create a prototypical representation of each class and classify a query point (new point) based on the distance between the class prototype and the query point.

Let's say we have a support set comprising images of lions, elephants, and dogs, as shown in the following diagram:

We have three classes (lion, elephant, and dog). Now we need to create a prototypical representation for each of these three classes. How can we build the prototype of these three classes? First, we will learn the embeddings of each data point using some embedding function. The embedding function, ,can be any function that can be used to extract features. Since our input is an image, we can use the convolutional network as our embedding function, which will extract features from the input images, shown as follows:

Once we learn the embeddings of each data point, we take the mean embeddings of data points in each class and form the class prototype, shown as follows. So, a class prototype is basically the mean embeddings of data points in a class:

Similarly, when a new data point comes in, that is, a query point for which we want to predict the label, we will generate the embeddings for this new data point using the same embedding function that we used to create the class prototype: that is, we generate the embeddings for our query point using the convolutional network:

Once we have the embedding for our query point, we compare the distance between class prototypes and query point embeddings to find which class the query point belongs to. We can use Euclidean distance as a distance measure for finding the distance between the class prototypes and query points embeddings, as shown below:

After finding the distance between the class prototype and query point embeddings, we apply softmax to this distance and get the probabilities. Since we have three classes, that is, lion, elephant, and dog, we will get three probabilities. The class that has high probability will be the class of our query point.

Since we want our network to learn from just a few data points, that is, since we want to perform few-shot learning, we train our network in the same way. We use episodic training; for each episode, we randomly sample a few data points from each of the classes in our dataset, and we call that a support set, and we train the network using only the support set, instead of the whole dataset. Similarly, we randomly sample a point from the dataset as a query point and try to predict its class. In this way, our network learns how to learn from data points.

The overall flow of the prototypical network is shown in the following figure. As you can see, first, we will generate the embeddings for all the data points in our support set and build the class prototype by taking the mean embeddings of data points in a class. We also generate the embeddings for our query point. Then we compute the distance between the class prototype and the query point embeddings. We use Euclidean distance as the distance measure. Then we apply softmax to this distance and get the probabilities.

As you can see in the following diagram, since our query point is a lion, the probability for lion is the highest, which is 0.9:

Prototypical networks are not only used for one-shot/few-shot learning, but are also used in zero-shot learning. Consider a case where we have no data points for each class but we have the meta-information containing a high-level description of each class.

In those cases, we learn the embeddings of meta- information of each class to form the class prototype and then perform classification with the class prototype.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset