Intuition

When building a GAN for generating images, we trained both the generator and the discriminator at the same time. After training, we can discard the discriminator because we only used it for training the generator.

Figure 6: Semi-supervised learning GAN architecture for an 11 class classification problem

In semi-supervised learning, we need to transform the discriminator into a multi-class classifier. This new model has to be able to generalize well on the test set, even though we do not have many labeled examples for training. Additionally, this time, by the end of training, we can actually throw away the generator. Note that the roles changed. Now, the generator is only used for helping the discriminator during training. Putting it differently, the generator acts as a different source of information from which the discriminator gets raw, unlabeled training data. As we will see, this unlabelled data is key to improving the discriminator's performance. Also, for a regular image generation GAN, the discriminator has only one role. Compute the probability of whether its inputs are real or not — let's call it the GAN problem.

However, to turn the discriminator into a semi-supervised classifier, besides the GAN problem, the discriminator also has to learn the probabilities of each of the original dataset classes. In other words, for each input image, the discriminator has to learn the probabilities of it being a one, two, three, and so on.

Recall that for an image generation GAN discriminator, we have a single sigmoid unit output. This value represents the probability of an input image being real (value close to 1), or fake (value near 0). In other words, from the discriminator's point of view, values close to 1 mean that the samples are likely to come from the training set. Likewise, value near 0 mean a higher chance that the samples come from the generator network. By using this probability, the discriminator is able to send a signal back to the generator. This signal allows the generator to adapt its parameters during training, making it possible to improve its capabilities of creating realistic images.

We have to convert the discriminator (from the previous GAN) into an 11 class classifier. To do that, we can turn its sigmoid output into a softmax with 11 class outputs, the first 10 for the individual class probabilities of the SVHN dataset (zero to nine), and the 11th class for all the fake images that come from the generator.

Note that if we set the 11th class probability to 0, then the sum of the first 10 probabilities represents the same probability computed using the sigmoid function.

Finally, we need to set up the losses in such a way that the discriminator can do both:

Help the generator learn to produce realistic images. To do that, we have to instruct the discriminator to distinguish between real and fake samples.
Use the generator’s images, along with the labeled and unlabeled training data, to help classify the dataset.

To summarize, the discriminator has three different sources of training data:

Real images with labels. These are image label pairs like in any regular supervised classification problem.
Real images without labels. For those, the classifier only learns that these images are real.
Images from the generator. To use these ones, the discriminator learns to classify as fake.

The combination of these different sources of data will make the classifier able to learn from a broader perspective. That, in turn, allows the model to perform inference much more precisely than it would be if only using the 1,000 labeled examples for training.

Table of Contents for Intuition

Create new playlist

Sign In

Sign Up

Table of Contents for
Intuition