Leaky ReLU

We mentioned that we will be using a different version than the ReLU activation function, which is called leaky ReLU. The traditional version of the ReLU activation function will just take the maximum between the input value and zero, by other means truncating negative values to zero. Leaky ReLU, which is the version that we will be using, allows some negative values to exist, hence the name leaky ReLU.

Sometimes, if we use the traditional ReLU activation function, the network gets stuck in a popular state called the dying state, and that's because the network produces nothing but zeros for all the outputs.

The idea of using leaky ReLU is to prevent this dying state by allowing some negative values to pass through.

The whole idea behind making the generator work is to receive gradient values from the discriminator, and if the network is stuck in a dying situation, the learning process won't happen.

The following figures illustrate the difference between traditional ReLU and its leaky version:

Figure 4: ReLU function
Figure 5: Leaky ReLU activation functions

The leaky ReLU activation function is not implemented in TensorFlow, so we need to implement it ourselves. The output of this activation function will be positive if the input is positive, and will be a controlled negative value if the input is negative. We will control the negative value by a parameter called alpha, which will introduce tolerance of the network by allowing some negative values to pass.

The following equation represents the leaky ReLU that we will be implementing:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset