The Swish function

The Swish function is a recently introduced activation function by Google. Unlike other activation functions, which are monotonic, Swish is a non-monotonic function, which means it is neither always non-increasing nor non-decreasing. It provides better performance than ReLU. It is simple and can be expressed as follows:

Here, is the sigmoid function. The Swish function is shown in the following diagram:

We can also reparametrize the Swish function and express it as follows:

When the value of  is 0, then we get the identity function .

It becomes a linear function and, when the value of  tends to infinity, then becomes , which is basically the ReLU function multiplied by some constant value. So, the value of acts as a good interpolation between a linear and a nonlinear function. The swish function can be implemented as shown:

def swish(x,beta):
return 2*x*sigmoid(beta*x)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset