Non-linearities

First, we need to identify why we need non-linearities. Consider, we have two affine maps: f(x)=Ax+b and g(x)=Cx+d. f(g(x)) is shown in the following equation:

Here, we can see that when affine maps are composed together, the resultant is an affine map, where Ad+b is a vector and AC is a matrix.

We can identify neural networks as long chains of affine compositions. Previously, it was possible that non-linearities were introduced in-between the affine layers. But thankfully, it isn't the case any longer, and hence that helps in building more powerful and efficient models.

While working with the most common functions such as tanh (x), σ(x) and ReLU (x), we see that there are a few core non-linearities, as shown in the following code block:

#let's see more about non-linearities
#Most of the non-linearities in PyTorch are present in torch.functional which we import as F)
# Please make a note that unlike affine maps, there are mostly no parameters in non-linearites 
# That is, they don't have weights that are updated during training.
#This means that during training the weights are not updated.
data = torch.randn(2, 2)
print(data)
print(F.relu(data))

The output of the preceding code is as follows:

tensor([[ 0.5848, 0.2149],
 [-0.4090, -0.1663]])
tensor([[0.5848, 0.2149],
 [0.0000, 0.0000]])

Table of Contents for Non-linearities

Create new playlist

Sign In

Sign Up

Table of Contents for
Non-linearities