Computing the gradients of the loss function

The gradients of the loss function, , with respect to the parameter , are given as follows:

We define a function called compute_gradients, which takes the parameters, data and theta as input and returns the computed gradients:

def compute_gradients(data, theta):

Now, we need to initialize the gradients:

    gradients = np.zeros(2)

Then, we need to save the total number of data points in N:

    N = float(len(data))

Now, we can get the value of and :

    m = theta[0]
    b = theta[1]

We do the same for each iteration:

    for i in range(0, len(data)):

Then, we get the value of and :

        x = data[i, 0]
        y = data[i, 1]

Now, we compute the gradient of the loss with respect to , as given in equation (4):

        gradients[0] += - (2 / N) * x * (y - (( m* x) + b))

Then, we compute the gradient of the loss with respect to , as given in equation (5):

        gradients[1] += - (2 / N) * (y - ((theta[0] * x) + b))

We need to add epsilon to avoid division by zero error:

    epsilon = 1e-6 
    gradients = np.divide(gradients, N + epsilon)

    return gradients

When we feed our randomly initialized data and theta model parameter, the compute_gradients function returns the gradients with respect to , that is, , and gradients with respect to , that is, , as follows:

compute_gradients(data,theta)

array([-9.08423989e-05,  1.05174511e-04])

Table of Contents for Computing the gradients of the loss function

Create new playlist

Sign In

Sign Up

Table of Contents for
Computing the gradients of the loss function