Now that we've computed the gradients, we need to update our model parameters according to our update rule, as follows:

Since we stored in theta[0] and in theta[1], we can write our update equation as follows:

As we learned in the previous section, updating gradients on just one iteration will not lead us to convergence, that is, the minimum of the cost function, so we need to compute gradients and update the model parameter for several iterations.

First, we need to set the number of iterations:

num_iterations = 50000

Now, we need to define the learning rate:

lr = 1e-2

Next, we will define a list called loss for storing the loss on every iteration:

loss = []

On each iteration, we will calculate and update the gradients according to our parameter update rule from equation (8):

theta = np.zeros(2)

for t in range(num_iterations):
    
    #compute gradients
    gradients = compute_gradients(data, theta)
    
    #update parameter
    theta = theta - (lr*gradients)
    
    #store the loss
    loss.append(loss_function(data,theta))

Now, we need to plot the loss (Cost) function:

plt.plot(loss)
plt.grid()
plt.xlabel('Training Iterations')
plt.ylabel('Cost')
plt.title('Gradient Descent')

The following plot shows how the loss (Cost) decreases over the training iterations:

Thus, we learned that gradient descent can be used to find the optimal parameters of the model, which we can then use to minimize the loss. In the next section, we will learn about several variants of the gradient descent algorithm.

Table of Contents for
Updating the model parameters

Updating the model parameters

Table of Contents for Updating the model parameters

Create new playlist

Sign In

Sign Up

Table of Contents for
Updating the model parameters