Updating the model parameters

Now that we've computed the gradients, we need to update our model parameters according to our update rule, as follows:

Since we stored in theta[0] and in theta[1], we can write our update equation as follows:

As we learned in the previous section, updating gradients on just one iteration will not lead us to convergence, that is, the minimum of the cost function, so we need to compute gradients and update the model parameter for several iterations.

First, we need to set the number of iterations:

num_iterations = 50000

Now, we need to define the learning rate:

lr = 1e-2

Next, we will define a list called loss for storing the loss on every iteration:

loss = []

On each iteration, we will calculate and update the gradients according to our parameter update rule from equation (8):

theta = np.zeros(2)

for t in range(num_iterations):

#compute gradients
gradients = compute_gradients(data, theta)

#update parameter
theta = theta - (lr*gradients)

#store the loss
loss.append(loss_function(data,theta))

Now, we need to plot the loss (Cost) function:

plt.plot(loss)
plt.grid()
plt.xlabel('Training Iterations')
plt.ylabel('Cost')
plt.title('Gradient Descent')

The following plot shows how the loss (Cost) decreases over the training iterations:

Thus, we learned that gradient descent can be used to find the optimal parameters of the model, which we can then use to minimize the loss. In the next section, we will learn about several variants of the gradient descent algorithm.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset