Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Gradients with respect to input to the hidden layer weight, U

Computing the gradients of the loss function with respect to is the same as , since here also we take the sequential derivative of . Similar to , to compute the derivative of any loss with respect to , we need to traverse all the way back to .

The final equation for computing the gradient of the loss with respect to is given as follows. As you may notice, it is basically the same as the equation (15), except that we have the term instead of shown as follows:

We have already seen how to compute to the first two terms in the previous section.

Let's look at the final term . We know that the hidden state is computed as, . Thus, the derivation of with respect to becomes:

So, our final equation for a gradient of the loss , with respect to , can be written as follows:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Gradients with respect to input to the hidden layer weight, U

Create new playlist

Sign In

Sign Up

Table of Contents for
Gradients with respect to input to the hidden layer weight, U