Gradients with respect to U

Now we will see how to calculate the gradients of loss with respect to the input to the hidden weights, , for all the gates and the content state. Computing gradients with respect to is exactly the same as for those we computed with respect to , except that the last term will be instead of , similar to what we learned when we covered the LSTM cell.

We can write the gradients of loss with respect to as:

The gradients of loss with respect to are represented as follows:

The gradients of loss with respect to are represented as follows:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset