Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Monte Carlo prediction

As we know, Monte Carlo methods predict the state-value function for a given policy. The value of any state is the expected return or expected cumulative future discounted rewards starting from that state. These values are estimated in MC methods simply to average the returns observed after visits to that state. As more and more values are observed, the average should converge to the expected value based on the law of large numbers. In fact, this is the principle applicable in all Monte Carlo methods. The Monte Carlo Policy Evaluation Algorithm consist of the following steps:

Initialize:

Repeat forever:
- Generate an episode using π
- For each state s appearing in the episode:
  - G return following the first occurrence of s
  - Append G to Returns(s)
  - V(s) average(Returns(s))

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Monte Carlo prediction

Create new playlist

Sign In

Sign Up

Table of Contents for
Monte Carlo prediction