Setting up a Markov Decision Process

The Markov Decision Process (MDP) forms the basis of setting up RL, where the outcome of a decision is semi-controlled; that is, it is partly random and partly controlled (by the decision-maker). An MDP is defined using a set of possible states (S), a set of possible actions (A), a real-values reward function (R), and a set of transition probabilities from one state to another state for a given action (T). In addition, the effects of an action performed on one state depends only on that state and not on its previous states.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset