0%

Book Description

Deep Reinforcement Learning in Action teaches you how to program AI agents that adapt and improve based on direct feedback from their environment. In this example-rich tutorial, you’ll master foundational and advanced DRL techniques by taking on interesting challenges like navigating a maze and playing video games. Along the way, you’ll work with core algorithms, including deep Q-networks and policy gradients, along with industry-standard tools like PyTorch and OpenAI Gym.

Table of Contents

  1. Copyright
  2. Brief Table of Contents
  3. Table of Contents
  4. Preface
  5. Acknowledgments
  6. About This Book
  7. About the Authors
  8. About the Cover Illustration
  9. Part 1. Foundations
    1. Chapter 1. What is reinforcement learning?
      1. 1.1. The “deep” in deep reinforcement learning
      2. 1.2. Reinforcement learning
      3. 1.3. Dynamic programming versus Monte Carlo
      4. 1.4. The reinforcement learning framework
      5. 1.5. What can I do with reinforcement learning?
      6. 1.6. Why deep reinforcement learning?
      7. 1.7. Our didactic tool: String diagrams
      8. 1.8. What’s next?
      9. Summary
    2. Chapter 2. Modeling reinforcement learning problems: Markov decision processes
      1. 2.1. String diagrams and our teaching methods
      2. 2.2. Solving the multi-arm bandit
      3. 2.3. Applying bandits to optimize ad placements
      4. 2.4. Building networks with PyTorch
      5. 2.5. Solving contextual bandits
      6. 2.6. The Markov property
      7. 2.7. Predicting future rewards: Value and policy functions
      8. Summary
    3. Chapter 3. Predicting the best states and actions: Deep Q-networks
      1. 3.1. The Q function
      2. 3.2. Navigating with Q-learning
      3. 3.3. Preventing catastrophic forgetting: Experience replay
      4. 3.4. Improving stability with a target network
      5. 3.5. Review
      6. Summary
    4. Chapter 4. Learning to pick the best policy: Policy gradient methods
      1. 4.1. Policy function using neural networks
      2. 4.2. Reinforcing good actions: The policy gradient algorithm
      3. 4.3. Working with OpenAI Gym
      4. 4.4. The REINFORCE algorithm
      5. Summary
    5. Chapter 5. Tackling more complex problems with actor-critic methods
      1. 5.1. Combining the value and policy function
      2. 5.2. Distributed training
      3. 5.3. Advantage actor-critic
      4. 5.4. N-step actor-critic
      5. Summary
  10. Part 2. Above and beyond
    1. Chapter 6. Alternative optimization methods: Evolutionary algorithms
      1. 6.1. A different approach to reinforcement learning
      2. 6.2. Reinforcement learning with evolution strategies
      3. 6.3. A genetic algorithm for CartPole
      4. 6.4. Pros and cons of evolutionary algorithms
      5. 6.5. Evolutionary algorithms as a scalable alternative
      6. Summary
    2. Chapter 7. Distributional DQN: Getting the full story
      1. 7.1. What’s wrong with Q-learning?
      2. 7.2. Probability and statistics revisited
      3. 7.3. The Bellman equation
      4. 7.4. Distributional Q-learning
      5. 7.5. Comparing probability distributions
      6. 7.6. Dist-DQN on simulated data
      7. 7.7. Using distributional Q-learning to play Freeway
      8. Summary
    3. Chapter 8. Curiosity-driven exploration
      1. 8.1. Tackling sparse rewards with predictive coding
      2. 8.2. Inverse dynamics prediction
      3. 8.3. Setting up Super Mario Bros.
      4. 8.4. Preprocessing and the Q-network
      5. 8.5. Setting up the Q-network and policy function
      6. 8.6. Intrinsic curiosity module
      7. 8.7. Alternative intrinsic reward mechanisms
      8. Summary
    4. Chapter 9. Multi-agent reinforcement learning
      1. 9.1. From one to many agents
      2. 9.2. Neighborhood Q-learning
      3. 9.3. The 1D Ising model
      4. 9.4. Mean field Q-learning and the 2D Ising model
      5. 9.5. Mixed cooperative-competitive games
      6. Summary
    5. Chapter 10. Interpretable reinforcement learning: Attention and relational models
      1. 10.1. Machine learning interpretability with attention and relational biases
      2. 10.2. Relational reasoning with attention
      3. 10.3. Implementing self-attention for MNIST
      4. 10.4. Multi-head attention and relational DQN
      5. 10.5. Double Q-learning
      6. 10.6. Training and attention visualization
      7. Summary
    6. Chapter 11. In conclusion: A review and roadmap
      1. 11.1. What did we learn?
      2. 11.2. The uncharted topics in deep reinforcement learning
      3. 11.3. The end
  11. Appendix. Mathematics, deep learning, PyTorch
    1. A.1. Linear algebra
    2. A.2. Calculus
    3. A.3. Deep learning
    4. A.4. PyTorch
  12. Reference list
  13. Index
  14. List of Figures
  15. List of Tables
  16. List of Listings