About this book

Deep Learning and the Game of Go is intended to introduce modern machine learning by walking through a practical and fun example: building an AI that plays Go. By the end of chapter 3, you can make a working Go-playing program, although it will be laughably weak at that point. From there, each chapter introduces a new way to improve your bot’s AI; you can learn about the strengths and limitations of each technique by experimenting. It all culminates in the final chapters, where we show how AlphaGo and AlphaGo Zero integrate all the techniques into incredibly powerful AIs.

Who should read this book

This book is for software developers who want to start experimenting with machine learning, and who prefer a practical approach over a mathematical approach. We assume you have a working knowledge of Python, although you could implement the same algorithms in any modern language. We don’t assume you know anything about Go; if you prefer chess or some similar game, you can adapt most of the techniques to your favorite game. If you are a Go player, you should have a blast watching your bot learn to play. We certainly did!

Roadmap

The book has three parts that cover 14 chapters and 5 appendices. Part I: Foundations introduces the major concepts for the rest of the book.

  • Chapter 1, Towards deep learning, gives a lightweight and high-level overview of the discipline’s artificial intelligence, machine learning, and deep learning. We explain how they interrelate and what you can and cannot do with techniques from these fields.
  • Chapter 2, Go as a machine learning problem, introduces the rules of Go and explains what we can hope to teach a computer playing the game.
  • Chapter 3, Implementing your first Go bot, is the chapter in which we implement the Go board, placing stones and playing full games in Python. At the end of this chapter you can program the weakest Go AI possible.

Part II: Machine learning and game AI presents the technical and methodological foundations to create a strong go AI. In particular, we will introduce three pillars, or techniques, that AlphaGo uses very effectively: tree search, neural networks, and reinforcement learning.

Tree search

  • Chapter 4, Playing games with tree search, gives an overview of algorithms that search and evaluate sequences of game play. We start with the simple brute-force minimax search, then build up to advanced algorithms such as alpha-beta pruning and Monte Carlo search.

Neural networks

  • Chapter 5, Getting started with neural networks, gives a practical introduction into the topic of artificial neural networks. You will learn to predict handwritten digits by implementing a neural network from scratch in Python.
  • Chapter 6, Designing a neural network for Go data, explains how Go data shares traits similar to image data and introduces convolutional neural networks for move prediction. In this chapter we start using the popular deep learning library Keras to build our models.
  • Chapter 7, Learning from data: a deep learning bot, we apply the practical knowledge acquired in the preceding two chapters to build a Go bot powered by deep neural networks. We train this bot on actual game data from strong amateur games and indicate the limitations of this approach.
  • Chapter 8, Deploying bots in the wild, will get you started with serving your bots so that human opponents can play against it through a user interface. You will also learn how to let your bots play against other bots, both locally and on a Go server.

Reinforcement learning

  • Chapter 9, Learning by practice: reinforcement learning, covers the very basics of reinforcement learning and how we can use it for self-play in Go.
  • Chapter 10, Reinforcement learning with policy gradients, carefully introduces policy gradients, a vital method in improving move predictions from chapter 7.
  • Chapter 11, Reinforcement learning with value methods, shows how to evaluate board positions with so-called value methods, a powerful tool when combined with tree search from chapter 4.
  • Chapter 12, Reinforcement learning with actor-critic methods, introduces techniques to predict the long-term value of a given board position and a given next move, which will help us choose next moves efficiently.

Part III: Greater than the sum of its parts is the final part, in which all building blocks developed earlier culminate in an application that is close to what AlphaGo does.

  • Chapter 13, Alpha Go: Bringing it all together, is both technically and mathematically the pinnacle of this book. We discuss how first training a neural network on Go data (chapters 57) and then proceeding with self-play (chapters 811), combined with a clever tree search approach (chapter 4) can create a superhuman-level Go bot.
  • Chapter 14, AlphaGo Zero: Integrating tree search with reinforcement learning, the last chapter of this book, describes the current state of the art in board game AI. We take a deep dive into the innovative combination of tree search and reinforcement learning that powers AlphaGo Zero.

In the appendices, we cover the following topics:

  • Appendix A, Mathematical foundations, recaps some basics of linear algebra and calculus, and shows how to represent some linear algebra structures in the Python library NumPy.
  • Appendix B, The backpropagation algorithm, explains the more math-heavy details of the learning procedure of most neural networks, which we use from chapter 5 onwards.
  • Appendix C, Go programs and servers, provides some resources for readers who want to learn more about Go.
  • Appendix D, Training and deploying bots using Amazon Web Services, is a quick guide to running your bot on an Amazon cloud server.
  • Appendix E, Submitting a bot to the Online Go Server (OGS), shows how to connect your bot to a popular Go server, where you can test it against players around the world.

The figure on the following page summarizes the chapter dependencies.

About the code

This book contains many examples of source code both in numbered listings and in line with normal text. In both cases, source code is formatted in a fixed-width font like this to separate it from ordinary text. Sometimes code is also in bold to highlight code that has changed from previous steps in the chapter, such as when a new feature adds to an existing line of code.

In many cases, the original source code has been reformatted; we’ve added line breaks and reworked indentation to accommodate the available page space in the book. In rare cases, even this was not enough, and listings include line-continuation markers (). Additionally, comments in the source code have often been removed from the listings when the code is described in the text. Code annotations accompany many of the listings, highlighting important concepts.

All code samples, along with some additional glue code, are available on GitHub at: https://github.com/maxpumperla/deep_learning_and_the_game_of_go.

Book forum

Purchase of Deep Learning and the Game of Go includes free access to a private web forum run by Manning Publications, where you can make comments about the book, ask technical questions, and receive help from the author and from other users. To access the forum, go to https://forums.manning.com/forums/deep-learning-and-the-game-of-go. You can also learn more about Manning’s forums and the rules of conduct at https://forums.manning.com/forums/about.

Manning’s commitment to our readers is to provide a venue where a meaningful dialogue between individual readers and between readers and the authors can take place. It is not a commitment to any specific amount of participation on the part of the authors, whose contribution to the forum remains voluntary (and unpaid). We suggest you try asking the authors some challenging questions lest their interest stray! The forum and the archives of previous discussions will be accessible from the publisher’s website as long as the book is in print.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset