About this book
Deep Learning and the Game of Go is intended to introduce modern machine learning by walking through a practical and fun example: building an AI that plays
Go. By the end of chapter 3, you can make a working Go-playing program, although it will be laughably weak at that point. From there, each chapter introduces
a new way to improve your bot’s AI; you can learn about the strengths and limitations of each technique by experimenting.
It all culminates in the final chapters, where we show how AlphaGo and AlphaGo Zero integrate all the techniques into incredibly
powerful AIs.
Who should read this book
This book is for software developers who want to start experimenting with machine learning, and who prefer a practical approach
over a mathematical approach. We assume you have a working knowledge of Python, although you could implement the same algorithms
in any modern language. We don’t assume you know anything about Go; if you prefer chess or some similar game, you can adapt
most of the techniques to your favorite game. If you are a Go player, you should have a blast watching your bot learn to play. We certainly did!
Roadmap
The book has three parts that cover 14 chapters and 5 appendices. Part I: Foundations introduces the major concepts for the rest of the book.
- Chapter 1, Towards deep learning, gives a lightweight and high-level overview of the discipline’s artificial intelligence, machine learning, and deep learning.
We explain how they interrelate and what you can and cannot do with techniques from these fields.
- Chapter 2, Go as a machine learning problem, introduces the rules of Go and explains what we can hope to teach a computer playing the game.
- Chapter 3, Implementing your first Go bot, is the chapter in which we implement the Go board, placing stones and playing full games in Python. At the end of this chapter
you can program the weakest Go AI possible.
Part II: Machine learning and game AI presents the technical and methodological foundations to create a strong go AI. In particular, we will introduce three pillars,
or techniques, that AlphaGo uses very effectively: tree search, neural networks, and reinforcement learning.
Tree search
- Chapter 4, Playing games with tree search, gives an overview of algorithms that search and evaluate sequences of game play. We start with the simple brute-force minimax
search, then build up to advanced algorithms such as alpha-beta pruning and Monte Carlo search.
Neural networks
- Chapter 5, Getting started with neural networks, gives a practical introduction into the topic of artificial neural networks. You will learn to predict handwritten digits
by implementing a neural network from scratch in Python.
- Chapter 6, Designing a neural network for Go data, explains how Go data shares traits similar to image data and introduces convolutional neural networks for move prediction.
In this chapter we start using the popular deep learning library Keras to build our models.
- Chapter 7, Learning from data: a deep learning bot, we apply the practical knowledge acquired in the preceding two chapters to build a Go bot powered by deep neural networks.
We train this bot on actual game data from strong amateur games and indicate the limitations of this approach.
- Chapter 8, Deploying bots in the wild, will get you started with serving your bots so that human opponents can play against it through a user interface. You will
also learn how to let your bots play against other bots, both locally and on a Go server.
Reinforcement learning
- Chapter 9, Learning by practice: reinforcement learning, covers the very basics of reinforcement learning and how we can use it for self-play in Go.
- Chapter 10, Reinforcement learning with policy gradients, carefully introduces policy gradients, a vital method in improving move predictions from chapter 7.
- Chapter 11, Reinforcement learning with value methods, shows how to evaluate board positions with so-called value methods, a powerful tool when combined with tree search from
chapter 4.
- Chapter 12, Reinforcement learning with actor-critic methods, introduces techniques to predict the long-term value of a given board position and a given next move, which will help us
choose next moves efficiently.
Part III: Greater than the sum of its parts is the final part, in which all building blocks developed earlier culminate in an application that is close to what AlphaGo
does.
- Chapter 13, Alpha Go: Bringing it all together, is both technically and mathematically the pinnacle of this book. We discuss how first training a neural network on Go data
(chapters 5–) and then proceeding with self-play (chapters 8–11), combined with a clever tree search approach (chapter 4) can create a superhuman-level Go bot.
- Chapter 14, AlphaGo Zero: Integrating tree search with reinforcement learning, the last chapter of this book, describes the current state of the art in board game AI. We take a deep dive into the innovative
combination of tree search and reinforcement learning that powers AlphaGo Zero.
In the appendices, we cover the following topics:
- Appendix A, Mathematical foundations, recaps some basics of linear algebra and calculus, and shows how to represent some linear algebra structures in the Python
library NumPy.
- Appendix B, The backpropagation algorithm, explains the more math-heavy details of the learning procedure of most neural networks, which we use from chapter 5 onwards.
- Appendix C, Go programs and servers, provides some resources for readers who want to learn more about Go.
- Appendix D, Training and deploying bots using Amazon Web Services, is a quick guide to running your bot on an Amazon cloud server.
- Appendix E, Submitting a bot to the Online Go Server (OGS), shows how to connect your bot to a popular Go server, where you can test it against players around the world.
The figure on the following page summarizes the chapter dependencies.
About the code
This book contains many examples of source code both in numbered listings and in line with normal text. In both cases, source
code is formatted in a fixed-width font like this to separate it from ordinary text. Sometimes code is also in bold to highlight code that has changed from previous steps in the chapter, such as when a new feature adds to an existing line
of code.
In many cases, the original source code has been reformatted; we’ve added line breaks and reworked indentation to accommodate
the available page space in the book. In rare cases, even this was not enough, and listings include line-continuation markers
(). Additionally, comments in the source code have often been removed from the listings when the code is described in the text.
Code annotations accompany many of the listings, highlighting important concepts.
All code samples, along with some additional glue code, are available on GitHub at: https://github.com/maxpumperla/deep_learning_and_the_game_of_go.
Book forum
Purchase of Deep Learning and the Game of Go includes free access to a private web forum run by Manning Publications, where you can make comments about the book, ask
technical questions, and receive help from the author and from other users. To access the forum, go to https://forums.manning.com/forums/deep-learning-and-the-game-of-go. You can also learn more about Manning’s forums and the rules of conduct at https://forums.manning.com/forums/about.
Manning’s commitment to our readers is to provide a venue where a meaningful dialogue between individual readers and between
readers and the authors can take place. It is not a commitment to any specific amount of participation on the part of the
authors, whose contribution to the forum remains voluntary (and unpaid). We suggest you try asking the authors some challenging
questions lest their interest stray! The forum and the archives of previous discussions will be accessible from the publisher’s
website as long as the book is in print.