By now, you know how to build and train a strong deep-learning model for Go move prediction—but how do you integrate this into an application that plays games against opponents? Training a neural network is just one part of building an end-to-end application, whether you’re playing yourself or letting your bot compete against other bots. The trained model has to be integrated into an engine that can be played against.
In this chapter, you’ll build a simple Go model server and two frontends. First, we provide you with an HTTP frontend that you can use to play against your bot. Then, we introduce you to the Go Text Protocol (GTP), a widely used protocol that Go bots use to exchange information, so your bot can play against other bots like GNU Go or Pachi, two freely available Go programs based on GTP. Finally, we show you how to deploy your Go bot on Amazon Web Services (AWS) and connect it against the Online Go Server (OGS). Doing so will allow your bots to play ranked games in a real environment, compete against other bots and human players worldwide, and even enter tournaments. To do all this, we’ll show you how to tackle the following tasks:
Now that you have all the building blocks in place to build a strong neural network for Go data, let’s integrate such networks into an agent that will serve them. Recall from chapter 3 the concept of Agent. We defined it as a class that can select the next move for the current game state, by implementing a select_move method. Let’s write a DeepLearningAgent by using Keras models and our Go board Encoder concept (put this code into predict.py in the agent module in dlgo).
import numpy as np from dlgo.agent.base import Agent from dlgo.agent.helpers import is_point_an_eye from dlgo import encoders from dlgo import goboard from dlgo import kerasutil class DeepLearningAgent(Agent): def __init__(self, model, encoder): Agent.__init__(self) self.model = model self.encoder = encoder
You’ll use the encoder to transform the board state into features, and you’ll use the model to predict the next move. In fact, you’ll use the model to compute a whole probability distribution of possible moves that you’ll later sample from.
def predict(self, game_state): encoded_state = self.encoder.encode(game_state) input_tensor = np.array([encoded_state]) return self.model.predict(input_tensor)[0] def select_move(self, game_state): num_moves = self.encoder.board_width * self.encoder.board_height move_probs = self.predict(game_state)
Next, you alter the probability distribution stored in move_probs a little. First, you compute the cube of all values to drastically increase the distance between more-likely and less-likely moves. You want the best possible moves to be picked much more often. Then you use a trick called clipping that prevents move probabilities from being too close to either 0 or 1. This is done by defining a small positive value, ϵ = 0.000001, and setting values smaller than ϵ to ϵ, and values larger than 1 – ϵ to 1 – ϵ. Afterward, you normalize the resulting values to end up with a probability distribution once again.
move_probs = move_probs ** 3 1 eps = 1e-6 move_probs = np.clip(move_probs, eps, 1 - eps) 2 move_probs = move_probs / np.sum(move_probs) 3
You do this transformation because you want to sample moves from this distribution, according to their probabilities. Instead of sampling moves, another viable strategy would be to always take the most likely move (taking the maximum over the distribution). The benefit of the way you’re doing it is that sometimes other moves get chosen, which might be especially useful when there isn’t one single move that sticks out from the rest.
candidates = np.arange(num_moves) 1 ranked_moves = np.random.choice( candidates, num_moves, replace=False, p=move_probs) 2 for point_idx in ranked_moves: point = self.encoder.decode_point_index(point_idx) if game_state.is_valid_move(goboard.Move.play(point)) and not is_point_an_eye(game_state.board, point, game_state.next_player): 3 return goboard.Move.play(point) return goboard.Move.pass_turn() 4
For convenience, you also want to persist a DeepLearningAgent, so you can pick it up at a later point. The prototypical situation in practice is this: you train a deep-learning model and create an agent, which you then persist. At a later point, this agent gets deserialized and served, so human players or other bots can play against it. To do the serialization step, you hijack the serialization format of Keras. When you persist a Keras model, it gets stored in HDF5, an efficient serialization format. HDF5 files contain flexible groups that are used to store meta-information and data. For any Keras model, you can call model.save("model_path.h5") to persist the full model, meaning the neural network architecture and all weights, to the local file model_path.h5. The only thing you need to do before persisting a Keras model like this is to install the Python library h5py; for instance, with pip install h5py.
To store a complete agent, you can add an additional group for information about your Go board encoder.
def serialize(self, h5file): h5file.create_group('encoder') h5file['encoder'].attrs['name'] = self.encoder.name() h5file['encoder'].attrs['board_width'] = self.encoder.board_width h5file['encoder'].attrs['board_height'] = self.encoder.board_height h5file.create_group('model') kerasutil.save_model_to_hdf5_group(self.model, h5file['model'])
Finally, after you serialize a model, you also need to know how to load it from an HDF5 file.
def load_prediction_agent(h5file): model = kerasutil.load_model_from_hdf5_group(h5file['model']) encoder_name = h5file['encoder'].attrs['name'] if not isinstance(encoder_name, str): encoder_name = encoder_name.decode('ascii') board_width = h5file['encoder'].attrs['board_width'] board_height = h5file['encoder'].attrs['board_height'] encoder = encoders.get_encoder_by_name( encoder_name, (board_width, board_height)) return DeepLearningAgent(model, encoder)
This completes our definition of a deep-learning agent. As a next step, you have to make sure this agent connects and interacts with an environment. You do this by embedding DeepLearningAgent into a web application that human players can play against in their browser.
In chapters 6 and 7, you designed and trained a neural network that predicts what move a human would play in a Go game. In section 8.1, you turned that model for move prediction into a DeepLearningAgent that does move selection. The next step is to play your bot! Back in chapter 3, you built a bare-bones interface in which you could type in moves on your keyboard, and your benighted RandomBot would print its reply to the console. Now that you’ve built a more sophisticated bot, it deserves a nicer frontend to communicate moves with a human player.
In this section, you’ll connect the DeepLearningAgent to a Python web application, so you can play against it in your web browser. You’ll use the lightweight Flask library to serve such an agent via HTTP. On the browser side, you’ll use a JavaScript library called jgoboard to render a Go board that humans can use. The code can be found in our repository on GitHub, in the httpfrontend module in dlgo. We don’t explicitly discuss this code here, because we don’t want to distract from the main topic, building a Go AI, by digressing into web development techniques in other languages (such as HTML or JavaScript). Instead, we’ll give you an overview of what the application does and how to use it in an end-to-end example. Figure 8.1 provides an overview of the application you’re going to build in this chapter.
If you look into the structure of httpfrontend, you find a file called server.py that has a single, well-documented method, get_web_app, that you can use to return a web application to run. Here’s an example of how to use get_web_app to load a random bot and serve it.
from dlgo.agent.naive import RandomBot from dlgo.httpfrontend.server import get_web_app random_agent = RandomBot() web_app = get_web_app({'random': random_agent}) web_app.run()
When you run this example, a web application will start on localhost (127.0.0.1), listening on port 5000, which is the default port used in Flask applications. The RandomBot you just registered as random corresponds to an HTML file in the static folder in httpfrontend: play_random_99.html. In this file, a Go board is rendered, and it’s also the place in which the rules of human-bot game play are defined. The human opponent starts with the black stones; the bot takes white. Whenever a human move has been played, the route/select-move/random is triggered to receive the next move from the bot. After the bot move has been received, it’s applied to the board, and it’s the human’s move once again. To play against this bot, navigate to http://127.0.0.1:5000/static/play_random_99.html in your browser. You should see a playable demo, as shown in figure 8.2.
You’ll add more and more bots in the next chapters, but for now note that another frontend is available under play_predict_19.html. This web frontend talks to a bot called predict and can be used to play 19 × 19 games. Therefore, if you train a Keras neural network model on Go data and use a Go board encoder, you can first create an instance agent = DeepLearningAgent(model, encoder) and then register it in a web application web_app = get_web_app({'predict': agent}) that you can then start with web_app.run().
Figure 8.3 shows an end-to-end example covering the whole process (the same flow we introduced in the beginning of chapter 7). You start with the imports you need and load Go data into features X and labels y by using an encoder and a Go data processor, as shown in listing 8.8.
import h5py from keras.models import Sequential from keras.layers import Dense from dlgo.agent.predict import DeepLearningAgent, load_prediction_agent from dlgo.data.parallel_processor import GoDataProcessor from dlgo.encoders.sevenplane import SevenPlaneEncoder from dlgo.httpfrontend import get_web_app from dlgo.networks import large go_board_rows, go_board_cols = 19, 19 nb_classes = go_board_rows * go_board_cols encoder = SevenPlaneEncoder((go_board_rows, go_board_cols)) processor = GoDataProcessor(encoder=encoder.name()) X, y = processor.load_go_data(num_samples=100)
Equipped with features and labels, you can build a deep convolutional neural network and train it on this data. This time, you choose the large network from dlgo.networks and use Adadelta as the optimizer.
input_shape = (encoder.num_planes, go_board_rows, go_board_cols) model = Sequential() network_layers = large.layers(input_shape) for layer in network_layers: model.add(layer) model.add(Dense(nb_classes, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=['accuracy']) model.fit(X, y, batch_size=128, epochs=20, verbose=1)
After the model has finished training, you can create a Go bot from it and save this bot in HDF5 format.
deep_learning_bot = DeepLearningAgent(model, encoder) deep_learning_bot.serialize("../agents/deep_bot.h5")
Finally, you can load the bot from file and serve it in a web application.
model_file = h5py.File("../agents/deep_bot.h5", "r") bot_from_file = load_prediction_agent(model_file) web_app = get_web_app({'predict': bot_from_file}) web_app.run()
Of course, if you’ve already trained a strong bot, you can skip all but the last part. For instance, you could load one of the models stored in checkpoints in chapter 7 and see how they perform as opponents in action by changing the model_file accordingly.
Until this point, all development took place on your local machine at home. If you’re in the good position to have a modern GPU available on your computer, training the deep neural networks we developed in chapters 5–7 isn’t of concern for you. If you don’t have a powerful GPU or can’t spare any compute time on it, it’s usually a good option to rent compute time on a GPU in the cloud.
If you disregard training for now and assume you have a strong bot already, serving this bot is another situation in which cloud providers can come in handy. In section 8.2, you ran a bot via a web application hosted from localhost. If you want to share your bot with friends or make it public, that’s not exactly ideal. You neither want to ensure that your computer runs night and day, nor give the public access to your machine. By hosting your bot in the cloud, you separate development from deployment and can simply share a URL with anyone who’s interested in playing your bot.
Because this topic is important, but somewhat special and only indirectly related to machine learning, we entirely outsourced it to appendix D. Reading and applying the techniques from this appendix is entirely optional, but recommended. In appendix D, you’ll learn how to get started with one particular cloud provider, Amazon Web Services (AWS). You’ll learn the following skills in the appendix:
On top of learning these useful skills, appendix D is also a prerequisite for deploying a full-blown Go bot that connects to an online Go server, a topic we cover later in section 8.6.
In section 8.2, you saw how to integrate your bot framework into a web frontend. For this to work, you handled communication between the bot and human player with the Hypertext Transfer Protocol (HTTP), one of the core protocols running the web. To avoid distraction, we purposefully left out all the details, but having a standardized protocol in place is necessary to pull this off. Humans and bots don’t share a common language to exchange Go moves, but a protocol can act as a bridge.
The Go Text Protocol (GTP) is the de facto standard used by Go servers around the world to connect humans and bots on their platforms. Many offline Go programs are based on GTP as well. This section introduces you to GTP by example; you’ll implement part of the protocol in Python and use this implementation to let your bots play against other Go programs.
In appendix C, we explain how to install GNU Go and Pachi, two common Go programs available for practically all operating systems. We recommend installing both, so please make sure to have both programs on your system. You don’t need any frontends, just the plain command-line tools. If you have GNU Go installed, you can start it in GTP mode by running the following:
gnugo --mode gtp
Using this mode, you can now explore how GTP works. GTP is a text-based protocol, so you can type commands into your terminal and hit Enter. For instance, to set up a 9 × 9 board, you can type boardsize 9. This will trigger GNU Go to return a response and acknowledge that the command has been executed correctly. Every successful GTP command triggers a response starting with the symbol =, whereas failed commands lead to a ?. To check the current board state, you can issue the command showboard, which will print out an empty 9 × 9 board, as expected.
In actual game play, two commands are the most important: genmove and play. The first command, genmove, is used to ask a GTP bot to generate the next move. The GTP bot will usually also apply this move to its game state internally. All this command needs as arguments is the player color, either black or white. For instance, to generate a white move and place it on GNU Go’s board, type genmove white. This will lead to a response such as = C4, meaning GNU Go accepts this command (=) and places a white stone at C4. As you can see, GTP accepts standard coordinates as introduced in chapters 2 and 3.
The other game-play relevant move for us is play. This command is used to let a GTP bot know it has to play a move on the board. For instance, you could tell GNU Go that you want it to play a black move on D4 by issuing play black D4, which will return an = to acknowledge this command. When two bots play against each other, they’ll take turns asking each other to genmove the next move, and then play the move from the response on their own board. This is all pretty straightforward—but we left out many details. A complete GTP client has a lot more commands to handle, ranging from handling handicap stones to managing time settings and counting rules. If you’re interested in the details of GTP, see http://mng.bz/MWNQ. Having said that, at a basic level genmove and play will be enough to let your deep-learning bots play against GNU Go and Pachi.
To handle GTP and wrap your Agent concept so it can exchange Go moves by using this protocol, you create a new dlgo module called gtp. You can still try to follow the implementation alongside this main text, but from this chapter on, we suggest directly following our implementation on GitHub at http://mng.bz/a4Wj.
To start, let’s formalize what a GTP command is. To do so, we have to note that on many Go servers, commands get a sequence number to make sure that we can match commands and responses. These sequence numbers are optional and can be None. For us, a GTP command consists of a sequence number, a command, and potentially multiple arguments to that command. You place this definition in command.py in the gtp module.
class Command: def __init__(self, sequence, name, args): self.sequence = sequence self.name = name self.args = tuple(args) def __eq__(self, other): return self.sequence == other.sequence and self.name == other.name and self.args == other.args def __repr__(self): return 'Command(%r, %r, %r)' % (self.sequence, self.name, self.args) def __str__(self): return repr(self)
Next, you want to parse text input from the command line into Command. For instance, parsing “999 play white D4” should result in Command(999, 'play', ('white', 'D4')). The parse function used for this goes into command.py as well.
def parse(command_string): pieces = command_string.split() try: sequence = int(pieces[0]) 1 pieces = pieces[1:] except ValueError: 2 sequence = None name, args = pieces[0], pieces[1:] return Command(sequence, name, args)
We’ve just argued that GTP coordinates come in standard notation, so parsing GTP coordinates into Board positions and vice versa is simple. You define two helper functions to convert between coordinates and positions in board.py within gtp.
from dlgo.gotypes import Point from dlgo.goboard_fast import Move def coords_to_gtp_position(move): point = move.point return COLS[point.col - 1] + str(point.row) def gtp_position_to_coords(gtp_position): col_str, row_str = gtp_position[0], gtp_position[1:] point = Point(int(row_str), COLS.find(col_str.upper()) + 1) return Move(point)
Now that you understand the basics of GTP, let’s dive right into an application and build a program that loads one of your bots and lets it compete against either GNU Go or Pachi. Before we present this program, we have just one technicality left to resolve—when our bot should resign a game or pass.
At the current development status, your deep-learning bots have no means of knowing when to stop playing. The way you designed them so far, your bot will always pick the best move to play. This can be detrimental toward the end of the game, when it might be better to pass or even resign when the situation looks a little too bad. For this reason, you’ll impose termination strategies: you’ll explicitly tell the bot when to stop. In chapters 13 and 14, you’ll learn powerful techniques that’ll render this entirely useless (your bot will learn to judge the current board situation and thereby learn that sometimes it’s best to stop). But for now, this concept is useful and will help you on the way to deploy a bot against other opponents.
You build the following TerminationStrategy in a file called termination.py in the agent module of dlgo. All it does is decide when you should pass or resign—and by default, you never pass or resign.
from dlgo import goboard from dlgo.agent.base import Agent from dlgo import scoring class TerminationStrategy: def __init__(self): pass def should_pass(self, game_state): return False def should_resign(self, game_state): return False
A simple heuristic for stopping game play is to pass when your opponent passes. You have to rely on the fact that your opponent knows when to pass, but it’s a start, and it works well against GNU Go and Pachi.
class PassWhenOpponentPasses(TerminationStrategy): def should_pass(self, game_state): if game_state.last_move is not None: return True if game_state.last_move.is_pass else False def get(termination): if termination == 'opponent_passes': return PassWhenOpponentPasses() else: raise ValueError("Unsupported termination strategy: {}" .format(termination))
In termination.py, you also find another strategy called ResignLargeMargin that resigns whenever the estimated score of the game goes too much in favor of the opponent. You can cook up many other such strategies, but keep in mind that ultimately you can get rid of this crutch with machine learning.
The last thing you need in order to let bots play against each other is to equip an Agent with a TerminationStrategy so as to pass and resign when appropriate. This TerminationAgent class goes into termination.py as well.
class TerminationAgent(Agent): def __init__(self, agent, strategy=None): Agent.__init__(self) self.agent = agent self.strategy = strategy if strategy is not None else TerminationStrategy() def select_move(self, game_state): if self.strategy.should_pass(game_state): return goboard.Move.pass_turn() elif self.strategy.should_resign(game_state): return goboard.Move.resign() else: return self.agent.select_move(game_state)
Having discussed termination strategies, you can now turn to pairing your Go bots with other programs. Under play_local.py in the gtp module, find a script that sets up a game between one of your bots and either GNU Go or Pachi. Go through this script step-by-step, starting with the necessary imports.
import subprocess import re import h5py from dlgo.agent.predict import load_prediction_agent from dlgo.agent.termination import PassWhenOpponentPasses, TerminationAgent from dlgo.goboard_fast import GameState, Move from dlgo.gotypes import Player from dlgo.gtp.board import gtp_position_to_coords, coords_to_gtp_position from dlgo.gtp.utils import SGFWriter from dlgo.utils import print_board from dlgo.scoring import compute_game_result
You should recognize most of the imports, with the exception of SGFWriter. This is a little utility class from dlgo.gtp.utils that keeps track of the game and writes an SGF file at the end.
To initialize your game runner LocalGtpBot, you need to provide a deep-learning agent and optionally a termination strategy. Also, you can specify how many handicap stones should be used and which bot opponent should be played against. For the latter, you can choose between gnugo and pachi. LocalGtpBot will initialize either one of these programs as subprocesses, and both your bot and its opponent will communicate over GTP.
class LocalGtpBot: def __init__(self, go_bot, termination=None, handicap=0, opponent='gnugo', output_sgf="out.sgf", our_color='b'): self.bot = TerminationAgent(go_bot, termination) 1 self.handicap = handicap self._stopped = False 2 self.game_state = GameState.new_game(19) self.sgf = SGFWriter(output_sgf) 3 self.our_color = Player.black if our_color == 'b' else Player.white self.their_color = self.our_color.other cmd = self.opponent_cmd(opponent) 4 pipe = subprocess.PIPE self.gtp_stream = subprocess.Popen( cmd, stdin=pipe, stdout=pipe 5 ) @staticmethod def opponent_cmd(opponent): if opponent == 'gnugo': return ["gnugo", "--mode", "gtp"] elif opponent == 'pachi': return ["pachi"] else: raise ValueError("Unknown bot name {}".format(opponent))
One of the main methods used in the tool we’re demonstrating here is command_and_response, which sends out a GTP command and reads back the response for this command.
def send_command(self, cmd): self.gtp_stream.stdin.write(cmd.encode('utf-8')) def get_response(self): succeeded = False result = '' while not succeeded: line = self.gtp_stream.stdout.readline() if line[0] == '=': succeeded = True line = line.strip() result = re.sub('^= ?', '', line) return result def command_and_response(self, cmd): self.send_command(cmd) return self.get_response()
Playing a game works as follows:
def run(self): self.command_and_response("boardsize 19 ") self.set_handicap() self.play() self.sgf.write_sgf() def set_handicap(self): if self.handicap == 0: self.command_and_response("komi 7.5 ") self.sgf.append("KM[7.5] ") else: stones = self.command_and_response("fixed_handicap {} ".format(self.handicap)) sgf_handicap = "HA[{}]AB".format(self.handicap) for pos in stones.split(" "): move = gtp_position_to_coords(pos) self.game_state = self.game_state.apply_move(move) sgf_handicap = sgf_handicap + "[" + self.sgf.coordinates(move) + "]" self.sgf.append(sgf_handicap + " ")
The game-play logic for your bot clash is simple: while none of the opponents stop, take turns and continue to play moves. The bots do that in methods called play_our_move and play_their_move, respectively. You also clear the screen, and print out the current board situation and a crude estimate of the outcome.
def play(self): while not self._stopped: if self.game_state.next_player == self.our_color: self.play_our_move() else: self.play_their_move() print(chr(27) + "[2J") print_board(self.game_state.board) print("Estimated result: ") print(compute_game_result(self.game_state))
Playing moves for your bot means asking it to generate a move with select_move, applying it to your board, and then translating the move and sending it over GTP. This needs special treatment for passing and resigning.
def play_our_move(self): move = self.bot.select_move(self.game_state) self.game_state = self.game_state.apply_move(move) our_name = self.our_color.name our_letter = our_name[0].upper() sgf_move = "" if move.is_pass: self.command_and_response("play {} pass ".format(our_name)) elif move.is_resign: self.command_and_response("play {} resign ".format(our_name)) else: pos = coords_to_gtp_position(move) self.command_and_response("play {} {} ".format(our_name, pos)) sgf_move = self.sgf.coordinates(move) self.sgf.append(";{}[{}] ".format(our_letter, sgf_move))
Letting your opponent play a move is structurally similar to your move. You ask GNU Go or Pachi to genmove a move, and you have to take care of converting the GTP response into a move that your bot understands. The only other thing you have to do is stop the game when your opponent resigns or both players pass.
def play_their_move(self): their_name = self.their_color.name their_letter = their_name[0].upper() pos = self.command_and_response("genmove {} ".format(their_name)) if pos.lower() == 'resign': self.game_state = self.game_state.apply_move(Move.resign()) self._stopped = True elif pos.lower() == 'pass': self.game_state = self.game_state.apply_move(Move.pass_turn()) self.sgf.append(";{}[] ".format(their_letter)) if self.game_state.last_move.is_pass: self._stopped = True else: move = gtp_position_to_coords(pos) self.game_state = self.game_state.apply_move(move) self.sgf.append(";{}[{}] ".format(their_letter, self.sgf.coordinates(move)))
That concludes your play_local.py implementation, and you can now test it as follows.
from dlgo.gtp.play_local import LocalGtpBot from dlgo.agent.termination import PassWhenOpponentPasses from dlgo.agent.predict import load_prediction_agent import h5py bot = load_prediction_agent(h5py.File("../agents/betago.hdf5", "r")) gtp_bot = LocalGtpBot(go_bot=bot, termination=PassWhenOpponentPasses(), handicap=0, opponent='pachi') gtp_bot.run()
You should see the way the game between the bots unfolds, as shown in figure 8.4.
In the top part of the figure, you see the board printed by you, followed by your current estimate. In the lower half, you see Pachi’s game state (which is identical to yours) on the left, and on the right Pachi gives you an estimation of its current assessment of the game in terms of which part of the board it thinks belongs to which player.
This is a hopefully convincing and exciting demo of what your bot can do by now, but it’s not the end of the story. In the next section, we go one step further and show you how to connect your bot to a real-life Go server.
Note that play_local.py is really a tiny Go server for two bot opponents to play against each other. It accepts and sends GTP commands and knows when to start and finish a game. This produces overhead, because the program takes the role of a referee that controls how the opponents interact.
If you want to connect a bot to an actual Go server, this server will take care of all the game-play logic, and you can focus entirely on sending and receiving GTP commands. On the one hand, your fate becomes easier because you have less to worry about. On the other hand, connecting to a proper Go server means that you have to make sure to support the full range of GTP commands supported by that server, because otherwise your bot may crash.
To ensure that this doesn’t happen, let’s formalize the processing of GTP commands a little more. First, you implement a proper GTP response class for successful and failed commands.
class Response: def __init__(self, status, body): self.success = status self.body = body def success(body=''): 1 return Response(status=True, body=body) def error(body=''): 2 return Response(status=False, body=body) def bool_response(boolean): 3 return success('true') if boolean is True else success('false') def serialize(gtp_command, gtp_response): 4 return '{}{} {} '.format( '=' if gtp_response.success else '?', '' if gtp_command.sequence is None else str(gtp_command.sequence), gtp_response.body )
This leaves you with implementing the main class for this section, GTPFrontend. You put this class into frontend.py in the gtp module. You need the following imports, including command and response from your gtp module.
import sys from dlgo.gtp import command, response from dlgo.gtp.board import gtp_position_to_coords, coords_to_gtp_position from dlgo.goboard_fast import GameState, Move from dlgo.agent.termination import TerminationAgent from dlgo.utils import print_board
To initialize a GTP frontend, you need to specify an Agent instance and an optional termination strategy. GTPFrontend will then instantiate a dictionary of GTP events that you process. Each of these events, which includes common commands like play and others, will have to be implemented by you.
HANDICAP_STONES = { 2: ['D4', 'Q16'], 3: ['D4', 'Q16', 'D16'], 4: ['D4', 'Q16', 'D16', 'Q4'], 5: ['D4', 'Q16', 'D16', 'Q4', 'K10'], 6: ['D4', 'Q16', 'D16', 'Q4', 'D10', 'Q10'], 7: ['D4', 'Q16', 'D16', 'Q4', 'D10', 'Q10', 'K10'], 8: ['D4', 'Q16', 'D16', 'Q4', 'D10', 'Q10', 'K4', 'K16'], 9: ['D4', 'Q16', 'D16', 'Q4', 'D10', 'Q10', 'K4', 'K16', 'K10'], } class GTPFrontend: def __init__(self, termination_agent, termination=None): self.agent = termination_agent self.game_state = GameState.new_game(19) self._input = sys.stdin self._output = sys.stdout self._stopped = False self.handlers = { 'boardsize': self.handle_boardsize, 'clear_board': self.handle_clear_board, 'fixed_handicap': self.handle_fixed_handicap, 'genmove': self.handle_genmove, 'known_command': self.handle_known_command, 'komi': self.ignore, 'showboard': self.handle_showboard, 'time_settings': self.ignore, 'time_left': self.ignore, 'play': self.handle_play, 'protocol_version': self.handle_protocol_version, 'quit': self.handle_quit, }
After you start a game with the following run method, you continually read GTP commands that are forwarded to the respective event handler, which is done by the process method.
def run(self): while not self._stopped: input_line = self._input.readline().strip() cmd = command.parse(input_line) resp = self.process(cmd) self._output.write(response.serialize(cmd, resp)) self._output.flush() def process(self, cmd): handler = self.handlers.get(cmd.name, self.handle_unknown) return handler(*cmd.args)
What’s left to complete this GTPFrontend is the implementation of the individual GTP commands. The following listing shows the three most important ones; we refer you to the GitHub repository for the rest.
def handle_play(self, color, move): if move.lower() == 'pass': self.game_state = self.game_state.apply_move(Move.pass_turn()) elif move.lower() == 'resign': self.game_state = self.game_state.apply_move(Move.resign()) else: self.game_state = self.game_state.apply_move(gtp_position_to_coords(move)) return response.success() def handle_genmove(self, color): move = self.agent.select_move(self.game_state) self.game_state = self.game_state.apply_move(move) if move.is_pass: return response.success('pass') if move.is_resign: return response.success('resign') return response.success(coords_to_gtp_position(move)) def handle_fixed_handicap(self, nstones): nstones = int(nstones) for stone in HANDICAP_STONES[nstones]: self.game_state = self.game_state.apply_move( gtp_position_to_coords(stone)) return response.success()
You can now use this GTP frontend in a little script to start it from the command line.
from dlgo.gtp import GTPFrontend from dlgo.agent.predict import load_prediction_agent from dlgo.agent import termination import h5py model_file = h5py.File("agents/betago.hdf5", "r") agent = load_prediction_agent(model_file) strategy = termination.get("opponent_passes") termination_agent = termination.TerminationAgent(agent, strategy) frontend = GTPFrontend(termination_agent) frontend.run()
After this program runs, you can use it in exactly the same way you tested GNU Go in section 8.4: you can throw GTP commands at it, and it’ll process them properly. Go ahead and test it by generating a move with genmove or printing out the board state with showboard. Any command covered in your event handler in GTPFrontend is feasible.
Now that your GTP frontend is complete and works in the same way as GNU Go and Pachi locally, you can register your bots at an online platform that uses GTP for communication. You’ll find that most popular Go servers are based on GTP, and appendix C covers three of them explicitly. One of the most popular servers in Europe and North America is the Online Go Server (OGS). We’ve chosen OGS as the platform to show you how to run a bot, but you could do the same thing with most other platforms as well.
Because the registration process for your bot at OGS is somewhat involved and the piece of software that connects your bot to OGS is a tool written in JavaScript, we’ve put this part into appendix E. You can either read this appendix now and come back here, or skip it if you’re not interested in running your own bot online. When you complete appendix E, you’ll have learned the following skills:
This will allow you to enter a (ranked) game against your own creation online. Also, everyone with an OGS account can play your bot at this point, which can be motivating to see. On top of that, your bot could even enter tournaments hosted on OGS!