In this appendix, you’re going to learn how to use the cloud service Amazon Web Services (AWS) to build and deploy your deep-learning models. Knowing how to use a cloud service and hosting models is a useful skill in general, not only for this Go bot use case. You’ll learn the following skills:
Although as of this writing AWS is the largest cloud provider in the world and provides many benefits, we could’ve chosen many other cloud services for this appendix. Because the big cloud providers largely overlap in terms of their offerings, getting started with one will help you know the others as well.
To get started with AWS, head over to https://aws.amazon.com/ to see the large product range AWS has to offer. Amazon’s cloud service gives you access to an almost intimidatingly large number of products, but for this book you’ll get pretty far by using just a single service: Amazon Elastic Compute Cloud (EC2). EC2 gives you easy access to virtual servers in the cloud. Depending on your needs, you can equip these servers or instances with various hardware specifications. To train deep neural networks efficiently, you need access to strong GPUs. Although AWS may not always provide the latest generation of GPUs, flexibly buying compute time on a cloud GPU is a good way to get started without investing too much in hardware up front.
The first thing you need to do is register an account with AWS at https://portal.aws.amazon.com/billing/signup; fill out the form shown in figure D.1.
After signing up, at the top right of the page (https://aws.amazon.com/) you should click Sign in to the Console and enter your account credentials. This redirects you to the main dashboard for your account. From the top menu bar, click Services, which opens a panel displaying the AWS core products. Click the EC2 option in the Compute category, as shown in figure D.2.
This puts you into the EC2 dashboard, which gives you an overview of your currently running instances and their statuses. Given that you just signed up, you should see 0 running instances. To launch a new instance, click the Launch Instance button, as shown in figure D.3.
At this point, you’re asked to select an Amazon Machine Image (AMI), which is a blueprint for the software that will be available to you on your launched instance. To get started quickly, you’ll choose an AMI that’s specifically tailored for deep-learning applications. On the left sidebar, you’ll find the AWS Marketplace (see figure D.4), which has a lot of useful third-party AMIs.
In the Marketplace, search for Deep Learning AMI Ubuntu, as shown in figure D.5. As the name suggests, this instance runs on Ubuntu Linux and has many useful components already preinstalled. For instance, on this AMI, you’ll find TensorFlow and Keras available, plus all the necessary GPU drivers already installed for you. Therefore, when the instance is ready, you can get right into your deep-learning application, instead of spending time and effort installing software.
Choosing this particular AMI is cheap but doesn’t come entirely for free. If you want to play with a free instance instead, look for the free tier eligible tag. For example, in the Quick Start section shown previously in figure D.4, most of the AMIs shown there you can get for free.
After clicking Select for the AMI of your choice, a tab opens that shows you the pricing for this AMI, depending on which instance type you choose; see figure D.6.
Continuing, you can now choose your instance type. In figure D.7, you see all instance types optimized for GPU performance. Selecting p2.xlarge is a good option to get started, but keep in mind that all GPU instances are comparatively expensive. If you first want to get a feel for AWS and familiarize yourself with the features presented here, go for an inexpensive t2.small instance first. If you’re interested only in deploying and hosting models, a t2.small instance will be sufficient anyway; it’s only the model training that requires more-expensive GPU instances.
After you’ve chosen an instance type, you could directly click the Review and Launch button in the lower right to immediately launch the instance. But because you still need to configure a few things, you’ll instead opt for Next: Configure Instance Details. Steps 3 to 5 in the dialog box that follows can be safely skipped for now, but step 6 (Configure Security Group) requires some attention. A security group on AWS specifies access rights to the instance by defining rules. You want to grant the following access rights:
If you configured the access rules as we just described, your settings should look like those in figure D.8.
After completing security settings, you can click Review and Launch and then Launch. This opens a window that will ask you to create a new key pair or select an existing one. You need to select Create a New Pair from the drop-down menu. The only thing you need to do is select a key pair name and then download the secret key by clicking Download Key Pair. The downloaded key will have the name you’ve given it, with a .pem file signature. Make sure to store this private key in a secure location. The public key for your private key is managed by AWS and will be put on the instance you’re about to launch. With the private key, you can then connect to the instance. After you’ve created a key, you can reuse it in the future by selecting Choose an Existing Key Pair. In figure D.9, you see how we created a key pair called maxpumperla_aws.pem.
This was the final step, and you can now launch your instance by clicking Launch Instance. You’ll see an overview called Launch Status, and you can proceed by selecting View Instances in the lower right. This puts you back into the EC2 main dashboard from which you started (by selecting Launch Instances). You should see your instance listed there. After waiting for a bit, you should see that the instance state is “running” and see a green dot next to the state. This means your instance is ready, and you can now connect to it. You do so by first selecting the check box to the left of the instance, which activates the Connect button on top. Clicking this button opens a window that looks like the one shown in figure D.10.
This window contains a lot of useful information for connecting to your instance, so read it carefully. In particular, it gives you instructions on how to connect to your instance with ssh. If you open a terminal and then copy and paste the ssh command listed under Example, you should establish a connection to your AWS instance. This command is as follows:
ssh -i "<full-path-to-secret-key-pem>" <username>@<public-dns-of-your-instance>
This is a long command that can be a little inconvenient to work with, especially when you’re handling many instances or SSH connections to other machines as well. To make life easier, we’re going to work with an SSH configuration file. In UNIX environments, this configuration file is usually stored at ~/.ssh/config. On other systems, this path may vary. Create this file and the .ssh folder, if necessary, and put the following content into this file:
Host aws HostName <public-dns-of-your-instance> User ubuntu Port 22 IdentityFile <full-path-to-secret-key-pem>
Having stored this file, you can now connect to your instance by typing ssh aws into your terminal. When you first connect, you’re asked whether you want to connect. Type yes and submit this command by pressing Enter. Your key will be added permanently to the instance (which you can check by running cat ~/.ssh/authorized_keys to return a secure hash of your key pair), and you won’t be asked again.
The first time you successfully log into the instance of the Deep Learning AMI Ubuntu AMI (in case you went with this one), you’ll be offered a few Python environments to choose from. An option that gives you a full Keras and TensorFlow installation for Python 3.6. is source activate tensorflow_p36, or source activate tensorflow_p27 if you prefer to go with Python 2.7. For the rest of this appendix, we assume you skip this and work with the basic Python version already provided on this instance.
Before you proceed to running applications on your instance, let’s quickly discuss how to terminate an instance. This is important to know, because if you forget to shut down an expensive instance, you can easily end up with a few hundred dollars of costs per month. To terminate an instance, you select it (by clicking the check box next to it, as you did before) and then click the Actions button at the top of the page, followed by Instance State and Terminate. Terminating an instance deletes it, including everything you stored on it. Make sure to copy everything you need (for instance, the model you trained) before termination (we’ll show you how in just a bit). Another option is to Stop the instance, which allows you to Start it at a later point. Note, however, that depending on the storage that your instance is equipped with, this might still lead to data loss. You’ll be prompted with a warning in this situation.
Running a deep-learning model on AWS works the same way as running it locally, after you have everything in place. You first need to make sure you have all the code and data you need on the instance. An easy way to do that is by copying it there in a secure way by using scp. For example, from your local machine, you can run the following commands to compute an end-to-end example:
git clone https://github.com/maxpumperla/deep_learning_and_the_game_of_go cd deep_learning_and_the_game_of_go scp -r ./code aws:~/code 1 ssh aws 2 cd ~/code python setup.py develop 3 cd examples python end_to_end.py 4
In this example, we assume you start fresh by cloning our GitHub repository first. In practice, you’ll have done this already and want to build your own experiments instead. You do this by creating the deep neural networks you want to train and running the examples you want. The example end_to_end.py we just presented will produce a serialized deep-learning bot in the following path relative to the examples folder: ../agents/deep_bot.h5. After the example runs, you can either leave the model there (for example, to host it or continue working on it) or retrieve it from the AWS instance and copy it back to your machine. For instance, from a terminal on your local machine, you can copy a bot called deep_bot.h5 from AWS to local as follows:
cd deep_learning_and_the_game_of_go/code scp aws:~/code/agents/deep_bot.h5 ./agents
This makes for a relatively lean model-training workflow that we can summarize as follows:
Chapter 8 showed you how to serve a bot over HTTP so you and your friends can play against it through a convenient web interface. The drawback was that you simply started a Python web server locally on your machine. So for others to test your bot, they’d have to have direct access to your computer. By deploying this web application on AWS and opening the necessary ports (as you did when setting up the instance), you can share your bot with others by sharing a URL.
Running your HTTP frontend works the same way as before. All you need to do is the following:
ssh aws cd ~/code python web_demo.py --bind-address 0.0.0.0 --pg-agent agents/9x9_from_nothing/round_007.hdf5 --predict-agent agents/betago.hdf5
This hosts a playable demo of your bot on AWS and makes it available under the following address:
http://<public-dns-of-your-instance>:5000/static/play_predict_19.html
That’s it! In appendix E, we’ll go one step further and show you how to use the AWS basics presented here to deploy a full-blown bot connecting to the Online Go Server (OGS) using the Go Text Protocol (GTP).