Model training on AWS

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Appendix D. Training and deploying bots by using Amazon Web Services

In this appendix, you’re going to learn how to use the cloud service Amazon Web Services (AWS) to build and deploy your deep-learning models. Knowing how to use a cloud service and hosting models is a useful skill in general, not only for this Go bot use case. You’ll learn the following skills:

Setting up a virtual server with AWS to train deep-learning models
Running deep-learning experiments in the cloud
Deploying a Go bot with a web interface on a server to share with others

Although as of this writing AWS is the largest cloud provider in the world and provides many benefits, we could’ve chosen many other cloud services for this appendix. Because the big cloud providers largely overlap in terms of their offerings, getting started with one will help you know the others as well.

To get started with AWS, head over to https://aws.amazon.com/ to see the large product range AWS has to offer. Amazon’s cloud service gives you access to an almost intimidatingly large number of products, but for this book you’ll get pretty far by using just a single service: Amazon Elastic Compute Cloud (EC2). EC2 gives you easy access to virtual servers in the cloud. Depending on your needs, you can equip these servers or instances with various hardware specifications. To train deep neural networks efficiently, you need access to strong GPUs. Although AWS may not always provide the latest generation of GPUs, flexibly buying compute time on a cloud GPU is a good way to get started without investing too much in hardware up front.

The first thing you need to do is register an account with AWS at https://portal.aws.amazon.com/billing/signup; fill out the form shown in figure D.1.

Figure D.1. Signing up for an AWS account

After signing up, at the top right of the page (https://aws.amazon.com/) you should click Sign in to the Console and enter your account credentials. This redirects you to the main dashboard for your account. From the top menu bar, click Services, which opens a panel displaying the AWS core products. Click the EC2 option in the Compute category, as shown in figure D.2.

Figure D.2. Selecting the Elastic Cloud Compute (EC2) service from the Services menu

This puts you into the EC2 dashboard, which gives you an overview of your currently running instances and their statuses. Given that you just signed up, you should see 0 running instances. To launch a new instance, click the Launch Instance button, as shown in figure D.3.

Figure D.3. Launching a new AWS instance

At this point, you’re asked to select an Amazon Machine Image (AMI), which is a blueprint for the software that will be available to you on your launched instance. To get started quickly, you’ll choose an AMI that’s specifically tailored for deep-learning applications. On the left sidebar, you’ll find the AWS Marketplace (see figure D.4), which has a lot of useful third-party AMIs.

Figure D.4. Selecting the AWS Marketplace

In the Marketplace, search for Deep Learning AMI Ubuntu, as shown in figure D.5. As the name suggests, this instance runs on Ubuntu Linux and has many useful components already preinstalled. For instance, on this AMI, you’ll find TensorFlow and Keras available, plus all the necessary GPU drivers already installed for you. Therefore, when the instance is ready, you can get right into your deep-learning application, instead of spending time and effort installing software.

Figure D.5. Choosing an AMI suited for deep learning

Choosing this particular AMI is cheap but doesn’t come entirely for free. If you want to play with a free instance instead, look for the free tier eligible tag. For example, in the Quick Start section shown previously in figure D.4, most of the AMIs shown there you can get for free.

After clicking Select for the AMI of your choice, a tab opens that shows you the pricing for this AMI, depending on which instance type you choose; see figure D.6.

Figure D.6. Pricing for your deep-learning AMI, depending on the instance you choose

Continuing, you can now choose your instance type. In figure D.7, you see all instance types optimized for GPU performance. Selecting p2.xlarge is a good option to get started, but keep in mind that all GPU instances are comparatively expensive. If you first want to get a feel for AWS and familiarize yourself with the features presented here, go for an inexpensive t2.small instance first. If you’re interested only in deploying and hosting models, a t2.small instance will be sufficient anyway; it’s only the model training that requires more-expensive GPU instances.

Figure D.7. Selecting the right instance type for your needs

After you’ve chosen an instance type, you could directly click the Review and Launch button in the lower right to immediately launch the instance. But because you still need to configure a few things, you’ll instead opt for Next: Configure Instance Details. Steps 3 to 5 in the dialog box that follows can be safely skipped for now, but step 6 (Configure Security Group) requires some attention. A security group on AWS specifies access rights to the instance by defining rules. You want to grant the following access rights:

Primarily, you want to access your instance by logging in through SSH. The SSH port 22 on the instance should already by open (this is the only specified rule on new instances), but you need to restrict access and allow connections from only your local machine. You do this for security reasons, so that nobody else can access your AWS instance; only your IP is granted access. This can be achieved by selecting My IP under Source.
Because you also want to deploy a web application, and later even a bot that connects to other Go servers, you’ll also have to open HTTP port 80. You do so by first clicking Add Rule and selecting HTTP as the type. This will automatically select port 80 for you. Because you want to allow people to connect to your bot from anywhere, you should select Anywhere as the Source.
The HTTP Go bot from chapter 8 runs on port 5000, so you should open this port as well. In production scenarios, you’d normally deploy a suitable web server listening on port 80 (which you configured in the preceding step); this internally redirects traffic to port 5000. To keep things simple, you trade security for convenience and open port 5000 directly. You can do this by adding another rule and selecting Custom TCP Rule as the type and 5000 as the port range. As for the HTTP port, you set the source to Anywhere. This will prompt a security warning, which you’ll ignore because you’re not dealing with any sensitive or proprietary data or application.

If you configured the access rules as we just described, your settings should look like those in figure D.8.

Figure D.8. Configuring security groups for your AWS instance

After completing security settings, you can click Review and Launch and then Launch. This opens a window that will ask you to create a new key pair or select an existing one. You need to select Create a New Pair from the drop-down menu. The only thing you need to do is select a key pair name and then download the secret key by clicking Download Key Pair. The downloaded key will have the name you’ve given it, with a .pem file signature. Make sure to store this private key in a secure location. The public key for your private key is managed by AWS and will be put on the instance you’re about to launch. With the private key, you can then connect to the instance. After you’ve created a key, you can reuse it in the future by selecting Choose an Existing Key Pair. In figure D.9, you see how we created a key pair called maxpumperla_aws.pem.

Figure D.9. Creating a new key pair to access your AWS instance

This was the final step, and you can now launch your instance by clicking Launch Instance. You’ll see an overview called Launch Status, and you can proceed by selecting View Instances in the lower right. This puts you back into the EC2 main dashboard from which you started (by selecting Launch Instances). You should see your instance listed there. After waiting for a bit, you should see that the instance state is “running” and see a green dot next to the state. This means your instance is ready, and you can now connect to it. You do so by first selecting the check box to the left of the instance, which activates the Connect button on top. Clicking this button opens a window that looks like the one shown in figure D.10.

Figure D.10. Creating a new key pair to access your AWS instance

This window contains a lot of useful information for connecting to your instance, so read it carefully. In particular, it gives you instructions on how to connect to your instance with ssh. If you open a terminal and then copy and paste the ssh command listed under Example, you should establish a connection to your AWS instance. This command is as follows:

ssh -i "<full-path-to-secret-key-pem>" <username>@<public-dns-of-your-instance>

This is a long command that can be a little inconvenient to work with, especially when you’re handling many instances or SSH connections to other machines as well. To make life easier, we’re going to work with an SSH configuration file. In UNIX environments, this configuration file is usually stored at ~/.ssh/config. On other systems, this path may vary. Create this file and the .ssh folder, if necessary, and put the following content into this file:

Host aws
  HostName <public-dns-of-your-instance>
  User ubuntu
  Port 22
  IdentityFile <full-path-to-secret-key-pem>

Having stored this file, you can now connect to your instance by typing ssh aws into your terminal. When you first connect, you’re asked whether you want to connect. Type yes and submit this command by pressing Enter. Your key will be added permanently to the instance (which you can check by running cat ~/.ssh/authorized_keys to return a secure hash of your key pair), and you won’t be asked again.

The first time you successfully log into the instance of the Deep Learning AMI Ubuntu AMI (in case you went with this one), you’ll be offered a few Python environments to choose from. An option that gives you a full Keras and TensorFlow installation for Python 3.6. is source activate tensorflow_p36, or source activate tensorflow_p27 if you prefer to go with Python 2.7. For the rest of this appendix, we assume you skip this and work with the basic Python version already provided on this instance.

Before you proceed to running applications on your instance, let’s quickly discuss how to terminate an instance. This is important to know, because if you forget to shut down an expensive instance, you can easily end up with a few hundred dollars of costs per month. To terminate an instance, you select it (by clicking the check box next to it, as you did before) and then click the Actions button at the top of the page, followed by Instance State and Terminate. Terminating an instance deletes it, including everything you stored on it. Make sure to copy everything you need (for instance, the model you trained) before termination (we’ll show you how in just a bit). Another option is to Stop the instance, which allows you to Start it at a later point. Note, however, that depending on the storage that your instance is equipped with, this might still lead to data loss. You’ll be prompted with a warning in this situation.

Model training on AWS

Running a deep-learning model on AWS works the same way as running it locally, after you have everything in place. You first need to make sure you have all the code and data you need on the instance. An easy way to do that is by copying it there in a secure way by using scp. For example, from your local machine, you can run the following commands to compute an end-to-end example:

git clone https://github.com/maxpumperla/deep_learning_and_the_game_of_go
cd deep_learning_and_the_game_of_go
scp -r ./code aws:~/code              1
ssh aws                               2
cd ~/code
python setup.py develop               3
cd examples
python end_to_end.py                  4

1 Copy code from local to your remote AWS instance.
2 Log into the instance with ssh.
3 Install your dlgo Python library.
4 Run an end-to-end example.

In this example, we assume you start fresh by cloning our GitHub repository first. In practice, you’ll have done this already and want to build your own experiments instead. You do this by creating the deep neural networks you want to train and running the examples you want. The example end_to_end.py we just presented will produce a serialized deep-learning bot in the following path relative to the examples folder: ../agents/deep_bot.h5. After the example runs, you can either leave the model there (for example, to host it or continue working on it) or retrieve it from the AWS instance and copy it back to your machine. For instance, from a terminal on your local machine, you can copy a bot called deep_bot.h5 from AWS to local as follows:

cd deep_learning_and_the_game_of_go/code
scp aws:~/code/agents/deep_bot.h5 ./agents

This makes for a relatively lean model-training workflow that we can summarize as follows:

Set up and test your deep-learning experiments locally by using the dlgo framework.
Securely copy the changes you made to your AWS instance.
Log into the remote machine and start your experiment.
After training finishes, evaluate your results, adapt your experiment, and start a new experimentation cycle at 1.
If you wish, copy your trained model to your local machine for future use or process it otherwise.

Hosting a bot on AWS over HTTP

Chapter 8 showed you how to serve a bot over HTTP so you and your friends can play against it through a convenient web interface. The drawback was that you simply started a Python web server locally on your machine. So for others to test your bot, they’d have to have direct access to your computer. By deploying this web application on AWS and opening the necessary ports (as you did when setting up the instance), you can share your bot with others by sharing a URL.

Running your HTTP frontend works the same way as before. All you need to do is the following:

ssh aws
cd ~/code
python web_demo.py 
  --bind-address 0.0.0.0 
  --pg-agent agents/9x9_from_nothing/round_007.hdf5 
  --predict-agent  agents/betago.hdf5

This hosts a playable demo of your bot on AWS and makes it available under the following address:

http://<public-dns-of-your-instance>:5000/static/play_predict_19.html

That’s it! In appendix E, we’ll go one step further and show you how to use the AWS basics presented here to deploy a full-blown bot connecting to the Online Go Server (OGS) using the Go Text Protocol (GTP).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Model training on AWS

Create new playlist

Sign In

Sign Up

Appendix D. Training and deploying bots by using Amazon Web Services

Figure D.1. Signing up for an AWS account

Figure D.2. Selecting the Elastic Cloud Compute (EC2) service from the Services menu

Figure D.3. Launching a new AWS instance

Figure D.4. Selecting the AWS Marketplace

Figure D.5. Choosing an AMI suited for deep learning

Figure D.6. Pricing for your deep-learning AMI, depending on the instance you choose

Figure D.7. Selecting the right instance type for your needs

Figure D.8. Configuring security groups for your AWS instance

Figure D.9. Creating a new key pair to access your AWS instance

Figure D.10. Creating a new key pair to access your AWS instance

Model training on AWS

Hosting a bot on AWS over HTTP

Table of Contents for
Model training on AWS