So far, we have been using the GitHub API without authentication. This limits us to sixty requests per hour. Now that we can query the API in parallel, we could exceed this limit in seconds.
Fortunately, GitHub is much more generous if you authenticate when you query the API. The limit increases to 5,000 requests per hour. You must have a GitHub user account to authenticate, so go ahead and create one now if you need to. After creating an account, navigate to https://github.com/settings/tokens and click on the Generate new token button. Accept the default settings and enter a token description and a long hexadecimal number should appear on the screen. Copy the token for now.
Before using our newly generated token, let's take a few minutes to review how HTTP works.
HTTP is a protocol for transferring information between different computers. It is the protocol that we have been using throughout the chapter, though Scala hid the details from us in the call to Source.fromURL
. It is also the protocol that you use when you point your web browser to a website, for instance.
In HTTP, a computer will typically make a request to a remote server, and the server will send back a response. Requests contain a verb, which defines the type of request, and a URL identifying a resource. For instance, when we typed api.github.com/users/pbugnion in our browsers, this was translated into a GET (the verb) request for the users/pbugnion
resource. All the calls that we have made so far have been GET requests. You might use a different type of request, for instance, a POST request, to modify (rather than just view) some content on GitHub.
Besides the verb and resource, there are two more parts to an HTTP request:
Authorization
header. This Wikipedia article lists commonly used header fields: en.wikipedia.org/wiki/List_of_HTTP_header_fields./pbugnion/repos
. The POST body would then be a JSON object describing the new repository. We will not use the request body in this chapter.We will pass the OAuth token as a header with our HTTP request. Unfortunately, the Source.fromURL
method is not particularly suited to adding headers when creating a GET request. We will, instead, use a library, scalaj-http
.
Let's add scalaj-http
to the dependencies in our build.sbt
:
libraryDependencies += "org.scalaj" %% "scalaj-http" % "1.1.6"
We can now import scalaj-http
:
scala> import scalaj.http._ import scalaj.http._
We start by creating an HttpRequest
object:
scala> val request = Http("https://api.github.com/users/pbugnion") request:scalaj.http.HttpRequest = HttpRequest(api.github.com/users/pbugnion,GET,...
We can now add the authorization header to the request (add your own token string here):
scala> val authorizedRequest = request.header("Authorization", "token e836389ce ...") authorizedRequest:scalaj.http.HttpRequest = HttpRequest(api.github.com/users/pbugnion,GET,...
Let's fire the request. We do this through the request's asString
method, which queries the API, fetches the response, and parses it as a Scala String
:
scala> val response = authorizedRequest.asString response:scalaj.http.HttpResponse[String] = HttpResponse({"login":"pbugnion",...
The response is made up of three components:
200
for a successful request:scala> response.code Int = 200
scala> response.body String = {"login":"pbugnion","id":1392879,...
scala> response.headers Map[String,String] = Map(Access-Control-Allow-Credentials -> true, ...
To verify that the authorization was successful, query the X-RateLimit-Limit
header:
scala> response.headers("X-RateLimit-Limit") String = 5000
This value is the maximum number of requests per hour that you can make to the GitHub API from a single IP address.
Now that we have some understanding of how to add authentication to GET requests, let's modify our script for fetching users to use the OAuth token for authentication. We first need to import scalaj-http
:
import scalaj.http._
Injecting the value of the token into the code can be somewhat tricky. You might be tempted to hardcode it, but this prohibits you from sharing the code. A better solution is to use an environment variable. Environment variables are a set of variables present in your terminal session that are accessible to all processes running in that session. To get a list of the current environment variables, type the following on Linux or Mac OS:
$ env HOME=/Users/pascal SHELL=/bin/zsh ...
On Windows, the equivalent command is SET
. Let's add the GitHub token to the environment. Use the following command on Mac OS or Linux:
$ export GHTOKEN="e83638..." # enter your token here
On Windows, use the following command:
$ SET GHTOKEN="e83638..."
If you were to reuse this environment variable across many projects, entering export GHTOKEN=...
in the shell for every session gets old quite quickly. A more permanent solution is to add export GHTOKEN="e83638…"
to your shell configuration file (your .bashrc
file if you are using Bash). This is safe provided your .bashrc
is readable by the user only. Any new shell session will have access to the GHTOKEN
environment variable.
We can access environment variables from a Scala program using sys.env
, which returns a Map[String, String]
of the variables. Let's add a lazy val token
to our class, containing the token
value:
lazy val token:Option[String] = sys.env.get("GHTOKEN") orElse { println("No token found: continuing without authentication") None }
Now that we have the token, the only part of the code that must change, to add authentication, is the fetchUserFromUrl
method:
def fetchUserFromUrl(url:String):Future[User] = { val baseRequest = Http(url) val request = token match { case Some(t) => baseRequest.header( "Authorization", s"token $t") case None => baseRequest } val response = Future { request.asString.body } val parsedResponse = response.map { r => parse(r) } parsedResponse.map(extractUser) }
Additionally, we can, to gain clearer error messages, check that the response's status code is 200. As this is straightforward, it is left as an exercise.