© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2023
Y. Wilson, A. HingnikarSolving Identity Management in Modern Applicationshttps://doi.org/10.1007/978-1-4842-8261-8_10

10. Using Modern Identity to Build Applications

Yvonne Wilson1   and Abhishek Hingnikar2
(1)
San Francisco, CA, USA
(2)
London, UK
 

It’s not just what it looks like and feels like. Design is how it works.

—Steve Jobs, founder of Apple Computers, as quoted in “The Guts of a New Machine,” New York Times Magazine

The past chapters covered how certain identity protocols provide a solution for authentication, API authorization, and application authorization. It may seem daunting at first to learn these identity protocols, but using them will reduce the work you have to do in the long run. In this chapter, we’ll describe how we created a sample application to demonstrate how OpenID Connect (OIDC) and OAuth 2 can easily be applied to identity challenges faced by modern applications.

The source code for this book is available on GitHub via the book’s product page, located at https://github.com/Apress/Solving-Identity-Management-in-Modern-Applications. This repository contains the sample application discussed in this chapter.

Sample Application: Collaborative Text Editor

We chose to demonstrate how modern identity protocols can be used to build an application by modeling a collaborative document editor. The application involves a stateless back-end API serving a single-page application (SPA). This architecture is very popular at the time of writing, and learnings from it can be carried over to mobile applications, which require a similar separation of concerns between the client and API Server.

Note

Discussing every detail of how to build and deploy modern applications is far beyond the scope of this chapter; thus, we will focus here on the identity-related aspects of the application.

The application offers the following features and services:
  • Allows the user to create an article, using rich text – via Markdown.i

  • Articles belong to the author (user).

  • Authors can share an article with others.

  • Authors can invite others to collaboratively edit an article.

  • Authors can invite others to view the contents of an article in read-only mode.

The demo doesn’t support multiple users editing the same article at the same time. Furthermore, a new version is created every time a document is saved. Thus, if Jon created an article, and Jessie were to edit it, Jessie would get their own copy of the version, which is a full copy of the document, and any subsequent edits by either will be in their own history and branch. We kept the user interface quite simple, with minimal validation and error checking, to keep the focus on identity management. Now that we’ve covered the obligatory caveats about what isn’t included, we can discuss what the application will do.

Discovery

In Chapter 1, we suggested several questions to ask about your project that would help you better understand your identity management requirements. Let us now revisit those questions for this sample application. We share the following answers to clarify the application requirements for the demo application.

Who Are Your Users: Employees or Consumers?

Our sample application provides all types of users the ability to create and share documents. Therefore, both consumers and employees should be able to log in.

How Will Users Authenticate?

Our sample application allows a user to sign up for an account using a username and password and optionally offers the ability to use a social authentication provider. If a user tries to use more than one way to authenticate, they are prompted to link their accounts, in order to reduce confusion and offer a better experience.

Can Your App Be Used Anonymously?

Users can start out anonymously and create an anonymous document. To keep the application simple, if a user subsequently signs in, their anonymous documents are not converted to named-owner documents and are not visible to the named account. If an anonymous document is edited, the application creates another copy of the document.

Web-Based or Native App Format or Both?

The sample application will provide a web-based single-page application. This question helps us understand the scope of the project and argues for creating an API that encapsulates the back-end business logic needed by front-end applications. Such an API would make it easier to add a native application in the future. If you are writing a native application, many of these paradigms will remain the same.

Does Your Application Call APIs?

Our application will call our first-party API, which provides access to users’ articles. There is no use of third-party APIs at this time. The front-end application and back-end API are logically part of the same entity.

Does Your Application Store Sensitive Data?

The sample application assumes that the data in a user’s article might be sensitive.

What Access Control Requirements Exist?

The sample application provides anonymous authors with the ability to write and share documents. These documents are visible to everyone and such documents can be edited by the document author or cloned by anyone else.

Registered users can view documents shared with them, create new documents, edit the documents they created, and share their documents with others by specifying an email address or domain name. Documents can be shared with an individual user via their email address or with a group of users via a domain name. Basing document sharing on an email address’s Internet domain name allows us to demonstrate a simple implementation of group-based sharing.

How Long Should a User Session Last?

The application offers a three-day maximum session timeout. There is no inactivity timeout implemented on the application. Instead, the application uses ephemeral access tokens issued by the identity provider. This is discussed further later in this chapter. After session timeout occurs, a user must reauthenticate.

Will Users Need Single Sign-On (If More Than One Application)?

In our sample scenario, there is only one application. However, we have implemented single sign-on into Discourse,ii a popular tool that is used for community documentation and support forums. This allows us to demonstrate a common scenario where users have single sign-on between an application, documentation, a public community, and some kind of support center.

What Should Happen When a User Logs Out?

When a user logs out, their application session should be terminated, and the user should be returned to a home page from which they could log in again.

Are There Any Compliance Requirements?

We assume no compliance requirements. The sample does not include a privacy notice, nor does it support any privacy requirements such as the right to erasure. We can only do this because it is a sample and can only run on a developer’s machine. Real applications have to consider privacy requirements, and we encourage investigating them early in a project to understand the scope of work required.

Platform, Framework, and Identity Provider

In order to keep the application simple and easy to understand, it is implemented using the popular React Frameworkiii for the front end and Express.​jsiv on the back end. All the code is written in JavaScript. This decision was made as the learnings from javascript can be translated easily to many frameworks in many different languages. The application components can be easily deployed to a hosting platform such as Herokuv or Vercel.vi

Design

Based on the answers to the requirements questionnaire, we made a list of identity-related and application-related tasks in our design. The categorization for your application might not be as simple as our sample, but we recommend doing this exercise as it helps identify identity management requirements. Our list of tasks is shown in Table 10-1.
Table 10-1

Identity-Related and Application-Related Tasks in Sample Application

Identity-Related

Application-Related

Authenticating users

Reading/writing documents

Issuing tokens

Performing checks on documents

Revoking sessions/tokens

Access to documents

Logout

From the preceding list, it’s clear that we have a lot of identity-related tasks to implement. We would like to abstract out the identity-related tasks as much as we can. There are different approaches for doing this, such as using a library and building individual components or using an identity provider server or service. Instead of building an identity provider from scratch, we chose to leverage a third-party identity provider service for the identity-related tasks. This allows us to focus more of our time on the core functionality of our application.

Buy vs. Build

A useful analogy for this buy vs. build decision is someone needing to write a paper. A computer buff with experience building computers as a hobby might enjoy designing and building a custom computer for document editing. It would take a lot of time, possibly more time than writing the paper itself. If it is their hobby, they might not mind the time required, and since they have significant experience, they would be able to design a reliable computer and fix anything that breaks down later.

Most people, however, would just want to focus on writing the paper. They would not have the time or expertise to build a reliable custom computer. It would make a lot more sense for them to buy a computer so they could immediately get to work on their writing. If the computer malfunctions, they could get help more readily for a standard computer than one they built themselves. For similar reasons, we’ll leverage a third-party identity provider rather than build our own identity provider. It enables us to focus on our application’s features and reduces our future support burden.

There are several third-party identity providers available. We chose Auth01 as we are familiar with it, but there are several good identity providers to choose from. Third-party providers enable applications to integrate with them using common identity protocols.

Industry Standard Protocols

Industry standard identity protocols play a very important dual role here. While they provide features such as single sign-on, user federation from social or employer accounts, and authorization, they also facilitate the use of third-party identity providers. They have probably undergone more thorough security review than custom code, and it may be easier to hire engineers to work with industry standard protocols than custom code. We strongly recommend using open, industry standard identity protocols and a third-party identity provider rather than building your own identity provider from scratch. While it may seem daunting to learn the protocols at first, it reduces complexity, liability, and maintenance burden in the long run, as we will see in the next few pages. When picking an identity provider, we strongly recommend using one that supports open, industry standard protocols to integrate with your application.

Architecture

Choosing to use industry standard protocols and a third-party identity provider allows us to refactor our solution. Figure 10-1 shows our application architecture with a third-party identity provider.

A 3-block diagram for the application architecture. The identity provider is connected to a single-page application by identity and authorization. A single-page application is connected to A P I server by authorization. The identity provider has a backchannel communication with an A P I server.

Figure 10-1

High-Level Design of the Sample Application

With an identity provider in the picture, we can reshape our original problem space into the following questions from an identity perspective:
  • How is my application going to trigger a login and logout?

  • How is my application going to establish who the logged-in user is?

  • How is my application going to call the API?

  • How can the API ensure the request it received is valid/authorized?

  • How can my API ensure the requested user has sufficient authorization to perform this task?

We will focus on implementing these details in the next section. From an architectural point of view, we recommend writing a thin “glue” layer to abstract these details out from your core application code into a set of convenience functions that handle the identity calls. This can simplify your application code and make it less tightly coupled to a specific identity provider. With an identity layer, our solution can be divided into three components, shown in Table 10-2.
Table 10-2

Architectural Components for Our Application

Component

Responsible for

Identity Provider

Authenticating users

Managing service-wide session

Providing logout

Providing identity federation

Application and API

Performing application-related tasks

Application-specific identity layer (convenience functions)

Acting as a glue between the identity provider and the API/application

Implementation: Front End

Our list of problems to solve can be further split into two groups: front end and back end. The problems on the front end are
  • How is my application going to trigger a login and logout?

  • How is my application going to establish who the logged-in user is?

  • How is my application going to call the API?

The answers are obvious: using modern identity and access management (IAM) protocols of course! We are going to be using OpenID Connect (OIDC) to communicate between our application and the identity provider. At a high level, the identity provider will issue an ID Token to our application to convey information about the user and an access token to grant our application access to the API.

To start, we should mention that our application needs to be registered as an OIDC client on the identity provider. Different identity providers have various means of doing this. The registration process assigns a client ID to an application and allows the application to specify a callback URL, among other things. The registration process is important to establish a trust relationship between the application and identity provider. The information exchanged is used during the protocol interaction to mitigate the risk of various types of attacks. Registering our application at the identity provider gives us information, such as the client ID, that we will need to include in calls to the identity provider.

You may recall from previous chapters that integrating OIDC in your application requires constructing a URL, redirecting the user to the identity provider, and handling the callback from the identity provider. Fortunately, there are many OIDC-compliant client SDKs that will do the heavy lifting of this detailed protocol interaction for you. We chose an SDK from our identity provider because it simplifies the OIDC interaction and we knew it would be supported for use with the provider.

We then created our application’s identity layer of convenience functions to call the identity provider SDK. This allowed us to abstract the protocol details into more use case–driven tasks, which map directly to our list of problems to solve on the front end. Our solution’s front-end identity layer needs four convenience functions, as shown in Table 10-3.
Table 10-3

Front-End Functions Needed

Function

Purpose

login

To authenticate the user using the identity provider.

getToken

To get a token to call an API with specific scopes.

logout

To end the current authentication session via the identity provider.

getProfile

To get information about the current user.

handleCallback

To handle redirection back from the identity provider, mostly needed for redirect-based flows on browsers.

Depending on your identity provider, the protocol in use, and the user/application/API model, the exact signatures for these functions might vary. The tasks they need to perform, however, will likely be roughly the same as ours. For instance, the login() function might handle the details of storing the current application state and redirecting to an identity provider. At the identity provider, the redirection is typically handled by a method called “authorize,” which in our case requires parameters for an “audience” and “scopes”. Our login convenience function therefore requires such parameters to call the identity provider SDK.

Beyond the functions shown in Table 10-3, we need some kind of in-memory persistence layer in which to store the data received from the identity provider, such as user profile data or access tokens. This can reduce the frequency with which our application has to call the identity provider and improve the responsiveness of our application. Some applications store the user profile data fetched from identity providers on client storage (e.g., LocalStorage). However, any sensitive data, such as access tokens, must only be placed in adequately secure storage. The storage options available vary based on the type of application and platform used. In your identity abstraction layer, you should try to reuse as much functionality from the identity SDK as possible. If your identity SDK offers some type of storage, you should prefer using it. However, if your identity SDK provides only the bare-bones methods for abstracting OIDC, your methods would need to be doing something like that described in the following sections.

login( ) and handleCallback( )

Login in OIDC usually involves implementing the redirection flow using the “authorization code” flow. Implementing this two-step process as one “logical” unit has advantages. For instance, consider a scenario where a user who has never logged in navigates to a document at /articles/foo/1 via a hyperlink shared to them. At this point, we’d like to redirect the user to the identity provider and then redirect them to /articles/foo/1 after they have logged in successfully.

To solve this, we can include state data such as the user’s desired document URL and any additional metadata on client-side storage and then “refer” to it via a string key, which we pass to the identity provider as a state parameter. Upon successful authentication of the user, our application will receive the state parameter back and would be able to use this data to redirect the user to the desired document URL.

To send such a key of state data, we usually use the state parameter in OIDC. One thing to stress here is that when using state it must be an opaque string. One simple storage solution could be to use localStorage and JSON in the browser as shown in the following code snippet:
javascript
// encoding
function encodeState(data) {
    const state = randomBytes(32);
    const serializedData = JSON.stringify(data);
    localStorage.setItem(“state_”+ state, serializedData);
    return state;
}
// decoding
function decodeState(state) {
    const stateKey = "state_"+ state;
    if (!localStorage.hasItem(stateKey)) {
          throw new Error("State not found");
    }
          const storedState = JSON.parse(localStorage.getItem(stateKey));
          localStorage.clearItem(stateKey);
          return storedState;
}
// Then use it when redirecting
.authorize({ state: encodeState({ returnTo: "/articles/foo/1" }) });
// On callback
// Assume parsedResponseUrl is URL Object
const state = parsedResponseUrl.search.state;
const storedState = decodeState(state);
window.history.push(storedState.returnTo); // replace with your library / router
The key things to look for in an identity SDK for login() and handleCallback() are
  • How does the identity SDK create the authorization URL? This may be exposed as a method named “login,” “authorize,” or something similar.

  • How does the identity SDK receive the data that is returned by the identity provider?

  • Does the SDK handle the implementation details of redirecting the user and handling the response?

  • What information are you responsible for storing and providing to the SDK?

  • Does the SDK implement verification of the ID Token or not?

Once you have those details, then the flow to implement login can be
  • Serialize your application state and generate any parameters needed to call the SDK.

  • Call the method to generate the /authorize URL and redirect to the OpenID Provider in the SDK.

  • Resume flow when redirect occurs using .handleCallback.

  • Perform necessary validation steps, such as checking the ID Token is valid.

  • Perform any additional required steps, such as redirecting to the original page, prefilling a form, etc.

Invoking “.handleCallback” should occur on a dedicated route in your web application such as /user/auth/callback, which is registered on the identity provider’s configuration as the callback URL for the application. The callback handler should be written to process both successful and unsuccessful authentication cases to avoid unexpected behavior when an identity provider doesn’t return with a successful status.

Note

There are some differences between authentication in a native app and a web-based app in terms of how authentication is delegated to the identity provider. On the Web, it is natural for a web-based application to redirect the user’s browser to the identity provider in order for the user to authenticate there. For native apps, both iOS, MacOS and Android offer system browser integrations for doing this securely, and such redirection has become a widely accepted user experience on such platforms.

getToken( ) and getProfile( )

As a part of the login method using OIDC, the application will receive an ID Token and an access token. These should ideally be stored in a javascript variable (in memory) effectively acting as if they are cached until they expire. The getToken and getProfile convenience functions would primarily work on top of this cache, fetching the user profile and access token from it as needed.

The access token received will expire at some point. When it does, you will need to communicate with the identity provider and obtain a new token to continue accessing the API, as described in Chapter 6. There are two strategies available for getting a new token after the original token has expired. If your application is using refresh tokens, it would be able to get a new access token using the refresh token and the refresh token grant. If your application is not using refresh tokens, it would need to redirect to the identity provider to get a new access token. As described previously in Chapter 9, this can be done in a hidden iframe to improve user experience. OIDC SDKs and identity providers may implement this using response_mode “web_message”.

Things to look for in an identity SDK for getToken() and getProfile():
  • Does the SDK support the web_message response type?

  • Does the SDK support refresh tokens and refresh token rotation?

Note

Long-lived, nonrotating refresh tokens are effectively sensitive credentials and should not be used on public, web-based clients due to the higher risk of exposure and compromise. If refresh token rotation is not available in an SDK, you’ll need to implement the logic to receive a new refresh token each time a refresh token is used and store this new refresh token for the next request. This logic can be added on top of your token management logic, which would be responsible for checking when you need new access tokens or ID Tokens.

Once you have chosen the strategy for renewing expired access tokens, the implementation for both getToken and getProfile includes the following:
  • Check if cached content is available.

  • If not, use the refresh token, or do a hidden redirect flow to get a new access token and any other necessary information from the identity provider.

  • If an error is raised, handle it. For example, if the error requires user interaction, handle it by redirecting the user to the identity provider.

  • Once all data is obtained, for getToken return the access token, and for getProfile return the contents of the ID Token.

A Detailed Note on Token Management in SPAs

We abstract token management in the getToken method for the front end. When the token is acquired, the application uses the “expires_in” element of the response to compute an expected timeout for the token. All this information, along with the audience, scope, and other metadata associated with the token, is stored in memory. Later, when the application needs an access token with specific scopes, the getToken method simply returns an access token from the in-memory cache, until the token expires, at which point the application needs to request a new access token.

When using refresh token rotation, there are three possible flows to request a new access token (with our identity provider’s feature set), as shown in Table 10-4.
Table 10-4

Steps for SPAs to Obtain New Access Tokens

Scenario

Steps

A refresh token is in memory.

Request a new access token using refresh token rotation.

A refresh token is not available (no longer in memory).

Try to retrieve a new access token from the identity provider using a hidden iframe. This may fail with browsers implementing tracking protections.

A refresh token is not available, and a redirect via hidden iframe fails.

Fully redirect the user’s browser to the identity provider with prompt=none. If a session exists, the browser will be redirected back to the application with new tokens.

Implementing this can be challenging, especially since doing so in a browser where the user has never authenticated or is not authenticated will create unnecessary overhead for your identity provider. To simplify that, we recommend using some kind of a localStorage flag or variable, such as _lastSessionValidated_, which would represent the likelihood of a session being available on your identity provider. Based on this variable, you could execute the logic for the second and third flows, as appropriate.

In our case, we abstract getToken using the getToken method in our identity provider’s SDK, auth0-spa-js.vii The auth0-spa-js library offers a cache implementation as well, so we do not reimplement it in the application. However, we do provide an example class that represents what such a cache requires.

.logout( )

Logout is implemented by clearing any tokens available to the application from memory. This includes tokens received during the user’s session as well as any cookies and session state set by the application. In addition, when logging out, we redirect to the OpenID Provider’s logout endpoint. This terminates the identity provider session as the provider will log the user out when the logout endpoint is invoked. The implementation for logout and session termination is vendor specific, and we recommend checking your identity provider’s documentation for their specific implementation of any logout-related features.

Things to look for in an identity SDK for .logout():
  • How to invoke the OIDC (Relying Party) RP-Initiated Logoutviii

  • Methods to clear any meta state the SDK might have stored for the user session

Implementing logout is pretty straightforward in our case and includes
  • Clear all tokens in memory

  • Remove all other locally stored information about the user

  • Redirect the user to the identity provider’s OIDC Logout endpoint via the SDK

An additional consideration is how to handle access tokens at logout. Identity providers that issue opaque access tokens may provide a mechanism to revoke access tokens. If provided, this can be used within a logout function. With a JWT access token, however, it is not possible to revoke the access token unless the issuing identity provider supports a blocklist feature or provides an introspection endpoint to check the status of the current token. It is possible to maintain a blocklist at your resource server, but synchronizing these blocklists can be challenging. If not revoked or blocklisted, the access tokens will stay valid until they expire. In practice, it’s often more convenient to use a short token expiration than to call the provider for each token to check for blocklisting.

Our provider uses a JWT-format access token. The access tokens for our API are configured in the identity provider to have a sufficiently short expiration period so we can avoid the development work and performance impact of checking for blocklisting. We recommend checking for the recommendations from your chosen identity provider on how to terminate access associated with security tokens it has issued, as the process may vary by the provider.

Closing Note

In the preceding sections, we’ve described an approach for writing an identity layer of convenience functions that call an identity provider’s SDK. We did this to simplify our application code and isolate the details of a particular identity provider’s SDK in the identity layer. Even if an identity provider offers a higher-level SDK, we still recommend writing an identity layer as a simple wrapper on top of the SDK. This makes it easier to update your code if the SDK’s API changes in the future. It can also facilitate testing, debugging, and Continuous Integration/Continuous Deployment (CI/CD) workflows without having to call the identity provider during your tests because you can stub out the convenience functions with a dummy implementation.

In our sample application, Auth0’s auth-spa-js SDK is doing most of the heavy lifting for the identity protocol interaction with the identity provider. For additional learning, we have included a sample using OAuth4WebAPI which is a pure OpenID Connect Client to implement capabilities mentioned earlier. You can find this sample in the code repository in examples/oidc-spa-js.js.

Implementation: Back-End API

So far, we’ve been happily building the front end and haven’t talked much about the back end. A strong separation of concerns between the front end and the back end allows them to be implemented with some independence. The advantage of developing with well-designed, industry standard identity protocols is that you can largely develop a front end and/or a back end as long as they agree upon the identity protocol. The front ends, whether native, browser-based, or web applications, will be able to access the back end, or many back ends, as long as they have the appropriate access token for the specific back end.

With that, let’s start designing and implementing our back end. Just like the front end, there are the two major problems to solve:
  • How can the API ensure the request it received is valid/authorized?

  • How can my API ensure the requested user has sufficient authorization to perform this task?

The first task is primarily dependent on the client via the identity provider. The access token issued by the identity provider must be included in all requests from the client application to the back-end server, as a bearer token. Identity Providers have different ways of representing an API, but, in general, you’ll register a back-end server/API with an identity provider, which will issue it a unique identifier. A client application can then request an access token for a specific API using that identifier. In our case, the identifier for our registered API is used with the “audience” parameter in the client’s authorization request to Auth0, our identity provider. (This parameter may be called the “resource” parameter in some implementations.)

Depending on the identity provider, you may receive an opaque access token that is validated by checking with the identity provider via an API call. Alternatively, you may receive a JWT-format access token, which is a self-contained token that includes the proof of its legitimacy cryptographically attached to it. JWT-format tokens are common, and there is now a defined “JSON Web Token (JWT) Profile for OAuth 2.0 Access Tokens.”ix We use JWT access tokens that use the RS256 signature algorithm. Doing so allows us to validate the JWT using the public key hosted by the identity provider on a well-known URL. This avoids the requirement to store a secret symmetric key on our API servers to validate the signature of JWT access tokens as would be required with HS256. This reduces risk, because if the secret key were compromised, it could be used by a rogue element to issue unauthorized tokens.

Once a JWT is received by an API and extracted from the request header, the API server should decode it and perform a quick assessment on whether the JWT uses one of the approved algorithms and is issued by an approved identity provider for the API. The API back end should then fetch and cache the public key from the identity provider and validate the signature on the JWT access token before trusting the contents of the token.

Most languages and development frameworks provide a library for JWT access token validation. The website https://jwt.io lists a large number of these libraries. Your identity provider or SDK vendor might provide a more specialized library for this task.

You will need to ensure the audience, issuer, and algorithms are valid, by having an approved list on your API Server. You should not trust the incoming information in the access token without validating it. For example, if the issuer ID on your identity provider is https://foo.bar, then any requests to your API using access tokens from another issuer must receive an HTTP 401 response (unauthorized). Another example is that tokens with “none” as the algorithm must be rejected.

The second problem to solve in the back end can be divided into two major tasks, namely, identifying who the user is and what they are allowed to do. We can convert these into methods, like we did on the front end, and then focus on implementing them. Table 10-5 shows the API Helper Functions for our application.
Table 10-5

API Helper Functions Needed

Function

Purpose

function getUserId(token) {}

Takes a token and extracts the user_id

function canPerform(token, resource, action) {}

Given a token check a specific action can be performed using the token

.getUserId( )

This function is rather simple to implement. The claim “sub” in a JSON Web Token is often used to represent the user. Since we are using the Express framework,x we can extract this claim and populate this as an additional property on the request object.

As recommended in Chapter 4, we use an internal, application-specific identifier for a user in all application and API logic and use separate identifier attributes for display and notification. This enables a user to change the value of attributes such as their display name or notification email address without impacting articles tied to their identity. To keep the program simple, we didn’t implement functionality to let them actually make such changes, but this may serve as a fun modeling problem for the future.

The things to look for in the identity SDK for getUserId are
  • How will it perform JWT validation?

  • What contents are returned?

.canPerform( )

This method is an abstraction of authorization and requires a lot more explanation. The attributes needed to answer the authorization question need to be available either in the incoming request to the API or via some form of secondary storage that is referenced by information in the incoming request.

In our case, however, we needed access control at a very granular level, namely, each individual document. In our implementation, permissions for a document are stored on the document itself. Each document is stored on disk with an additional field that encapsulates who has access to the document. As a result, we elected to handle access control enforcement within our API. This meant the API needed to receive information about the user in each request for a document. The canPerform method therefore accesses a document resource’s metadata and then returns true or false depending on whether the resource is accessible by the user.

To model permissioning, the metadata for document resources is an array of access control which represents the following:
  • The type of access granted, whether based on a user or a domain (discussed later)

  • The benefactor of this access

  • A list of the permissions that the benefactor has (“owner,” “editor,” “reader”)

A simple implementation of canPerform, which ignores the domain-based access is
javascript
/**
 * Simple implementation of canPerform
 */
canPerform(token, resource, action){
       const userId = getUserId(token);
       if (resource .type !== “Document”) return false;
       const {acl} = Documents.read(resource.id);
       return acl.some(access => access.userId === userId
&& access.permission.includes(action)
);
}

Using OAuth 2 Scopes – for API Authorization

OAuth 2 defines scopes as a means for an application to indicate the specific privileges it requests for an API call. We defined access scopes for the applications around the API endpoints and functions the applications would perform. This resulted in the following scopes that the applications can request:
  • get:article

  • post:article

  • patch:article

  • patch:profile

  • get:author

Note that these are the privileges that the front-end applications will use with the API, and not privileges that will be granted for individual users. The advantage here is that we could now expose our API to third-party applications, or even other first-party applications, and customize the access granted to each one, based on the defined scopes, without having to modify our API implementation. To reiterate, the scopes are used to grant application clients the ability to make various requests to our API. We will discuss authorization of users later in this chapter.

As a final note, the access policy for our demo application is extremely simple and easily expressed using the scope parameter. If we had a different application with a significantly more complex access policy, we would have considered using a Rich Authorization Request (RAR), as described in Chapter 8. At the time of writing, this is still a draft specification, so timing and support from identity providers would be factors to keep in mind against the capabilities this specification offers.

Linking Accounts

The benefit of having abstracted the identity aspects of our application is that we can handle challenges like user duplication, where a user may inadvertently create more than one account, by using a different way to authenticate than the one they initially used.

One option is to avoid the issue by simply enforcing a constraint that they must register first with a username/password account. However, such an approach proves to reduce the efficiency offered by using a social provider. In our case, we used an extension feature in the identity provider to add an additional constraint in the login procedure. When a user is authenticating, if we determine that the email address is already registered, we prompt the user to link their existing identity. It is important that we verify the original identity and that the current user has access to the original identity before linking the accounts. Otherwise, we would open up a vulnerability whereby a compromised social account could propagate into potential account takeovers, even if the user had never used the social provider to log in to our app. Further discussion of account linking can be found in Chapter 18.

In the case of our identity provider, adding additional logic during the login process can be done by running additional javascript code, post login. An extension is provided to accomplish this task. Most identity providers have some means of accomplishing this type of linking. As a last resort, you might be able to query the user record on the first login, bring it into your application, and handle this linking challenge in your application instead.

Anonymous Access

In our API, the core of the data model is a document on which CRUD (Create, Read, Update, Delete) operations are performed. We modeled those as HTTP “verbs” that map to POST, GET, PATCH, and DELETE functions.

By decoupling the business logic from the user information and implementing just these operations, our application could be fully functional without the notion of a user. As a bonus feature, in our case it would allow anonymous access, which might entice users to try out the app who might otherwise balk at having to sign up for an account first.

However, if we enable a user to create documents anonymously and the user later signs in, documents created before logging in will stay anonymous and public. The trade-off we made is that allowing a user to start anonymously means that if they upgrade to a full user later, we miss out on the ability to integrate the information about the user with content created anonymously.

Applications may choose to offer anonymous access or require mandatory user authentication from the start. In our sample, we chose to allow anonymous use as it felt natural to encourage users to try out the application with the least amount of friction.

To make this work with our JWT validation strategy, the application sends the Bearer value of “anonym.” This is a special value that means the user is untrusted. The .canPerform method can then be adapted to grant “create only” access to all documents, with anonymous identifiers.

Granting Access Based on Domains

One of the features that we find really convenient today is being able to grant access to an entire domain. This is a simple version of the feature to share a document with an entire team on Google Workspace. Solving this is outside the domain of OAuth 2 and OIDC. However, this allows us to highlight how easily solutions can be built on top of these standards.

In the previous sections, we loosely defined the access model for our application. In practice, we store permissions in the document metadata, similar to how files have permissions associated with them in Unix-like operating systems.

Files can be shared using the full email address “[email protected]” or using “@domain.com” identifiers which must start with the @ symbol. A file permission of “@domain.com” provides a simple way to grant access to a team and allows all users with an @domain.com email to access the file. The creator of a file has full access to the file, and to keep things simple, only the creator of a file is able to grant “share” privilege to others.

To implement this, each file has an array in metadata with the following shape:
{
        Type: "DOMAIN"|"INDIVIDUAL"
        identifier: String,
        permission: Permission[],
}

The “identifier” attribute is either an email address or a domain name, while “permission” is one or more of the following, “read,” “write,” “share,” and “owner.” In the current permission model, a complete email will be matched in its entirety with the user’s email. In the interest of privacy, instead of sharing the full email, only a salted hash of the information is stored.

This brings us to another problem. So far, our application has no means of fetching the user’s hashed email or their hashed email domain in the access token. To work around this, we add custom claims in the JWT access token issued by the identity provider. To convey information about the authenticated user to an API, it is very useful to have an identity provider that offers some means of adding custom claims to the access tokens it issues.

A nonstandard claim, such as “https://dev.doc/team”, is used to indicate a team’s email domain, and “https://dev.doc/email” is used to indicate the salted email address of an individual user. An extensibility feature in our chosen identity provider allows us to use custom code logic to augment the claims in the access token. We used a snippet of code like the following to add a claim:
export async function (user, context, callback) {
    user.app_metadata = user.app_metadata || {};
    user.app_metadata.teamId = user.app_metadata.teamId || async hash(getDomain(user.email));
    context.accessToken["https://dev.doc/team"] = user.app_metadata.teamId;
    callback(null, user, context);
}

We have found that access policies vary quite a bit, and it is very common for applications to have unique access requirements. For this reason, we recommend checking for extensibility features when selecting your identity provider, as well as carefully evaluating a provider’s support for your application’s access control requirements. If your identity provider doesn’t support either your requirements or such extensibility, you’ll need to handle more logic on your application back end or add this in the provisioning step. In our case, this is conveniently handled for us out of the box with extensibility features in our identity provider, giving us more time to focus on our business logic.

The same need for extensibility applies to front-end customization as well, as most applications will want to customize consent screens and need tailored consent management or approval logic. Having an identity provider with some form of front-end extensibility for functions like login, sign-up, and consent can reduce what you have to build in your application.

Other Applications

In addition to the application that we developed, we are also using Discourse as a second application to demonstrate single sign-on. The Discourse application is registered separately with the identity provider and uses the Authorization Code Flow to authenticate users. The instructions about this are documented at the Discourse Documentation. One of the advantages of using an industry standard protocol like OIDC, OAuth 2, or SAML 2 is the benefit of support for single sign-on with third-party services like this one.

If you have an application that has a native application counterpart, the native application should also be registered at the identity provider as a second application. There are many benefits of doing so. It reduces the chances of potential abuse by limiting the impact of a bug, vulnerability, or misconfiguration. It can also improve visibility of users’ usage patterns, enable implementing different access control for the two client applications, as well as provide flexibility for branding and customization.

Additional Note on Sessions

In the modern identity world, a session for a user may be in turn a series of sessions, interconnected. Even for our simple user application, there are two layers of session. There is an application session for the user as well as an identity provider session for the user. There may be additional sessions, as shown in Figure 10-2, when a user’s account in one identity provider is federated to an account in a remote identity provider. Sessions A, B, and C are independent sessions, managed by each application.

A 3-block diagram for authenticated user sessions. The blocks include google, your identity provider, and your application, connected via unidirectional arrows.

Figure 10-2

Sessions Illustrated

The single-page app relies on the identity provider session for a user and, as such, does not store data locally, beyond storing the tokens it receives in the in-memory cache. This simplifies the application but comes with the disadvantage that every time the application is started in the browser, it has to check with the identity provider for the status of the user’s session.

Luckily, it is very simple to make this transparent and perform this activity in the background. The SDK we use abstracts this for us. However, it can be achieved simply by storing some information about the user, like name and picture URL, rendering the UI optimistically, and performing the authentication in the background. This can be done by redirecting to the identity provider and fetching the response from the identity provider in an iframe, at least for first-party applications. The ability to run the authentication in the background is subject to being supported by your identity provider, but has usually been made available via “web_message” response mode or via the OpenID Connect Session Management specification.xi

As you learned in Chapter 9, in a typical SSO deployment, a user may have multiple sessions including an application session, identity provider session, and an additional session in each of any other applications they’ve authenticated to via the same identity provider. It is desirable in some cases to have a binding between the session at an identity provider and all the relying party applications it serves, so that a given application can be aware of changes to the user’s session in the identity provider and vice versa. At the time of writing, there is no reliable, standard way of achieving this with OIDC. There is a recently finalized specification for session management which we elected not to use for reasons described in the following section.

Browsers, Trackers, and OAuth 2

Modern browsers deploy powerful defenses to prevent the monitoring of a user without their consent, despite being constantly challenged by tracking companies. In the past few years, in order to protect the user from being tracked by such third parties, many modern browsers now remove cookies placed by third parties and limit access to cookies in iframes. Some browsers even go so far as to remove any cookies placed by websites that use CNAMES to refer to other domains (e.g., a CDN Service) after seven days. All of this has had a cascading effect that limits the background detection of session status as described in the OpenID Connect Session Management specification.xi

In response to the additional security added by browsers to thwart third-party trackers, the OAuth 2 specification now recommends the use of refresh token rotation in browsers. This involves using a short-lived, refresh token. Each consumption of the refresh token results in a new refresh token being issued, such that each refresh token is only used once. The utility here is that since this refresh token is short-lived, the risk of exposure is limited, and it makes the programming model almost the same as for native applications. Additionally, OAuth 2 recommends adding token reuse detection and revoking sessions in case of a token reuse being detected, to further improve the user’s security.

Summary

We’ve covered in this chapter how we designed and built a sample application that uses OIDC for user authentication and OAuth 2 for API authorization for our custom API. In this scenario, both functions are handled by our identity provider, which serves as an OpenID Provider and an OAuth 2 authorization server. We shared some key design decisions and implementation points involved in creating the application. The following chapters will discuss additional aspects of identity management that applications have to handle after the user has been initially authenticated, starting with single sign-on.

Key Points

  • OIDC is used to authenticate users and obtain an ID Token with claims about the authenticated user.

  • OAuth 2 is used to obtain an access token to authorize the application to call our custom API.

  • To obtain access tokens for our custom API, we need to obtain and configure our own identity provider to protect it.

  • We used customization features from the identity provider to add custom claims to the access token to provide additional information to the API about the user. The custom claims enable the API to enforce user-level access policy. Customization features vary by identity provider.

  • Both applications use the OIDC authorization code flow with PKCE for a user’s initial authentication.

  • Our application uses a library that uses the web_message response mode for renewing access tokens for a better user experience.

  • The native application uses a refresh token to obtain a new access token if the previous access token has expired.

  • Registering the single-page and native versions of our application separately at the identity provider allows us to distinguish between the two application versions for access control, branding, and logging.

Notes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset