© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2023
Y. Wilson, A. HingnikarSolving Identity Management in Modern Applicationshttps://doi.org/10.1007/978-1-4842-8261-8_4

4. Identity Provisioning

Yvonne Wilson1   and Abhishek Hingnikar2
(1)
San Francisco, CA, USA
(2)
London, UK
 

The more identities a man has, the more they express the person they conceal.

—John le Carré, from Tinker, Tailor, Soldier, Spy (1974)

The first step in the life of an identity is its creation. If Descartes had lived in the time of Internet identity, he might have quipped, “Ego signati sursum, ergo sum” (I signed up, therefore I am). Provisioning is the act of establishing identities and accounts for your application. As defined in Chapter 2, an identity includes at least one identifier and various additional user profile attributes. An online account is associated with an identity and can be used to access protected online resources. The objective of the provisioning phase is the creation or selection of a repository of user accounts and identity information that will be used in the authentication and authorization of users as they access protected resources.

Provisioning Options

For an application developer, the identity provisioning phase involves getting users and creating accounts and identity profiles for them. One obvious approach for this is to have users sign up for a local application account, but that isn’t the only possibility. A list of approaches to consider includes
  • A user creates a new identity by filling out a self-registration form.

  • A special case of self-registration is sending select users an invitation to sign up.

  • User identities are transferred from a previously existing user repository.

  • An identity service with an existing repository of user identities is leveraged.

  • An administrator or automated process creates identities.

These approaches are not mutually exclusive; in some cases, a combination of approaches might work best. We’ll describe each in more detail along with some advantages and disadvantages for each.

Self-Registration

One option is to have users create a new account for your application and specify their identity information via self-service sign-up. This requires enticing users to your site, having them fill out a registration form and then storing the collected information. This is a common approach for consumer-facing sites and requires you to design and create the sign-up form(s). The sign-up form and process must be capable of scaling to the expected volume of user sign-ups especially for a big, widely announced launch. Self-registration also necessitates privacy notices about the information you are collecting and obtaining the user’s consent for the planned use of the information collected. You should keep the information requested to a minimum as users may abandon the registration process if too much data is required.

With a self-registration form, you control the user sign-up experience. You can customize the information you collect and ask the user directly for information that may not be available from other sources. Self-registration is more scalable, at least compared to having administrators manually create accounts. On the flip side, there is work to implement and maintain a registration form, along with procedures for obtaining user consent for the data collection and processing. In addition, having to fill out a registration form may deter some users from signing up. Table 4-1 summarizes some of the advantages and disadvantages of using a registration form.
Table 4-1

Self-Registration

Advantages

Disadvantages

• Ability to collect user attributes that don’t exist elsewhere.

• Control over user registration experience.

• Scalability through self-service.

• May deter some prospective new users from signing up.

• Liability associated with storing login credentials.

Progressive Profiling

You can reduce the information a user has to enter upon sign-up by using progressive profiling, the practice of building up user profile attributes for an identity over time, instead of requesting them all at once. With progressive profiling, a user is asked to provide minimal attributes when they sign up. If the user later performs a transaction that requires more information, it is collected at that time. Alternatively, additional information can be gathered after a certain amount of time has passed or a set number of logins. Progressive profiling reduces the sign-up friction that a lengthy initial sign-up form would present. It is often used in conjunction with self-registration sign-up, but can be used with other provisioning options.

Invite-Only Registration

A variant of the self-service registration approach is the invite-only registration flow. In this scenario, specific users are invited to sign up. The invitation may be triggered by another user. Some social networking sites use this approach to have users invite their friends to join the site. The invited user gets a link which takes them to a sign-up form where they can register. The invited user should create the password for their account when they register, rather than having it included in the invitation, so that only the user knows the password.

An invitation may also be triggered by an administrator of a site. This case may involve a registration form for the user, or, if the administrator has already provided all account data needed, it might only involve email address validation and/or a password reset. This technique might be useful to invite specific users to test an early access (alpha) version of an application or release.

With an invite-only sign-up, access to the registration form is restricted to a select group of users who receive an invitation. The invitation can be delivered via channels such as email or text message and contains a link that allows the user to register. The registration page can lock the email address or phone number to that used in the invitation so it cannot be changed at the time of registration. This prevents an uninvited person from stealing someone else’s invitation and signing up as themselves. The link in the invitation can also have an expiration associated with it, if necessary, and each invitation is usually tracked so it can only be used once.

An invite-only flow can also be used for situations where you need to create an account in order to assign privileges to it before sending the invitation. This approach could be used to establish employee accounts for new hires or customer accounts for access to early access (alpha) application environments. An administrator or automated process can create the account, assign it privileges, and then trigger the sending of the invitation link to the new user. The user clicks the link and provides additional information in a registration form if needed. The information entered by the user can be associated with the previously created account. The account is then ready for the person to use and has the privileges previously assigned to the account by the administrator.

The invite-only flow has similar considerations to the self-registration option described previously. It can additionally protect against registrations by hackers and bots, unless, of course, they find a way to finagle an invitation. An invite-only registration flow obviously requires extra work to implement the invitation mechanism as well as access control to limit access to the invitation distribution. It may require work by an administrator to issue the invitations or to create an automated process to do this. As with open self-registration, the invite-only registration process should be capable of supporting the expected volume of invited sign-ups. Some advantages and disadvantages of the invite-only sign-up approach are shown in Table 4-2.
Table 4-2

Invite-Only Registration

Advantages

Disadvantages

• Ability to collect user attributes not available from other sources.

• Control over user registration experience.

• Some protection against registration by hackers and bots.

• Scalability through self-service if users invite others.

• The work to implement the invitation mechanism and control access to it.

• The work to issue invitations.

• May deter some prospective new users from signing up.

• Liability associated with storing login credentials.

Identity Migration

If identities already exist elsewhere, they can be moved from one repository, such as a legacy database, to another repository that can be used by the new application. The advantage is that users don’t have to provide information they already entered elsewhere, and the new repository can be quickly populated with users from the legacy repository.

While most user profile attributes can be extracted and moved, passwords represent a challenge. Passwords are typically stored in a hashed format. Hashing converts them to a string of random characters, and this cannot be reversed to get the original “cleartext” value. Each time the user logs in and enters their password, it is hashed and the hashed value is compared to the password that was hashed and stored when the user registered or last reset their password. Storing passwords in hashed format allows validation of entered passwords but prevents administrators with access to password repositories from seeing cleartext passwords and makes it difficult to use the passwords if the storage repository is compromised or stolen.

There are different algorithms for hashing passwords and different inputs passed to the hashing algorithms such as salts and iteration counts. As a result, a password hashed in one system cannot necessarily be imported and used by another system. If two different systems use different hashing algorithms or different inputs to the same algorithm, it is not possible to move a hashed password from one system to the other and have it be usable by the new system. In such circumstances, there are a few solutions to consider for migrating identities to a new system.

Support Legacy Hashing Algorithm

One solution is to move the hashed passwords to the new system and update the new system to support the hashing algorithm(s) used by the legacy system. This requires implementing in the new system the legacy system’s hashing algorithm(s) and a means of determining which hashing algorithm to use with each account. This will enable moving all identity data and hashed passwords from the legacy system to the new system without requiring the users to reset a password. Table 4-3 summarizes some advantages and disadvantages of supporting legacy hashing algorithms.
Table 4-3

Supporting Legacy Hashing Algorithms

Advantages

Disadvantages

• Avoids need for password reset.

• Transfers all accounts in a usable state.

• Work to implement legacy hashing algorithm(s).

• Liability associated with storing login credentials.

• Inherits any weakness associated with legacy hashing algorithms.

Bulk Identity Migration

If it is not possible for the new system to support the legacy hashed passwords, it may be possible to extract the users’ identity data, minus the hashed passwords, from the legacy system and import it into the new system. The new system would then need to send each user a unique password reset link to establish a new password for their account in the new system. This requires the identity information in the legacy system to include a validated email address, and that a password reset link sent via email is deemed adequately secure for the sensitivity of the information handled by the new system. If other forms of communication besides email are used, the same validation and security requirements apply. This solution may also be useful if the passwords in the legacy system were not stored in a hashed form and the new system requires newly reset, hashed passwords for improved security. Users should be notified in advance about the migration, so they will know to expect the password reset message and not view it as an attempted phishing attack.

When forcing users to reset their passwords is acceptable, a bulk transfer can be done all at once, making it possible to retire the legacy system soon after the transfer. Table 4-4 summarizes some advantages and disadvantages of a bulk transfer of users.
Table 4-4

Bulk Migration of Users

Advantages

Disadvantages

• Transfers all users at once.

• Enables immediate shutdown of legacy user repository.

• No latency added at login time to check a legacy system for a user account.

• Code to transfer identities can be independent of application code.

• Transfers all accounts, even inactive accounts, unless they are filtered out during the transfer.

• Requires all users to set new password via account recovery, unless the new system can support the legacy hashed passwords.

• Migrating all users at once may cause an outage or delay the migration if things go wrong with the migration and there is no backout plan.

• If multiple applications use the legacy repository, they must migrate at the same time if the legacy repository is to be shut off after migration.

• Liability associated with storing login credentials.

Gradual Migration of Users

Identities can also be transferred gradually as users log in. This requires a login mechanism that prompts users for credentials, validates the credentials against the legacy repository, and, if validated, retrieves identity information from the legacy repository and stores it along with the entered credentials in the new repository. The password entered by the user, after validation against the legacy system, is hashed by the new system using its hashing algorithm and stored in the user’s account in the new system. This option will only migrate users who log in and requires the new authentication system to have direct access to the legacy system to validate the entered password and retrieve user profile information. This is convenient for users because no password reset is required, but it means the legacy system must remain operational until the identities have been migrated. This solution will not transfer inactive accounts (users who don’t log in).

With the gradual migration approach, a subset of users may not log in and therefore not have their identity information migrated. You can set a cutoff date for the migration and decide what to do about any identities that have not been migrated by that date. One possibility is to declare the unmigrated accounts inactive and abandon them. A common approach is to use the bulk move option described previously on the inactive accounts so you can decommission the old system. You may want to migrate only a subset of identities that you have reason to believe will be active again in the future. If you do not migrate all remaining identities, you should consider reserving the account identifiers of unmigrated identities to prevent them from being used by new accounts in the future. Chapter 15 explains why.

Of course, a user whose identity has not yet been migrated might forget their password. The user could use the new system to enter their email and get a password reset token or link. The user would confirm receipt of the email and be prompted to enter a new password. The new system would create an account for the user in the new system, with information retrieved from the legacy system for an account with matching email address. This scheme is only appropriate if there is no possibility that an email address used in the old system could have been recycled and assigned to a different user.

The gradual migration of active identities, combined with bulk migration of remaining identities and credential reset, provides a nice user experience for active users while not abandoning infrequent users. If using this approach, care should be taken to minimize exposure of login credentials, ideally by using an identity provider service that implements such a migration. In this case, the credentials are only exposed to the identity provider service that will handle authentication with those credentials going forward.

A final consideration with the gradual migration option is that it may be confusing for users if other applications are using the legacy identity store and the user resets their password in the new application to which their account has been migrated. If the new password is not synchronized back to the legacy identity store, the user may have one password for the applications that continue to use the legacy identity store and a different password for the new application. Performing synchronization back to the legacy system would depend on technical feasibility, cost, and the security of the legacy system. Alternatively, clear differentiation of the legacy and new login screens could reduce confusion to some extent.

Table 4-5 summarizes some advantages and disadvantages of a gradual migration.
Table 4-5

Gradual Migration of Users

Advantages

Disadvantages

• Inactive accounts can be weeded out.

• No password reset required (for users who log in during migration).

• Spreads out risk of outages by migrating identities gradually (no big bang risk).

• Can support continued use of previous sign-up mechanisms or applications that use the legacy identity repository during the gradual migration.

• Requires that legacy identity store is accessible from new application’s authentication mechanism.

• Legacy identity store must remain accessible until enough identities are transferred.

• Transfer mechanism must be maintained throughout the gradual migration.

• A user’s first login after migration starts may have some latency as identity data is transferred from the legacy system.

• Potential for user confusion after password reset if other applications continue to use the legacy data store.

• Potential for user confusion if users can make user profile updates in both legacy and new systems after migration.

• Implementation work cannot be easily decoupled from the application team.

• Liability associated with storing login credentials.

Any time identity data is moved from one system to another, it is important to consider any changes that might occur during the transition. The easiest approach is to prohibit users from making changes to either the old or new system during the transition, but this may not be feasible in some cases. If users are allowed to make changes to their identity information in the old system during an identity data migration, a plan is needed for how to identify and transfer such interim changes from the old system to the new system. In the case of a gradual identity migration, the user’s account in the old system can be disabled when the user’s account is migrated, preventing further changes in the old system. This assumes there are no other applications which will continue to use the old system. Requirements for each environment can be unique, so creating a plan that takes into account all applications in an environment, the migration timing, and potential for user changes during the migration is essential.

Administrative Account Creation

Yet another solution to consider for creating accounts and identities is to have an administrator or automated process create them. The best approach for a situation should take into account
  • The size of an organization

  • The frequency with which new users need to be added

  • Whether provisioning needs to be done across domains

The following sections provide a few variants of this solution to consider.

Manual Account Creation

Having an administrator manually create accounts for new identities will only be practical for very small organizations (low tens of users) with an infrequent need to add new users and few applications. For very small organizations, the work to implement account provisioning automation may not be justified. In the absence of automation, written procedures and checklists can be used to ensure necessary account provisioning steps are consistently followed. If passwords are used as credentials, the account provisioning procedures should ensure that administrators do not know the user’s password. This can be done by sending a password reset link to the user and/or requiring a password reset upon initial login. If the organization grows or starts to need more than a handful of applications, some form of automation will be beneficial for consistency, accuracy, security, and trackability.

Automated Account Creation

This approach is often used for employee identities. When a new employee joins a company, the company can automatically create an account for the employee using identity information from a Human Resources (HR) system. If large volumes of accounts need to be created on an ongoing basis, workflow software or specialized account provisioning software can be used to automate account creation and provide identity attributes for accounts.

Cross-Domain Account Creation

In several situations, account provisioning may need to occur across domains. This can occur when
  • Maintaining employee accounts in external SaaS (Software-as-a-Service) applications

  • Maintaining partner accounts in corporate identity repositories or applications

  • Maintaining business customer user accounts in business-facing applications

  • Maintaining guest professor or student accounts in collaborating universities’ systems

  • Maintaining guest user accounts in collaborating government agencies’ systems

Ideally, modern authentication protocols would convey user profile attributes to applications in authentication tokens at the time of login, but provisioning or synchronizing identity information across domains may still be needed if
  • Applications are not designed to extract identity information from authentication tokens.

  • The identity profile information is too large to convey in authentication tokens.

  • User logins are not frequent enough to keep profile information sufficiently up to date.

When needed, the provisioning of accounts and identity information across domains is still commonly done using proprietary solutions, but an industry standard protocol, SCIM 2.0 (System for Cross-domain Identity Management),i was defined in 2015 to provide a more standard approach to sending and updating identity information from one domain to another. SCIM 2.0 provides a standard REST API for one system to send requests to another system for adding, modifying, or deleting user and group records. This can be used to keep identity data synchronized between different systems. A common use case is for a centralized identity repository to send user account and profile updates, as well as account deactivation requests, to other service provider systems. SCIM 2.0 also provides an optional common user schema, though user profile attributes vary widely across systems so mapping user profile attributes between systems is usually required.

Table 4-6 shows some advantages and disadvantages of administrative account provisioning.
Table 4-6

Administrative Account Creation

Advantages

Disadvantages

• User doesn’t fill out registration form.

• Administrator can assign customized privileges for the account.

• Can incorporate manual identity validation step if required by the organization creating account.

• Can be automated via workflow or identity provisioning software.

• Time-consuming if not automated.

• Requires care to ensure that only the user knows the password for the account created.

• Liability associated with storing login credentials if stored locally.

Leverage Existing Identity Service

It’s also possible to leverage an identity that already exists for a user in an identity provider service. This allows users to employ an account they already have such as at a social provider like Facebook or Google, a corporate identity provider service operated by their employer, or a government identity service. With this option, your application delegates responsibility for authenticating users to an identity provider and receives back a security token with information about the user’s authenticated session and, optionally, attributes about the user.

Leveraging accounts in an existing identity provider service may mean less work for users if it reduces the data they have to enter into a registration form. It also usually means users don’t need to set up another password. This may translate to less development work if you don’t have to implement a login form or account recovery mechanism because all users authenticate via an identity provider service. It may also reduce your risk somewhat if user passwords are not stored in your infrastructure. If an identity provider service does not contain all the attributes your application needs about the user, you can always collect additional data later. Of course, it’s a good idea to vet an external identity service before trusting it, and the use of an identity provider service requires collaborative troubleshooting as described in Chapter 16. Table 4-7 summarizes some advantages and disadvantages of using an external identity service.
Table 4-7

Leveraging an Existing Identity Service

Advantages

Disadvantages

• Better user experience if it reduces the data required to sign up.

• Easier for user to remember password if identity provider account is used frequently.

• You may not have to implement a login form or account recovery mechanism if all users authenticate via the identity provider service.

• Less risk if you do not store user passwords.

• You may have to collect additional profile information not available from the identity provider service.

• You need to evaluate the service and availability levels of the external identity service to ensure it meets your needs.

• May require additional development or configuration work for each identity provider service to be used.

• May require configuration work at each identity provider service for each application you have, unless you use an authentication broker service (described in Chapter 7).

• May require collaborative troubleshooting with another organization when issues occur.

In addition to existing identity provider services, you can of course set up your own, new identity provider service for use by your application. If you choose that route, many cloud services are available to facilitate the task, and any of the previous provisioning options could be used to populate the new identity provider service with users.

Selecting an External Identity Service

If you choose to leverage an external identity service, it’s important to consider the strength of the identity issued by a service as well as the suitability and availability of a provider for a particular environment. The strength of an identity is one factor in determining how much trust can be placed in the identity, and several factors influence the strength of an identity:
  • The validation of the information used to establish the identity

  • The identity’s implementation that prevents it from being forged or used by others

  • Recognition of certain issuers of identities as authoritative for a particular domain

Table 4-8 provides a comparison of characteristics of strong vs. weak identities.
Table 4-8

Characteristics of Strong vs. Weak Identities

Strong Identities

Weak Identities

• Linked to a real person, who can be held accountable for actions taken with the identity and associated accounts.

• Identity attributes are validated during account issuance process.

• Issued by entity recognized as authoritative for a particular context.

• Contains mechanisms to protect against forgery or unauthorized use.

• Anonymous, cannot be linked to a real person.

• Little validation of identity attributes.

• Issued by an entity with little recognized authority.

• Few protections against forgery or unauthorized use.

The strength of an identity is based on the trustworthiness of the issuer, the validation of identity data, the practices behind the issuing and distribution of the identity, and, in some cases, agreements, either implicit or explicit, between the issuer and any entities trusting identity information from the issuer. The next sections provide examples.

Self-Registered Identities

A self-registered identity, such as a basic Gmail or Yahoo email account, is an example of a weak identity. You can sign up for these accounts using any identifier that has not already been taken, such as [email protected] or [email protected]. You do not have to supply true information in the sign-up form, and the service provider does not validate most of the identity data. Several social providers have added security features to protect against unauthorized use of accounts, but self-registered accounts are typically not considered authoritative for identity information due to the lack of validation. Identity providers with self-registered accounts and little validation of attributes are most suitable for consumer-facing applications that do not require strongly validated identity data and would otherwise rely on self-registered information. Allowing users to authenticate via such providers gives users convenience and the ability to reuse a common profile.

Organization Identities

Many organizations, such as companies or universities, will issue an online identity for their members, such as employees or students, respectively. These identities meet some of the criteria for a strong identity. For example, in the United States, one must show government-issued identity when starting a new job. This enables validation of the identity attributes used to establish an online account within the company and ties the account to a real person. Most companies implement measures in their identity service such as minimum password length and possibly stronger forms of authentication, to protect an account against unauthorized use. The corporate identity service is authoritative for user login, at least within the domain of the issuing company. However, a user typically cannot log in via their corporate identity service and access services outside the organization and its contracted SaaS services. A user could not, for example, expect to log in via their corporate identity service and access a government site to buy stamps as the government site would not have any basis to trust the corporate identity service. Organization identity services are primarily suitable for use by applications selected by the organization to provide services to organization members.

Government Identities

A government-issued online identity, such as those issued by the United Kingdom’s EasyID,ii Belgian eID,iii or Estonian e-identity,iv is an example of a stronger identity. These require supplying information that is checked by a validation process. Some require applying in person at a government office, and some can be done online. Required documentation includes government-issued identity documents and photos that clearly show one’s face and may include fingerprints and financial questions. The resulting identity contains validated information and employs several security mechanisms to prevent unauthorized use.

The EasyID service, for example, can be used within the United Kingdom to prove identity and access various services conducted via the Post Office. The Belgian eID program issues an electronic identity that can be used for identification, digitally signing documents, and logging in to public services. Estonia issues a mandatory, secure, national digital identity and card which Estonians use to travel within the EU, as well as access e-services such as voting and logging in to bank accounts, access medical records, file taxes, and sign documents with a digital signature. Government-issued identities provide more strongly validated identities, but may be limited to users from one country and may be limited to use at the issuing government’s services. Wider use would need international standards similar to those for passports as well as a model for funding the incremental service operation costs.

Industry Consortium Identities

The Belgian Mobile IDv project is a consortium of financial institutions and mobile network operators to provide a strongly validated identity for anyone with a Belgian-issued eID and a mobile phone. It’s used to register at services, digitally sign documents, and securely log in as well as confirm transactions. The service includes a mobile application, “itsme,” which is used to authenticate without the need for passwords. The service is used to access Belgian government services such as social security and tax services as well as telecom and ebanking applications.

Identity Provider Selection

If you are creating a consumer-facing application that does not require validated identity information, allowing users to authenticate via an existing self-registered identity, such as a social provider account, offers users convenience over signing up with the same self-registered information at multiple sites.

If you are creating an employee-facing application, however, relying on social identity provider accounts to access company applications can be problematic because the user owns their identity and account at these providers. The credential standards of the provider may not meet company needs, and when an employee leaves the company, you could not delete their account to terminate their access. If, on the other hand, a social provider account is linked to a local application account, to enable logging in to the application via the social provider identity, the link can be removed and the local account disabled if an employee leaves. In the absence of such account linking, access would often need to be removed within individual applications, and one or more applications might be missed. For employee-facing applications, therefore, it’s best to use an identity service where the employing organization owns the accounts. The same logic applies to other organizations, such as educational institutions.

An organization-controlled identity service provides a single place at which the organization can provision accounts as well as shut off accounts if an employee or member leaves the organization. It also gives a single point at which to enforce credential strength/policy and deploy multi-factor authentication as well as log authentication activity. There are several cloud vendors that offer an identity service on a subscription basis. Cloud services such as Google Apps, Azure AD, Auth0,1 Amazon Cognito, and Okta offer cloud-based identity services. Organizations can provision employees or members into these services and have complete control over the accounts including the ability to quickly terminate or disable the accounts of anyone who leaves the organization.

If you are creating an application where your customers are businesses, you will likely need to support a variety of different identity providers because each business may have its own preferred identity provider service and want their users to sign in to your application via their chosen identity provider. Your business-to-business (B2B) customers may ask you to support authentication against cloud identity providers, such as those mentioned in the previous sections, or private identity providers that they operate themselves on their corporate network. It is best to do this via standard identity protocols such as OIDC or SAML 2. Implementing authentication directly against a customer’s internally hosted database or directory service would involve custom work for each customer and may expose your staff to passwords or administrative access which significantly increases your potential liability. Table 4-9 summarizes the types of identity providers that are most common for different scenarios.
Table 4-9

Identity Providers for Different Customer Types

Scenario

Common Type(s) of Identity Provider

B2C: Business to consumer

Social Identity Providers2

Identity services such as Azure AD or Auth0

Application-specific repository

B2E: Business to employee

Identity services such as Google Apps, Azure AD, Auth0

Any OIDC or SAML 2–compliant identity provider

B2B: Business to business

Identity services such as Google Apps, Azure AD, Auth0

Any OIDC or SAML 2–compliant provider controlled by the business customer

To recap, you should consider the target audience and strength of identity needed by your application. If a strong identity is required, it must be issued by a process which validates the information used to establish the identity and includes protections, such as strong password requirements or multi-factor authentication, to prevent unauthorized use of the identity. It must also be issued by an entity recognized as authoritative for the application’s domain.

Identity Proofing

The need for validated identity has been increasing. One important driver is the need to combat fraud, identity theft, and money laundering, especially in industries such as the financial sector. In the United States, the USA Patriot Actvi requires financial institutions to validate the identity of account holders, maintain records of the information used in such validation, and check if an account holder is on a list of known or suspected terrorists or traffickers. These measures are designed to reduce funding for organizations involved in terrorism, narcotics, and human trafficking. Similar requirements have been enacted by governments around the world, and they are sometimes called Know Your Customer (KYC) and Anti-Money Laundering (AML) requirements.

There are additional drivers for identity validation. In the United States, the Immigration Reform and Control Act (IRCA) of 1986 requires employers to validate the identity and employment eligibility of new employees. Businesses with sensitive intellectual property may validate the identity of new employees to reduce the risk of espionage. Applications targeted for a specific group, such as members of a trade union, may need to validate identity as part of eligibility requirements. Background checks can validate that an identity meets certain requirements, but at some point, the person applying must prove that they are the person represented by the submitted identity information. In the past, identity validation often took place via in-person presentation of identity documents. When an online service has no storefront at which in-person identity verification can take place, or when businesses hire remote employees, previous validation approaches may no longer be feasible. Many businesses may need to validate the identity of online users, and this process is known as identity proofing. The National Institute of Standards and Technology (NIST) in the United States has published a document on Digital Identity Guidelinesvii that outlines different identity assurance levels and the type of identity validation required for each.

A variety of digital services have sprung up to assist with this need. Some services validate an identity by having an applicant answer a series of multiple-choice questions that only the legitimate owner of an identity is likely to know, such as questions about past financial transactions. Other services will have a user record a selfie video to prove liveness, match the face on the video with a government-issued identity document, validate the identity document is legitimate, and retrieve validated identity attributes about the person from the identity document. Some providers can additionally check an identity against government lists of people and organizations on global sanctions and watchlists.

At the time of writing, providers such as ID.me, Sumsub, Socure, and Trulioo are a few examples of vendors offering solutions to help validate the identities of self-registered users. They can help businesses automate the process of identity verification to comply with government regulations, combat money laundering, or validate a user is a member of a particular class such as a military veteran or credentialed teacher. If validated identity is a requirement for your project, using such services can help validate user identity and profile attributes, freeing up application developers to focus on differentiating innovation for your application. It is important to note, however, that online identity validation services may not yet meet requirements for certain cases, such as identity verification for employment in the United States with the I-9 form required by the IRCA.

Choosing and Validating Identity Attributes

A common question that arises during the design of provisioning processes is how to identify a user. Email addresses have been widely adopted as identifiers. Using an email address as an identifier has the advantage that it includes a domain name and thus provides built-in uniqueness across domains. This eliminates the need for a user to find a name on each site that hasn’t been taken already. An email address is probably easier for a user to remember because it is used frequently, and it can double as a communication attribute. Email address identifiers, however, create several issues. Users may need to change their email address for any number of reasons and still retain access to their account as well as transactions performed using the previous email address. In addition, an email provider may reassign a previously used email address to a new owner. For business-facing applications, some businesses do not provide their employees with email accounts which can be an issue if an application assumes the availability of an email address. Similarly, applications marketed to children should recognize that some children may not have an email address.

Using a user-selected username also has advantages and disadvantages. A username may make it easier for a person to set up multiple accounts if needed and is typically shorter and therefore easier to type on mobile devices. A user must choose a unique username, however, and if their favorite username is already taken on a site, they have to choose another. It may be hard for users to remember which username was used at each site, which may create a need for a forgotten username feature. When one company acquires another, it often requires the merging of user repositories which may involve eliminating duplicate usernames. Table 4-10 lists some common advantages and disadvantages of different identifiers.
Table 4-10

Advantages and Disadvantages of Account Identifiers

Advantages

Disadvantages

Email: Globally unique.

No need to hunt for a name that isn’t taken already.

May be easier to remember than a username.

Can double as a communication attribute, such as for password resets.

Email:

May need to be changed by a user.

May be reassigned by an email provider to a new user.

May be reassigned by a corporate provider to a new user.

Terminated by the employer if a user leaves.

Not all companies issue email addresses.

Children may not have email addresses.

Family members may share an email address.

May expose personal information (user’s name).

Exposure as display name may result in spam email.

Username:

Easier to set up multiple accounts at a site.

May be shorter to type on mobile devices.

Can be used in searches, allowing other attributes with personal data to be encrypted.

Username:

Only unique within an application domain.

Merging user repositories problematic after acquisitions.

May be harder for a user to remember which username was used at each site.

A user may want to change a username over time.

May expose personal information if used for display and it contains personal information.

Phone number:

Globally unique (with country code).

No need to hunt for a free identifier. Can double as a communication attribute, such as for password resets.

May be easier for a user to remember than a username.

Phone number:

Exposure as display name may cause spam calls.

Might be reassigned to a new user over time.

May involve a charge to obtain a phone number.

More difficult for a person to set up multiple accounts at the same site.

May be changed by a user for various reasons.

May be terminated by a phone provider.

Attribute Usage

Some of the disadvantages listed earlier stem from using the same attribute for multiple purposes. They can be avoided by decoupling and using a different attribute for each of the following purposes:
  • Identifier for logging in

  • Display name

  • Notification/communication/account recovery

  • Internal account implementation such as for
    • Linking an identity/account to application records

    • Capturing user activity in log files

    • Consistent identifier for a user over time for audit purposes

The last three in the list are used for internal account implementation and should use a unique, internal account identifier that is not impacted by a user’s need to change profile attributes such as their email address, phone number, or their legal name. In addition, the following suggestions can avoid some of the other disadvantages outlined in Table 4-9:
  • Avoid exposing identifiers that may contain personal data.
    • Use an internal account identifier in log files to avoid directly exposing personal data in logs.

    • Use an internal account identifier in application records.

    • Allow users to specify a display name for use on screens/printouts to protect privacy.

  • Identifiers/attributes for logging in, display, and notification should be distinct and easily changeable by the user.

  • Allow setting multiple attributes for notification purposes, such as a primary and secondary email, in case one becomes inoperable.

  • Allowing usernames that are long, contain special characters, and that can be changed by users enables flexibility. Users can use an email address as their username if that is easier for them to remember, while other users can use other values. A separate profile attribute besides the username should be used for display purposes and another for notification/contact information to decouple these different usages.

If your application will leverage an identity provider, and users will access multiple applications through that identity provider, the use of Pairwise Pseudonymous Identifiers (PPIDs) reduces the ability for someone to correlate the user’s activity across different applications. For each user, a unique identifier is used between the identity provider and each application. A given user might be identified with “a8h3” for one application site and “c37j” for another. (In practice, the identifiers would be long, opaque, unguessable strings.) Support for PPIDs may vary by identity provider.

Validating Critical Attributes

In addition to using different profile attributes for different functions, it is important to validate email addresses and other profile attributes if used in activities that impact security and privacy. This includes attributes used for
  • Authorization decisions

  • Account recovery

  • Delivery of sensitive information to the user

For example, if a user profile includes an email address, and the email address attribute is used in authorization decisions, you should implement email address validation. Similarly, email address attributes used for notification in account recovery or delivery of sensitive information should be validated. The same holds true if a phone number is used for such purposes. If you import identities from elsewhere, you should ensure email addresses or other critical attributes used for the listed functions have been validated before accepting them so that you can rely on the profile attributes.

Security and privacy-related issues can arise with unvalidated attributes. If users can sign up using a fictitious, unvalidated email address and this attribute is used for authorization, their fictitious email address may match authorization rules that grant access to resources they are not really entitled to access. Validating email addresses also prevents accidental entry of an incorrect address. Incorrect email addresses could enable account takeover via account recovery mechanisms or result in the delivery of sensitive information to the wrong recipient. For these reasons, it is critical to decouple attributes for different purposes and validate any email addresses or other profile attributes that are used in authorization decisions, account recovery mechanisms, or to deliver sensitive information to users.

Consent Management

A less obvious requirement that must usually be addressed as part of provisioning processes is obtaining any necessary user consent for the collection, processing, and use of their personal data as well as notifying users about their rights related to such collection, processing, and use. Privacy legislation varies by jurisdiction but typically requires that a site provide privacy-related notification to users and have a legal basis for collecting and using data about individuals. Such legal bases include obtaining user consent, fulfilling a contract, satisfying a legal obligation, or performing a task that is vital to a data subject, in the public interest, or for a legitimate business purpose.

It is beyond the scope of this book to cover privacy requirements in detail, but we will provide some considerations of technical features related to consent management that may be useful for the account provisioning process. A need to obtain user consent is primarily driven by privacy legislation, but, if done well, can facilitate user trust, which in turn may make users more willing to engage with your site by sharing information, responding to surveys, and consenting to practices such as personalization. Consent management encompasses the processes and mechanisms for providing privacy-related notification to users, obtaining their consent when required, allowing users to set privacy-related preferences, and securely storing consent records to support compliance with relevant privacy legislation.

Consent management includes displaying privacy notice(s) that describes data collection and processing practices, including the use of cookies and tracking technologies. Rather than a single opt-in vs. opt-out, best practices have evolved to provide more granular choices that enable users to opt in to the use of various cookies and tracking technologies, as well as the use of their data for functions such as marketing, ongoing communications, and analytics. Progressive consent gathering can be used to reduce consent fatigue on the part of users, but consent must be obtained prior to collection and use of data about users unless another legal basis applies. Consent must also be obtained in a way that complies with applicable privacy legislation. For example, in some jurisdictions, a user’s ability to access content on your site cannot require the user to agree to the use of their data for purposes such as analytics, personalization, or cross-marketing.

You will need to keep a record of consent obtained from users. Consent records should include information such as
  • Who gave consent, in the form of an identifier, such as email address or other account identifier, or in more anonymous cases, a cookie or device ID

  • When the consent was given, in the form of a timestamp

  • The site for which consent was given

  • The purposes of processing for which the user has consented

  • The version of privacy/consent notice used at the time of consent

  • Any subsequent changes or withdrawal of consent

The data about users’ consent choices may be used in several ways, so the decision about how and where to store it should consider such needs. It should be possible for users to view and update their consent choices over time. User consent data may need to be accessible by applications to trigger the execution of code which gathers data on user behavior, for feedback and learning about users. Marketing applications may also need consent data to govern communications sent to users. Lastly, auditors may need to review records that show that user consent has been obtained. To support these requirements, user consent data should be centralized and accessible by different business functions and systems.

A site may collect different types of data about users, including what is known as zero-party data, first-party data, and third-party data. Zero-party data is a term coined by Forrester Research and refers to data that users provide themselves, such as preferences, survey responses, or sharing information about themselves. First-party data is data collected by an application about a user. It can include observations of user behavior on a site and transactions the user submits. Third-party data is that collected or purchased from third parties, to augment data collected by an application. The data you hold about users may be subject to Data Subject Access Requests (DSARs).

Privacy legislation typically gives users, also known as data subjects in this context, the right to access data held about them. Companies must respond to DSAR requests in a timely manner, with the exact time varying by jurisdiction. The data collected about a user during the provisioning process and beyond may be subject to such requests. Applications must support a user’s right of access, right to rectification, right to erasure, right to restrict processing, right to data portability, and right to object to processing. Users must be able to see data held about them, rectify it if needed, and erase data, as well as revoke or adjust any consent given earlier. You will need to implement processes and/or online mechanisms to support these rights.

Summary

We’ve covered several approaches that can be used to establish accounts for the users of your application, including self-registration, progressive profiling, transferring users from elsewhere, administrative processes, and leveraging identity provider services. In selecting a provisioning approach, you will want to consider the strength and suitability of the identity offered by each option against the sensitivity and target audience of your application. You may need to design consent management processes to provide notification and obtain user consent for the collection and processing of personal data, and you may additionally need to utilize identity proofing services to validate the identities of your users. Once you have an idea how your users will be created, you can start implementing authentication and access control. Modern applications are often designed starting with APIs, so we’ll start off in the next chapter with OAuth 2, which is designed for protecting APIs.

Key Points

  • Provisioning is the process of creating an account and associated identity information.

  • Applications can create new accounts for users or leverage identities in existing identity provider services.

  • Progressive profiling can be used to build up user profiles over time.

  • Email addresses and other attributes used for notifications to users must be validated.

  • Identities can be classified as weak or strong depending on a provider’s practices.

  • Weak identities are created with unvalidated information.

  • Strong identities are based on validated information and mechanisms to prevent forgery and unauthorized use. They must be issued via secure distribution mechanisms by authoritative providers.

  • In choosing identity providers, a service should match the strength of the identity offered by the provider with the identity validation and strength requirements of an application.

  • A variety of services are available to perform online identity verification which can help prevent fraud and meet legislative requirements for knowing your customer and combatting money laundering.

  • Application designers should decouple and designate appropriate user profile attributes for each of several purposes, including login, display, notification, and internal tracking.

  • To comply with privacy regulations, provisioning processes should include consent management to provide privacy notifications to users and obtain and securely store user consent for the collection and processing of their personal data.

Notes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset