In this section, I will discuss additional features of AWS Data Lake administration that will be handy to manage the data lake.
User management for the AWS Data Lake
The AWS Data Lake solution comes with good user management capabilities that allow administrators to manage access to their users. Let's review the key aspects of user management.
Inviting a new user
Here are the steps to add a new user to our data lake:
From the navigation pane, select Users under the Administration section and then click on Invite user.
Enter the name, e-mail, and role.
A Member role has following permissions:
View and search for all packages in the data lake
Add, remove, and generate manifests for packages
Create, update, and delete packages created by them
Add and remove datasets from the packages created by them
Generate a secret access key if the Administrator has granted them API access
An Admin role has the following permissions in addition to those that a member has:
Create user invitations
Update, disable, enable, and delete data lake users.
Create, revoke, enable, and disable a user's API access
Update data lake settings
Create, update, and delete governance settings
Finally, click on Create Invitation to send an invitation to the user. The user now has 7 days to sign in to the data lake, after which the invitation expires.
Figure 7.26: Add user
This completes the add user section; next, we will look at updating existing users.
Updating an existing user
If you wish to change the role or details of an existing user, you can follow these steps:
From the navigation pane, select Users under the Administration section and then click on the pencil icon next to the user you want to update.
Next, on the Details tab, you can modify the role, disable the user, or enable the user.
You can also request for an API access key from the API Access tab, as shown here:
Figure 7.27: User API access
General system settings for AWS Data Lake
To access the data lake settings, click on the Settings menu under the Administration section from the left navigation pane. I will discuss the key links that this page provides information about:
Application Url: The main URL for the data lake console
Default Amazon S3 Bucket: The S3 bucket used to store datasets and manifests that are uploaded to the data lake
Amazon Elasticsearch Index: The index created for searching packages in the data lake
Amazon Elasticsearch Kibana Url: The URL used for the Kibana application that comes packaged with the data lake
Audit Logging: Enable or disable audit logging. When audit logging is enabled, all user operations within the data lake are logged to the data lake/audit log in the Amazon CloudWatch logs
Default Search Results Limit: The maximum number of hits returned on the user interface when a search is performed. This is also shown here: