Getting started with Amazon Redshift

In this section, we will be looking at a few simple steps which you can take to have a fully functioning Amazon Redshift cluster up and running in a matter of minutes:

  1. First up, we have a few prerequisite steps that need to be completed before we begin with the actual set up of the Redshift cluster. From the AWS Management Console, use the Filter option to filter out IAM. Alternatively, you can also launch the IAM dashboard by selecting this URL: https://console.aws.amazon.com/iam/.
  1. Once logged in, we need to create and assign a role that will grant our Redshift cluster read-only access to Amazon S3 buckets. This role will come in handy later on in this chapter when we load some sample data on an Amazon S3 bucket and use Amazon Redshift's COPY command to copy the data locally into the Redshift cluster for processing. To create the custom role, select the Role option from the IAM dashboards' navigation pane.
  2. On the Roles page, select the Create role option. This will bring up a simple wizard using which we will create and associate the required permissions to our role.
  3. Select the Redshift option from under the AWS Service group section and opt for the Redshift - Customizable option provided under the Select your use case field. Click Next to proceed with the set up.
  4. On the Attach permissions policies page, filter and select the AmazonS3ReadOnlyAccess permission. Once done, select Next: Review.
  5. In the final Review page, type in a suitable name for the role and select the Create Role option to complete the process. Make a note of the role's ARN as we will be requiring this in the later steps. Here is snippet of the role policy for your reference:
{ 
  "Version": "2012-10-17", 
  "Statement": [ 
    { 
      "Effect": "Allow", 
      "Action": [ 
        "s3:Get*", 
        "s3:List*" 
      ], 
      "Resource": "*" 
    } 
  ] 
} 

With the role created, we can now move on to creating the Redshift cluster.

  1. To do so, log in to the AWS Management Console and use the Filter option to filter out Amazon Redshift. Alternatively, you can also launch the Redshift dashboard by selecting this URL: https://console.aws.amazon.com/redshift/.
  2. Select Launch Cluster to get started with the process.
  1. Next, on the CLUSTER DETAILS page, fill in the required information pertaining to your cluster as mentioned in the following list:
    • Cluster identifier: A suitable name for your new Redshift cluster. Note that this name only supports lowercase strings.
    • Database name: A suitable name for your Redshift database. You can always create more databases within a single Redshift cluster at a later stage. By default, a database named dev is created if no value is provided:
    • Database port: The port number on which the database will accept connections. By default, the value is set to 5439, however you can change this value based on your security requirements.
    • Master user name: Provide a suitable username for accessing the database.
    • Master user password: Type in a strong password with at least one uppercase character, one lowercase character and one numeric value. Confirm the password by retyping it in the Confirm password field.
  1. Once completed, hit Continue to move on to the next step of the wizard.
  2. On the NODE CONFIGURATION page, select the appropriate Node type for your cluster, as well as the Cluster type based on your functional requirements. Since this particular cluster setup is for demonstration purposes, I've opted to select the dc2.large as the Node type and a Single Node deployment with 1 compute node. Click Continue to move on the next page once done.
It is important to note here that the cluster that you are about to launch will be live and not running in a sandbox-like environment. As a result, you will incur the standard Amazon Redshift usage fees for the cluster until you delete it. You can read more about Redshift's pricing at: https://aws.amazon.com/redshift/pricing/.
  1. In the ADDITIONAL CONFIGURATION page, you can configure add-on settings, such as encryption enablement, selecting the default VPC for your cluster, whether or not the cluster should have direct internet access, as well as any preferences for a particular Availability Zone out of which the cluster should operate. Most of these settings do not require any changes at the moment and can be left to their default values.
  2. The only changes required on this page is associating the previously created IAM role with the cluster. To do so, from the Available Roles drop-down list, select the custom Redshift role that we created in our prerequisite section. Once completed, click on Continue.
  3. Review the settings and changes on the Review page and select the Launch Cluster option when completed.

The cluster takes a few minutes to spin up depending on whether or not you have opted for a single instance deployment or multiple instances. Once completed, you should see your cluster listed on the Clusters page, as shown in the following screenshot. Ensure that the status of your cluster is shown as healthy under the DB Health column. You can additionally make a note of the cluster's endpoint as well, for accessing it programmatically:

With the cluster all set up, the next thing to do is connect to the same. In the next section, we will be looking at a few simple steps you can take to connect to your newly deployed Redshift cluster.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset