Accessing AWS S3 from applications

AWS S3 is highly scalable and durable object storage. You only have to pay for the storage you actually use. S3 replicates data in multiple data centers within the region. Further, AWS S3 introduces cross-region replication that replicates your data across AWS regions. In this recipe, we cover both uploading objects to and downloading objects from AWS S3.

How to do it…

  1. Installing AWS Java SDK.

    In your Maven dependency section, add the following dependency for AWS Java SDK Version 1.9.28.1:

    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>aws-java-sdk</artifactId>
        <version>1.9.28.1</version>
    </dependency>
  2. Create a bucket.

    The following sample Java program creates a S3 bucket called laurenceluckinbill in the Singapore region:

        // Create S3 bucket.
            public static void CreateBucket() {
    
            // Create BasicAWSCredentials with Access Key Id and Secret Access Key.
            BasicAWSCredentials credentials = new BasicAWSCredentials(
                    "Access Key Id",
                    "Secret Access Key");
    
            // Create S3 client.
            AmazonS3Client s3Client = new AmazonS3Client(credentials);
    
            // Set endpoint.
            s3Client.setEndpoint("s3-ap-southeast- 1.amazonaws.com");
    
            // Create bucket.
            s3Client.createBucket("laurenceluckinbill");
    
        }
  3. Upload an object into the S3 bucket. The following sample Java program uploads the Readme.txt file into the bucket called laurenceluckinbill:
        // Upload object.
            public static void UploadObject() {
    
            // Create BasicAWSCredentials with Access Key Id and Secret Access Key.
            BasicAWSCredentials credentials = new BasicAWSCredentials(
                    "Access Key Id",
                    "Secret Access Key");
            // Create S3 client.
            AmazonS3Client s3Client = new AmazonS3Client(credentials);
    
            // File to upload.
            File file = new File("D:\Readme.txt");
    
            // Upload object into bucket.
            s3Client.putObject("laurenceluckinbill", "Readme.txt", file);
        }
  4. Download an object.

    The following sample program downloads the Readme.txt object from a bucket called laurenceluckinbill into a local folder:

        // Download object.
            public static void DownloadObject() {
    
            // Create BasicAWSCredentials with Access Key Id and Secret Access Key.
            BasicAWSCredentials credentials = new BasicAWSCredentials(
                    "Access Key Id",
                    "Secret Access Key");
    
            // Create S3 client.
            AmazonS3Client s3Client = new AmazonS3Client(credentials);
    
            // Local file path.
            String path = "D:\Readme.txt";
    
            // Download object.
            s3Client.getObject(new GetObjectRequest("laurenceluckinbill",
                    "Readme.txt"), new File(path));
        }

How it works…

Objects are stored in containers called buckets on S3. In our example, the object is a file called Readme.txt and is stored in a bucket called laurenceluckinbill. This object can be accessed using http://laurenceluckinbill.s3.amazonaws.com/Readme.txt. S3 provides, both, a REST and a SOAP interface to store and retrieve objects. In addition, AWS SDKs are also provided for building applications that use Amazon S3.

You can configure buckets to be created in a specific region to minimize costs or latency, or meet regulatory requirements. While a bucket is the container on S3, an object is the entity stored in Amazon S3. The objects are uniquely addressed by a combination of the endpoint, bucket, key, and version ID.

First, we install the AWS SDK, as the AWS Java SDK is required to access the AWS services from Java applications.

Next, we create an S3 bucket. We have to create buckets in one of the AWS regions before uploading any data into Amazon S3. We can upload any number of objects inside a bucket. As a best practice, always use DNS-compliant bucket names as it provides support for virtual-host style access to the buckets.

After creating the bucket, you can upload your objects into the bucket or access/download objects from the bucket. In order to upload an object to our bucket, we create an instance of AmazonS3Client, and then execute the putObject method. The putObject method is overloaded. You need to select the appropriate version, depending on whether you are uploading data from a file or a stream. In our example, we are uploading a file from our local drive to our bucket.

The GetObjectRequest object provides several options. In our example, we use it to retrieve the object and write to a File object on our local drive. Alternatively, the data can be streamed directly from S3, and you can read from it. However, this should be done as quickly as possible because the network connection remains open till you finish reading and close the input stream.

There's more…

Amazon S3 replicates your data across multiple servers, hence your object writes, replacements, and deletes may not be reflected immediately. S3 does not support object locking and the latest request wins. If such a behavior is unacceptable for your application, then you will need to build this functionality. You can also replicate objects to different AWS regions. You can filter by key prefixes to replicate specific objects to specific regions for regulatory or reducing latency.

Each object in Amazon S3 has a storage class associated with it. There are two storage types most commonly used—standard and reduced redundancy storage. You can reduce your costs using the S3 Reduced Redundancy Storage option for storing non-critical data. This option replicates the data fewer number of times than the standard S3 storage, and hence the associated costs are lower as well.

You can enable versioning on your buckets. If enabled, Amazon S3 assigns a unique version ID to your objects. This helps protect you from unintended overwrites and deletes, and retrieving prior versions of your objects. You can also retrieve specific versions of your objects.

For uploading large objects, you can use the multipart upload API. After all the parts are uploaded, Amazon S3 constructs the object and makes it available for you. When you request a multipart upload, S3 returns you an upload ID. You need to use this ID and a part number (to identify the part and its position in the object being uploaded). You can also retrieve the entire object or retrieve it in parts.

You can list your object keys by prefix. Hence, if you choose a common prefix for the names of your related keys, then you can use the list operation to select or browse the keys in a hierarchical manner using the bucket name and the prefix (similar to your local filesystem).

For deleting objects from Amazon S3, you can use the delete API or the Multi-Object delete API depending on whether you are deleting a single or multiple objects, respectively. You need to create an instance of AmazonS3Client, and then execute the deleteObject method. If versioning is not enabled, then the object is deleted otherwise the operation puts a delete marker on the version and the object disappears from the bucket. Note that if you specify the version in the delete request, the version ID maps to a delete marker for that object. Then, S3 deletes the marker and the object reappears in your bucket.

You can create policies to control access to the buckets and objects. The policies govern the creation, deletion, and listing the contents of buckets. Every request to S3 can be authenticated or anonymous. You can use IAM user access keys or temporary security credentials to access the services. There are two options for protecting your data at rest on S3—server-side encryption and client-side encryption. In server-side encryption you can let Amazon S3 to encrypt and decrypt your data. If you use Client-Side encryption then you encrypt the data in the client and upload it for storage on S3.

You can host static websites in S3 by configuring your bucket for website hosting. This can be done via the AWS Management Console or using the AWS SDKs. All requests to your registered domain are routed to the appropriate S3 website endpoint. You will also need to create appropriate policies to make your S3 content accessible to the public.

You can configure Amazon S3 notifications for certain S3 events such as object creation, object removal, delete marker created for a versioned object, and so on. These events can be published to an SNS topic, SQS queue, or AWS Lambda function. You can also configure notifications to be filtered by the key prefix.

In addition, Amazon S3 is integrated with CloudWatch, so you can collect and analyze metrics for your S3 buckets. You can also create alarms and send notifications if the threshold is exceeded for a specific S3 metric. API calls to S3 can be tracked via CloudTrail Logs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset