Chapter 16
Optimization

THE AWS CERTIFIED DEVELOPER – ASSOCIATE EXAM TOPICS COVERED IN THIS CHAPTER MAY INCLUDE, BUT ARE NOT LIMITED TO, THE FOLLOWING:

  • Domain 3: Development with AWS Services
  • check mark 3.4 Write code that interacts with AWS services by using APIs, SDKs, and AWS CLI.

    Content may include the following:

    • Programming AWS APIs
  • Domain 4: Refactoring
  • check mark 4.1 Optimize application to best use AWS services and features.

    Content may include the following:

    • Cost optimization
    • Performance optimization
    • Best practices for achieving optimization
  • Domain 5: Monitoring and Troubleshooting
  • check mark 5.1 Write code that can be monitored.

    Content may include the following:

    • Tools for cost monitoring
    • Tools for performance monitoring

Introduction to Optimization

Creating a software system is a lot like constructing a building. If the foundation is not solid, structural problems can undermine the integrity and function of the building. The AWS Well-Architected Tool helps you understand the pros and cons of decisions that you make while building systems on AWS. By using the tool, you will learn architectural best practices for designing and operating reliable, secure, efficient, and cost-effective systems in the AWS Cloud. When architecting technology solutions, if you neglect the five pillars of operational excellence, security, reliability, performance efficiency, and cost optimization, it can become challenging to build a system that delivers on your expectations and requirements. Incorporating these pillars into your architecture helps you to produce stable and efficient systems.

This chapter covers some of the best practices and considerations in designing systems with the most effective use of services and resources to achieve business outcomes at a minimal cost and maintain the optimal performance efficiency.

Cost Optimization: Everyone’s Responsibility

All teams help manage cloud costs, and cost optimization is everyone’s responsibility. Make sure that costs are known from beginning to end, at every level, and from executives to engineers. Ensure that project owners and budget holders know what their upfront and ongoing costs are. Business decision makers must track costs against budgets and understand return on investment (ROI).

Encourage everyone to track their cost optimization daily so that they can establish a habit of efficiency and see the daily impact of their cost savings over time.

Developers’ and engineers’ contributions are a significant part of the organization’s success. Every engineer can be a cost engineer. Engineers should design the code to consume resources only when needed, control the utilization, build sizing into architecture, and tag the resources to optimize usage.

Tagging

By tagging your AWS resources, you can assign custom metadata to instances, images, and other resources. For example, you can categorize resources by owner, purpose, or environment, which helps you organize them and assign cost accountability. When you apply tags to your AWS resources and activate the tags, AWS adds this information to the Cost and Usage reports.

Follow Mandatory Cost Tagging

An effective tagging strategy gives you improved visibility and monitoring, helps you create accurate chargeback and showback models, and extract more granular and precise insights into usage and spending by applications and teams. The following tag categories help you achieve these goals:

Environment Distinguishes among development, test, and production infrastructure. Specifying an environment tag reduces analysis time, post-processing, and the need to maintain a separate mapping file of production versus nonproduction accounts.

Application ID Identifies resources that are related to a specific application for easy tracking of spending changes and that turn off at the end of projects.

Automation Opt-In/Opt-Out Indicates whether a resource should be included in an automated activity such as starting, stopping, or resizing instances.

Cost Center/Business Unit Identifies the cost center or business unit associated with a resource, typically for cost allocation and tracking.

Owner Used to identify who is responsible for the resource. This is typically the technical owner. If needed, you can add a separate business owner tag. You can specify the owner as an email address. Using email addresses supports automated notifications to both the technical and business owners as required (for example, if the resource is a candidate for elasticity or right sizing).

Tag on Creation

You can make tagging a part of your build process and automate it with AWS management tools, such as AWS Elastic Beanstalk and AWS OpsWorks.

The following AWS CLI sample adds two tags, CostCenter and environment, for an Amazon Machine Image (AMI) and an instance:

aws ec2 create-tags --resources ami-1a2b3c4d i-1234567890abcdef0 --tags Key=CostCenter,Value=123   Key=environment,Value=Production

You can execute management tasks at scale by listing resources with specific tags and then executing the appropriate actions. For example, you can list all the resources with the tag and value of environment:test; then, for each of the resources, delete or terminate the resource. This is useful for automating shutdown or removal of a test environment at the end of the working day. Running reports on tagged and, more importantly, untagged resources enables greater compliance with internal cost management policies.

Enforce Tag Use

Using AWS Identity and Access Management (IAM) policies, you can enforce tag use to gain precise control over access to resources, ownership, and accurate cost allocation.

The following example policy allows a user to create an Amazon Elastic Block Store (Amazon EBS) volume only if the user applies the tags (Costcenter and environment) that are defined in the policy using the qualifier ForAllValues. If the user applies any tag that is not included in the policy, the action is denied. To enforce case sensitivity, use the condition aws:TagKeys as follows:

Effect: Allow
Action: 'ec2:CreateVolume'
Resource: 'arn:aws:ec2:us-east-1:123456789012:volume/*'
Condition:
    StringEquals:
        'aws:RequestTag/costcenter': '115'
        'aws:RequestTag/environment': prod
    'ForAllValues:StringEquals':
        'aws:TagKeys':
            - Costcenter
            - environment

Tagging Tools

The following tools help you manage your tags:

  • AWS Tag Editor—Finds resources with search criteria (including missing and misspelled tags) and enables you to edit tags from the AWS Management Console
  • AWS Config—Identifies resources that do not comply with tagging policies
  • Capital One’s Cloud Custodian (open source)—Ensures tagging compliance and remediation

Reduce AWS Usage

Set a continuous practice to review your consumption of AWS resources, and understand the factors that contribute to cost. Use various AWS monitoring tools to provide visibility, control, and cost optimization. Implement the best practice of oversight to make sure that you are not overspending. Following the DevOps phase, use dashboards to view the estimated costs of your AWS usage, top services that you use most, and the proportion of your costs to which each service contributed. If your monthly bill increases, make sure that it is for the right reason (business growth) and not the wrong reason (waste).

Delete Unnecessary EBS Volumes

Stopping an Amazon Elastic Compute Cloud (Amazon EC2) instance leaves any attached Amazon Elastic Block Store (Amazon EBS) volumes operational. You continue to incur charges for these volumes until you delete them.

Stop Unused Instances

Stop instances used in development and production during hours when these instances are not in use and then start them again when their capacity is needed. Assuming a 50-hour workweek, you can save 70 percent of costs by automatically stopping dev/test/production instances during nonbusiness hours.

Delete Idle Resources

Consider the following best practices to reduce costs associated with AWS idle resources, such as unattached Amazon EBS volumes and unused Elastic IP addresses:

  • The easiest way to reduce operational costs is to turn off instances that are no longer being used. If you find instances that have been idle for more than two weeks, it’s safe to stop or even terminate them.
  • Terminating an instance, however, automatically deletes attached EBS volumes and requires effort to re-provision if the instance is needed again. If you decide to delete an EBS volume, consider storing a snapshot of the volume so that it can be restored later if needed.
  • Spin up instances to test new ideas. If the ideas work, keep the instance for further refinement. If not, spin it down.
  • An Elastic IP address does not incur charges as long as it is associated with an Amazon EC2 instance. If an Elastic IP address is not used, you can avoid charges by releasing the IP address. After you release an IP address, you cannot provision that same Elastic IP address again.

Update Outdated Resources

As AWS releases new services and features, it is a best practice to review your existing architectural decisions to ensure that they remain cost effective and stay evergreen. As your requirements change, be aggressive in decommissioning resources, components, and workloads that you no longer require.

Delete Unused Keys

Each customer master key (CMK) that you create in AWS Key Management Service (AWS KMS), regardless of whether you use it with KMS-generated key material or key material imported by you, incurs a cost you until you delete it. Before deleting a CMK, you might want to know how many ciphertexts were encrypted under that key. Knowing how a CMK was used in the past might help you decide whether you will need it in the future by using AWS CloudTrail usage logs. After you are sure that you want to delete a CMK in AWS KMS, schedule the key deletion.

Delete Old Snapshots

If your architecture suggests a backup policy that takes EBS volume snapshots daily or weekly, then you will quickly accumulate snapshots. To reduce storage costs, check for “stale” snapshots—ones that are more than 30 days old—and delete them. Deleting a snapshot has no effect on the volume. You can use the AWS Management Console or AWS Command Line Interface (AWS CLI) for this purpose.

Right Sizing

Right sizing is the process of matching instance types and sizes to performance and capacity requirements at the lowest possible cost. To achieve cost optimization, right sizing must become an ongoing process within your organization. Even if you right size workloads initially, performance and capacity requirements can change over time, which can result in underused or idle resources. New projects and workloads require additional cloud resources. Therefore, if there is no periodic check on right sizing, overprovisioning is the likely outcome.

AWS provides APIs, SDKs, and features that allow resources to be modified as demands change.

The following are examples of how you can change the instance type to match performance and capacity requirements:

  • On Amazon Elastic Compute Cloud (Amazon EC2), you can perform a stop-and-start to allow a change of instance size or instance type.
  • On Amazon EBS, you can increase volume size or adjust performance while volumes are still in use to improve performance through increased input/output operations per second (IOPS) or throughput or to reduce cost by changing the type of volume.

Select the Right Use Case

As you monitor current performance, identify the following usage needs and patterns so that you can take advantage of potential right-sizing options:

Steady state The load remains constant over time, making forecasting simple. Consider using Reserved Instances to gain significant savings.

Variable, but predictable The load changes on a predictable schedule. Consider using AWS Auto Scaling.

Dev, test, production Development, testing, and production environments can usually be turned off outside of work hours.

Temporary Temporary workloads that have flexible start times and can be interrupted are good candidates for Spot Instances instead of On-Demand Instances.

Select the Right Instance Family

When you launch an instance, the instance type that you specify determines the hardware of the host computer used for your instance. Each instance type offers different compute, memory, and storage capabilities, and they are grouped in instance families based on these capabilities. Depending on the AWS offering, you can determine the right instance family for your infrastructure.

Amazon Elastic Cloud Compute

Amazon Elastic Cloud Compute (Amazon EC2) provides a wide selection of instances, which gives you flexibility to right size CPU and memory needs for your compute resources to match capacity needs at the lowest cost. Following are the different options for CPU, memory, and network resources:

General purpose (includes A1, T2, M3, and M4 instance types) A1 instances deliver significant cost savings and are ideally suited for scale-out workloads, such as web servers, containerized microservices, caching fleets, and distributed data stores. T2 instances are a low-cost option that provides a small amount of CPU resources that can be increased in short bursts when additional cycles are available. They are well suited for lower throughput applications, such as administrative applications or low-traffic websites. M3 and M4 instances provide a balance of CPU, memory, and network resources, and they are ideal for running small and midsize databases, more memory-intensive data processing tasks, caching fleets, and backend servers.

Compute optimized (includes the C3 and C4 instance types) This family has a higher ratio of virtual CPUs to memory than the other families and the lowest cost per virtual CPU of all of the Amazon EC2 instance types. Consider compute-optimized instances first if you are running CPU-bound, scale-out applications, such as front-end fleets for high-traffic websites, on-demand batch processing, distributed analytics, web servers, video encoding, and high-performance science and engineering applications.

Memory optimized (includes the X1, R3, and R4 instance types) Designed for memory-intensive applications, these instances have the lowest cost per GiB of RAM of all Amazon EC2 instance types. Use these instances if your application is memory-bound.

Storage optimized (includes the I3 and D2 instance types) Optimized to deliver tens of thousands of low-latency, random input/output operations per second (IOPS) to applications. Storage-optimized instances are best for large deployments of NoSQL databases. I3 instances are designed for I/O-intensive workloads and equipped with super-efficient NVMe SSD storage. These instances can deliver up to 3.3 million IOPS in 4-KB blocks and up to 16 GB per second of sequential disk throughput. D2 or dense storage instances are designed for workloads that require high sequential read and write access to large datasets such as Hadoop, distributed computing, massively parallel processing data warehousing, and log-processing applications.

Accelerated computing (includes the P2, G3, and F1 instance types) Provides access to hardware-based compute accelerators, such as graphics processing units (GPUs) or field programmable gate arrays (FPGAs). Accelerated-computing instances enable more parallelism for higher throughput on compute-intensive workloads.

Amazon Relational Database Service

Similar to Amazon EC2 instances, Amazon Relational Database Service (Amazon RDS) provides options to choose from database instances that are optimized for memory, performance, and I/O.

Standard performance (includes the M3 and M4 instance types) Designed for general-purpose database workloads that do not run many in-memory functions. This family has the most options for provisioning increased IOPS.

Burstable performance (includes T2 instance types) For workloads that require burstable performance capacity.

Memory optimized (includes the R3 and R4 instance types) Optimized for in-memory functions and big data analysis.

Select the Right Instance Compatibility

You can right size an instance by migrating to a different model within the same instance family or by migrating to another instance family. When you’re migrating within the same instance family, consider vCPU, memory, network throughput, and ephemeral storage.

Virtualization type The instances must have the same Linux Amazon Machine Image (AMI) virtualization type (PV AMI versus HVM) and platform (Amazon EC2-Classic versus Amazon EC2-VPC).

Network Instances unsupported in Amazon EC2-Classic must be launched in a virtual private cloud (VPC).

Platform If the current instance type supports 32-bit AMIs, make sure to select a new instance type that also supports 32-bit AMIs (not all Amazon EC2 instance types do).

Using Instance Reservations

Amazon EC2 provides several purchasing options to enable you to optimize your costs based on your needs.

AWS Pricing for Reserved Instances

Amazon EC2 Reserved Instances allow you to commit to usage parameters. To unlock an hourly rate that is up to 75 percent lower than On-Demand pricing, you can commit to a one-year or three-year duration at the time of purchase.

There are three payment options for Reserved Instances:

No Upfront No upfront payment is required, and Reserved Instances are billed monthly. This requires a good payment history with AWS.

Partial Upfront A portion of the cost is paid upfront, and the remaining hours in the term are billed at a discounted hourly rate, regardless of whether the RI is being used.

All Upfront Full payment is made at the start of the term, with no other costs or additional hourly charges incurred for the remainder of the term, regardless of hours used.

Amazon EC2 Reservations

Amazon EC2 Reserved Instances provide a reservation of resources and capacity when used in a specific Availability Zone within an AWS Region:

  • With Reserved Instances, you commit to a period of usage (one or three years) and save up to 75 percent over equivalent On-Demand hourly rates.
  • For applications that have steady state or predictable usage, Reserved Instances can provide significant savings compared to using On-Demand Instances, without requiring a change to your workload.

Convertible Reserved Instances

Convertible Reserved Instances are provided for a one-year or three-year term, and they enable conversion to different families, new pricing, different instance sizes, different platforms, or tenancy during the period. Use Convertible Reserved Instances when you are uncertain about instance needs in the future, but you are still committed to using Amazon EC2 instances for a three-year term in exchange for a significant discount.

Suppose that you own an Amazon EC2 Reserved Instance for a c4.8xlarge for three years. This Reserved Instance applies to any usage of a Linux/Unix c4 instance with shared tenancy in the same region as the Reserved Instance, such as 1 c4.8xlarge instance, 2 c4.4xlarge instances, or 16 c4.large instances, during this term. This adds flexibility to match the new needs of your workloads:

  • There are no limits to how many times you perform an exchange, as long as the target Convertible Reserved Instance is of an equal or higher value than the Convertible Reserved Instances that you are exchanging.
  • Exchanging Convertible Reserved Instances is free of charge, but you might need to pay a true-up cost if the value is lower than the value of the Reserved Instances for which you’re exchanging. For example, you can convert C3 Reserved Instances to C4 Reserved Instances to take advantage of a newer instance type, or you can convert C4 Reserved Instances to M4 Reserved Instances if your application requires more memory. You can also use Convertible Reserved Instances to take advantage of Amazon EC2 price reductions over time.

Reserved Instance Marketplace

Use the Reserved Instance Marketplace to sell your unused Reserved Instances and buy Reserved Instances from other AWS customers. As your needs change throughout the course of your term, the AWS Marketplace provides an option to buy Reserved Instances for shorter terms and with a wider selection of prices.

Amazon Relational Database Service Reservations

Reserved DB instances are not physical instances; they are a billing discount applied to the use of certain on-demand DB instances in your account. Discounts for reserved DB instances are tied to instance type and AWS Region.

All Reserved Instance types are available for Amazon Aurora, MySQL, MariaDB, PostgreSQL, Oracle, and SQL Server database engines.

  • Reserved Instances can also provide significant cost savings for mission-critical applications that run on Multi-AZ database deployments for higher availability and data durability. Reserved Instances can minimize your costs up to 69 percent over On-Demand rates when used in steady state.
  • Most production applications require database servers to be available 24/7. Consider using Reserved Instances to gain substantial savings if you are currently using On-Demand Instances.
  • Any usage of running DB instances that exceeds the number of applicable Reserved Instances you have purchased are charged the On-Demand rate. For example, if you own three Reserved Instances with the same database engine and instance type (or instance family, if size flexibility applies) in a given region, the billing system checks each hour to determine how many total instances you have running that match those parameters. If it is three or fewer, you are charged the Reserved Instance rate for each instance running that hour. If more than three are running, you are charged the On-Demand rate for the additional instances.
  • With size flexibility, your Reserved Instance’s discounted rate is automatically applied to usage of any size in the instance family (using the same database engine) for the MySQL, MariaDB, PostgreSQL, and Amazon Aurora database engines and the “Bring your own license” (BYOL) edition of the Oracle database engine. For example, suppose that you purchased a db.m4.2xlarge MySQL Reserved Instance in US East (N. Virginia). The discounted rate of this Reserved Instance can automatically apply to two db.m4.xlarge MySQL instances without you needing to do anything.
  • The Reserved Instance discounted rate also applies to usage of both Single-AZ and Multi-AZ configurations for the same database engine and instance family.

  • Suppose that you purchased a db.r3.large PostgreSQL Single-AZ Reserved Instance in EU (Frankfurt). The discounted rate of this Reserved Instance can automatically apply to 50 percent of the usage of a db.r3.large PostgreSQL Multi-AZ instance in the same region.

Using Spot Instances

Amazon EC2 Spot Instances offer spare compute capacity in the AWS Cloud at steep discounts compared to On-Demand Instances.

You can use Spot Instances to save up to 90 percent on stateless web applications, big data, containers, continuous integration/continuous delivery (CI/CD), high performance computing (HPC), and other fault-tolerant workloads. Or, scale your workload throughput by up to 10 times and stay within the existing budget.

Spot Fleets

Use Spot Fleets to request and manage multiple Spot Instances automatically, which provides the lowest price per unit of capacity for your cluster or application, such as a batch-processing job, a Hadoop workflow, or an HPC grid computing job. You can include the instance types that your application can use. You define a target capacity based on your application needs (in units, including instances, vCPUs, memory, storage, or network throughput) and update the target capacity after the fleet is launched. Spot Fleets enable you to launch and maintain the target capacity and to request resources automatically to replace any that are disrupted or manually terminated.

To ensure that you have instance capacity, you can include a request for On-Demand capacity in your Spot Fleet request. If there is capacity, the On-Demand request is fulfilled. If there is capacity and availability, the balance of the target capacity is fulfilled as Spot.

The following example specifies the desired target capacity as 10, of which 5 must be On-Demand capacity. Spot capacity is not specified; it is implied in the balance of the target capacity minus the On-Demand capacity. If there is available Amazon EC2 capacity and availability, Amazon EC2 launches 5 capacity units as On-Demand and 5 capacity units (10−5=5) as Spot.

{
"IamFleetRole":"arn:aws:iam::1234567890:role/aws-ec2-spot-fleet-tagging-role",
"AllocationStrategy":"lowestPrice",
"TargetCapacity":10,
"SpotPrice":null,
"ValidFrom":"2018-04-04T15:58:13Z",
"ValidUntil":"2019-04-04T15:58:13Z",
"TerminateInstancesWithExpiration": true,
"LaunchSpecifications":[],
"Type":"maintain",
"OnDemandTargetCapacity":5,
"LaunchTemplateConfigs":[
    {
"LaunchTemplateSpecification":{
"LaunchTemplateId": "lt-0dbb04d4a6abcabcabc",
"Version": "2"
      },
"Overrides": [
        {
"InstanceType": "t2.medium",
"WeightedCapacity": 1,
"SubnetId": "subnet-d0dc51fb"
        }
      ]
    }
  ]
}

Amazon EC2 Fleets

With a single API call, Amazon EC2 Fleet enables you to provision compute capacity across different instance types, Availability Zones, and across On-Demand, Reserved Instances, and Spot Instances purchase models to help optimize scale, performance, and cost.

By default, Amazon EC2 Fleet launches the On-Demand option that is at the lowest price. For Spot Instances, Amazon EC2 Fleet provides two allocation strategies: lowest price and diversified. The lowest-price strategy allows you to provision Spot Instances in pools that provide the lowest price per unit of capacity at the time of the request. The diversified strategy allows you to provision Spot Instances across multiple Spot pools, and you can maintain your fleet’s target capacity to increase application.

Design for Continuity

With Spot Instances, you avoid paying more than the maximum price you specified. If the Spot price exceeds your maximum willingness to pay for a given instance or when capacity is no longer available, your instance is terminated automatically (or stopped or hibernated, if you opt for this behavior on a persistent request).

Spot offers features, such as termination notices, persistent requests, and spot block duration, to help you better track and control when Spot Instances can run and terminate (or stop or hibernate).

Using Termination Notices

If you need to save state, upload final log files, or remove Spot Instances from an Elastic Load Balancing load balancer before interruption, you can use termination notices, which are issued 2 minutes before interruption.

If your instance is marked for termination, the termination notice is stored in the instance’s metadata 2 minutes before its termination time. The notice is accessible at http://169.254.169.254/latest/meta-data/spot/termination-time. The notice includes the time when the shutdown signal will be sent to the instance’s operating system.

Relevant applications on Spot Instances should poll for the termination notice at 5-second intervals, which gives the application almost the entire 2 minutes to complete any needed processing before the instance is terminated and taken back by AWS.

Using Persistent Requests

You can set your request to remain open so that a new instance is launched in its place when the instance is interrupted. You can also have your Amazon EBS–backed instance stopped upon interruption and restarted when Spot has capacity at your preferred price.

Using Block Durations

You can also launch Spot Instances with a fixed duration (Spot blocks, 1–6 hours), which are not interrupted as the result of changes in the Spot price. Spot blocks can provide savings of up to 50 percent.

You submit a Spot Instance request and use the new BlockDuration parameter to specify the number of hours that you want your instances to run, along with the maximum price that you are willing to pay.

You can submit a request of this type by running the following command:

$ aws ec2 request-spot-instances 
      block-duration-minutes 360 
      instance-count 2 
      spot-price "0.25"
:

Alternatively, you can call the RequestSpotInstances function.

Minimizing the Impact of Interruptions

Because the Spot service can terminate Spot Instances without warning, it is important to build your applications in a way that allows you to make progress even if your application is interrupted. There are many ways to accomplish this, including the following:

Adding checkpoints Add checkpoints that save your work periodically, for example, to an Amazon EBS volume. Another approach is to launch your instances from Amazon EBS–backed AMI.

Splitting up the work By using Amazon Simple Queue Service (Amazon SQS), you can queue up work increments and track work that has already been done.

Using AWS Auto Scaling

Using AWS Auto Scaling, you can scale workloads in your architecture. It automatically increases the number of resources during the demand spikes to maintain performance and decreases capacity when demand lulls to reduce cost. AWS Auto Scaling is well-suited for applications that have stable demand patterns and for ones that experience hourly, daily, or weekly variability in usage. AWS Auto Scaling is useful for applications that show steady demand patterns and that experience frequent variations in usage.

Amazon EC2 Auto Scaling

Amazon EC2 Auto Scaling helps you scale your Amazon EC2 instances and Spot Fleet capacity up or down automatically according to conditions that you define. AWS Auto Scaling is generally used with Elastic Load Balancing to distribute incoming application traffic across multiple Amazon EC2 instances in an AWS Auto Scaling group. AWS Auto Scaling is triggered using scaling plans that include policies that define how to scale (manual, schedule, and demand spikes) and the metrics and alarms to monitor in Amazon CloudWatch.

CloudWatch metrics are used to trigger the scaling event. These metrics can be standard Amazon EC2 metrics, such as CPU utilization, network throughput, Elastic Load Balancing observed request and response latency, and even custom metrics that might originate from application code on your Amazon EC2 instances.

You can use Amazon EC2 Auto Scaling to increase the number of Amazon EC2 instances automatically during demand spikes to maintain performance and decrease capacity during lulls to reduce costs.

Dynamic Scaling

The dynamic scaling capabilities of Amazon EC2 Auto Scaling refers to the functionality that automatically increases or decreases capacity based on load or other metrics. For example, if your CPU spikes above 80 percent (and you have an alarm set up), Amazon EC2 Auto Scaling can add a new instance dynamically, reducing the need to provision Amazon EC2 capacity manually in advance. Alternatively, you could set a target value by using the new Request Count Per Target metric from Application Load Balancer, a load balancing option for the Elastic Load Balancing service. Amazon EC2 Auto Scaling will then automatically adjust the number of Amazon EC2 instances as needed to maintain your target.

Scheduled Scaling

Scaling based on a schedule allows you to scale your application ahead of known load changes, such as the start of business hours, thus ensuring that resources are available when users arrive, or in typical development or test environments that run only during defined business hours or periods of time.

You can use APIs to scale the size of resources within an environment (vertical scaling). For example, you could scale up a production system by changing the instance size or class. This can be achieved by stopping and starting the instance and selecting the different instance size or class. You can also apply this technique to other resources, such as EBS volumes, which can be modified to increase size, adjust performance (IOPS), or change the volume type while in use.

Fleet Management

Fleet management refers to the functionality that automatically replaces unhealthy instances in your application, maintains your fleet at the desired capacity, and balances instances across Availability Zones. Amazon EC2 Auto Scaling fleet management ensures that your application is able to receive traffic and that the instances themselves are working properly. When AWS Auto Scaling detects a failed health check, it can replace the instance automatically.

Instances Purchasing Options

With Amazon EC2 Auto Scaling, you can provision and automatically scale instances across purchase options, Availability Zones, and instance families in a single application to optimize scale, performance, and cost. You can include Spot Instances with On-Demand and Reserved Instances in a single AWS Auto Scaling group to save up to 90 percent on compute. You have the option to define the desired split between On-Demand and Spot capacity, select which instance types work for your application, and specify preferences for how Amazon EC2 Auto Scaling should distribute the AWS Auto Scaling group capacity within each purchasing model.

Golden Images

A golden image is a snapshot of a particular state of a resource, such as an Amazon EC2 instance, Amazon EBS volumes, and an Amazon RDS DB instance. You can customize an Amazon EC2 instance and then save its configuration by creating an Amazon Machine Image (AMI). You can launch as many instances from the AMI as you need, and they will all include those customizations. A golden image results in faster start times and removes dependencies to configuration services or third-party repositories. This is important in auto-scaled environments in which you want to be able to launch additional resources in response to changes in demand quickly and reliably.

AWS Auto Scaling

AWS Auto Scaling monitors your applications and automatically adjusts capacity of all scalable resources to maintain steady, predictable performance at the lowest possible cost. Using AWS Auto Scaling, you can set up application scaling for multiple resources across multiple services in minutes.

AWS Auto Scaling automatically scales resources for other AWS services, including Amazon ECS, Amazon DynamoDB, Amazon Aurora, Amazon EC2 Spot Fleet requests, and Amazon EC2 Scaling groups.

If you have an application that uses one or more scalable resources and experiences variable load, use AWS Auto Scaling. A good example would be an ecommerce web application that receives variable traffic throughout the day. It follows a standard three-tier architecture with Elastic Load Balancing for distributing incoming traffic, Amazon EC2 for the compute layer, and Amazon DynamoDB for the data layer. In this case, AWS Auto Scaling scales one or more Amazon EC2 Auto Scaling groups and DynamoDB tables that are powering the application in response to the demand curve.

AWS Auto Scaling continually monitors your applications to make sure that they are operating at your desired performance levels. When demand spikes, AWS Auto Scaling automatically increases the capacity of constrained resources so that you maintain a high quality of service.

AWS Auto Scaling bases its scaling recommendations on the most popular scaling metrics and thresholds used for AWS Auto Scaling. It also recommends safe guardrails for scaling by providing recommendations for the minimum and maximum sizes of the resources. This way, you can get started quickly and then fine-tune your scaling strategy over time, allowing you to optimize performance, costs, or balance between them.

The predictive scaling feature uses machine learning algorithms to detect changes in daily and weekly patterns, automatically adjusting their forecasts. This removes the need for the manual adjustment of AWS Auto Scaling parameters as cyclicality changes over time, making AWS Auto Scaling simpler to configure, and provides more accurate capacity provisioning. Predictive scaling results in lower cost and more responsive applications.

DynamoDB Auto Scaling

DynamoDB automatic scaling uses the AWS Auto Scaling service to adjust provisioned throughput capacity dynamically on your behalf in response to actual traffic patterns. This enables a table or a global secondary index to increase its provisioned read and write capacity to handle sudden increases in traffic without throttling. When the workload decreases, AWS Auto Scaling decreases the throughput so that you don’t pay for unused provisioned capacity.

Amazon Aurora Auto Scaling

Amazon Aurora automatic scaling dynamically adjusts the number of Aurora Replicas provisioned for an Aurora DB cluster. Aurora automatic scaling is available for both Aurora MySQL and Aurora PostgreSQL. Aurora automatic scaling enables your Aurora DB cluster to handle sudden increases in connectivity or workload. When the connectivity or workload decreases, Aurora automatic scaling removes unnecessary Aurora Replicas so that you don’t pay for unused provisioned DB instances.

Amazon Aurora Serverless is an on-demand, automatic scaling configuration for the MySQL-compatible edition of Amazon Aurora. An Aurora Serverless DB cluster automatically starts up, shuts down, and scales capacity up or down based on your application’s needs. Aurora Serverless provides a relatively simple, cost-effective option for infrequent, intermittent, or unpredictable workloads.

Accessing AWS Auto Scaling

There are several ways to get started with AWS Auto Scaling. You can set up AWS Auto Scaling through the AWS Management Console, with the AWS CLI, or with AWS SDKs.

You can access the features of AWS Auto Scaling using the AWS CLI, which provides commands to use with Amazon EC2 and Amazon CloudWatch and Elastic Load Balancing.

To scale a resource other than Amazon EC2, you can use the Application Auto Scaling API, which allows you to define scaling policies to scale your AWS resources automatically or schedule one-time or recurring scaling actions.

Using Containers

Containers provide a standard way to package your application’s code, configurations, and dependencies into a single object. Containers share an operating system installed on the server and run as resource-isolated processes, ensuring quick, reliable, and consistent deployments, regardless of environment.

Containers provide process isolation that lets you granularly set CPU and memory utilization for better use of compute resources.

Containerize Everything

Containers are a powerful way for developers to package and deploy their applications. They are lightweight and provide a consistent, portable software environment for applications to run and scale effortlessly anywhere.

Use Amazon Elastic Container Service (Amazon ECS) to build all types of containerized applications easily, from long-running applications and microservices to batch jobs and machine learning applications. You can migrate legacy Linux or Windows applications from on-premises to the AWS Cloud and run them as containerized applications using Amazon ECS.

Amazon ECS enables you to use containers as building blocks for your applications by eliminating the need for you to install, operate, and scale your own cluster management infrastructure. You can schedule long-running applications, services, and batch processes using Docker containers. Amazon ECS maintains application availability and allows you to scale your containers up or down to meet your application’s capacity requirements. Amazon ECS is integrated with familiar features like Elastic Load Balancing, EBS volumes, virtual private cloud (VPC), and AWS Identity and Access Management (IAM). Use APIs to integrate and use your own schedulers or connect Amazon ECS into your existing software delivery process.

Containers without Servers

AWS Fargate technology is available with Amazon ECS. With Fargate, you no longer have to select Amazon EC2 instance types, provision and scale clusters, or patch and update each server. You do not have to worry about task placement strategies, such as binpacking or host spread, and tasks are automatically balanced across Availability Zones. Fargate manages the availability of containers for you. You define your application’s requirements, select Fargate as your launch type in the AWS Management Console or AWS CLI, and Fargate takes care of all of the scaling and infrastructure management required to run your containers.

For developers who require more granular, server-level control over the infrastructure, Amazon ECS EC2 launch type enables you to manage a cluster of servers and schedule placement of containers on the servers.

Using Serverless Approaches

Serverless approaches are ideal for applications whereby load can vary dynamically. Using a serverless approach means no compute costs are incurred when there is no user traffic, while still offering instant scale to meet high demand, such as a flash sale on an ecommerce site or a social media mention that drives a sudden wave of traffic. All of the actual hardware and server software are handled by AWS.

Benefits gained by using AWS Serverless services include the following:

  • No need to manage servers
  • No need to ensure application fault tolerance, availability, and explicit fleet management to scale to peak load
  • No charge for idle capacity

You can focus on product innovation and rapidly construct these applications:

  • Amazon S3 offers a simple hosting solution for static content.
  • AWS Lambda, with Amazon API Gateway, supports dynamic API requests using functions.
  • Amazon DynamoDB offers a simple storage solution for session and per-user state.
  • Amazon Cognito provides a way to handle user registration, authentication, and access control to resources.
  • AWS Serverless Application Model (AWS SAM) can be used by developers to describe the various elements of an application.
  • AWS CodeStar can set up a CI/CD toolchain with a few clicks.

Compared to traditional infrastructure approaches, an application is also often less expensive to develop, deliver, and operate when it has been architected in a serverless fashion. The serverless application model is generic, and it applies to almost any type of application from a startup to an enterprise.

Here are a few examples of application use cases:

  • Web applications and websites
  • Mobile backends
  • Media and log processing
  • IT automation
  • AWS IoT Core backends
  • Web hooked systems
  • Chatbots
  • Clickstream and other near real-time streaming data processes

Optimize Lambda Usage

AWS Lambda provides the cloud-logic layer, and with Lambda you can run code for virtually any type of application or backend service, all with zero administration. A variety of events can trigger Lambda functions, enabling developers to build reactive, event-driven systems without managing infrastructure. When there are multiple, simultaneous events, Lambda scales by running more copies of the function in parallel, responding to each individual trigger. As a result, there is no possibility of an idle server or container. The problem of wasted infrastructure expenditures is eliminated by design in architectures that use Lambda functions.

Serverless applications are typically composed of one or more Lambda functions; therefore, monitor the execution duration and configuration of your functions closely.

Consider the following recommendations for optimizing Lambda functions:

Optimal memory size The memory usage for your function is determined per invocation and can be viewed in CloudWatch Logs. By analyzing the Max Memory Used: field in the Invocation report, you can determine whether your function needs more memory or whether you over-provisioned your function’s memory size.

Language runtime performance If your application use case is both latency-sensitive and susceptible to incurring the initial invocation cost frequently (spiky traffic or infrequent use), then recommend one of the interpreted languages, such as Node.js or Python.

Optimizing code Much of the application performance depends on your logic and dependencies. Pay attention to reusing the objects and using global/static variables. Keep live or reuse HTTP/session connections, and use default network environments as much as possible.

Optimizing Storage

AWS storage services are optimized for different storage scenarios—there is no single data storage option that is ideal for all workloads. When evaluating your storage requirements, consider data storage options for each workload separately.

To optimize the storage, you must first understand the performance levels of your workloads. Conduct a performance analysis to measure input/output operations per second, throughput, quick access to your data, durability, sensitivity, size, and budget.

Amazon offers three broad categories of storage services: object, block, and file storage. Each offering is designed to meet a different storage requirement, which gives you flexibility to find the solution that works best for your storage scenarios.

Object Storage

Amazon Simple Storage Service (Amazon S3) is highly durable, general-purpose object storage that works well for unstructured datasets such as media content.

There are multiple tiers of storage: hot, warm, or cold data. In terms of pricing, the colder the data, the cheaper it is to store, and the costlier it is to access when needed.

Standard (STANDARD) This is the best storage option for data that you frequently access. Amazon S3 delivers low latency and high throughput, and it is ideal for use cases such as cloud applications, dynamic websites, content distribution, gaming, and data analytics.

Amazon S3 Standard – Infrequent Access (STANDARD_IA) Use this storage option for data that you access less frequently, such as long-term backups and disaster recovery. It offers cheaper storage over time, but higher charges to retrieve or transfer data.

Amazon S3 Intelligent-Tiering (INTELLIGENT_TIERING) This storage class is designed to optimize the cost by moving data to the most cost-effective access tier automatically without degrading the performance of the application. If an object in the infrequent access tier is accessed, it is automatically moved back to the frequent access tier.

Amazon S3 One Zone-Infrequent Access (ONEZONE_IA) This storage class provides a lower-cost option for infrequently accessed data that requires rapid access. The data is stored in only one Availability Zone (AZ), and it saves up to 20 percent of storage costs as compared to STANDARD_IA. Use this option for storing secondary backups of on-premises data or data that can be easily recreated.

Amazon S3 Glacier (GLACIER) This option is designed for long-term storage of infrequently accessed data, such as end-of-lifecycle, compliance, or regulatory backups. Different methods of data retrieval are available at various speeds and cost. Retrieval can take from a few minutes to several hours.

Amazon S3 Glacier Deep Archive (DEEP_ARCHIVE) This is the lowest-cost class designed for long-term retention of rarely accessed data. Data will be retained for 7–10 years and may be accessed about once or twice a year. When you need the data, you can retrieve it within 12 hours. This storage is ideal for maintaining backups of historical regulatory or compliance data and disaster recovery backups.

Block Storage

Amazon Elastic Block Store (Amazon EBS) volumes provide a durable block-storage option for use with Amazon EC2 instances. Use Amazon EBS for data that requires long-term persistence and quick access at assured levels of performance. There are two types of block storage: solid-state drive (SSD) storage and hard disk drive (HDD) storage.

SSD storage is optimized for transactional workloads wherein performance is closely tied to IOPS. Choose from two SSD volume options:

General Purpose SSD (gp2) Designed for general use and offers a balance between cost and performance.

Provisioned IOPS SSD (io1) Best for latency-sensitive workloads that require specific minimum-guaranteed IOPS. With io1 volumes, you pay separately for Provisioned IOPS, so unless you need high levels of Provisioned IOPS, gp2 volumes are a better match at lower cost.

HDD storage is designed for throughput-intensive workloads, such as data warehouses and log processing. There are two types of HDD volumes:

Throughput Optimized HDD (st1) Best for frequently accessed, throughput-intensive workloads.

Cold HDD (sc1) Designed for less frequently accessed, throughput-intensive workloads.

File Storage

Amazon Elastic File System (Amazon EFS) provides simple, scalable file storage for use with Amazon EC2 instances. Amazon EFS supports any number of instances at the same time. Amazon EFS is designed for workloads and applications such as big data, media-processing workflows, content management, and web serving.

Amazon S3 and Amazon EFS allocate storage based on your usage, and you pay for what you use. However, for EBS volumes, you are charged for provisioned (allocated) storage regardless of whether you use it or not. The key to keeping storage costs low without sacrificing required functionality is to maximize the use of Amazon S3 when possible and use more expensive EBS volumes with provisioned I/O only when application requirements demand it.

Optimize Amazon S3

Perform analysis on data access patterns, create inventory lists, and configure lifecycle policies. Identifying the right storage class and moving less frequently accessed Amazon S3 data to cheaper storage tiers yields considerable savings. For example, by moving data from the STANDARD to STANDARD_IA storage class, you can save up to 60 percent (on a per-gigabyte basis) of Amazon S3 pricing. By moving data that is at the end of its lifecycle and accessed on rare occasions from Amazon S3 Glacier, you can save up to 80 percent of Amazon S3 pricing.

Storage Management Tools/Features

The following sections detail some of the tools that help to determine when to transition data to another storage class.

Cost Allocation S3 Bucket Tags

To track the storage cost or other criteria for individual projects or groups of projects, label your Amazon S3 buckets using cost allocation tags. A cost allocation tag is a key-value pair that you associate with an S3 bucket. To manage storage data most effectively, you can use these tags to categorize your S3 objects and filter on these tags in your data lifecycle policies.

Amazon S3 Analytics: Storage Class Analysis

Use this feature to analyze storage access patterns to help you decide when to transition the right data to the right storage class. This feature observes data access patterns to help you determine when to transition less frequently accessed STANDARD storage to the STANDARD_IA storage class.

After storage class analysis observes the infrequent access patterns of a filtered set of data over a period of time, you can use the analysis results to help you improve your lifecycle policies. You can configure storage class analysis to analyze all the objects in a bucket. Alternatively, you can configure filters to group objects together for analysis by common prefix (that is, objects that have names that begin with a common string), by object tags, or by both prefix and tags. You’ll most likely find that filtering by object groups is the best way to benefit from storage class analysis.

You can use the Amazon S3 console, the s3:PutAnalyticsConfiguration REST API, or the equivalent from the AWS CLI or AWS SDKs to configure storage class analysis.

Amazon S3 Inventory

This tool audits and reports on the replication and encryption status of your S3 objects on a weekly or monthly basis. This feature provides CSV output files that list objects and their corresponding metadata, and it lets you configure multiple inventory lists for a single bucket, organized by different Amazon S3 metadata tags. You can also query Amazon S3 inventory through standard SQL by using Amazon Athena, Amazon Redshift Spectrum, and other tools, such as Presto, Apache Hive, and Apace Spark.

Amazon CloudWatch

Amazon S3 can also publish storage, request, and data transfer metrics to Amazon CloudWatch. Storage metrics are reported daily, are available at one-minute intervals for granular visibility, and can be collected and reported for an entire bucket or a subset of objects (selected via prefix or tags).

Use Amazon S3 Select

Amazon S3 Select enables applications to retrieve only a subset of data from an object by using simple SQL expressions. By using Amazon S3 Select to retrieve only the data needed by your application, you can achieve drastic performance increases—in many cases, you can get as much as a 400 percent improvement.

Following is a Python sample code snippet that shows how to retrieve columns from an object containing data in CSV format. This code snippet retrieves the city and airport code, where country name is similar to “United States.” If you have column headers and you set the FileHeaderInfo to Use, you can identify columns by name in the SQL expression.

result = s3.select_object_content(      
        Bucket=example-bucket-us-west-2',    
        Key='sample-data/airportCodes.csv',    
        ExpressionType='SQL',      
        Expression="select s.city, s.code from s3object s where"   
                "s."Country (Name)" like '%United States%'",  
        InputSerialization = {'CSV': {"FileHeaderInfo": "Use"}},  
        OutputSerialization = {'CSV': {}},    
:

Use Amazon Glacier Select

Amazon Glacier Select unlocks an opportunity to query your archived data easily. With Glacier Select, you can filter directly against an Amazon S3 Glacier object by using standard SQL statements.

It works like any other retrieval job, except for having an additional set of parameters (SelectParameters) that you can pass in an initiate job request.

The following is an example of a Python code snippet that shows how to pass an SQL expression under SelectParameters:

jobParameters = {          
    "Type": "select", "ArchiveId": "ID",      
    "Tier": "Expedited",        
    "SelectParameters": {        
        "InputSerialization": {"csv": {}},      
        "ExpressionType": "SQL",        
        "Expression": "SELECT * FROM archive WHERE _5='498960'",  
        "OutputSerialization": {        
            "csv": {}        
        }          
    }            

With both Amazon S3 Select and Glacier Select, you can lower your costs and uncover more insights from your data, regardless of which storage tier it is in.

Optimize Amazon EBS

With Amazon EBS, you are paying for provisioned capacity and performance—even if the volume is unattached or has low write activity. To optimize storage performance and costs for Amazon EBS, monitor volumes periodically to identify ones that are unattached or appear to be underutilized or overutilized, and adjust provisioning to match actual usage.

Check Configuration

Follow these configuration guidelines
  • To achieve the best performance consistently, launch instances as EBS optimized. For instances that are not EBS-optimized by default, you can enable EBS optimization when you launch the instances or enable EBS optimization after the instances are running.

  • To enable this feature, you can use either the Amazon EC2 console or AWS Tools. For AWS CLI, use ebs-optimized with the command run-instances to enable EBS optimization when launching and with the command modify-instance-attribute to enable EBS optimization for a running instance.

  • Choose an EBS-optimized instance that provides more dedicated EBS throughput than your application needs; otherwise, the Amazon EBS to Amazon EC2 connection becomes a performance bottleneck.
  • New EBS volumes receive their maximum performance the moment that they are available and do not require initialization. However, storage blocks on volumes that were restored from snapshots must be initialized before you can access the block. This preliminary action takes time and can cause a significant increase in the latency of an I/O operation the first time each block is accessed.
  • To achieve a higher level of performance for a file system than you can provision on a single volume, create a RAID 0 (zero) array. Consider using RAID 0 when I/O performance is more important than fault tolerance. For example, you could use it with a heavily used database where data replication is already set up separately.

Use Monitoring Tools

AWS offers tools that help you optimize block storage.

Amazon CloudWatch

Amazon CloudWatch automatically collects a range of data points for EBS volumes, and you can then set alarms on volume behavior.

Consider the following important metrics:

BurstBalance When your burst bucket is depleted, volume I/O credits (for gp2 volumes) or volume throughput credits (for st1 and sc1 volumes) are throttled to the baseline. Check the BurstBalance value to determine whether your volume is being throttled for this reason.

VolumeQueueLength If your I/O latency is higher than you require, check VolumeQueueLength to make sure that your application is not trying to drive more IOPS than you have provisioned. If your application requires a greater number of IOPS than your volume can provide, consider using a larger gp2 volume with a higher base performance level or an io1 volume with more Provisioned IOPS to achieve faster latencies.

VolumeReadBytes, VolumeWriteBytes, VolumeReadOps, VolumeWriteOps HDD-backed st1 and sc1 volumes are designed to perform best with workloads that take advantage of the 1,024 KiB maximum I/O size. To determine your volume’s average I/O size, divide VolumeWriteBytes by VolumeWriteOps. The same calculation applies to read operations. If the average I/O size is below 64 KiB, increasing the size of the I/O operations sent to an st1 or sc1 volume should improve performance.

AWS Trusted Advisor

AWS Trusted Advisor is another way for you to analyze your infrastructure to identify unattached, underutilized, and overutilized EBS volumes.

Delete Unattached Amazon EBS Volumes

To find unattached EBS volumes, look for volumes that are available, which indicates that they are not attached to an Amazon EC2 instance. You can also look at network throughput and IOPS to determine whether there has been any volume activity over the previous two weeks, or you can look up the last time the EBS volume was attached. If the volume is in a nonproduction environment, hasn’t been used in weeks, or hasn’t been attached in a month, there is a good chance that you can delete it.

Before deleting a volume, store an Amazon EBS snapshot (a backup copy of an EBS volume) so that the volume can be quickly restored later if needed.

Resize or Change the EBS Volume Type

Identify volumes that are underutilized and downsize them or change the volume type. Monitor the read/write access of EBS volumes to determine whether throughput is low. If you have a current-generation EBS volume attached to a current-generation Amazon EC2 instance type, you can use the elastic volumes feature to change the size or volume type or (for an SSD io1 volume) adjust IOPS performance without detaching the volume.

Follow these tips:

  • For General Purpose SSD gp2 volumes, optimize for capacity so that you’re paying only for what you use.
  • With Provisioned IOPS SSD io1 volumes, pay close attention to IOPS utilization rather than throughput, since you pay for IOPS directly. Provision 10–20 percent above maximum IOPS utilization.
  • You can save by reducing Provisioned IOPS or by switching from a Provisioned IOPS SSD io1 volume type to a General Purpose SSD gp2 volume type.
  • If the volume is 500 GB or larger, consider converting to a Cold HDD sc1 volume to save on your storage rate.

Delete Stale Amazon EBS Snapshots

If you have a backup policy that takes EBS volume snapshots daily or weekly, you will quickly accumulate snapshots. Check for stale snapshots that are more than 30 days old and delete them to reduce storage costs. Deleting a snapshot has no effect on the volume.

Optimizing Data Transfer

Optimizing data transfer ensures that you minimize data transfer costs. Review your user presence if global or local and how the data gets located in order to reduce the latency issues.

  • Use Amazon CloudFront, a global content delivery network (CDN), to locate data closer to users. It caches data at edge locations across the world, which reduces the load on your resources. By using CloudFront, you can reduce the administrative effort in delivering content automatically to large numbers of users globally, with minimum latency. Depending on your application types, distribute your entire website, including dynamic, static, streaming, and interactive content through CloudFront instead of scaling out your infrastructure.
  • Amazon S3 transfer acceleration enables fast transfer of files over long distances between your client and your S3 bucket. Transfer acceleration leverages Amazon CloudFront globally distributed edge locations to route data over an optimized network path. For a workload in an S3 bucket that has intensive GET requests, you should use Amazon S3 with CloudFront.
  • When uploading large files, use multipart uploads with multiple parts uploading at once to help maximize network throughput. Multipart uploads provide the following advantages:
    • Improved throughput—You can upload parts in parallel to improve throughput.
    • Quick recovery from any network issues—Smaller part size minimizes the impact of restarting a failed upload due to a network error.
    • Pause and resume object uploads—You can upload object parts over time. After you initiate a multipart upload, there is no expiry; you must explicitly complete or abort the multipart upload.
    • Begin an upload before you know the final object size—You can upload an object as you are creating it.
  • Using Amazon Route 53, you can reduce latency for your users by serving their requests from the AWS Region for which network latency is lowest. Amazon Route 53 latency-based routing lets you use Domain Name System (DNS) to route user requests to the AWS Region that will give your users the fastest response.

Caching

Caching improves application performance by storing frequently accessed data items in memory so that they can be retrieved without accessing the primary data store. Cached information might include the results of I/O-intensive database queries or the outcome of computationally intensive processing.

When the result set is not found in the cache, the application can calculate it or retrieve it from a database of expensive, slowly mutating third-party content and store it in the cache for subsequent requests.

Amazon ElastiCache

Amazon ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud. It supports two open-source, in-memory caching engines: Memcached and Redis.

  • The Memcached caching engine is popular for database query results caching, session caching, webpage caching, API caching, and caching of objects such as images, files, and metadata. Memcached is also a great choice to store and manage session data for internet-scale applications in cases wherein persistence is not critical.
  • Redis caching engine is a great choice for implementing a highly available in-memory cache to decrease data access latency, increase throughput, and ease the load off your relational or NoSQL database and application. Redis has disk persistence built in, and you can use it for long-lived data.

Lazy loading is a good caching strategy whereby you populate the cache only when an object is requested by the application, keeping the cache size manageable. Apply a lazy caching strategy anywhere in your application where you have data that is going to be read often but written infrequently. In a typical web or mobile app, for example, a user’s profile rarely changes but is accessed throughout the application.

Amazon DynamoDB Accelerator (DAX)

Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for Amazon DynamoDB. This feature delivers performance improvements from milliseconds to microseconds, for high throughput. DAX adds in-memory acceleration to your DynamoDB tables without requiring you to manage cache invalidation, data population, or clusters.

DAX is ideal for applications that require the fastest possible response time read operations but that are also cost-sensitive and require repeated reads against a large set of data. For example, consider an ecommerce system that has a one-day sale on a popular product that would sharply increase the demand or a long-running analysis of regional weather data that could temporarily consume all of the read capacity in a DynamoDB table. Naturally, these would negatively impact other applications that must access the same data.

Relational Databases and Amazon DynamoDB

Traditional relational database management system (RDBMS) platforms store data in a normalized relational structure that reduces hierarchical data structures to a set of common elements that are stored across multiple tables.

RDBMS platforms use an ad hoc query language (generally a flavor of SQL) to generate or materialize views of the normalized data to support application-layer access patterns.

A relational database system does not scale well for the following reasons:

  • It normalizes data and stores it on multiple tables that require multiple queries to write to disk.
  • It generally incurs the performance costs of an Atomicity, Consistency, Isolation, Durability (ACID)–compliant transaction system.
  • It uses expensive joins to reassemble required views of query results.

For this reason, when your business requires a low-latency response to high-traffic queries, taking advantage of a NoSQL system generally makes technical and economic sense. Amazon DynamoDB helps solve the problems that limit relational system scalability by avoiding them.

DynamoDB scales well for these reasons:

  • Schema flexibility lets Amazon DynamoDB store complex hierarchical data within a single item.
  • Composite key design lets it store related items close together on the same table.

The following are some recommendations for maximizing performance and minimizing throughput costs when working with Amazon DynamoDB.

Apply NoSQL Design

NoSQL design requires a different mind-set than RDBMS design. For an RDBMS, you can create a normalized data model without thinking about access patterns. You can then extend it later when new questions and query requirements arise.

NoSQL design is different:

  • For DynamoDB, by contrast, design your schema after you know the questions it needs to answer. Understanding the business problems and the application use cases up front is essential.
  • Maintain as few tables as possible in an Amazon DynamoDB application. Most well-designed applications require only one table.

Keep Related Data Together

Keeping related data in proximity has a major impact on cost and performance. Instead of distributing related data items across multiple tables, keep related items in your NoSQL system as close together as possible.

Keep Fewer Tables

In general, maintain as few tables as possible in an Amazon DynamoDB application. Most well-designed applications require only one table, unless there is a specific reason for using multiple tables.

Distribute Workloads Evenly

The optimal usage of a table’s provisioned throughput depends on the workload patterns of individual items and the partition key design.

Designing Partition Keys

The more distinct partition key values that your workload accesses, the more those requests are spread across the partitioned space. In general, you use your provisioned throughput more efficiently as the ratio of partition key values accessed to the total number of partition key values increases.

Table 16.1 provides a comparison of the provisioned throughput efficiency of some common partition key schemas.

Table 16.1 Samples of Partition Key Distributions

Partition Key Value Uniformity
User ID, where the application has many users Good
Status code, where there are only a few possible status codes Bad
Item creation date, rounded to the nearest time period (for example, day, hour, or minute) Bad
Device ID, where each device accesses data at relatively similar intervals Good
Device ID, where even if there are many devices being tracked, one is by far more popular than all the others Bad

If a single table has only a small number of partition key values, consider distributing your write operations across more distinct partition key values. Structure the primary key elements to avoid one “hot” (heavily requested) partition key value that slows the overall performance.

For example, consider a table with a composite primary key. The partition key represents the item’s creation date, rounded to the nearest day. The sort key is an item identifier. On a given day, say 2014-07-09, all of the new items are written to that single partition key value (and corresponding physical partition).

If the table fits entirely into a single partition (considering growth of your data over time) and if your application’s read and write throughput requirements don’t exceed the read and write capabilities of a single partition, your application won’t encounter any unexpected throttling because of partitioning.

Implementing Write Sharding

One better way to distribute writes across a partition key space in DynamoDB is to expand the space. One strategy for distributing loads more evenly across a partition key space is to add a random number or a calculated hash suffix to the end of the partition key values. Then you can randomize the writes across the larger space. A randomizing strategy can greatly improve write throughput.

For example, in the case of a partition key that represents today’s date in the Order table, suppose that each item has an accessible OrderId attribute and that you most often need to find items by OrderId in addition to date. Before your application writes the item to the table, it could calculate a hash suffix based on the OrderId (similar to OrderId, modulo 200, + 1) and append it to the partition key date. The calculation might generate a number between 1 and 200 that is fairly evenly distributed, similar to what the random strategy produces.

With this strategy, the writes are spread evenly across the partition key values and thus across the physical partitions. You can easily perform a GetItem operation for a particular item and date because you can calculate the partition key value for a specific OrderId value.

Upload Data Efficiently

Typically, when you load data from other data sources, Amazon DynamoDB partitions your table data on multiple servers. You get better performance if you upload data to all the allocated servers simultaneously.

For example, suppose that you want to upload user messages to a DynamoDB table that uses a composite primary key with UserID as the partition key and MessageID as the sort key.

You can distribute your upload work by using the sort key to load one item from each partition key value, then another item from each partition key value, and so on.

Every upload in this sequence uses a different partition key value, keeping more DynamoDB servers busy simultaneously and improving your throughput performance.

Use Sort Keys for Version Control

Many applications need to maintain a history of item-level revisions for audit or compliance purposes and to be able to retrieve the most recent version easily.

For each new item, create two copies of the item. One copy should have a version-number prefix of zero (for example, v0) at the beginning of the sort key, and one should have a version-number prefix of one (for example, v001_).

Every time the item is updated, use the next higher version prefix in the sort key of the updated version and copy the updated contents into the item with the version prefix of zero. This means that the latest version of any item can be located easily by using the zero prefix.

Keep the Number of Indexes to a Minimum

Create secondary indexes on attributes that are queried often. Indexes that are seldom used contribute to increased storage and I/O costs without improving application performance.

Choose Projections Carefully

Because secondary indexes consume storage and provisioned throughput, keep the size of the index as small as possible. Also, the smaller the index, the greater the performance advantage compared to querying the full table. Project only the attributes that you regularly request. Every time you update an attribute that is projected in an index, you incur the extra cost of updating the index as well.

Optimize Frequent Queries to Avoid Fetches

To get the fastest queries with the lowest possible latency, project all of the attributes that you expect those queries to return. In particular, if you query a local secondary index for attributes that are not projected, Amazon DynamoDB automatically fetches those attributes from the table, which requires reading the entire item from the table. This introduces latency and additional I/O operations that you can avoid.

Use Sparse Indexes

For any item in a table, Amazon DynamoDB writes a corresponding index entry only if the index sort key value is present in the item. If the sort key doesn’t appear in every table item, the index is said to be sparse.

Sparse indexes are useful for queries over a small subsection of a table. It’s faster and less expensive to query that index than to scan the entire table.

For example, suppose that you have a table in which you store all of your customer orders with the following key attributes:

  • Partition keyCustomerId
  • Sort keyOrderId

To track open orders, you can insert the OrderOpenDate attribute set to the date on which each order was placed and then delete it after the order is fulfilled. If you then create an index on CustomerId (partition key) and OrderOpenDate (sort key), only those orders with OrderOpenDate defined appear in it. That way, when you query the sparse index, the items returned are the orders that are unfulfilled and sorted by the date on which each order was placed.

Avoid Scans as Much as Possible

In general, Scan operations are less efficient than other operations in DynamoDB. A Scan operation scans the entire table or secondary index. It then filters out values to provide the result you want.

If possible, avoid using a Scan operation on a large table or index with a filter that removes many results. Also, as a table or index grows, the Scan operation slows down. The Scan operation examines every item for the requested values and can use up the provisioned throughput for a large table or index in a single operation.

This usage of capacity units by a scan prevents other potentially more important requests for the same table from using the available capacity units. As a result, you’ll likely get a ProvisionedThroughputExceeded exception for those requests.

For faster response times, design your tables and indexes so that your applications can use Query instead of Scan. (For tables, you can also consider using the GetItem and APIs.) GetItem is highly efficient because it provides direct access to the physical location of the item.

Monitoring Costs

When you measure and monitor your users and applications and combine the data you collect with data from AWS monitoring tools, you can perform a gap analysis that tells you how closely aligned your system utilization is to your requirements. By working continually to minimize this utilization gap, you can ensure that your systems are cost effective.

Over time, you can continue to reduce cost with continuous monitoring and tagging. Similar to application development, cost optimization is an iterative process. Because your application and its usage will evolve over time and because AWS iterates frequently and regularly releases new options, it is important to evaluate your solution continuously.

Cost Management Tools

AWS provides tools to help you identify those cost-saving opportunities and keep your resources right-sized. Use these tools to help you access, organize, understand, control, and optimize your costs.

AWS Trusted Advisor

AWS Trusted Advisor is an online tool that provides you with real-time guidance to help you provision your resources following AWS best practices.

Whether you’re establishing new workflows or developing applications, or as part of ongoing improvements, take advantage of the recommendations provided by Trusted Advisor on a regular basis. By reviewing the recommendations, you can look for opportunities to save money.

Here are some Trusted Advisor checks that help you determine how to reduce your bill:

  • Low utilization of Amazon EC2 instances
  • Idle resources, such as load balancers and Amazon RDS DB instances
  • Underutilized Amazon EBS volumes and Amazon Redshift clusters
  • Unassociated Elastic IP addresses
  • Optimization, lease expiration—Amazon Reserved Instances
  • Inefficiently configured Amazon Route 53 latency record sets

AWS Cost Explorer

Use the AWS Cost Explorer tool to dive deeper into your cost and usage data to identify trends, pinpoint cost drivers, and detect anomalies. It includes Amazon EC2 usage reports, which let you analyze the cost and usage of your Amazon EC2 instances over the last 13 months. You can analyze your cost and usage data in aggregate (such as total costs and usage across all accounts) down to granular details (for example, m2.2xlarge costs within the Dev account tagged “project: GuardDuty”).

AWS Cost Explorer built-in reports include the following:

Monthly Costs by AWS Service Allows you to visualize the costs and usage associated with your top five cost-accruing AWS services and gives you a detailed breakdown on all services in the table view. The reports let you adjust the time range to view historical data going back up to 12 months to gain an understanding of your cost trends.

Amazon EC2 Monthly Cost and Usage Lets you view all AWS costs over the past two months, in addition to your current month-to-date costs. From there, you can drill down into the costs and usage associated with particular linked accounts, regions, tags, and more.

Monthly Costs by Linked Account Allows you to view the distribution of costs across your organization.

Monthly Running Costs Provides an overview of all running costs over the past three months and provides forecasted numbers for the coming month with a corresponding confidence interval. This report gives you good insight into how your costs are trending and helps you plan ahead.

AWS Cost Explorer Reserved Instance Reports include the following:

RI Utilization Report Visualize the degree to which you are using your existing resources and identify opportunities to improve your Reserved Instance cost efficiencies. The report shows how much you saved by using Reserved Instances, how much you overspent on Reserved Instances, and your net savings from purchasing Reserved Instances during the selected time range. This helps you to determine whether you have purchased too many Reserved Instances.

RI Coverage Report Discover how much of your overall instance usage is covered by Reserved Instances so that you can make informed decisions about when to purchase or modify a Reserved Instance to ensure maximum coverage. These show how much you spent on On-Demand Instances and how much you might have saved had you purchased more reservations. The report enables you to determine whether you have under-purchased Reserved Instances.

AWS Cost Explorer API

Use AWS Cost Explorer API to query your cost and usage data programmatically (using AWS CLI or AWS SDKs). You can query for aggregated data such as total monthly costs or total daily usage. You can also query for granular data, such as the number of daily write operations for Amazon DynamoDB database tables in your production environment. All of the AWS SDKs greatly simplify the process of signing requests and save you a significant amount of time when compared with using the AWS Cost Explorer API.

You can access your Amazon EC2 Reserved Instance purchase recommendations programmatically through the AWS Cost Explorer API. Recommendations for Reserved Instance purchases are calculated based on your past usage and indicate opportunities for potential cost savings.

The following example retrieves recommendations for Partial Upfront Amazon EC2 instances with a three-year term based on the last 60 days of Amazon EC2 usage.

Here’s the AWS CLI command:

aws ce get-reservation-purchase-recommendation --service "Amazon Redshift" --lookback-period-in-days SIXTY_DAYS --term-in-years THREE_YEARS --payment-option PARTIAL_UPFRONT

Here’s the output:

{
   "Recommendations": [],
   "Metadata": {
       "GenerationTimestamp": "2018-08-08T15:20:57Z",
       "RecommendationId": "00d59dde-a1ad-473f-8ff2-iexample3330b"
   }

AWS Budgets

With AWS Budgets, you can set custom budgets that alert you when your costs or usage exceed (or are forecasted to exceed) your budgeted amount. You can also use AWS Budgets to set Reserved Instance utilization or coverage targets and receive alerts when your utilization drops below the threshold you define. Reserved Instance alerts support Amazon EC2, Amazon RDS, Amazon Redshift, and Amazon ElastiCache reservations.

Budgets can be tracked at the monthly, quarterly, or yearly level, and you can customize the start and end dates. You can further refine your budget to track costs associated with multiple dimensions, such as AWS service, linked account, tag, and others. You can send budget alerts through email or Amazon Simple Notification Service (Amazon SNS) topic. For example, you can set notifications that alert you if you accrue 80, 90, and 100 percent of your actual budgeted costs in addition to a notification that alerts you if you are forecasted to exceed your budget.

AWS Cost and Usage Report

The AWS Cost and Usage Report tracks your AWS usage and provides estimated charges associated with that usage. You can configure this report to present the data hourly or daily. It is updated at least once a day until it is finalized at the end of the billing period. The AWS Cost and Usage report gives you the most granular insight possible into your costs and usage, and it is the source of truth for the billing pipeline. It can be used to develop advanced custom metrics using business intelligence, data analytics, and third-party cost optimization tools.

The AWS Cost and Usage report is delivered automatically to an S3 bucket that you specify, and it can be downloaded directly from there (standard Amazon S3 storage rates apply). It can also be ingested into Amazon Redshift or uploaded to Amazon QuickSight.

Amazon CloudWatch

Amazon CloudWatch is a monitoring service for AWS Cloud resources and the applications you run on AWS. You can use Amazon CloudWatch to collect and track metrics and log files, set alarms, and automatically react to changes in your AWS resources. You can create an alarm to perform one or more of the following actions based on the value of the metric:

  • Automatically stop or terminate Amazon EC2 instances that have gone unused or underutilized for too long
  • Stop your instance if it has an EBS volume as its root device

For example, you may run development or test instances and occasionally forget to shut them off. You can create an alarm that is triggered when the average CPU utilization percentage has been lower than 10 percent for 24 hours, signaling that it is idle and no longer in use. You can create a group of alarms that first sends an email notification to developers whose instance has been underutilized for 8 hours and then terminates that instance if its utilization has not improved after 24 hours.

Amazon CloudWatch Events deliver a near real-time stream of system events that describe changes in AWS resources. Using simple rules, you can route each type of event to one or more targets, such as Lambda functions, Amazon Kinesis streams, and Amazon SNS topics.

AWS Cost Optimization Monitor

AWS Cost Optimization Monitor is an automated reference deployment solution that processes detailed billing reports to provide granular metrics that you can search, analyze, and visualize in a customizable dashboard. The solution uploads detailed billing report data automatically to Amazon Elasticsearch Service (Amazon ES) for analysis and leverages its built-in support for Kibana, enabling you to visualize the first batch of data as soon as it’s processed.

The default dashboard is configured to show specific cost and usage metrics. All of these metrics, as listed here, were selected based on best practices observed across AWS customers:

  • Amazon EC2 Instances Running per Hour
  • Total Cost
  • Cost by Tag Key: Name
  • Cost by Amazon EC2 Instance Type
  • Amazon EC2 Elasticity
  • Amazon EC2 Hours per Dollar Invested

Cost Optimization: Amazon EC2 Right Sizing

Amazon EC2 Right Sizing is an automated AWS reference deployment solution that uses managed services to perform a right-sizing analysis and offer detailed recommendations for more cost-effective instances. The solution analyzes two weeks of utilization data to provide detailed recommendations for right sizing your Amazon EC2 instances.

Monitoring Performance

After you have implemented your architecture, monitor its performance so that you can remediate any issues before your customers are aware of them. Use monitoring metrics to raise alarms when thresholds are breached. The alarm can trigger automated action to work around any components with poor performance.

AWS provides tools that you can use to monitor the performance, reliability, and availability of your resources on the AWS Cloud.

Amazon CloudWatch

Amazon CloudWatch is essential to performance efficiency, which provides system-wide visibility into resource utilization, application performance, and operational health.

You can create an alarm to monitor any Amazon CloudWatch metric in your account. For example, you can create alarms on an Amazon EC2 instance CPU utilization, Elastic Load Balancing request latency, Amazon DynamoDB table throughput, or Amazon SQS queue length.

In the following example, AWS CLI is used to create an alarm to send an Amazon SNS email message when CPU utilization exceeds 70 percent:

aws cloudwatch put-metric-alarm --alarm-name cpu-mon --alarm-description "Alarm when CPU exceeds 70 percent" --metric-name CPUUtilization --namespace AWS/EC2 --statistic Average --period 300 --threshold 70 --comparison-operator  GreaterThanThreshold  --dimensions "Name=InstanceId,Value=i-12345678" --evaluation-periods 2 --alarm-actions arn:aws:sns:us-east-1:111122223333:MyTopic --unit Percent

Here are a few examples of when and how alarms are sent:

  • Sends an email message using Amazon SNS when the average CPU use of an Amazon EC2 instance exceeds a specified threshold for consecutive specified periods
  • Sends an email when an instance exceeds 10 GB of outbound network traffic per day
  • Stops an instance and sends a text message (SMS) when outbound traffic exceeds 1 GB per hour
  • Stops an instance when memory utilization reaches or exceeds 90 percent so that application logs can be retrieved for troubleshooting

AWS Trusted Advisor

AWS Trusted Advisor inspects your AWS environment and makes recommendations that help to improve the speed and responsiveness of your applications.

The following are a few Trusted Advisor checks to improve the performance of your service. The Trusted Advisor checks your service limits, ensuring that you take advantage of provisioned throughput, and monitors for overutilized instances:

  • Amazon EC2 instances that are consistently at high utilization can indicate optimized, steady performance, but this check can also indicate that an application does not have enough resources.
  • Provisioned IOPS (SSD) volumes that are attached to an Amazon EC2 instance that is not Amazon EBS–optimized. Amazon EBS volumes are designed to deliver the expected performance only when they are attached to an EBS-optimized instance.
  • Amazon EC2 security groups with a large number of rules.
  • Amazon EC2 instances that have a large number of security group rules.
  • Amazon EBS magnetic volumes (standard) that are potentially overutilized and might benefit from a more efficient configuration.
  • CloudFront distributions for alternate domain names with incorrectly configured DNS settings.
  • Some HTTP request headers, such as Date or User-Agent, significantly reduce the cache hit ratio. This increases the load on your origin and reduces performance because CloudFront must forward more requests to your origin.

Summary

In this chapter, you learned about the following:

  • Cost-optimizing practices
  • Right sizing your infrastructure
  • Optimizing using Reserved Instances, Spot Instances, and AWS Auto Scaling
  • Optimizing storage and data transfer
  • Optimizing using NoSQL database (Amazon DynamoDB)
  • Monitoring your costs and performance
  • Tools, such as AWS Trusted Advisor, Amazon CloudWatch, and AWS Budgets

Achieving an optimized system is a continual process. An optimized system uses all the provisioned resources efficiently and achieves your business goal at the lowest price point. Engineers must know the cost of deploying resources and how to architect for cost optimization. Practice eliminating the waste and bring accountability in every step of the build process. Use mandatory cost tags on all of your resources to gain precise insights into usage. Define IAM policies to enforce tag usage, and use tagging tools, such as AWS Config and AWS Tag Editor, to manage tags. Be cost-conscious, reduce the usage by terminating unused instances, and delete old snapshots and unused keys.

Right size your infrastructure by matching instance types and sizes, and set periodic checks to ensure that the initial provision remains optimum as your business changes over time. With Amazon EC2, you can choose the combination of instance types and sizes most appropriate for your applications. Amazon RDS instances are also optimized for memory, performance, and I/O.

Amazon EC2 Reserved Instances provide you with a significant discount (up to 75 percent) compared to On-Demand Instance pricing. Using Convertible Reserved Instances, you can change instance families, OS types, and tenancies while benefitting from Reserved Instance pricing. Reserved Instance Marketplace allows you to sell the unused Reserved Instances or buy them from other AWS customers, usually at lower prices and shorter terms. With size flexibility, discounted rates for Amazon RDS Reserved Instances are automatically applied to the usage of any size within the instance family.

Spot Instances provide an additional option for obtaining compute capacity at a reduced cost and can be used along with On-Demand and Reserved Instances. Spot Fleets enable you to launch and maintain the target capacity and to request resources automatically to replace any that are disrupted or manually terminated. Using the termination notices and persistent requests in your application design help to maintain continuity as the result of interruptions.

AWS Auto Scaling automatically scales if your application experiences variable load and uses one or more scalable resources, such as Amazon ECS, Amazon DynamoDB, Amazon Aurora, Amazon EC2 Spot requests, and Amazon EC2 scaling groups. Predictive Scaling uses machine learning models to forecast daily and weekly patterns. Amazon EC2 Auto Scaling enables you to scale in response to demand and known load schedules. It supports the provisioning of scale instances across purchase options, Availability Zones, and instance families to optimize performance and cost.

Containers provide process isolation and improve the resource utilization. Amazon ECS lets you easily build all types of containerized applications and launch thousands of containers in seconds with no additional complexity. With AWS Fargate technology, you can manage containers without having to provision or manage servers. It enables you to focus on building and running applications, not the underlying infrastructure.

AWS Lambda takes care of receiving events or client invocations and then instantiates and runs the code. That means there’s no need to manage servers. Serverless services have built-in automatic scaling, availability, and fault tolerance. These features allow you to focus on product innovation and rapidly construct applications, such as web applications, websites, web-hooked systems, chatbots, and clickstream.

AWS storage services are optimized to meet different storage requirements. Use the Amazon S3 analytics feature to analyze storage access patterns to help you decide when to transition the right data to the right storage class and to yield considerable savings. Monitor Amazon EBS volumes periodically to identify ones that are unattached or appear to be underutilized or overutilized, and adjust provisioning to match actual usage.

Optimizing data transfer ensures that you minimize data transfer costs. Use options such as Amazon CloudFront, Amazon S3 transfer acceleration, and Amazon Route 53 to let data reach Regions faster and reduce latency issues.

NoSQL database systems like Amazon DynamoDB use alternative models for data management, such as key-value pairs or document storage. DynamoDB enables you to offload the administrative burdens of operating and scaling a distributed database so that you don’t have to worry about hardware provisioning, setup and configuration, replication, software patching, or cluster scaling. Follow best practices, such as distributing data evenly, effective partition and sort keys usage, efficient data scanning, and using sparse indexes for maximizing performance and minimizing throughput costs, when working with Amazon DynamoDB.

AWS provides several tools to help you identify those cost-saving opportunities and keep your resources right-sized. AWS Trusted Advisor inspects your AWS environment to identify idle and underutilized resources and provides real-time insight into service usage to help you improve system performance and save money. Amazon CloudWatch collects and tracks metrics, monitors log files, sets alarms, and reacts to changes in AWS resources automatically. AWS Cost Explorer checks patterns in AWS spend over time, projects future costs, identifies areas that need further inquiry, and provides Reserved Instance recommendations.

Optimization is an ongoing process. Always stay current with the pace of AWS new releases, and assess your existing design solutions to ensure that they remain cost-effective.

Exam Essentials

Know the importance of tagging. By using tags, you can assign metadata to AWS resources. This tagging makes it easier to manage, search for, and filter resources in billing reports and automation activities and when setting up access controls.

Know about various tagging tools and how to enforce the tag rules. With AWS Tag Editor, you can add tags to multiple resources at once, search for the resources that you want to tag, and then add, remove, or edit tags for the resources in your search results. AWS Config identifies resources that do not comply with tagging policies. You can use IAM policy conditions to force the usage of tags while creating the resources.

Know the fundamental practices in reducing the usage. Follow the best practices of cost optimization in every step of your build process, such as turning off unused resources, spinning up instances only when needed, and spinning them down when not in use. Use tagging to help with the cost allocation. Use Amazon EC2 Spot Instances, Amazon EC2, and Reserved Instances where appropriate, and use alerts, notifications, and cost-management tools to stay on track.

Know the various usage patterns for right sizing. By understanding your business use case and backing up the analysis with performance metrics, you can choose the most appropriate options, such as steady state; variable; predictable, but temporary; and development, test, and production usage.

Know the various instance families for right sizing and the corresponding use cases. Amazon EC2 provides a wide selection of instances to match capacity needs at the lowest cost and comes with different options for CPU, memory, and network resources. The families include General Purpose, Compute Optimized, Memory Optimized, Storage Optimized, and Accelerated Computing.

Know Amazon EC2 Auto Scaling benefits and how this feature can make your solutions more optimized and highly available. AWS Auto Scaling is a fast, easy way to optimize the performance and costs of your applications. It makes smart scaling decisions based on your preferences, automatically maintains performance even when your workloads are periodic, unpredictable, and continuously changing.

Know how to create a single AWS Auto Scaling group to scale instances across different purchase options. You can provision and automatically scale Amazon EC2 capacity across different Amazon EC2 instance types, Availability Zones, and On-Demand, Reserved Instances, and Spot purchase options in a single AWS Auto Scaling group. You can define the desired split between On-Demand and Spot capacity, select which instance types work for your application, and specify preferences for how Amazon EC2 Auto Scaling should distribute the AWS Auto Scaling group capacity within each purchasing model.

Know how block, object, and file storages are different. Block storage is commonly dedicated, low-latency storage for each host, and it is provisioned with each instance. Object storage is developed for the cloud, has vast scalability, is accessed over the web, and is not directly attached to an instance. File storage enables accessing shared files as a file system.

Know key CloudWatch metrics available to measure the Amazon EBS efficiency and how to use them. CloudWatch metrics are statistical data that you can use to view, analyze, and set alarms on the operational behavior of your volumes. Depending on your needs, set alarms and response actions that correspond to each data point. For example, if your I/O latency is higher than you require, check the metric VolumeQueueLength to make sure that your application is not trying to drive more IOPS than you have provisioned. Review and learn more about the available metrics that help optimize the block storage.

Know tools and features that help in efficient data transfer. Using Amazon CloudFront, you can locate data closer to users and reduce administrative efforts to minimize data transfer costs. Amazon S3 Transfer Acceleration enables fast data transfer over an optimized network path. Use the multipart upload file option while uploading a large file to improve network throughput.

Know key differences between RDBMS and NoSQL databases to design efficient solutions using Amazon DynamoDB. Schema flexibility and the ability to store related items together make DynamoDB a solution for solving problems associated with changing business needs and scalability issues unlike relational databases.

Know the importance of distributing the data evenly when designing DynamoDB tables. Use provisioned throughput more efficiently by making the partition key more distinct. That way, data spreads throughout the provisioned space. Use the sort key with the partition key to make a unique key to achieve better performance while uploading data simultaneously.

Know the different ways to read data from DynamoDB tables to avoid scans. DynamoDB provides Query and Scan actions to read data from a table and does not support table joins. DynamoDB provides the GetItem action for retrieving an item by its primary key. GetItem is highly efficient because it provides direct access to the physical location of the item. The scan always scans the entire table and can consume large amounts of system resources.

Know the AWS Cost Management tools and their features. AWS provides tools to help you manage, monitor, and, ultimately, optimize your costs. Use AWS Cost Explorer for deeper dives into the cost drivers. Use AWS Trusted Advisor to inspect your AWS infrastructure to identify overutilized or idle resources. AWS Budgets enables you to set custom cost and usage budgets and receive alerts when budgets approach or exceed the limits. There are a wide range of tools to explore, such as AWS Cost Optimization – Amazon EC2 Right Sizing, and monitoring tools to identify additional savings opportunities.

Know how the AWS Trusted Advisor features help in saving costs and improving the performance of your solutions. AWS Trusted Advisor scans your AWS environment, compares it to AWS best practices, and makes recommendations for saving money, improving system performance, and more. Cost Optimization recommendations highlight unused and underutilized resources. Performance recommendations help to improve the speed and responsiveness of your applications.

Know how to evaluate the reporting details in the AWS Cost Explorer default reports. Cost Explorer provides you with default reports: Cost and Usage reports and Reserved Instance reports. Cost and Usage reports include your daily costs and monthly costs by service, listing the top five services. These reports help you to determine whether you have purchased too many Reserved Instances. The Reserved Instance Coverage reports show how many of your instance hours are covered by Reserved Instances, how much you spent on On-Demand Instances, and how much you might have saved had you purchased more reservations. This enables you to determine whether you have under-purchased Reserved Instances.

Know how to extract recommendations using AWS Cost Explorer API. The Cost Explorer API allows you to use either AWS CLI or SDKs to query your cost and usage data. You can query for aggregated data, such as total monthly costs or total daily usage. You can also query for granular data, such as the number of daily write operations for DynamoDB database tables in your production environment.

Know all of the Amazon CloudWatch metrics and how to set alarms. With Amazon CloudWatch, you can observe CPU utilization, network throughput, and disk I/O, and match the observed peak metrics to a new and cheaper instance type. You choose a CloudWatch metric and threshold for the alarm to watch. The alarm turns into the alarm state when the metric breaches the threshold for a specified number of evaluation periods. Use the Amazon CloudWatch console, AWS CLI, or AWS SDKs for creating or managing alarms.

Know how AWS Lambda integrates with other AWS serverless services to build cost-effective solutions. AWS Lambda provides the cloud-logic layer and integrates seamlessly with the other serverless services to build virtually any type of application or backend service. For example, Amazon S3 automatically triggers Lambda functions when an object is created, copied, or deleted. Lambda functions can process Amazon SQS messages.

Resources to Review

Exercises

Symbol of Note Before you begin this task, you must first create an SNS topic (name: myHighCpuAlarm) and subscribe to it.

Exercise 16.1

Set Up a CPU Usage Alarm Using AWS CLI

In this exercise, you will use the AWS CLI to create a CPU usage alarm that sends an email message using Amazon SNS when the CPU usage exceeds 70 percent.

  1. Set up an SNS topic with the name myHighCpuAlarm and subscribe to it. For more information, see this article:

    https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/US_SetupSNS.html

  2. Create an alarm using the put-metric-alarm command as follows:
    aws cloudwatch put-metric-alarm 
        --alarm-name cpu-mon 
        --alarm-description "Alarm when CPU exceeds 70%" 
        --metric-name CPUUtilization 
        --namespace AWS/EC2 
        -–statistic Average 
        --period 300 
        --threshold 70 
        --comparison-operator GreaterThanThreshold 
        --dimensions Name=InstanceId,Value=i-12345678 
        --evaluation-periods 2 
        --alarm-actions arn:aws:sns:us-east-1:111122223333:myHighCpuAlarm 
        --unit Percent

Symbol of Note For Windows, replace the backslash () Unix continuation character at the end of each line with a caret (^).

  1. Test the alarm by forcing an alarm state change using the set-alarm-state command.
    1. Change the alarm state from INSUFFICIENT_DATA to OK.
      aws cloudwatch set-alarm-state --alarm-name cpu-mon --state-reason "initializing" --state-value OK
    2. Change the alarm state from OK to ALARM.
      aws cloudwatch set-alarm-state –alarm-name cpu-mon --state-reason "initializing" --state-value ALARM
    3. Check that you have received an email notification about the alarm.

Using AWS CLI, you created a CPU alarm that sends an email notification when CPU usage exceeds 70 percent. You tested it by manually changing its alarm state to ALARM.

Exercise 16.2

Modify Amazon EBS Optimization for a Running Instance

In this exercise, you will use the Amazon EC2 console to enable the optimization for a running instance by modifying its Amazon EBS optimized instance attribute.

  1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.
  2. In the navigation pane, click Instances, and select the instance.
  3. Choose ActionsInstance State ➢ Stop.
  4. In the Confirmation dialog box, choose YesStop.

    It can take a few minutes for the instance to stop.

  5. With the instance still selected, choose ActionsInstance Settings and then choose Change Instance Type.
  6. In the Change Instance Type dialog box, do one of the following:
    1. If the instance type of your instance is Amazon EBS–optimized, EBS-optimized is selected by default, and you cannot change it. Choose Cancel.
    2. If the instance type of your instance supports Amazon EBS optimization, choose EBS-optimizedApply.
    3. If the instance type of your instance does not support Amazon EBS optimization, select an instance type from Instance Type that supports Amazon EBS optimization and then choose EBS-optimizedApply.
  7. Choose ActionsInstance StateStart.

You enabled the EBS optimization feature for a running Amazon EC2 instance using AWS Console.

Symbol of Warning When you stop an instance, the data on any instance store volumes is erased. To keep data in instance store volumes, back it up to persistent storage.

Exercise 16.3

Create an AWS Config Rule

In this exercise, using the AWS Management Console, you will create an AWS Config rule to monitor whether Elastic IP addresses are attached to Amazon EC2 instances.

  1. Create an Elastic IP address to be used as part of this exercise, but do not attach it to any Amazon EC2 instance. See the following for instructions:

    https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/elastic-ip-addresses-eip.html#using-instance-addressing-eips-releasing

  2. Open the AWS Config console at https://console.aws.amazon.com/config/.
  3. Choose Get Started Now.
  4. On the Settings page, for Resource types to record, select All resources.
  5. For Amazon S3 Bucket, select the Amazon S3 bucket to which AWS Config sends configuration history and configuration snapshot files.
  6. For Amazon SNS Topic, select whether AWS Config streams information by selecting the Stream configuration changes and notifications to an Amazon SNS topic.
  7. For Topic Name, type a name for your SNS topic.
  8. For Bucket Name, type a name for your Amazon S3 bucket.
  9. For AWS Config role, choose the IAM role that grants AWS Config permission to record configuration information and send this information to Amazon S3 and Amazon SNS.
  10. Choose Create AWS Config service-linked role, and then Next.
  11. On the AWS Config Role page, in the search bar, enter eip to find a specific rule from the list.
  12. Select the eip-attached rule.
  13. Choose Next and then Confirm.

    AWS Config will run this rule against your resources. The rule flags the unattached EIP as non-compliant.

  14. Delete the AWS Config rule.
  15. Release the Elastic IP address. See the following for instructions:

    https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/elastic-ip-addresses-eip.html#using-instance-addressing-eips-releasing

From the AWS Config console, you used AWS Config to create a rule to determine whether an Elastic IP address is attached to an Amazon EC2 instance.

Exercise 16.4

Create a Launch Configuration and an AWS Auto Scaling Group, and Schedule a Scaling Action

In this exercise, using AWS Management Console, you will create a launch configuration and AWS Auto Scaling policy, and verify the scheduled scaling action.

  1. To create a launch configuration, complete the following steps:
    1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.
    2. On the navigation pane, under AWS Auto Scaling, choose Launch Configurations. On the next page, choose Create launch configuration.
    3. On the Choose AMI page, select your custom AMI.
    4. On the Choose Instance Type page, select a hardware configuration for your instance and choose Next: Configure details. Configure the remaining details.
    5. On the Configure Details page, do the following:
      1. For Name, type a name for your launch configuration.
      2. For Advanced DetailsIP Address Type, select Assign a public IP address to every instance.
    6. Choose Skip to review.
    7. On the Review page, choose Edit security groups. Follow the instructions to choose an existing security group, and then choose Review.
    8. On the Review page, choose Create launch configuration.
    9. For Select an existing key pair or create a new key pair page, select one of the listed options.
    10. Select the acknowledgment check box, and then choose Create launch configuration.
  2. To create an AWS Auto Scaling group, complete the following steps:
    1. Select Create an AWS Auto Scaling group using this launch configuration.
    2. On the Create AWS Auto Scaling Group page, follow these steps:
      1. For Group name, enter a name for your AWS Auto Scaling group.
      2. For Group size, enter 1 as the initial number of instances for your AWS Auto Scaling group.
      3. For Network, select the default VPC.
      4. For Subnet, select one or more subnets from the listed subnets.
    3. Choose Next: Configure scaling policies.
    4. On the Configure scaling policies page, select Keep this group at its initial size and then choose Review.
    5. On the Review page, choose Create AWS Auto Scaling group.
    6. On the AWS Auto Scaling group creation status page, choose Close.
  3. To schedule an AWS Auto Scaling action and verify that it’s working, complete the following steps:
    1. Select your AWS Auto Scaling group.
    2. On the Schedule Actions tab, select Create Scheduled Action.
    3. On Schedule Action page, follow these steps:
      1. For Name, type name of the action.
      2. For Max, type 2.
      3. For Desired Capacity, type 2.
      4. For Start Time, select current day in Date (UTC), and type current UTC time + 2 minutes.
    4. Select Save.
    5. Select the Instances tab, refresh the tab in the next two minutes, and observe that a new Amazon EC2 instance was created.

In this exercise, you created a launch configuration and an AWS Auto Scaling group using the launch group that you just created. To test whether automatic scaling is working, you added a Scaling action to launch a new Amazon EC2 instance by increasing capacity. You also verified that a new instance was added to the current capacity.

 

Review Questions

  1. You are developing an application that will run across dozens of instances. It uses some components from a legacy application that requires some configuration files to be copied from a central location and held on a volume local to each of the instances. You plan to modify your application with a new component in the future that will hold this configuration in Amazon DynamoDB. Which storage option should you use in the interim to provide the lowest cost and the lowest latency for your application to access the configuration files?

    1. Amazon S3
    2. Amazon EBS
    3. Amazon EFS
    4. Amazon EC2 instance store
  2. Similar to SQL, Amazon DynamoDB provides several operations for reading the data. Which operation is the most efficient way to retrieve a single item?

    1. Query
    2. Scan
    3. GetItem
    4. Join
  3. AWS Trusted Advisor offers a rich set of best practice checks and recommendations across five categories: cost optimization, security, fault tolerance, performance, and service limits. Which of the following checks is NOT under Cost and Performance categories?

    1. Amazon EBS Provisioned IOPS (SSD) volume attachment configuration
    2. Amazon CloudFront header forwarding and cache hit ratio
    3. Amazon EC2 Availability Zone balance
    4. Unassociated Elastic IP address
  4. Which of the following common partition schemas includes a partition key design that distributes I/O requests evenly across partitions and uses provisioned I/O capacity of an Amazon DynamoDB table efficiently?

    1. Status code, where there are only a few possible status codes
    2. User ID, where the application has many users
    3. Item creation date, rounded to the nearest time period
    4. Device ID, where even if there are many devices tracked, one is by far more popular than all the others
  5. You are developing an application that consists of a set of Amazon EC2 instances hosting a web layer and a database hosting a MySQL instance. You are required to add a layer that can be used to ensure that the most frequently accessed data from the database is fetched in a faster and more efficient manner. Which of the following can be used to store the frequently accessed data?

    1. Amazon Simple Queue Service (Amazon SQS) queue
    2. Amazon Simple Notification Service (Amazon SNS) topic
    3. Amazon CloudFront distribution
    4. Amazon ElastiCache instance
  6. You have an application deployed to the AWS platform. The application makes requests to an Amazon Simple Storage Service (Amazon S3) bucket. After monitoring the Amazon CloudWatch metrics, you notice that the number of GET requests has suddenly spiked. Which of the following can be used to optimize Amazon S3 cost and performance?

    1. Add Amazon ElastiCache in front of the S3 bucket.
    2. Use Amazon DynamoDB instead of Amazon S3.
    3. Place an Amazon CloudFront distribution in front of the S3 bucket.
    4. Place an Elastic Load Balancing load balancer in front of the S3 bucket.
  7. You are writing an application that will store data in an Amazon DynamoDB table. The ratio of read operations to write operations will be 1,000 to 1, with the same data being accessed frequently. Which feature or service should you enable on the DynamoDB table to optimize performance and minimize costs?

    1. Amazon DynamoDB Auto Scaling
    2. Amazon DynamoDB cross-region replication
    3. Amazon DynamoDB Streams
    4. Amazon DynamoDB Accelerator
  8. A developer is migrating an on-premises web application to the AWS Cloud. The application currently runs on a 32-processor server and stores session state in memory. On Mondays, the server runs at 80 percent CPU utilization, but at only about 5 percent CPU utilization at other times. How should the developer change the code to optimize running in the AWS Cloud?

    1. Store session state on the Amazon EC2 instance store.
    2. Encrypt the session state in memory.
    3. Store session state in an Amazon ElastiCache cluster.
    4. Compress the session state in memory.
  9. A company is using an ElastiCache cluster in front of their Amazon RDS instance. The company would like you to implement logic into the code so that the cluster retrieves data from Amazon RDS only when there is a cache miss. Which strategy can you implement to achieve this?

    1. Error retries
    2. Lazy loading
    3. Exponential backoff
    4. Write-through
  10. Your application will be hosted on an Amazon EC2 instance, which will be part of an AWS Auto Scaling group. The application must fetch the private IP of the instance. Which of the following can achieve this?

    1. Query the instance metadata.
    2. Query the instance user data.
    3. Have the application run ifconfig.
    4. Have an administrator get the IP address from the Amazon EC2 console.
  11. You just developed code in AWS Lambda that uses recursive functions. You see some throttling errors in the metrics. Which of the following should you do to resolve the issue?

    1. Use API Gateway to call the recursive code.
    2. Use versioning for the recursive function.
    3. Place the recursive function in a separate package.
    4. Avoid using recursive code in your function.
  12. A production application is making calls to an Amazon Relational Database Service (Amazon RDS) instance. The application’s reporting module is experiencing heavy traffic, causing performance issues. How can the application be optimized to alleviate this issue?

    1. Move the database to Amazon DynamoDB, and point the reporting module to the new DynamoDB table.
    2. Enable Multi-AZ for the database, and point the reporting module to the secondary database.
    3. Enable read replicas for the database, and point the reporting module to the read replica.
    4. Place an Elastic Load Balancing load balancer in front of the reporting part of the application.
  13. Your application uses Amazon S3 buckets. You have users in other countries accessing objects in those buckets. What can you do to reduce latency for those users outside of your country?

    1. Host a static website.
    2. Change the storage class.
    3. Enable cross-region replication.
    4. Enable encryption.
  14. You have an application that uploads objects to Amazon S3 between 200–500 MB. The process takes longer than expected, and you want to improve the performance of the application. Which of the following would you consider?

    1. Enable versioning on the bucket.
    2. Use the multipart upload API.
    3. Write the items in batches for better performance.
    4. Create multiple threads to upload the objects.
  15. You must bootstrap your application script to instances that are launched inside an AWS Auto Scaling group. Which is the most optimal way to achieve this?

    1. Create a Lambda function to install the script.
    2. Place a scheduled task on the instance that starts on boot.
    3. Place the script in the instance user data.
    4. Place the script in the instance metadata.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset