Using Amazon ElastiCache

Amazon ElastiCache is a managed cache service that improves latency and throughput for read-intensive applications. Amazon ElastiCache provides a caching layer for your applications. Instead of querying the databases each time, you can use a caching layer in front of the database layer to get higher query performance.

Each node runs an instance of either Memcached or Redis. However, each node in the cluster is of the same instance type and runs the same caching engine. You basically launch the cluster, get the node names, and then connect the client to it. No changes are required in your application code to access ElastiCache, and you can use your existing Redis and Memcached libraries to connect to the ElastiCache's Redis or Memcached clusters, respectively.

Amazon is responsible for tasks such as provisioning hardware, installing caching software, patch management, failure detection, and recovery. Supported in-memory caching engines are Memcached and Redis. ElastiCache clusters are only accessible from EC2 instances. The cache cluster and its related EC2 instances must be in the same VPC. If you want to access cache cluster from outside its VPC, then you will need to setup EC2 inside the cache VPC to act as a proxy for the outside world.

How to do it…

  1. Before creating the ElasticCache cluster, you need to create cache subnet group. Execute the following command to create a cache subnet group named appcachesubnetgroup:
    $ aws elasticache create-cache-subnet-group 
    --cache-subnet-group-name appcachesubnetgroup 
    --cache-subnet-group-description "Application Cache Subnet Group" 
    --subnet-ids subnet-5314c936 subnet-49ca1b2c subnet-0240b575
    
  2. Create a Memcached cluster.

    Execute the following command to create an ElastiCache cluster with three nodes. This cluster uses Memcached as the cache engine.

    $ aws elasticache  create-cache-cluster 
    --cache-cluster-id appcachecluster 
    --engine memcached 
    --cache-node-type cache.t2.small 
    --num-cache-nodes 3 
    --engine-version 1.4.14
    --cache-subnet-group-name appcachesubnetgroup
    
  3. Add an ingress rule for the Memcached port.

    Execute the following command to add an ingress rule for the security group sg-f332ea96:

    $ aws ec2 authorize-security-group-ingress 
    --group-id sg-f332ea96 
    --protocol tcp 
    --port 11211 
    --cidr 0.0.0.0/0
    
  4. Get ElastiCache information to verify your cluster.
    $ aws elasticache describe-cache-clusters 
    --cache-cluster-id appcachecluster 
    

Working with ElasticCache

The following steps illustrate how you can use ElastiCache from your code.

  1. Sign in to the AWS console at https://console.aws.amazon.com/elasticache/.
  2. Click on ElastiCache Cluster Client, and then click on Download.
  3. After downloading client, extract the ZIP file and add the AmazonElastiCacheClusterClient-1.0.jar to the Java application build path.
  4. The following sample code connects to the ElastiCache cluster and calls the setter and getter methods. You can retrieve the ElasticCache cluster information like cache cluster DNS name using the describe-cache-clusters command.
    // Create memcached client.
            MemcachedClient client = new MemcachedClient(new InetSocketAddress(
        "appcachecluster.nzrwy7.cfg.apse1.cache.amazonaws.com", 11211));
    
            // Set value for key andrew.
            client.set("andrew", 3600, 43);
    
            // Get value.
            int value = (int) client.get("andrew");

How it works…

If you want to launch the cache cluster inside the VPC, then you have to create a cache subnet group by defining the appropriate subnet IDs. In the first step, we create the cache subnet group by specifying a name for it, a description, and the subnet IDs.

You can create the ElastiCache cluster with the Memcached engine or Redis engine. Next, we execute the command for creating an ElastiCache cluster with the Memcached engine. For this, we supply the cache cluster ID, the cache engine to use, the compute, and memory capacity of the nodes in the cache cluster, the initial number of cache nodes, the version of the cache engine, and the cache subnet group name. Note that you can retrieve the cache engine version information by executing the describe-cache-engine-versions command.

As a best practice, you should restrict ElastiCache node access to applications running on EC2 instances within certain subnet and VPC security groups. In the next step, we add an ingress rule for the Memcached port to allow access from EC2 instances to the cache cluster inside your VPC. We have to add the ingress rule for the cache cluster VPC's default security group. The ingress rule is added by specifying the security group ID, the protocol, the range of ports allowed, and the CIDR IP range. After creating the ElastiCache cluster, we check the details of it by retrieving the cluster information.

Finally, we present some code to illustrate how you can use ElastiCache. You can use your Memcached libraries to access the Memcached engine in the ElastiCache cluster. If you have multiple cache nodes in your cluster, then you have to define all these endpoints in your client code. However, if you use ElastiCache Auto Discovery library, then you don't need to specify the endpoints for all these nodes. In this situation, you have to connect to the configuration endpoint, and then the Auto Discovery library connects to all the other nodes in the cache cluster.

There's more…

You can cache just about anything, including database records, full HTML pages, page fragments, and remote API calls. There are several factors that impact what you should cache. For example, join-based queries, and relatively static but frequently accessed data, are typically good candidates for caching.

Depending on the caching engine, the clustering configurations can vary; for example, Memcached clusters can partition or shard your data across the nodes, whereas Redis supports single node clusters and replication groups, and you cannot partition your data across multiple Redis clusters. In Redis, scaling is achieved by choosing a different node instance type. However, if your application is read-intensive, then you can create multiple read replicas to distribute the load.

Typically, due to its support for sharding, Memcached clusters will tend to use more and smaller nodes while Redis deployments will use fewer, large node instance types. The total memory capacity of your cluster is the product of the number of cache nodes in the cluster and the RAM capacity of each node. You can reduce the impact of a failed node by spreading your caching capacity over a larger number of cache nodes, each with smaller capacity, rather than using a fewer number of high capacity nodes.

You can use the describe-cache-clusters command to list the endpoints for a cluster. This command will return the configuration endpoint for a Memcached cluster and the cluster endpoint for a Redis cluster. Additionally, if you specify the show-cache-node-info parameter, then this command will also return the endpoints of the individual nodes in the cluster.

A replication group is a collection of Redis clusters, with one primary read-write cluster and several read-only clusters called read replicas. The read replicas are updated asynchronously to remain in sync with the primary cluster. For endpoints in a replication group, you can use the describe-replication-groups command. This command returns the replication group's primary endpoint and a list of all the clusters in the replication group with their endpoints.

With Auto-Discovery, your application does not need to manually connect to individual nodes, instead, your application connects to a configuration endpoint. There is no need to hard code individual cache node endpoints in your application because the configuration endpoint DNS entry contains CNAME entries for each of the cache node endpoints. Node Auto-Discovery for Memcached enables automatic discovery of cache nodes by the clients when nodes are added or removed from the cluster. Setup an Amazon SNS topic for ElastiCache, and have an app listen for the add and remove cache node events.

Connecting to a Memcached cluster is done using the cluster's configuration endpoint, while connecting to a Redis cluster is done using its endpoint. Connecting to Redis clusters in a replication group is done using the primary endpoint for all write operations and the individual cluster endpoints for read operations.

Note

Reference to the ElastiCache user guide is available at https://aws.amazon.com/documentation/elasticache/ for best practices for using Memcached and Redis clusters in an AWS environment.

Higher availability can be configured through replication across multiple availability zones.

ElastiCache monitors the health of the nodes in a Multi-AZ Replication Group. In case of a failure of the primary node, ElastiCache selects a Read Replica and upgrades it to a primary node. This process can take several minutes and the application design should take this into consideration and continue to operate in the absence of a cache.

Lazy loading is a caching strategy that loads data into the cache only when necessary. Most data is never accessed and only requested data is cached. Cache nodes fail, but the application continues to function, though with increased latency and scale, But a cache miss results in noticeable delay, and the cache data can go stale as it is only updated on a cache miss. However, this can be handled by updating the data in cache whenever the database is updated.

Lazy loading ensures cache is always current. But missing data on scale up can create an issue because the missing data is missing until it is added or updated on the database. Implementing lazy loading in conjunction with a write through strategy can minimize this effect. There can be a lot of data in the cluster that is never read—adding a TTL can help minimize the wasted space.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset