Monitoring and alerting with AWS health

While AWS is mostly stable, and outages are rare, it is not exempt from occasional service degradation. To check on the general health of their service, you go to their main dashboard at https://status.aws.amazon.com.

Note that this dashboard also provides an RSS feed, which can be integrated with most communication services, such as slack:

In addition to that global status page, you can also access a personalized health dashboard in the AWS console by clicking on the bell icon:

You can also access the dashboard directly by opening https://phd.aws.amazon.com in your browser. The personalized health dashboard will display information affecting all customers in the region and also notifications that are specific to your account, such as when one of your instances is scheduled for maintenance and reboot. The personalized health dashboard doesn't have an RSS feed, but instead is integrated into the CloudWatch event.

We are going to create a new rule in the CloudWatch event to send us email notifications of the different alerts.

We will do that using the command-line interface:

  1. The first step will be to create a rule that matches all events coming from the endpoint aws.health. We will do that with the following command:
$ aws events put-rule 
      --name AWSHealth 
      --event-pattern '{"source":["aws.health"]}' 
      --state ENABLED
{
    "RuleArn": "arn:aws:events:us-east-1:511912822958:rule/AWSHealth"
}  
  1. Next, we will get the information of our target. In our case, the target is the SNS topic created earlier in the chapter. We will need to get the topicARN, which you can get with the following command:
$ aws sns list-topics | grep alert-email
            "TopicArn": "arn:aws:sns:us-east-1:511912822958:alert-email"  
  1. Finally, we can tie the two together. The target's command expects a JSON entry, which we provide here using the following shorthand syntax:
aws events put-targets 
      --rule AWSHealth
      --targets Id=1,Arn=arn:aws:sns:us-east-1:511912822958:alert-email  

Throughout the course of this section, we explored how to create alerts and applied this method to a few of our key public indicators. If you wish, you can continue that exercise, reusing some of the techniques we explored to put in place more alarms to make sure you don't miss any important events.

Documentation
All the work done so far will only be useful if you create good documentation to go with it. At the very least, your documentation should cover the different failure scenarios and how to recover from them.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset