Monitoring EMR clusters

The EMR dashboard provides a rich feature set using which you can manage and monitor your EMR clusters all from one place. You can additionally view logs and leverage Amazon CloudWatch as well to track the performance of your cluster.

In this section, we will be looking at a few simple ways using which you can monitor your EMR clusters. To start off, let's look at how to monitor the status of your cluster using the EMR dashboard:

  1. From the EMR dashboard, select your cluster name from the cluster list page. This will bring up the newly created cluster's details page. Here, select the Events tab, as shown in the following screenshot:

The Events tab allows you to view the event logged by your cluster. You can use this to view events generated by the cluster, by running applications, by step execution and much more.

  1. The dashboard also provides an in-depth look into the performance of the cluster over a period. To view the performance indicators, select the Monitoring tab from the cluster's Details page.

Here, you can view essential details and status about your cluster, the running nodes, as well as the underlying I/O and data storage.

  1. Alternatively, you can also use Amazon CloudWatch to view and monitor the cluster's various metrics. To do so, launch the Amazon CloudWatch dashboard by selecting this URL: https://console.aws.amazon.com/cloudwatch/home.
  1. Next, from the navigation pane, select the Metrics option to view all the metrics associated with EMR. Use the JobFlowID dimension to filter the EMR cluster in case you have multiple clusters running in the same environment.

Here is a list of some important EMR metrics worth monitoring:

Metric name

Metric description

AppsFailed

The number of applications submitted to the EMR cluster that have failed to complete. This application status is monitored internally and reported by YARN.

MRUnhealthyNodes

The number of nodes available to MapReduce jobs marked in an UNHEALTHY state.

MRLostNodes

The number of nodes allocated to MapReduce that have been marked in a LOST state.

CorruptBlocks

The number of blocks that HDFS reports as corrupted.

You can view the complete list of monitored metrics at: https://docs.aws.amazon.com/emr/latest/ManagementGuide/UsingEMR_ViewingMetrics.html.
  1. Once a Metric is identified, select the Metric and click on the Graphed metrics tab. Here, select the Create alarm option provided under the Actions column to create and set an alarm threshold, as well as its corresponding action.

In this way, you can also leverage Amazon CloudWatch events to periodically monitor the events generated by the cluster. Remember, however, that EMR tracks and records events only for a period of seven days. With this, we come to the end of this particular section and EMR, as well. In the next section, we will be learning and exploring a bit about yet another awesome analytics service called Amazon Redshift!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset