Ganglia is a monitoring system designed for use with clusters and grids. Hadoop can be configured to send periodic metrics to the Ganglia monitoring daemon, which is useful for diagnosing and monitoring the health of the Hadoop cluster. This recipe will explain how to configure Hadoop to send metrics to the Ganglia monitoring daemon.
Ensure that you have Ganglia Version 3.1 or better installed on all of the nodes in the Hadoop cluster. The Ganglia monitoring daemon (gmond
) should be running on every worker node in the cluster. You will also need the Ganglia meta daemon (gmetad
) running on at least one node, and another node running the Ganglia web frontend.
The following is an example with modified gmond.conf
file that can be used by the gmond
daemon:
cluster { name = "Hadoop Cluster" owner = "unspecified" latlong = "unspecified" url = "unspecified" } host { location = "my datacenter" } udp_send_channel { host = mynode.company.com port = 8649 ttl = 1 } udp_recv_channel { port = 8649 } tcp_accept_channel { port = 8649 }
Also, ensure that the Ganglia meta daemon configuration file
includes your cluster as a data source. For example, modify the gmeta.conf
configuration file to add the Hadoop cluster as a data source:
data_source "Hadoop Cluster" mynode1.company.com:8649 mynode2.company.com:8649 mynode3.company.com:8649
Perform the following steps to use Ganglia to monitor cluster metrics:
hadoop-metrics.properties
file found in the Hadoop configuration folder. If the hadoop-metrics.properties
file does not exist, create it:This property file will need to be updated for every node in the cluster.
$ vi /path/to/hadoop/hadoop-metrics.properties dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 dfs.period=10 dfs.servers=mynode1.company.com:8649 mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 mapred.period=10 mapred.servers=mynode1.company.com 8649 jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 jvm.period=10 jvm.servers=mynode1.company.com:8649 rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 rpc.period=10 rpc.servers=mynode1.company.com 8649
$ cd /path/to/hadoop $ bin/stop-all.sh $ bin/start-all.sh
The Ganglia monitoring daemon (gmond
) is responsible for collecting metric information from the nodes where it is installed. Next, all of the
metrics collected by the gmond
daemons are aggregated to the Ganglia meta daemon (gmetad
). Finally, the Ganglia web frontend will request the aggregated metrics in the form of XML from the gmetad
daemon and report that to users
via the web interface.