Hadoop uses a set of counters to aggregate the metrics for MapReduce computations. Hadoop counters are helpful to understand the behavior of our MapReduce programs and to track the progress of the MapReduce computations. We can define custom counters to track the application specific metrics in MapReduce computations.
The following steps show you how to define a custom counter to count the number of bad or corrupted records in our log processing application.
public static num LOG_PROCESSOR_COUNTER { BAD_RECORDS };
context.getCounter(LOG_PROCESSOR_COUNTER.BAD_RECORDS).increment(1);
Job job = new Job(getConf(), "log-analysis"); …… Counters counters = job.getCounters(); Counter badRecordsCounter = counters.findCounter(LOG_PROCESSOR_COUNTER.BAD_RECORDS); System.out.println("# of Bad Records:"+ badRecordsCounter.getValue());
> bin/hadoop jar C4LogProcessor.jar demo.LogProcessor in out 1 ……… 12/07/29 23:59:01 INFO mapred.JobClient: Job complete: job_201207271742_0020 12/07/29 23:59:01 INFO mapred.JobClient: Counters: 30 12/07/29 23:59:01 INFO mapred.JobClient: demo.LogProcessorMap$LOG_PROCESSOR_COUNTER 12/07/29 23:59:01 INFO mapred.JobClient: BAD_RECORDS=1406 12/07/29 23:59:01 INFO mapred.JobClient: Job Counters ……… 12/07/29 23:59:01 INFO mapred.JobClient: Map output records=112349 # of Bad Records :1406