Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Monitoring important remote system metrics

The Nagios plugin check_multi is a convenient tool to execute multiple checks within a single check command that generates an overall returned state and output from it. Here in this recipe, we will show you how to set it up and use it to quickly monitor a list of important system metrics on your clients.

Getting ready

It is assumed that you've gone through this chapter recipe by recipe, therefore by now, you should have a Nagios server running and another client computer that you want to monitor, which can already be accessed via its NRPE service externally by our Nagios server. This client computer that you want to monitor needs an installation of the CentOS 7 operating system with root privileges and a console-based text editor of your choice installed on it, as well as a connection to the Internet in order to facilitate the download of additional packages. The client computer will have the IP address 192.168.1.8.

How to do it...

The check_multi Nagios plugin is available from Github, so we will begin this recipe to install the git program by downloading it:

Now, download and install the check_multi plugin by compiling it from the source:

cd /tmp;git clone git://github.com/flackem/check_multi;cd /tmp/check_multi
./configure --with-nagios-name=nagios --with-nagios-user=nagios --with-nagios-group=nagios --with-plugin-path=/usr/lib64/nagios/plugins --libexecdir=/usr/lib64/nagios/plugins/
make all;make install;make install-config

Next, we install another very useful plugin called check_mem, which is not available in the CentOS 7 Nagios plugin rpms:

cd /tmp;git clone https://github.com/justintime/nagios-plugins.git
cp /tmp/nagios-plugins/check_mem/check_mem.pl  /usr/lib64/nagios/plugins/

Next, let's create a check_multi command file that will contain all your desired client checks that you want to combine in a single run; open the following file:
```
vi /usr/local/nagios/etc/check_multi/check_multi.cmd
```

Put in the following content:

command[ sys_load::check_load ] = check_load -w 5,4,3 -c 10,8,6
command[ sys_mem::check_mem ] = check_mem.pl -w 10 -c 5 -f -C
command[ sys_users::check_users ] = check_users -w 5 -c 10
command[ sys_disks::check_disk ] = check_disk -w 5% -c 2% -X nfs
command[ sys_procs::check_procs ] = check_procs

Next, test out the command file that we just created in the last step using the following commandline:
```
/usr/lib64/nagios/plugins/check_multi -f   /usr/local/nagios/etc/check_multi/check_multi.cmd
```
If everything is correct, it should print out the results of your five plugin checks and an overall result, for example, OK - 5 plugins checked. Next, we will install this new command in the NRPE service on our client so that the Nagios server is able to execute it remotely by calling its name. Open the NRPE configuration file:
```
vi /etc/nagios/nrpe.cfg
```
Add the following line to the end of the file right below the last # command line to expose a new command called check_multicmd to our Nagios server:
```
command[check_multicmd]=/usr/lib64/nagios/plugins/check_multi -f   /usr/local/nagios/etc/check_multi/check_multi.cmd
```
Finally, let's reload NRPE:
```
systemctl restart nrpe
```
Now, let's check whether we can execute our new check_multicmd command that we defined in the last step from our Nagios server. Log in as root and type the following command (change the IP address of your client, 192.168.1.8, appropriately):
```
/usr/lib64/nagios/plugins/check_nrpe  -H 192.168.1.8 -c "check_multicmd"
```
If the output is the same as running it locally on the client itself (take a look at the former step), we can successfully execute remote NRPE commands on our client through our server, so let's define the command on our Nagios server system for real so that we can start using it within the Nagios system. Open the following file:
```
vi /etc/nagios/objects/commands.cfg
```
Put in the following content at the end of the file to define a new command called check_nrpe_multi, which we can use in any service definition:
```
define command {
  command_name check_nrpe_multi
  command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c "check_multicmd"
}
```
Next, we will define a new server definition for the client that we want to monitor on our Nagios server (give the config file an appropriate name, for example, its domain name or IP address):
```
vi /etc/nagios/servers/192.168.1.8.cfg
```

Put in the following content, which will define a new host with its service, using our new Nagios command that we just created:

define host {
       use                   linux-server
       host_name              host1
       address               192.168.1.22
       contact_groups         unix-admins
}
define service {
       use generic-service
       host_name host1
       check_command check_nrpe_multi
       normal_check_interval 15
       service_description check_nrpe_multi service
}

Finally, we need to configure all persons who should get notification e-mails for our new service in case of errors. Open the following file:
```
vi /etc/nagios/objects/contacts.cfg
```

Put in the following content at the end of the file:

define contactgroup{
        contactgroup_name       unix-admins
        alias                   Unix Administrators
 }
define contact {
        contact_name                    pelz
        use                             generic-contact
        alias                           Oliver Pelz
        contactgroups                   unix-admins
        email                           [email protected]
}

Now, restart the Nagios service:
```
systemctl restart nagios
```

How it works...

We started this recipe by installing the check_multi and check_mem plugins from their author's Github repositories; they are plain command-line tools. Nagios performs checks by running such external commands, and it uses the return code along with output from the command as information on whether the check was successful or not. Nagios has a very flexible architecture that can be easily extended using plugins, add-ons, and extensions. A central place to search for all kinds of extensions is at https://exchange.nagios.org/. Next, we added a new command file for check_multi, where we put five different system check_ commands in. These checks act as a starting point for customizing your monitoring needs and will check system load, memory consumption, system users, free space, and processes. All available check_ commands can be found at /usr/lib64/nagios/plugins/check_*. As you can see in our command file, the parameters of those check_ commands can be very different, and explaining them all is out of the scope of this recipe. Most of them are used to set threshold values to reach a certain state, for example, the CRITICAL state. To get more information about a specific command, use the --help parameter with the command. For example, to find out what all the parameters in the check_load -w 5,4,3 -c 10,8,6 command are doing, use run /usr/lib64/nagios/plugins/check_load --help. You can easily add any number of new check commands to our command file from existing plugins, or you can download and install any new commands, if you like. There are also a number of command file examples shipped with the check_multi plugin, which are very useful for learning, so please have a look at the directory: /usr/local/nagios/etc/check_multi/*.cmd.

Afterwards, we checked the correctness of our new command file that we just created by dry-running it as an -f parameter from the check_multi command locally on the client. In its output, you will find all the single outputs as if you would have run these five commands individually. If one single check fails, the complete check_multi will do. Next, we defined a new NRPE command in the NRPE config file called check_multicmd that can then be executed from the Nagios server, which we tested in the next step from our Nagios server. For a test to be successful, we expect the same results as we got when calling the command from the client itself. Afterwards, we defined this command in our commands.cfg on the Nagios server so that we can reuse it as much as we like in any service definition by referencing the command's name, check_nrpe_multi. Next, we created a new server file named as the IP address (you can name it anything you like as long it has the .cfg extension in the directory) of the client we want to monitor: 192.168.1.8.cfg. It contains exactly one host definition and one or multiple service definitions, which are linked by the value of host_name of the host with the host_name value in your service definitions.

In the host definition, we defined a contact_groups contact that links to the contacts.cfg file's contact group and contact entry. These will be used to send notification e-mails if the checked service has any errors. The most important value in the service definition is the check_command check_nrpe_multi line, which executes the command that we created before as our one and only check. Also, the normal_check_interval is important as it defines how often the service will be checked under normal conditions. Here, it gets checked every 15 minutes. You can add as many service definitions to a host as you like.

Now, go to your Nagios web frontend to inspect your new host and service. Here, go to the Hosts tab, where you will see the new host, host1, that you defined in this recipe, and it should give you information about its status. If you click on the Services tab, you will see the check_nrpe_multi service. It should show the Status as Pending, OK, or CRITICAL, depending on the success of the single checks. If you click on its check_nrpe_multi link, you will see details about the checks.

Here in this chapter, we could only show you the very basics of Nagios, and there is always more to learn, so please read the official Nagios Core documentation at https://www.nagios.org, or check out the book Learning Nagios 4, Packt Publishing, by Wojciech Kocjan.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Monitoring important remote system metrics

Create new playlist

Sign In

Sign Up

Monitoring important remote system metrics

Getting ready

How to do it...

How it works...

Table of Contents for
Monitoring important remote system metrics