Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 10. Monitoring Remote Hosts

Nagios offers various ways of monitoring computers and services. The previous chapter talked about passive checks and how they can be used to submit results to Nagios. It also discussed NRDP, which can be used to send check results from other machines to the Nagios server.

This chapter talks about another approach to check the service status. It uses Nagios active checks that run the actual check commands on different hosts. This approach is most useful in cases where resources local to a particular machine are to be checked, such as monitoring disk and memory usage as well as checking if your operating system is up to date. This type of data cannot be checked without running commands on the target computer.

Remote checks are usually used in combination with the Nagios plugins package that use either SSH or NRPE to run the plugins on the remote machine. This makes monitoring remote systems very similar to monitoring a local computer, with a difference only in the actual running of the commands on the remote machine. In this chapter, we will cover the following topics:

Monitoring over SSH
Monitoring using NRPE
Comparing NRPE and SSH
Alternatives to SSH and NRPE

Monitoring over SSH

Nagios is often used to monitor computer resources such as CPU utilization, memory, and disk space. One way in which this can be done is to connect over SSH and run a Nagios check plugin.

Automating the authentication process requires setting up SSH to authenticate using public keys. This works because the Nagios server has an SSH private key and the target machine is configured to allow users with that particular key to connect without prompting for a password.

Nagios offers a check_by_ssh plugin that takes the hostname and the actual command to run on the remote server. It then connects using SSH, runs the plugin, and returns both output and exit code from the actual check performed on the remote machine to Nagios running on the local server. Internally it runs the SSH client to connect to the server and runs the actual command to run along with its attributes on the target machine. After the check has been performed, the output along with the check command's exit code is returned to Nagios.

This way any Nagios plugin can be run from the same machine as the Nagios daemon as well as remotely over SSH without any changes to the plugins. Using the SSH protocol also means that the authorization process can be automated using the key-based authentication so that each check is done without any user activity. This way Nagios is able to log in to remote machines automatically without using any passwords. The following is an illustration of how such a check is performed:

Once Nagios schedules an active check to be performed, the check_by_ssh plugin runs the ssh command to connect to the remote host's SSH server. It then runs the actual plugin, which has to be present on the remote host, and waits for the result. The SSH client passes the standard output as well as exit code to the check_by_ssh plugin that also prints the output and exits with the same code as the plugin.

Even though the scenario might seem a bit complicated, it works quite efficiently and requires very little setup to work properly. It also works with various flavors of Unix systems such as the SSH protocol, clients, and the shell syntax for commands used by the check_by_ssh plugin is the same on all Unix-based systems.

Configuring the SSH connection

SSH provides multiple ways for a user to authenticate. One of them is password-based authentication, which means that the user specifies a password; the SSH client sends it to the remote machine, and the remote machine checks if the password is correct.

Another form of verifying whether a user or program can access the remote machine is public key-based authentication. It uses asymmetric cryptography (visit http://en.wikipedia.org/wiki/Public-key_cryptography for more detail) to perform the authentication and provides a secure way to authenticate without specifying any credentials. It requires the user to generate an authentication key, which consists of a public and private key. By default, the filename is ~/.ssh/id_rsa for the private key and ~/.ssh/id_rsa.pub for the public key. The public key is then put on the remote machines and it allows the remote machine to authenticate the user. The SSH protocol then takes care of the authentication, it only requires the client machine to have the private key and the remote machine to be configured to accept it by adding the public key to the remote user's SSH authorized keys file, which is located in ~/.ssh/authorized_keys in most cases.

Setting up remote checks over SSH requires a few steps. The first step is to create a dedicated user for performing checks on the machine on which the remote checks will be run. We will also need to set up directories for the user. The steps to create directory structure on the remote machine are very similar to the steps performed for the Nagios installation itself.

The first thing that needs to be performed on the Nagios server is the creation of a private and public key pair that will be used to log in to all the remote machines without using passwords. We will need to execute the ssh-keygen command to generate it. For example:

    root@nagiosserver:~# su -s /bin/bash nagios
    nagios@nagiosserver:~$ ssh-keygen
    Generating public/private rsa key pair.
    File in which to save the key (/opt/nagios/.ssh/id_rsa): <enter>
    Created directory '/opt/nagios/.ssh'.
    Enter passphrase (empty for no passphrase): <enter>
    Enter same passphrase again: <enter>
    Your identification has been saved in /opt/nagios/.ssh/id_rsa.
    Your public key has been saved in /opt/nagios/.ssh/id_rsa.pub.
    The key fingerprint is:
    c9:68:47:bd:cd:6e:12:d3:9b:e8:0d:cf:93:bd:33:98 nagios@nagiosserver
    nagios@nagiosserver:/root$

We used the su command to switch users along with the -s flag to force the shell to be /bin/bash; this is because in most setups the nagios user usually does not have shell access. The <enter> text means that the question was answered with the default reply. The private key is saved as /opt/nagios/.ssh/id_rsa, and the public key has been saved in the /opt/nagios/.ssh/id_rsa.pub file.

At this point our Nagios server is set up.

Next we need to set up the remote machines that we will monitor. All the following commands should be executed on the remote machine that is to be monitored, unless explicitly mentioned. First, let's create a user and group named nagios:

    root@remotehost:~# groupadd nagios
    root@remotehost:~# useradd -g nagios -d /opt/nagios nagios

We do not need the nagioscmd group as we will only need the account to log in to the machine. The computer that only performs checks does not have a full Nagios installation along with the external command pipe that needs a separate group.

The next thing that needs to be done is the compiling of the Nagios plugins. You will probably also need to install the prerequisites that are needed for Nagios. Detailed instructions on how to do this can be found in Chapter 2, Installing Nagios 4. For the rest of the section, we will assume that the Nagios plugins are installed in the /opt/nagios/plugins directory, similar to how they were installed on the Nagios server.

It is best to install plugins in the same directory on all the machines they will be running. In this case, we can use the $USER1$ macro definition when creating the actual check commands in the main Nagios configuration. The USER1 macro points to the location where Nagios plugins are installed in the default Nagios installations. This is described in more detail in Chapter 2, Installing Nagios 4.

Next, we will need to create the /opt/nagios directory and set its permissions:

    root@remotehost:~# mkdir /opt/nagios
    root@remotehost:~# chown nagios:nagios /opt/nagios
    root@remotehost:~# chmod 0700 /opt/nagios

You can make the /opt/nagios directory permissions less restrictive by setting the mode to 0755. However, it is recommended not to make the users' home directories readable for all users.

We will now need to add the public key from the nagios user on the remote machine that is running the Nagios daemon, as shown in the following command snippet:

    root@remotehost:~# mkdir /opt/nagios/.ssh
    root@remotehost:~# echo 'ssh-rsa ... nagios@nagiosserver' 
        /opt/nagios/.ssh/authorized_keys 

    root@remotehost:~# chown Nagios:nagios 
        /opt/nagios/.ssh /opt/nagios/.ssh/authorized_keys
    root@remotehost:~# chmod 0700 
        /opt/nagios/.ssh /opt/nagios/.ssh/authorized_keys

You need to replace the text ssh-rsa ... nagios@nagiosserver with the actual contents of the /opt/nagios/.ssh/id_rsa.pub file on the server that is running Nagios.

If your machine is maintained by more than one person, you might replace the nagios@nagiosserver string to a more readable comment such as Nagios on nagiosserver SSH check public key.

Make sure that you change the permissions for both the .ssh directory and the authorized_keys file, as many SSH server implementations ignore public key-based authorization if the files can be read or written to by other users on the system.

In order to configure multiple remote machines to be accessible over ssh without a password, you will need to perform all the steps mentioned earlier, except the key generation at the computer running the Nagios server, as a single private key will be used to access multiple machines.

Assuming everything was done successfully, we can now move on to testing if the public key-based authorization actually works. In order to check that our connection can now be successfully established, we need to try to connect to the remote machine from the computer that has the Nagios daemon running. We will use the ssh client with the verbose flag to make sure that our connection works properly:

    nagios@nagiosserver:~$ ssh -v [email protected]
    OpenSSH_6.6.1, OpenSSL 1.0.1f 6 Jan 2014
    debug1: Reading configuration data /etc/ssh/ssh_config
    debug1: Applying options for *
    debug1: Connecting to 192.168.2.1 [192.168.2.1] port 22.
    debug1: Connection established.
    debug1: identity file /opt/nagios/.ssh/id_rsa type 1
    (...)
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: server->client aes128-cbc hmac-md5 none
debug1: kex: client->server aes128-cbc hmac-md5 none
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
The authenticity of host '192.168.2.1 (192.168.2.1)' can't be established.
RSA key fingerprint is cf:72:1e:40:03:a4:e0:9b:6c:84:4e:e1:2d:ea:56:fc.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.2.1' (RSA) to the list of known hosts.
debug1: ssh_rsa_verify: signature correct
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: SSH2_MSG_SERVICE_REQUEST sent
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey,password
debug1: Next authentication method: publickey
debug1: Offering public key: /opt/nagios/.ssh/id_rsa
debug1: Server accepts key: pkalg ssh-rsa blen 277
debug1: read PEM private key done: type RSA
debug1: Authentication succeeded (publickey).
debug1: channel 0: new [client-session]
debug1: Entering interactive session.
debug1: Sending environment.
debug1: Sending env LANG = en_US.UTF-8
$

As we were connecting to the remote machine for the first time, the ssh command asked whether to accept the connection so that SSH can continue and store the remote machine's key to a list of known hosts. This is only done once for each host.

Also, note that we need to test the connection from the Nagios account so that the keys that are used for authentication as well as the list of known hosts are the same ones that will be used by the Nagios daemon later.

Assuming that we have the Nagios plugins installed on the remote machine in the /opt/nagios/plugins directory, we can try to use the check_by_ssh plugin from the computer running Nagios to the remote machine by running the following command:

nagios@nagiosserver:~$ /opt/nagios/plugins/check_by_ssh   
        -H 192.168.2.1 -C "/opt/nagios/plugins/check_apt"
APT OK: 0 packages available for upgrade (0 critical updates).

We are now sure that the checking itself works fine, and we can move on to how check_by_ssh can be used and what its syntax is.

Using the check_by_ssh plugin

As mentioned earlier, Nagios uses a separate check command that connects to a remote machine over SSH and runs the actual check command on it.

The command has multiple features and can be used to query a single service status by using active checks. It can also be used to perform and report multiple checks at once as passive checks.

The syntax of the plugin is as follows:

check_by_ssh -H <host> -C <command> [-fqv] [-1|-2] [-4|-6]
             [-S [lines]] [-E [lines]] [-t timeout] [-i identity]
             [-l user] [-n name] [-s servicelist] [-O outputfile]
             [-p port] [-o ssh-option]

The following table describes all the options accepted by the plugin. Items required are marked in bold:

Option	Description
`-H`, `--hostname`	This provides the hostname or IP address of the machine to connect to; this option must be specified
`-C`, `--command`	This provides the full path of the command to be executed on the remote host along with any additional arguments; this option must be specified
`-l`, `--logname`	This lets you log in as a specific user; if omitted, it defaults to the current user (usually `nagios`) or any other user specified in the per-user SSH client configuration file
`-I`, `--identity`	This specifies the path to the SSH private key to be used for authorization; if omitted, then `~/.ssh/id_rsa` is used by default
`-o`, `--ssh-option`	This allows passing SSH-specific options that will be passed as the `-o` option to the `ssh` command
`-q`, `--quiet`	This stops SSH from printing warning and information messages
`-w`, `--warning`	This specifies the time in seconds after which the connection should be terminated and a warning should be issued to Nagios
`-c`, `--critical`	This specifies the time in seconds after which the connection should be terminated and a critical should be issued to Nagios
`-t`, `--timeout`	This specifies the time in seconds after which the connection should be terminated and checks should be stopped; defaults to 10 seconds
`-p`, `--port`	This specifies the port to connect over SSH; defaults to 22
`-1`, `--proto1`	This will let you use the SSH protocol Version 1
`-2`, `--proto2`	This will let you use the SSH protocol Version 2; this is the default
`-4`	This will let you use IPv4 protocol for SSH connectivity
`-6`	This will let you use IPv6 protocol for SSH connectivity
`-S`, `--skip-stdout`	This will let you ignore all or the provided number of lines from the standard output
`-E`, `--skip-stderr`	This will let you ignore all or the provided number of lines from the standard error
`-f`	This tells SSH to work in the background just after connecting, instead of using a terminal

The only required flags are -H to specify the IP address or hostname to connect and -C to specify the command to run. The remaining parameters are optional. If they are not passed, SSH defaults and the timeout of 10 seconds will be used.

The -S and -E options are used to skip messages that are written by the SSH client or the remote machine, regardless of the commands executed. For example, to properly check machines printing MOTD, even for non-interactive sessions, skipping it by using one of the options is required.

When specifying commands, they usually need to be enclosed in single or double quotation marks. This is because the entire command that should be run needs to be passed to check_by_ssh as a single argument. If one or more arguments contain spaces, single quote characters will have to be used.

For example, when checking for disk usage remotely, we need to quote the entire command as well; this is because it's safer to quote the path to the drive we're checking, as shown here:

nagios@nagios1:~$ /opt/nagios/plugins/check_by_ssh -H 192.168.2.1 -C 
    "/opt/nagios/plugins/check_disk -w 15% -c 10% -p '/'"
DISK OK - free space: / 243 MB (17% inode=72%)

The example above is a typical usage of the check_by_ssh plugin as an active check. It performs a single check and returns the status directly using the standard output and exit code. This is how it is used as an active check within Nagios.

If you want to use check_by_ssh to deploy checks locally on the same machine as the one on which Nagios is running, you will need to add the SSH key from id_rsa.pub to the authorized_keys file on that machine as well. In order to verify that it works correctly, try logging in to the local machine over SSH.

Now that the plugin works when invoked manually; we need to configure Nagios to make use of it.

Usually, for commands that will be performed both locally and remotely, the approach is to create a duplicate entry for each command with a prefix, for example, _by_ssh.

For example a command that checks swap usage locally may be defined as follows:

  define command 
  { 
    command_name  check_swap 
    command_line  $USER1$/check_swap -w $ARG1$ -c $ARG2$ 
  }

Then, assuming that we will also check the swap usage on remote machines, we need to define the following remote counterpart:

  define command 
  { 
    command_name  check_swap_by_ssh 
    command_line  $USER1$/check_by_ssh -H $HOSTADDRESS$ -C 
                  "$USER1$/check_swap -w $ARG1$ -c $ARG2$" 
  }

Usually services are defined for groups of hosts. For example, a service to check swap space usage may be defined to be performed on all the Linux servers. It is more convenient to always use the check_swap_by_ssh command in this case—both for local Nagios as well as all remote machines. The overhead for performing checks over SSH is relatively small and can be ignored in most cases.

However, this requires that a server running Nagios accepts SSH connections, which is not always the case. It is also possible to simply define two types of service - one that is run over SSH and one locally and define that localhost should not use the SSH based check—such as:

  define service 
  { 
    use                  generic-service 
    host_name            localhost 
    service_description  SWAP 
    check_command        check_swap 
  } 
 
  define service 
  { 
    use                  generic-service 
    host_name            !localhost 
    hostgroup_name       linux-servers 
    service_description  SWAP 
    check_command        check_swap_by_ssh 
  }

This way localhost will use the check_swap command and all the remaining machines that are part of the linux-servers host group will use the check_swap_by_ssh check command.

Performing multiple checks

The check_by_ssh plugin can also run multiple plugins at once and report their results to Nagios using the external command pipe. The reason for this approach is that the SSH protocol negotiations introduce a lot of overhead related to the protocol itself. For hosts with heavy load or for machines with connectivity issues, it is more efficient to run all the checks using a single SSH session instead of performing every check individually.

As the results are reported as passive checks, using this functionality requires that those services allow receiving passive check results over the command pipe.

One of the main issues with doing multiple checks is that it is not trivial to schedule these directly from Nagios. A typical approach to passive checks is to schedule checks from an external application such as cron (http://linux.die.net/man/8/cron).

An alternate approach is to create a dummy service in Nagios that will launch passive checks in the background. The actual result for this service would also be to check whether running the tests was successful or not. An upside of this approach is that the checks will be performed even if the cron daemon is currently disabled, as Nagios will still take care of scheduling the checks done by it.

When using check_by_ssh to report multiple results as passive checks, the following options need to be specified:

Option	Description
`-n`, `--name`	This provides the short name of the host that the tests refer; this is the name of the host that will be used when sending the results over the external command pipe
`-s`, `--services`	These are the names of the services that the tests refer, separated by a colon; these are the names of services that will be used when sending results over the external pipe
`-O`, `--output`	This is the path to the external command pipe to which the results of all the checks should be sent

The options above are specific to performing multiple checks only and are not all of the options that the plugin accepts when running multiple checks. The remaining options described earlier must also be specified—especially the -H and -C options.

The -C option needs to be specified multiple times, each for one check. The number of parameters must match the number of entries in the -s parameter so that each result can be mapped to a service name.

The following example runs a disk space check for three partitions:

/opt/nagios/plugins/check_by_ssh -H 192.168.2.1 -O /tmp/out1 
     -n ubuntu1 -s "DISK /:DISK /usr:DISK /opt" 
     -C "/opt/nagios/plugins/check_disk -w 15% -c 10% -p /" 
     -C "/opt/nagios/plugins/check_disk -w 15% -c 10% -p /usr" 
     -C "/opt/nagios/plugins/check_disk -w 15% -c 10% -p /opt"

This command will put the output into /tmp/out1, similar to the following example:

[1462485600] PROCESS_SERVICE_CHECK_RESULT;ubuntu1;DISK /:DISK CRITICAL...
[1462485600] PROCESS_SERVICE_CHECK_RESULT;ubuntu1;DISK /usr:DISK OK   ...
[1462485600] PROCESS_SERVICE_CHECK_RESULT;ubuntu1;DISK /opt:DISK OK   ...

As mentioned earlier in this section, it is very common to write a script that is run as an active check and will perform passive checks.

The following is a sample script that runs several tests and reports their results back to Nagios:

 #!/bin/sh 
 
 COMMANDFILE=$1 
 HOSTNAME=$2 
 HOSTADDRESS=$3 
 PLUGINPATH=$4 
 
 $PLUGINPATH/check_by_ssh -H $HOSTADDRESS -t 30  
     -o $COMMANDFILE -n $HOSTNAME  
     -s "SWAP:Root Partition:Processes:System Load"  
     -C "$PLUGINPATH/check_swap -w 20% -c 10%"  
     -C "$PLUGINPATH/check_disk -w 20% -c 10% -p /"  
     -C "$PLUGINPATH/check_procs -w 100 -c 200"  
     -C "$PLUGINPATH/check_load -w 5,3,2 -c 10,8,7"  
        ( 
         echo "BYSSH CRITICAL problem while running SSH" 
         exit 2 
        ) 
 
 echo "BYSSH OK checks launched" 
 exit 0

For the remaining part of the section we'll assume that the script is in the /opt/nagios/plugins directory and is called check_linux_services_by_ssh.

The script will perform several checks, and if any of them fail, it will return a critical result as well. Otherwise, it will return an OK status and the remaining results will be passed as passive check results. We will also need to configure Nagios, both services that will receive their results as passive checks, and the service that will actually schedule the checks properly.

All the services that are checked via the check_by_ssh command itself have a very similar definition—accept passive checks and not have any active checks scheduled.

The following is a sample definition for the SWAP service:

  define service 
  { 
    use                     generic-service 
    host_name               !localhost 
    hostgroup_name          linux-servers 
    service_description     SWAP 
    active_checks_enabled   0 
    passive_checks_enabled  1 
  }

All other services will also need to have a very similar definition.

We might also define a template for such services and only create services that use it. This will make the configuration more readable.

We'll need to define a command definition that will launch the passive check script written earlier:

  define command 
  { 
    command_name    check_linux_services_by_ssh 
    command_line    $USER1$/check_linux_services_by_ssh                    
                    "$COMMANDFILE$" "$HOSTNAME$" "$HOSTADDRESS$" "$USER1$" 
  }

All the parameters that are used by the script are passed directly from the Nagios configuration. This makes reconfiguring paths to Nagios plugins or command pipe easier.

The next step is to define an actual service that will run these checks:

  define service 
  { 
    use                            generic-service 
    host_name                      !localhost 
    hostgroup_name                 linux-servers 
    service_description            Check Services By SSH 
    active_checks_enabled          1 
    passive_checks_enabled         0 
    check_command                  check_linux_services_by_ssh 
    check_interval                 30 
    check_period                   24x7 
    max_check_attempts             1 
    notification_interval          30 
    notification_period            24x7 
    notification_options           c,u,r 
    contact_groups                 linux-admins    
  }

This will cause the checks to be scheduled every 30 minutes. It will also notify the Linux administrators if any problem occurs with the scheduling of the checks.

An alternative approach is to use the cron daemon to schedule the launch of the previous script. In such a case, the Check Services By SSH service is not needed. In this case, scheduling of the checks is not done in Nagios, but we will still need to have the services for which the status will be reported.

In such a case, we need to make sure that cron is running to have up-to-date results for the checks. Such verification can be done by monitoring the daemon using Nagios and the check_procs plugin.

The first thing that needs to be done is to adapt the script to not print out the results in case everything worked fine and hardcode paths to the Nagios files:

#!/bin/sh 
 
COMMANDFILE=/vat/nagios/rw/nagios.cmd 
PLUGINPATH=/opt/nagios/plugins 
HOSTNAME=$1 
HOSTADDRESS=$2 
 
$PLUGINPATH/check_by_ssh -H $HOSTADDRESS -t 30  
    -o $COMMANDFILE -n $HOSTNAME  
    -s "SWAP:Root Partition:Processes:System Load"  
    -C "$PLUGINPATH/check_swap -w 20% -c 10%"  
    -C "$PLUGINPATH/check_disk -w 20% -c 10% -p /"  
    -C "$PLUGINPATH/check_procs -w 100 -c 200"  
    -C "$PLUGINPATH/check_load -w 5,3,2 -c 10,8,7"  
    || ( 
        echo "BYSSH CRITICAL problem while running SSH" 
        exit 2 
    ) 
exit 0

The main changes are that COMMANDFILE and PLUGINPATH variables are hardcoded as they are not passed from Nagios anymore. Also, by default the script does not print anything on standard output - this is because cron sends an e-mail with the script output if any is written or exit code is not 0.

The next step is to add an entry to the Nagios user, crontab. This can be done by running the crontab -e command as the nagios user or the crontab -u nagios -e command as the administrator.

Assuming that the check should be performed every 30 minutes, the crontab entry should be as follows:

 */30 * * * * /opt/nagios/plugins/check_linux_services_by_ssh

For more details on how an entry in crontab should look, please consult its manual page available at http://linux.die.net/man/5/crontab.

Troubleshooting the SSH-based checks

If you have followed the steps from the previous sections carefully, then everything should be working properly. However, in some cases, performing checks over SSH might not be working properly and troubleshooting needs to be done to understand the root cause of the problem.

The first thing that you should start with is using the check_ssh plugin to make sure that SSH is accepting connections on the host that we are checking. For example, we can run the following command:

root@ubuntu1:~# /opt/nagios/plugins/check_ssh -H 192.168.2.51
SSH OK - OpenSSH_4.7p1 Debian-8ubuntu1.2 (protocol 2.0)

Where 192.168.2.51 is the name of the IP address of the remote machine we want to monitor. If no SSH server is set up on the remote host, the plugin will return Connection refused status, and if it failed to connect, the result will state No route to host. In these cases, you need to make sure that the SSH server is working and all routers and firewalls do not reject communications over SSH, which is TCP port 22.

Assuming that the SSH server is accepting connections, the next thing that can be checked is whether the SSH key-based authorization works correctly. To do this, switch to the user the Nagios process is running as. Next, try to connect to the remote machine. The following are sample commands to perform this check:

root@ubuntu1:~# su nagios -
$ ssh -v 192.168.2.51

This way you can check the connectivity as the same user as that which Nagios is using to run checks. You can also analyze the logs that will be printed to the standard output, as described earlier in this chapter.

If the SSH client prompts you for a password, then your keys are not set up properly. It is a common mistake to set up keys on the root account instead of setting them up on the nagios account. If this is the case, then create a new set of keys as the correct user and verify whether these keys are working correctly now.

Assuming this step worked fine, the next thing to be done is checking whether invoking an actual check command produces correct results. For example:

root@ubuntu1:~# su nagios -
$ ssh 192.168.2.51 /opt/nagios/plugins/check_procs
PROCS OK: 51 processes

This way, you will check the connectivity as the same user at which Nagios is running checks.

The last check is to make sure that the check_by_ssh plugin also returns correct information. For example by doing:

root@ubuntu1:~# su nagios -
$ /opt/nagios/plugins/check_by_ssh -H 192.168.2.1 
    /opt/nagios/plugins/check_procs
PROCS OK: 52 processes

If the last step also worked correctly, it means that all check commands are working correctly.

If you still have issues with the running of the checks, then the next thing you should investigate is if Nagios has been properly configured and whether all commands, hosts, and services are set up in the correct way.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 10. Monitoring Remote Hosts

Create new playlist

Sign In