Remotely monitor server with Nagios using check_by_ssh

Nagios is a brilliant monitoring system that I’ve used for years. If you need to use Nagios to monitor remote servers, the usual method is to install a small client, NRPE (Nagios Remote Plugin Executor), on the remote server and use this to execute checks and send the data back to the main Nagios server. However, recently I could not install this plugin on a server.

Fortunately you can execute checks via SSH instead. The first step to getting this working is to enable to Nagios server to access the remote server via SSH without a password. This requires creating a SSH kaypair for the Nagios user on the Nagios server -:

su nagios
ssh-keygen -t rsa

Make sure you do not enter a passphrase when prompted.  Now you need to copy the id_rsa.pub created to the remote server you need to login to -:

scp ~/.ssh/id_rsa.pub user@nagios.host:~/.ssh/authorized_keys

You should now be able to SSH to the remote server without providing a password. Obviously you should now make sure you Nagios server is adequately protected as it can access your other server without needing a password.

On the central Nagios server, in the commands.cfg configuration file, define the new checks. The example below defines a new check_ssh_load command:

# 'check_ssh_load' command definition
define command {
        command_name    check_ssh_load
        command_line    $USER1$/check_by_ssh -H $HOSTADDRESS$ -C "/home/user/bin/check_load
        -w $ARG1$ -c $ARG2$"
}

This command will call the check_by_ssh plugin to connect to the specified host (via the $HOSTADDRESS$ macro) and execute the command /home/user/bin/check_load, which is the check_load plugin, on the remote machine; you will need to adjust the path to match the location of that plugin on the remote server. As well, if paths and/or usernames differ on remote servers and you plan to monitor more than one, you may need to define multiple commands, one for each server (or use macros).

Next, edit services.cfg and add the following:

define service {
       use                             local-service
       hostgroup_name                  ssh-nagios-services
       service_description             Current Load
       check_command                   check_ssh_load!5.0,4.0,3.0!10.0,6.0,4.0
}

This defines a new service to execute for hosts in the ssh-nagios-services hostgroup. It calls the defined check_ssh_load command and will put the service in a warn state if the load average hits 5, and a critical state if it hits 10 (adjust to suit, of course).

Finally, edit hostgroups.cfg to create the ssh-nagios-services hostgroup. Systems added to this hostgroup will automatically begin to use the defined service.

define hostgroup {
        hostgroup_name  ssh-nagios-services
        alias           Nagios over SSH
        members         remote1,remote2
}

Here we define that remote1 and remote2 both belong to this hostgroup. As a result, both will start using the check_ssh_load command.

Using check_by_ssh is a convenient and secure way to execute Nagios plugins on remote servers. When all you can see of the status of a remote server is HTTP or SMTP availability, your view of the server is quite restricted. Being able to see local resource usage can allow you to spot problems, and correct them, before they are visible to users.

Tagged with: , , ,
Posted in Linux

Leave a Reply

Your email address will not be published. Required fields are marked *

*

* Copy this password:

* Type or paste password here:

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>