Nagios is a brilliant monitoring system that I’ve used for years. If you need to use Nagios to monitor remote servers, the usual method is to install a small client, NRPE (Nagios Remote Plugin Executor), on the remote server and use this to execute checks and send the data back to the main Nagios server. However, recently I could not install this plugin on a server.
Fortunately you can execute checks via SSH instead. The first step to getting this working is to enable to Nagios server to access the remote server via SSH without a password. This requires creating a SSH kaypair for the Nagios user on the Nagios server -:
su nagios ssh-keygen -t rsa
Make sure you do not enter a passphrase when prompted. Now you need to copy the id_rsa.pub created to the remote server you need to login to -:
scp ~/.ssh/id_rsa.pub user@nagios.host:~/.ssh/authorized_keys
You should now be able to SSH to the remote server without providing a password. Obviously you should now make sure you Nagios server is adequately protected as it can access your other server without needing a password.
On the central Nagios server, in the commands.cfg configuration file, define the new checks. The example below defines a new check_ssh_load command:
# 'check_ssh_load' command definition
define command {
command_name check_ssh_load
command_line $USER1$/check_by_ssh -H $HOSTADDRESS$ -C "/home/user/bin/check_load
-w $ARG1$ -c $ARG2$"
}
This command will call the check_by_ssh plugin to connect to the specified host (via the $HOSTADDRESS$ macro) and execute the command /home/user/bin/check_load, which is the check_load plugin, on the remote machine; you will need to adjust the path to match the location of that plugin on the remote server. As well, if paths and/or usernames differ on remote servers and you plan to monitor more than one, you may need to define multiple commands, one for each server (or use macros).
Next, edit services.cfg and add the following:
define service {
use local-service
hostgroup_name ssh-nagios-services
service_description Current Load
check_command check_ssh_load!5.0,4.0,3.0!10.0,6.0,4.0
}
This defines a new service to execute for hosts in the ssh-nagios-services hostgroup. It calls the defined check_ssh_load command and will put the service in a warn state if the load average hits 5, and a critical state if it hits 10 (adjust to suit, of course).
Finally, edit hostgroups.cfg to create the ssh-nagios-services hostgroup. Systems added to this hostgroup will automatically begin to use the defined service.
define hostgroup {
hostgroup_name ssh-nagios-services
alias Nagios over SSH
members remote1,remote2
}
Here we define that remote1 and remote2 both belong to this hostgroup. As a result, both will start using the check_ssh_load command.
Using check_by_ssh is a convenient and secure way to execute Nagios plugins on remote servers. When all you can see of the status of a remote server is HTTP or SMTP availability, your view of the server is quite restricted. Being able to see local resource usage can allow you to spot problems, and correct them, before they are visible to users.
