Tuesday, 5 August 2014

Monitoring XenServer using Nagios

Monitoring XenServer using Nagios

This tutorial will discuss the various ways using which you can effectively monitor your XenServers using Nagios.

Before we begin, do note that the steps shown here are actually a continuation from my earlier series of tutorials based on Nagios:

  • Nagios Monitoring System: Installed and configured (Refer steps HERE)
  • XenServer: Installed, configured and located on the same network as that of the Nagios Server. Here, I am using a XenServer v6.2 ( as my client.

NOTE: We are going to monitor our Xenservers using two methods:

1) Using an agent (NRPE)
The NRPE add-on is designed to allow you to execute Nagios plugins on remote Linux/Unix machines. The main reason for doing this is to allow Nagios to monitor "local" resources (like CPU load, memory usage, etc.) on remote machines.
2) Agentless Monitoring (SNMP)
Agentless technologies like SNMP allow IT administrators to deploy monitoring solutions without having to install agent software on each monitored system, thus helping with reduced deployment time, reduced administrative overhead and centralized administration and configuration, etc.

1) Using an agent (NRPE)
Just to understand what NRPE actually is and how it works, check out this small excerpt from NRPE's Design overview Doc:

NRPE addon consists of two pieces:
– The check_nrpe plugin, which resides on the local monitoring machine
– The NRPE daemon, which runs on the remote Linux/Unix machine

When Nagios needs to monitor a resource of service from a remote Linux/Unix machine:
– Nagios will execute the check_nrpe plugin and tell it what service needs to be checked
– The check_nrpe plugin contacts the NRPE daemon on the remote host over an (optionally) SSL-protected connection
– The NRPE daemon runs the appropriate Nagios plugin to check the service or resource
– The results from the service check are passed from the NRPE daemon back to the check_nrpe plugin, which then returns the check results to the Nagios process.

NOTE: The NRPE daemon requires that Nagios plugins be installed on the remote Linux/Unix host. Without these, the daemon wouldn't be able to monitor anything. Hence we need to install NRPE as well as the Nagios Plugins in both the Nagios server as well as the Client that we will be monitoring.

a) Configuring Client
In the following steps, I'll be first installing Nagios Plugins as well as NRPE on my XenServer (, then configuring NRPE to accept connections with my Nagios Server (

To begin with, login to the XenServer using SSH. Run the following command to install the EPEL repository on your XenServer:

# wget http://dl.fedoraproject.org/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm

Once downloaded, install the EPEL RPM:

# rpm -ivh epel-release-5-4.noarch.rpm

Next, enable the EPEL Repository and install the NRPE Plugin using the following commands:

# sed -i 's/enabled=1/enabled=0/g' /etc/yum.repos.d/epel.repo

Install NRPE:

# yum install --enablerepo=epel nrpe

Next, we install some Nagios Plugins over in our XenServer. Remember, NRPE will require the Plugins to be present on both the Nagios Server as well as the Client (In this case, the XenServer)

# yum install --enablerepo=epel nagios-plugins-users nagios-plugins-disk nagios-plugins-swap nagios-plugins-procs nagios-plugins-load

The downloaded Plugins will be placed under the /usr/lib/nagios/plugins/ folder as shown below:

Once the installation finishes, we need to edit the nrpe.cfg file at our client and add the Nagios Server IP in the allowed_hosts attribute

$ vi /etc/nagios/nrpe.cfg

# Add the Nagios Server IP address ( Remember that the IPs are comma separated.

Save the file and exit the editor.

Start the NRPE service
# service nrpe start

# chkconfig nrpe on

Next, add the following line to the iptables config just before the REJECT line at the bottom of the file. This will basically allow your Nagios server to connect to your XenServer

# vi /etc/sysconfig/iptables

-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 5666 -j ACCEPT

Restart the Firewall for the changes to take effect:

# service iptables restart

b) Configure Nagios Server
Well, once your done with your Client setup, there actually nothing to configure at the Nagios Server side. You can test the NRPE by executing a check command along with NRPE as shown in the following sections.

c) Configuring Plugins
This section is going to show you how to use NRPE along with few Nagios plugins to monitor remote hosts and Clients.

1) check_load

This plugin checks the current systems Load Average.

Syntax for use with nrpe: 

check_nrpe -H <HOSTADDRESS> -c check_load 

# ./check_nrpe -H -c check_load 

Since we have not provided any arguments to this example, it will use the default check_load arguments specified in the nrpe.cfg file. 

The following example will generate a warning status if the load average exceeds the values: 15 for Load1, 10 for Load5 and 5 for Load15. Similarly, it will generate critical status if the load average exceeds the values: 30 for Load1, 25 for Load5 and 20 for Load15. 

To monitor the Client using the Nagios UI, simply add the Host Definition first to the clients.cfg file:

define host {
    use                           linux-server
    host_name               xenserver01
    alias                         xenserver01
max_check_attempts   5
check_period               24x7
notification_interval    10
notification_period      24x7

Define the corresponding Service as well:

define service {
use                            generic-service
host_name                xenserver01
service_description  Current Load of System
check_command      check_nrpe!check_load

Save and exit the editor.

Restart the Nagios service and check the Nagios UI. You should see your Plugin generating alerts as shown below:

2) check_disk
This plugin checks the Disk utilization for a particular host.

Syntax for use with nrpe: 

check_nrpe -H <HOSTADDRESS> -c check_users

# ./check_nrpe -H -c check_disk

Now at this stage I got the following error "NRPE: Command 'check_disk' not defined" A little bit of Google and some head scratching and soon the problem was evident.

Well, the error itself tells me what I am doing wrong or missing.. in this case, a Command Definition

The Nagios Server's NRPE Plugin basically queries the nrpe.cfg file (of your Client) to check whether the command that you are trying to execute (check_disk) is defined here or not?! To my surprise, its not. In fact, there is a command called as check_hda1 hardcoded here!! 

No issues.. all I did was comment out the check_hda1 command and basically wrote another command, but this time, defining check_disk. So the command that I wrote will basically raise a WARNING if the free space on the root partition (/) is only 20% and a CRITICAL alert if the root partition free space falls below 10%. 

NOTE: You can similarly add the Command Definitions of various other Plugins here as per your choice. Just remember that the Plugins that you define here have to be present on the Client as well.

Save the nrpe.cfg file and exit the editor. 

Restart NRPE Service for the changes to take effect:

# service nrpe restart

Re-execute the check_disk command once again and voila!! This time it shows that my root partition (/) is OK.

# ./check_nrpe -H -c check_disk

You can even define the corresponding Service for the Plugin in the clients.cfg file as shown:

define service {
use                            generic-service
host_name                xenserver01
service_description  Current root Partition
check_command     check_nrpe!check_disk

Save the file and quit the editor. Also, always remember to restart Nagios service after making any changes in its config files.

Restart the Nagios service and check the Nagios UI. You should see your Plugin generating alerts as shown below:

2) Agentless Monitoring (SNMP)
If agents and NRPE is not your thing and you still want to monitor your remote hosts and clients, then using something like SNMP is a very viable and simple option. Simply enable SNMP on your Client and provide it a unique SNMP string (a.k.a Community String) and there you have it!! 

You can easily monitor your Vyatta Router using SNMP using the following steps:

a) Configure Client
First, enable SNMP on XenServer and provide a suitable Community String (here, I'm using my-xen-community

# vi /etc/snmp/snmpd.conf
Replace the community with your current SNMP community if you have one or make a new one SNMP string as shown:

# sec.name source community
com2sec notConfigUser default my-xen-community

Next, editthe Firewall Rules to allow SNMP:

# vi /etc/sysconfig/iptables

Add the following lines AFTER the line “-A RH-Firewall-1-INPUT -p udp –dport 5353 -d -j ACCEPT

-A RH-Firewall-1-INPUT -p udp --dport 161 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp --dport 123 -j ACCEPT

Restart the Firewall:

# service iptables restart

Start the SNMP Service as shown:

# servie snmpd start

# chkconfig snmpd on

OPTIONAL: You can use snmpwalk to test whether your SNMP is actually configured correctly or not. You will need to have snmpwalk installed on a remote system on the same network as that of your XenServer. Once installed, simply run the following command in the remote system:

Syntax: snmpwalk -v 2c -c <SNMP-STRING> <XENSERVER-IP>

# snmpwalk -v 2c -c my-xen-community

You should see a lot of SNMP messages pop up on your screen. If yes then your SNMP is working well. If not, the it will probably throw you some error.

You can define SNMP Service in the clients.cfg file as shown:

define service {
use                            generic-service
host_name                xenserver01
service_description  XenServer Uptime
check_command  check_snmp! -C my-xen-community -o

Save the file and exit the editor. Restart Nagios Service.

Restart the Nagios service and check the Nagios UI. You should see your Plugin generating alerts as shown below:

And there you have it, simply and easy steps to monitor your XenServers. So stay tuned as in my next post, I'll be covering Monitoring OpenFiler with Nagios!!

1 comment :

Mahesh Kavalla said...

Thank you for this useful post. Do you have a list of important OIDs that I can use for monitoring Citrix Xenserver.

Post a Comment