Monitoring Apache Cassandra Database Nodes with Nagios XI

As cloud services grow in popularity, so do the networks that provide those cloud services. Few webserver-based distributed databases are as easy to install and configure as Apache Cassandra. Apache Cassandra is an open source distributed database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple data centers, with asynchronous master-less replication allowing low latency operations for all clients.

Cassandra relies on the Java platform, and as those of you who have tried to configure Java app monitoring most likely know, the experience can be painful. There are a handful of plugins on the Nagios Exchange that attempt to simplify the configuration. As these plugins rely on the Apache Cassandra utility “nodetool”, you either need to install Cassandra on the Nagios server (which is not suggested) or use an agent (like NRPE) to run the plugin script directly from the Cassandra server (which should have the nodetool utility).

The Cluster Node Check is designed to verify whether the number of live nodes is less than a specified number, and if so trigger a warning or critical alert within Nagios.

1. Download and install the NRPE agent on the Cassandra server. Follow our linux-agent installation document below:
http://assets.nagios.com/downloads/nagiosxi/docs/Installing_The_XI_Linux_Agent.pdf

If you experience issues with the NRPE install, refer to the following troubleshooting document:
http://assets.nagios.com/downloads/nagiosxi/docs/NRPE_Troubleshooting_and_Common_Solutions.pdf

If you are not running CentOS or RHEL on the Cassandra server, you may need to compile NRPE from source:
http://assets.nagios.com/downloads/nagiosxi/docs/Source_Based_NRPE_Installation_and_XI.pdf

2. Download the Plugin:

Once NRPE is installed, you will need to run the following commands from your Cassandra server command line to download the check_cassandra_cluster.sh script.

cd /usr/local/nagios/libexec
wget https://raw.github.com/hashnao/nagios-plugins/master/check_cassandra_cluster.sh
chmod +x check_cassandra_cluster.sh
chown nagios:nagios check_cassandra_cluster.sh

cd /usr/local/nagios/libexec

wget https://raw.github.com/hashnao/nagios-plugins/master/check_cassandra_cluster.sh

chmod +x check_cassandra_cluster.sh

chown nagios:nagios check_cassandra_cluster.sh

3. Verify check from Cassandra Server Command Line:

It is a good idea to run the plugin locally to verify that it works, before moving on to test it from the Nagios Server. To do so execute the following command from the command line on the Cassandra server.

/usr/local/nagios/libexec/check_cassandra_cluster.sh -H localhost -P 7199 -w 1 -c 0

1	/usr/local/nagios/libexec/check_cassandra_cluster.sh -H localhost -P 7199 -w 1 -c 0

You should see output similar to:

OK - Live Node:2 - 127.0.0.1:Normal,70.97,KB100.00%,940153922094527000 |     Load_127.0.0.1=KB100.00% Owns_127.0.0.1=940153922094527000

1	OK - Live Node:2 - 127.0.0.1:Normal,70.97,KB100.00%,940153922094527000 \| Load_127.0.0.1=KB100.00% Owns_127.0.0.1=940153922094527000

4. Configuring the check in nrpe.cfg:

In order for Nagios to execute a command on a remote server, you need to add the plugin to the nrpe.cfg on the Cassandra server. Edit the /usr/local/nagios/etc/nrpe.cfg file with your favorite text editor by adding the following line at the bottom of the file.

command[check_cassandra_cluster]=/usr/local/nagios/libexec/check_cassandra_cluster.sh $ARG1$

1	command[check_cassandra_cluster]=/usr/local/nagios/libexec/check_cassandra_cluster.sh $ARG1$

Verify that “dont_blame_nrpe=1” is configured in the nrpe.cfg on the Cassandra sever as we are passing arguments to the server.

Restart xinetd on the Cassandra Server (or the nrpe service if you compiled from source) by running the following command.

service xinetd restart

1	service xinetd restart

Test the check from the XI server command line. Make sure to replace <Cassandra server ip> with the IP address of your Cassandra server and also replace <ip of Cassandra node to check> with the same IP address or a different IP address of another Cassandra server.

/usr/local/nagios/libexec/check_nrpe -H <Cassandra server ip> -c check_cassandra_cluster -a '-H <ip of Cassandra node to check> -P 7199 -w 1 -c 0'

1	/usr/local/nagios/libexec/check_nrpe -H <Cassandra server ip> -c check_cassandra_cluster -a '-H <ip of Cassandra node to check> -P 7199 -w 1 -c 0'

You should see output similar to:

OK - Live Node:2 – 127.0.0.1:Normal,71.54,KB100.00%,-9165324447555808428 127.0.0.1:Normal,71.54,KB100.00%

1	OK - Live Node:2 – 127.0.0.1:Normal,71.54,KB100.00%,-9165324447555808428 127.0.0.1:Normal,71.54,KB100.00%

5. Add the check_cassandra_cluster command to XI:

In XI, go to Configure -> Core Config Manager -> Commands. Click “Add New“.

Enter “check_cassandra_cluster” for the Command Name.

For the Command Line enter:

$USER1$/check_nrpe -H $HOSTNAME$ -c check_cassandra_cluster -a '-H $ARG1$ -P $ARG2$ -w $ARG3$ -c $ARG4$'

1	$USER1$/check_nrpe -H $HOSTNAME$ -c check_cassandra_cluster -a '-H $ARG1$ -P $ARG2$ -w $ARG3$ -c $ARG4$'

Save changes and “apply configuration“.

6. Create a Host in XI for your Cassandra Server:

You will need to set up the Cassandra server as a Host in Nagios XI if you have not done so already. To do so, use the following steps.

In XI, go to Configure -> Run the Monitoring Wizard.

Select a Linux Server and enter the IP address of your Cassandra server and distribution. Select Next.

Select any services you wish to monitor and select Next. (Note: you do not need to download the agent as that has already been done in step 2 above.)

Set your Monitoring settings and click Finished.

7. Create a service check in XI:

In XI, go to Configure -> Core Config Manager -> Services and click “Add New“.

Enter a name for the check and select Check_Cassandra_Cluster from the check Command Drop Down.

Configure the arguments:

$ARG1$: The IP address of the Cassandra node to check
$ARG2$: The Port that the Cassandra node is listening on (default is 7199)
$ARG3$: Warning threshold – Integer for number of nodes or less report WARNING
$ARG4$: Critical threshold – Integer for the number of nodes or less to report CRITICAL (must be less than $ARG3$)

Add the Cassandra sever to the check through the “Manage Hosts” button.

Continue configuring the service object as you normally would using templates, check and alert settings, etc.

Save and Apply Configuration.

The check should now be active and working.

The full documentation can be found below

Monitoring Apache Cassandra Databases with Nagios XI

If you are unfamiliar with Nagios XI, you can download the fully functional Free 60 Day Trial.

Also, Nagios World Conference takes place October 13-16, 2014. Use discount code LABS100 and save $100 on your conference pass – register today!

Monitoring Apache Cassandra Database Nodes with Nagios XI

1 Response to “Monitoring Apache Cassandra Database Nodes with Nagios XI”

Search

Connect With Us

About Nagios Labs

Latest Articles

Quick Links

Live Webinar

Monitoring Apache Cassandra Database Nodes with Nagios XI

1 Response to “Monitoring Apache Cassandra Database Nodes with Nagios XI”

Search

Connect With Us

About Nagios Labs

Latest Articles

Article Categories

Quick Links

Live Webinar