Archive for the 'Performance' Category

Nagios V-Shell 1.9 Released

vshell2

Nagios V-Shell 1.9 includes major performance updates, and a re-implementation of PHP caching that should decrease V-Shell page load times anywhere from 40-75%.  I ran some benchmarking tests on a test system(Dual core desktop with 4GB of RAM) with 1800 hosts, and 7200 services.  This system runs with an average CPU load of 2.0-6.0 throughout the day, so the hardware is being pushed pretty hard already from the check load. V-Shell 1.8 created page load times anywhere from 18-28 seconds throughout the interface without APC caching enabled.  Needless to say, this is problematic for many users with larger environments.  The Core cgi’s were able to load anywhere from 2-11 seconds, with the service status page taking around 9-11 seconds to load all of the data.  My goal for 1.9 was to minimize any unnecessary processing, and optimize any functions that were inefficient or using slower PHP built-in functions.  The differences in 1.9 are substantial.  Without any caching enabled at all, I was able to decrease the average page load time to 9-14 seconds, which is 40-50% faster by itself.  Once I had the code optimized, I reworked the APC caching functionality.  If a user has PHP’s APC caching packages installed and enabled on their web server, V-Shell will cached the objects.cache file until it detects any changes in the file, while the data in the status.dat file will be cached based on a TTL (time to live) config option which now exists in 1.9.  Once the data is cached in APC, the page load times throughout the interface averaged between 4-5 seconds for all pages, which is a 75% decrease in load time on average.

My goal for the next version of V-Shell is to add support for mklivestatus and ndoutils for backend data, which will eliminate the need to parse the objects.cache file and status.dat files for systems with those backends.  This should further improve performance for larger installations.

Download Nagios V-Shell 1.9

CHANGELOG

 

Nagios Performance Tuning – Tech Tips: Understanding Disk I\O

We often get questions about the kind of hardware requirements needed for a particular Nagios installation.  As covered in a previous article, this is often a very difficult question to answer since monitoring environments differ so much.  Most people assume that for a large Nagios installation, it’s a matter of simply adding enough CPU’s to the machine to handle the workload that it’s given.  Although having enough CPU power is important, I’ve found that it’s ultimately not the biggest hardware limitation to the system.  A large Nagios installation creates an enormous amount of disk activity, and if the hard disk can’t keep up with the constant traffic flow that needs to happen, all of those precious CPU’s are simply going to wait in line to be able to do what they need to do on the system.  I’ve talked to some users who have spent some serious money on hardware to have insanely fast disks to handle their workload, but I wanted to do some experiments in-house for those users who may need to have better performance on a budget.  I want to give special thanks to Nagios community members Dan Wittenberg and Max Schubert for documenting some of the tricks that you guys pioneered on this topic.

Continue reading ‘Nagios Performance Tuning – Tech Tips: Understanding Disk I\O’

Nagios XI Graph Explorer Component Released

My brother (a fellow programmer) once told me, “the solution is easy once you know what it is.”  That’s been the case for the finishing touches needed to finally release a component that I’ve been excited about for a long time: The Nagios XI Graph Explorer.  This component utilizes a javascript visualization library and allows users to easily zoom graphs, select custom time frames, and even stack time periods on top of each other to compare performance from one time period to the next.  If you like data visualization, you’ll love this tool.  Currently this download is for current Nagios XI customers only and can be downloaded from the Nagios XI Customer Downloads page, and I recommend using this with Firefox for maximum reliability.  Special thanks to Nicholas Scott for accidentally pointing out the solution to the problem that’s been in front of my face the whole time ; )

 

Helping MySQL Move Out And Find Its Own Server

Anybody keeping tabs on the performance of their NagiosXI server knows that mysqld, httpd and nagios all play an intense game of king-of-the-CPU.   The cool thing about NagiosXI is that it comes with NDOUtils out of the box, which is a great tool for offloading the MySQL server, which is great if you need to stack on more checks.  If you run a NagiosXI server that is completely loaded down and have another server that could host a MySQL server for that NagiosXI server, this  PDF would definitely be worth a read. The PDF attached is a step-by-step guide to migrate your existing MySQL server to a remote MySQL server and is definitely an interesting look at just how exstensible NagiosXI is.

Offloading MySQL to a Remote Server

Nagios Visualization Toolkit (Under Construction)

In the past months we’ve had several requests for better control and time specifications for Nagios performance graphs, and me being a big fan of fancy visualizations, I’ve been staring at the old PNP graphs for a while and wondering if there’s a way we can create graphs that look like they’re actually from this decade.  After reviewing several different visualization libraries, we decided to take a stab at developing some new tools with some graphing libraries from HighCharts.  Although some of the fine details are still being polished, our first prototype has us pretty excited about where this project is headed.

Graph

JQuery Performance Graphs in XI

Our first prototype is a zoomable performance graph, that allows you to specify start/stop times, and then dynamically zoom the graph all the way down to a 5mn interval for closer examination.  Although these graphs are client-side, they can all be exported into either png, pdf, jpg, or SVG images to use in external reporting or presentations.  Let us know what you think!

Distributed Monitoring Solutions For Nagios

Distributed MonitoringInterested in scaling your Nagios deployment to monitor a large environment?  Distributed monitoring may be the solution you’re looking for.  We just created a document that describes different methods for configuring a distributed monitoring solution with Nagios Core and Nagios XI.

Distributed_Monitoring_Solutions.pdf

Analyze Nagios Performance With The Nagiostats Wizard

We came across an issue about a month ago where a user was losing data with a distributed/passive checks setup.  Upon a closer investigation we uncovered that all of the passive checks were being executed every 5 minutes from servers that were all synced to the same time server.  The result?  Hundreds of checks were all coming in with a few seconds, putting a heavy load on Nagios, while the other 4 minutes and 50 seconds were going virtually unused by the server.  After some discussion on this we decided to make use of a built-in tool for Nagios -  nagiostats – and create a wizard that could monitor Nagios itself to see how the checks were coming in and being processed.   Although multiple checks have been written in the past, we’ve created a new wizard that allows you to quickly create several checks against the nagiostats binary to monitor the monitoring environment itself.  We’ve just released a 1.0 version of this wizard and we’re curious to know what users think of it.  Feel free to give it a try and send us your feedback!

Nagiostats Wizard on Exchange

Wizard Preview

Graphs

Graphs from the Nagiostats Wizard