Interested in scaling your Nagios deployment to monitor a large environment? Distributed monitoring may be the solution you’re looking for. We just created a document that describes different methods for configuring a distributed monitoring solution with Nagios Core and Nagios XI.
Monthly Archive for February, 2011
We came across an issue about a month ago where a user was losing data with a distributed/passive checks setup. Upon a closer investigation we uncovered that all of the passive checks were being executed every 5 minutes from servers that were all synced to the same time server. The result? Hundreds of checks were all coming in with a few seconds, putting a heavy load on Nagios, while the other 4 minutes and 50 seconds were going virtually unused by the server. After some discussion on this we decided to make use of a built-in tool for Nagios – nagiostats – and create a wizard that could monitor Nagios itself to see how the checks were coming in and being processed. Although multiple checks have been written in the past, we’ve created a new wizard that allows you to quickly create several checks against the nagiostats binary to monitor the monitoring environment itself. We’ve just released a 1.0 version of this wizard and we’re curious to know what users think of it. Feel free to give it a try and send us your feedback!
One of the cool things we’ve been working on for Nagios XI 2011 is an alert heatmap that provides a visual representation of alerts over time.
Representing alerts in a visual manner can provide users with a quick understanding of when major events occurred, and which hosts and services have persistent problems.
We’re just about ready to launch Nagios Labs – the place where our team will show you all the cool stuff we’re working on. Stay tuned for more!