Historical Performance Data

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
dsdonut
Posts: 32
Joined: Mon Mar 21, 2011 1:21 pm

Historical Performance Data

Post by dsdonut »

I've noticed that some of the built-in (out of the box) checks will keep historical data, and display that data in the form of a graph. CPU stats, mem_info, etc, have this data.

Is this a function of the script itself, or is this a function of Nagios? For certain things, CPU, memory, disk space, and so on, I need to have this data, and I need these graphs. I was unable to find a script in the Nagios library to monitor the usage of individual CPU cores. All of the ones I could find, just take an average across all available cores. So, I wrote my own script to check each core individually. This check does not provide a graph showing historical data. Is there a way I can get that?

Also, by default how far back does historical data go?
User avatar
nscott
Posts: 1040
Joined: Wed May 11, 2011 8:54 am

Re: Historical Performance Data

Post by nscott »

dsdonut,

Nagios XI handles a lot of that logic, you'll simply need to make a command in Nagios XI that does what you want, and then you'll need to make a template for it (use the other templates in /usr/local/nagios/share/pnp/templates as a "template" :) ) and name it the same as the plugin in you wrote. Nagios XI should create the graph for you from there.

Also, the default RRD size is 1 year.
Nicholas Scott
Former Nagios employee
dsdonut
Posts: 32
Joined: Mon Mar 21, 2011 1:21 pm

Re: Historical Performance Data

Post by dsdonut »

I'm a little unsure of what to do.

There are templates in both /usr/local/nagios/share/pnp/templates and /usr/local/nagios/share/pnp/templates.dist

None of those templates are named the same as any of the checks I'm running. (ie I'm running a script called check_cpu_stats.sh, it is giving me performance graphs, yet I can't find a template with that name.) The only template listed in any of my nagios services is xiwizard_nrpe_service. I also don't find a template by that name, so I really don't know what template to copy.
agriffin
Posts: 876
Joined: Mon May 09, 2011 9:36 am

Re: Historical Performance Data

Post by agriffin »

Your plugin can provide graphing data in its output after a "|" character. Everything after that character is stripped out by Nagios for status checks and used for performance data. You can find more information in the documentation. Sourceforge.net seems to be down at the moment, which is where much of our online documentation is, but you should have a copy in /usr/local/nagios/share/docs/perfdata.html
hhlodge
Posts: 206
Joined: Tue Mar 08, 2011 2:13 pm

Re: Historical Performance Data

Post by hhlodge »

I've been outputting performance data for some custom plugins and haven't done a thing with templates and have nice graphs with history, but don't see any place this data is kept, to include the rrd files directory. Where would this be?
- Kyle
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Historical Performance Data

Post by mguthrie »

PNP template are a little bit tricky to work with. They are stored in the following directories:

/usr/local/nagios/share/pnp/templates
/usr/local/nagios/share/pnp/templates.dist #use this one for custom templates

Template names need to correspond to a defined command definition, not the plugin name. Documentation on custom templates is pretty sketchy, both from PNP and even rrdtool. Getting them to work correctly is mostly a matter of trial and error.

Here's the best breakdown I've seen on performance data syntax:
http://docs.pnp4nagios.org/pnp-0.4/abou ... quirements



Hope that helps.
hhlodge
Posts: 206
Joined: Tue Mar 08, 2011 2:13 pm

Re: Historical Performance Data

Post by hhlodge »

Sorry if I wasn't clear. I literally created a plugin and command named check_badness that just checks for a file and outputs the hour in perf data format and I get a graph that walks up the X axis each hour for every day with history going back since I made the service. I did nothing with a template by that name. I did find the rrd data. I was looking in /var. I see now the graphs say "Default_Template". Is that a catch-all template, meaning we don't have to make a template copy by name? Sorry, I'm confused on what's being conveyed above.

#!/bin/sh

if [ -f /tmp/nagbad ]
then
echo "CRITICAL: Things are grim!|'Hour is'=`date +%H`;;;0;24"
exit 2
else
echo "OK: All hunky dory.|'Hour is'=`date +%H`;;;0;24"
exit 0
fi
graph.png
You do not have the required permissions to view the files attached to this post.
- Kyle
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Historical Performance Data

Post by mguthrie »

Is that a catch-all template, meaning we don't have to make a template copy by name?
Correct, if there's no custom template defined, the default template that you're seeing there will be used.