graphing just stopped!

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
rredmond
Posts: 48
Joined: Mon Feb 28, 2011 10:17 am
Location: New Hartford, NY

graphing just stopped!

Post by rredmond »

Good Morning! Yup, Ive had this problem before...and 3 different suggestions have solved it in the past but none of them work now! LOL At 11PM last night graphing just stopped. I reset credentials, I restarted, I checked npcd and Ive rebooted the box.......still no graphs. This ones got me stumped! Any logs I should be looking at? Anything else I can check? In the past it has been EMC that for some reason has crashed performance monitoring. Yet Ive always been able to restart it. Any help would be greatly appreciated..........

Randy
Last edited by rredmond on Mon Nov 07, 2011 1:14 pm, edited 1 time in total.
rredmond
Posts: 48
Joined: Mon Feb 28, 2011 10:17 am
Location: New Hartford, NY

Re: graphing just stopped!

Post by rredmond »

Sorry, forgot...

NagiosXI 2009R1.2

Linux 2.6.18-164.9.1.e15
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: graphing just stopped!

Post by mguthrie »

Usually in 2009 the performance graph issues were permissions related. Try running:

/usr/local/nagiosxi/scripts/reset_config_perms

And also accessing the Admin->Reset Security Credentials page, and resetting the credentials for the subsystem components.
rredmond
Posts: 48
Joined: Mon Feb 28, 2011 10:17 am
Location: New Hartford, NY

Re: graphing just stopped!

Post by rredmond »

As I pointed out above that was the second thing I tried with no effect. Ill give it another go.....
User avatar
lmiltchev
Former Nagios Staff
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: graphing just stopped!

Post by lmiltchev »

Do you see any errors in:

/usr/local/nagios/var/npcd.log

or

/usr/local/nagios/var/perfdata.log
Be sure to check out our Knowledgebase for helpful articles and solutions!
rredmond
Posts: 48
Joined: Mon Feb 28, 2011 10:17 am
Location: New Hartford, NY

Re: graphing just stopped!

Post by rredmond »

I dont have either one of those logs. After doing the credential reset again{both} the nagios performance graphs started to show up. Added devices dont seem to show up until Ive scheduled an immediate check...although I COULD be impatient ;-) Very odd behavior..........
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: graphing just stopped!

Post by mguthrie »

Added devices dont seem to show up until Ive scheduled an immediate check...
The rrdfiles for performance data are generated based on performance data that is received for a host or service, so if no results have ever come in for that service, there won't be rrd's for it. rrd files have a static filesize, so it's by design that new files aren't generated until there's actually data to put into them.
rredmond
Posts: 48
Joined: Mon Feb 28, 2011 10:17 am
Location: New Hartford, NY

Re: graphing just stopped!

Post by rredmond »

Well this is very bizzare! I am NOT getting graphs for anything that I dont schedule an immediate check for. In status detail for these hosts it says last check 2011-11-06 22:56:11 and next check 2011-11-07 23:00:00! Wth? If I schedule an immediate check it starts to graph as normal. Settings are to check every 5 minutes. That last check time is right around when our mystery even occured per the graphs. Any ideas on how I can force a check on 694 services without going one by one? Should I cycle the box again?
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: graphing just stopped!

Post by mguthrie »

I would look at restarting the monitoring server. Oddities like this can sometimes be caused by multiple instances of Nagios running and competing with itself. A restart would take care of that, along with any other process that was acting out of sorts
rredmond
Posts: 48
Joined: Mon Feb 28, 2011 10:17 am
Location: New Hartford, NY

Re: graphing just stopped!

Post by rredmond »

I rebooted the server and the Last Check & Next Check date and time are exactly the same. Same results scheduleing an immediate check. SOMETHING changed the Next Check date and time on all these. Coincidently 24 hours after whatever incident caused this. The only thing that immediately started checking per config was Nagios itself. All of these devices are configured to check every 5 minutes yet they arent unless an immediate check is kicked off. I guess Im just waiting for all of these to kick off at 11PM tonight?