Some Performance graphs not graphing

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Some Performance graphs not graphing

Post by pbroste »

Hello @hbouma

Thanks for the updates,

Question; do you see a file count on this:

Code: Select all

watch 'ls /usr/local/nagios/var/spool/perfdata/ | wc -l'
Should see count fluctuate greater than zero and then when it is done completing performance data it will show up zero.

Code: Select all

systemctl stop npcd
We should see the number of files increase in '/usr/local/nagios/var/spool/perfdata/' since the 'npcd' service has stopped. If that looks good, let's start the 'npcd' service again:

Code: Select all

systemctl start npcd
Now the file count in '/usr/local/nagios/var/spool/perfdata/' should be decreasing as it is processing perfdata. If the count is not decreasing then we should bump up values:

Code: Select all

vi /usr/local/nagios/etc/pnp/npcd.cfg
  • Threshold value is changed from xx to 28 as it was reaching threshold breach with 10.
  • [list]
  • load_threshold = xx.x to load_threshold = 28.0
[*]process_perfdata.cfg: TIMEOUT value is changed from x to 40.[/*]
  • TIMEOUT = x to TIMEOUT = 40
[/list]

Restart the npcd service:

Code: Select all

systemctl restart npcd
Give it some time and watch the counts and logs.

Want to get a copy of the 'process_perfdata.pl' and directory structure list so we can see ownership and permissions.

Code: Select all

tar -czvf /tmp/processperf.tar.gz /usr/local/nagios/libexec/process_perfdata.pl

Code: Select all

whereis rrdtool  >/tmp/info.txt
ls -al /usr/local >>/tmp/info.txt
ls -alR /usr/local/nagios >>/tmp/info.txt
ls -alR /usr/local/nagiosxi >>/tmp/info.txt
tail -100  /usr/local/nagios/var/npcd.log  >>/tmp/info.txt
Thanks,
Perry
hbouma
Posts: 483
Joined: Tue Feb 27, 2018 9:31 am

Re: Some Performance graphs not graphing

Post by hbouma »

The file count in /usr/local/nagios/var/spool/perfdata was 0 for several minutes' worth of the while statement. I did verify that the time was incrementing during the while statement. The value continued to stay at 0 even after stopping npcd.
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Some Performance graphs not graphing

Post by pbroste »

Hello @hbouma

Looking through the logs it appears that this environment is also using 'nagiosramdisk'. The spool is located:

Code: Select all

'/var/nagiosramdisk/spool/perfdata/'
Please verify,
Perry
hbouma
Posts: 483
Joined: Tue Feb 27, 2018 9:31 am

Re: Some Performance graphs not graphing

Post by hbouma »

Perfdata is showing up in that folder.

Code: Select all

07:39 AM SERVER root [/var/nagiosramdisk/spool/perfdata]
$ ll
total 516K
drwxrwxr-x 2 nagios nagios  240 Nov 16 07:39 .
drwxrwxr-x 5 nagios nagios  100 Nov 13 19:37 ..
-rw-r--r-- 1 nagios nagios    0 Nov 16 07:38 1637066312.perfdata.host
-rw-r--r-- 1 nagios nagios  83K Nov 16 07:38 1637066312.perfdata.service
-rw-r--r-- 1 nagios nagios  91K Nov 16 07:38 1637066326.perfdata.service
-rw-r--r-- 1 nagios nagios  333 Nov 16 07:38 1637066327.perfdata.host
-rw-r--r-- 1 nagios nagios 137K Nov 16 07:39 1637066341.perfdata.service
-rw-r--r-- 1 nagios nagios 1023 Nov 16 07:38 1637066342.perfdata.host
-rw-r--r-- 1 nagios nagios  59K Nov 16 07:39 1637066356.perfdata.host
-rw-r--r-- 1 nagios nagios  56K Nov 16 07:39 1637066357.perfdata.service
-rw-r--r-- 1 nagios nagios  22K Nov 16 07:39 1637066371.perfdata.service
-rw-r--r-- 1 nagios nagios  51K Nov 16 07:39 1637066372.perfdata.host
07:39 AM SERVER root [/var/nagiosramdisk/spool/perfdata]
$ ll
total 516K
drwxrwxr-x 2 nagios nagios  240 Nov 16 07:39 .
drwxrwxr-x 5 nagios nagios  100 Nov 13 19:37 ..
-rw-r--r-- 1 nagios nagios    0 Nov 16 07:38 1637066312.perfdata.host
-rw-r--r-- 1 nagios nagios  83K Nov 16 07:38 1637066312.perfdata.service
-rw-r--r-- 1 nagios nagios  91K Nov 16 07:38 1637066326.perfdata.service
-rw-r--r-- 1 nagios nagios  333 Nov 16 07:38 1637066327.perfdata.host
-rw-r--r-- 1 nagios nagios 137K Nov 16 07:39 1637066341.perfdata.service
-rw-r--r-- 1 nagios nagios 1023 Nov 16 07:38 1637066342.perfdata.host
-rw-r--r-- 1 nagios nagios  59K Nov 16 07:39 1637066356.perfdata.host
-rw-r--r-- 1 nagios nagios  56K Nov 16 07:39 1637066357.perfdata.service
-rw-r--r-- 1 nagios nagios  22K Nov 16 07:39 1637066371.perfdata.service
-rw-r--r-- 1 nagios nagios  51K Nov 16 07:39 1637066372.perfdata.host
07:39 AM SERVER root [/var/nagiosramdisk/spool/perfdata]
$ ll
total 512K
drwxrwxr-x 2 nagios nagios  200 Nov 16 07:39 .
drwxrwxr-x 5 nagios nagios  100 Nov 13 19:37 ..
-rw-r--r-- 1 nagios nagios  83K Nov 16 07:38 1637066312.perfdata.service-PID-15973
-rw-r--r-- 1 nagios nagios  91K Nov 16 07:38 1637066326.perfdata.service-PID-15975
-rw-r--r-- 1 nagios nagios 137K Nov 16 07:39 1637066341.perfdata.service-PID-15976
-rw-r--r-- 1 nagios nagios 1023 Nov 16 07:38 1637066342.perfdata.host
-rw-r--r-- 1 nagios nagios  59K Nov 16 07:39 1637066356.perfdata.host
-rw-r--r-- 1 nagios nagios  56K Nov 16 07:39 1637066357.perfdata.service
-rw-r--r-- 1 nagios nagios  22K Nov 16 07:39 1637066371.perfdata.service
-rw-r--r-- 1 nagios nagios  51K Nov 16 07:39 1637066372.perfdata.host
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Some Performance graphs not graphing

Post by pbroste »

Hello @hbouma

Thanks for following up, appears that there is a host that is showing up with zero bytes on perfdata, so we know that that one is not creating any graphs. The others look good and you stated that they are rotating through the move as npcd service is running.

Want to have you temporarily stop the npcd service and run the Perl script on one of the perfdata. Please scroll through and let me know if you see anything that breaks the running script.

Code: Select all

systemctl stop npcd
From the spooled perfdata run:

Code: Select all

perl -d:Trace /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata/xxxxxxxxxxxxxx.perfdata.service
Start the npcd service:

Code: Select all

systemctl start npcd
Thanks,
Perry
hbouma
Posts: 483
Joined: Tue Feb 27, 2018 9:31 am

Re: Some Performance graphs not graphing

Post by hbouma »

Unfortunately, we do not have the Trace.pm on our perl install.

Can't locate Devel/Trace.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .).
BEGIN failed--compilation aborted.
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Some Performance graphs not graphing

Post by pbroste »

Hello @hbouma

Thanks for following up; I mentioned this case to our team during our stand-up this morning to get feedback and the following are suggestions that we were talking about.

Find out where 'rrdtool' is:

Code: Select all

which rrdtool

Code: Select all

/usr/local/nagios/etc/pnp/npcd.cfg

and change this:
RRDTOOL = /bin/rrdtool
To this:
RRDTOOL = /usr/bin/rrdtool
Want to have you install debug on RRD:

Code: Select all

cpan -i RRD::Simple Devel::Trace
Then restart npcd:

Code: Select all

systemctl restart npcd
Go through and run through the PNP setup:

Code: Select all

wget https://assets.nagios.com/downloads/nagiosxi/5/xi-5.8.6.tar.gz
rm -rf /tmp/nagiosxi
tar zxf https://assets.nagios.com/downloads/nagiosxi/5/xi-5.8.6.tar.gz
cd nagiosxi
./init.sh
cd nagiosxi/subcomponents/pnp
./install
Then re-implement ramdisk manually:
https://assets.nagios.com/downloads/nag ... giosXI.pdf

Thanks,
Perry
hbouma
Posts: 483
Joined: Tue Feb 27, 2018 9:31 am

Re: Some Performance graphs not graphing

Post by hbouma »

Odd, the RRDCached service is installed, but the /usr/local/nagios/etc/pnp/npcd.cfg file does not reference it.

Code: Select all

systemctl status rrdcached.service
● rrdcached.service - LSB: start and stop rrdtool caching daemon
   Loaded: loaded (/etc/rc.d/init.d/rrdcached; bad; vendor preset: disabled)
   Active: active (running) since Sat 2021-11-13 19:38:06 EST; 4 days ago
     Docs: man:systemd-sysv-generator(8)
   CGroup: /system.slice/rrdcached.service
           └─3175 /usr/bin/rrdcached -p /var/rrdtool/rrdcached/rrdcached.pid -s nagios -m 0660 -l unix:/var/rrdtool/rrdcached/rrdcached.sock -F -w 900 -z 90 -j /tmp/ -b /var/rrdtool/rrdcached -P FLUSH,PEND...

Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER3/Memory_Usage.rrd) failed with status -1. (/usr/local/nagios/s...m 1637263678)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER3/Paging_File_Usage.rrd) failed with status -1. (/usr/local/nag...m 1637263689)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER2/Paging_File_Usage.rrd) failed with status -1. (/usr/local/nag...m 1637263703)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER1/Memory_Usage.rrd) failed with status -1. (/usr/local/nagios/s...m 1637263702)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER1/Paging_File_Usage.rrd) failed with status -1. (/usr/local/nag...m 1637263655)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER1 /Paging_File_Usage.rrd) failed with status -1. (/usr/local/nag...m 1637263657)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER1/Paging_File_Usage.rrd) failed with status -1. (/usr/local/nag...m 1637263747)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER1/Memory_Usage.rrd) failed with status -1. (/usr/local/nagios/s...m 1637263709)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER1/Disk_Usage_on_Y__.rrd) failed with status -1. (/usr/local/nag...m 1637263726)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER1/Memory_Usage.rrd) failed with status -1. (/usr/local/nagios/s...m 1637263724)
Hint: Some lines were ellipsized, use -l to show in full.
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Some Performance graphs not graphing

Post by pbroste »

Hello @hbouma

Thanks for responding with they systemd status.

Check your system time. If NTP pushes the time back by a couple of seconds, you may see this issue. Additionally, if you have more than 1 check configured that writes to the same RRD, you may experience this error as well. Please check to make sure that system date/time/timezone is sync'ed across all:

Code: Select all

date
ls -l /etc/localtime
php -r 'echo date("D M j G:i:s T Y")."\n";'
grep "date.timezone =" /etc/php.ini
grep date.timezone /etc/php.ini
mysql -h 127.0.0.1 -uroot -pnagiosxi -e 'SELECT NOW(); SELECT @@GLOBAL.time_zone, @@SESSION.time_zone;'
Please send the following:

Code: Select all

tar -czvf /tmp/npcdcnf.tar.gz /usr/local/nagios/etc/pnp/npcd.cfg /var/rrdtool/ 
Thanks,
Perry
hbouma
Posts: 483
Joined: Tue Feb 27, 2018 9:31 am

Re: Some Performance graphs not graphing

Post by hbouma »

I will send you the tar file in a PM.

Code: Select all

$ date
Fri Nov 19 13:58:15 EST 2021
01:58 PM SERVER root [~]
$ ls -l /etc/localtime
lrwxrwxrwx. 1 root root 38 Mar  6  2017 /etc/localtime -> ../usr/share/zoneinfo/America/New_York
01:58 PM SERVER root [~]
$ php -r 'echo date("D M j G:i:s T Y")."\n";'
Fri Nov 19 13:58:15 EST 2021
01:58 PM SERVER root [~]
$ grep "date.timezone =" /etc/php.ini
date.timezone = America/New_York
01:58 PM SERVER root [~]
$ grep date.timezone /etc/php.ini
; http://php.net/date.timezone
date.timezone = America/New_York
01:58 PM SERVER root [~]
$ mysql -h OFFLOADED_DB_IP -uroot -pSPECIAL_PASSWORD' -e 'SELECT NOW(); SELECT @@GLOBAL.time_zone, @@SESSION.time_zone;'
+---------------------+
| NOW()               |
+---------------------+
| 2021-11-19 13:58:15 |
+---------------------+
+--------------------+---------------------+
| @@GLOBAL.time_zone | @@SESSION.time_zone |
+--------------------+---------------------+
| SYSTEM             | SYSTEM              |
+--------------------+---------------------+