We are running into some issues with gaps our performance graphs.
All services are showing green but for hours at a time, chunks of data are missing from our performance graphs, typically off regular working hours. We have followed the instructions found in the documentation and made a few changes but still have not pinned down what is going on.
We took the following actions:
- 1. Upped the verbosity of both NPCD and perfdata
2. Confirmed the nagios account has not expired
3. We noted errors re: load threshhold and adjusted the load_threshold of NPCD to 20 and restarted NPCD.
Here are the spooled files count-- it doesn't meet the 20k number cited in the article.
Code: Select all
$ ls /usr/local/nagios/var/spool/perfdata/ | wc -l
2
$ ls /usr/local/nagios/var/spool/xidpe/ | wc -l
4707
From npcd.log, we are seeing the following for every check:
Code: Select all
[10-09-2020 11:17:32] NPCD: ThreadCounter 0/5 File is 1599774829.perfdata.service-PID-15586
[10-09-2020 11:17:32] NPCD: File '1599774829.perfdata.service-PID-15586' is an already in process PNP file. Leaving it untouched.
[10-09-2020 11:17:32] NPCD: DEBUG: load 1.970000/20.000000
[10-09-2020 11:17:32] NPCD: ThreadCounter 0/5 File is 1600195788.perfdata.host-PID-20283
[10-09-2020 11:17:32] NPCD: File '1600195788.perfdata.host-PID-20283' is an already in process PNP file. Leaving it untouched.
Code: Select all
Oct 9 06:36:57 dltfanxi1 nagios: Warning: fork() in my_system_r() failed for command "/bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/1602239817.perfdata.host" - errno: Cannot allocate memory
Oct 9 06:37:11 dltfanxi1 nagios: Warning: fork() in my_system_r() failed for command "/bin/mv /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/1602239831.perfdata.service" - errno: Cannot allocate memory
Oct 9 06:37:12 dltfanxi1 nagios: Warning: fork() in my_system_r() failed for command "/bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/1602239831.perfdata.host" - errno: Cannot allocate memory
Oct 9 06:37:27 dltfanxi1 nagios: Warning: fork() in my_system_r() failed for command "/bin/mv /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/1602239847.perfdata.service" - errno: Cannot allocate memory
Oct 9 06:37:27 dltfanxi1 nagios: Warning: fork() in my_system_r() failed for command "/bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/1602239847.perfdata.host" - errno: Cannot allocate memory
However, we have confirmed that we did not stress the memory on the ESXi. The VM has 4 CPU and 8 GB RAM for reference.
Any ideas where we should be looking to resolve?
Thanks,