We updated the NPCA agent to the latest version 2.4.0 last week and we have found an issue. The agent is fetching RAM data from the VMs.
All other data is being fetched correctly. It's only the RAM data from the VMs not being fetched.
Is this a known issue?
We installed the older NCPA agent 2.3.1 and it started fetching the RAM data so there is some issue with 2.4.0 agent.
NCPA 2.4.0 agent issue
-
- Posts: 247
- Joined: Tue Aug 31, 2021 3:25 pm
Re: NCPA 2.4.0 agent issue
Hi sahilrana,
Are the checks failing in their entirety or are you seeing a performance graph issue similar to this ?
https://github.com/NagiosEnterprises/ncpa/issues/845
Also could you provide the output from a "Run Check Command" with the token redacted as in the example below:
Navigate via configure (top) -> Core Config Manager -> Serivces (left).
Find the "Memory Usage" service for one of the hosts with 2.4.0 installed and select Edit (wrench icon on the right side). The click on the run command button and it's follow up prompt . The output should look like this (note the token and IP have been redacted from the string)
Thanks and Best Regards,
Keith
Are the checks failing in their entirety or are you seeing a performance graph issue similar to this ?
https://github.com/NagiosEnterprises/ncpa/issues/845
Also could you provide the output from a "Run Check Command" with the token redacted as in the example below:
Navigate via configure (top) -> Core Config Manager -> Serivces (left).
Find the "Memory Usage" service for one of the hosts with 2.4.0 installed and select Edit (wrench icon on the right side). The click on the run command button and it's follow up prompt . The output should look like this (note the token and IP have been redacted from the string)
Code: Select all
[nagios@kf-centos-79 ~]$ /usr/local/nagios/libexec/check_ncpa.py -H REDACTED -t 'REDACTED' -P 5693 -M memory/virtual -u 'Gi' -w '50' -c '80'
CRITICAL: Memory usage was 88.90 % (Available: 0.40 GiB, Total: 3.65 GiB, Free: 0.13 GiB, Used: 2.85 GiB) | 'available'=0.40GiB;;; 'total'=3.65GiB;;; 'free'=0.13GiB;;; 'used'=2.85GiB;;;
Keith
-
- Posts: 32
- Joined: Sat Feb 20, 2021 6:55 am
Re: NCPA 2.4.0 agent issue
Hi Keith,
Yes, it's the same issue as mentioned in the link. RAM data is not there in the performance graphs.
Here is the output.
[nagios@abc.domain.com ~]$ /usr/local/nagios/libexec/check_ncpa.py -H x.x.x.x -T 119 -t 'token' -P 5693 -M memory/virtual -u 'Gi' -w '80' -c '90'
OK: Memory usage was 26.50 % (Available: 23.53 GiB, Total: 32.00 GiB, Free: 23.53 GiB, Used: 8.47 GiB) | 'available'=23.53GiB;;; 'total'=32.00GiB;;; 'percent'=26.50%;80;90; 'free'=23.53GiB;;; 'used'=8.47GiB;;;
Yes, it's the same issue as mentioned in the link. RAM data is not there in the performance graphs.
Here is the output.
[nagios@abc.domain.com ~]$ /usr/local/nagios/libexec/check_ncpa.py -H x.x.x.x -T 119 -t 'token' -P 5693 -M memory/virtual -u 'Gi' -w '80' -c '90'
OK: Memory usage was 26.50 % (Available: 23.53 GiB, Total: 32.00 GiB, Free: 23.53 GiB, Used: 8.47 GiB) | 'available'=23.53GiB;;; 'total'=32.00GiB;;; 'percent'=26.50%;80;90; 'free'=23.53GiB;;; 'used'=8.47GiB;;;
-
- Posts: 247
- Joined: Tue Aug 31, 2021 3:25 pm
Re: NCPA 2.4.0 agent issue
Hi sahilrana,
Thanks for confirming. I was able to replicate the difference in output and we suspect the issue may have to do with a mismatch in the number of inputs for the existing round robin database. Could you confirm which version of Nagios XI you are using ?
Thanks and Best Regards,
Keith
Thanks for confirming. I was able to replicate the difference in output and we suspect the issue may have to do with a mismatch in the number of inputs for the existing round robin database. Could you confirm which version of Nagios XI you are using ?
Thanks and Best Regards,
Keith
-
- Posts: 32
- Joined: Sat Feb 20, 2021 6:55 am
Re: NCPA 2.4.0 agent issue
Hi Keith,
NagiosXi version is 5.8.7. I think its the latest one.
NagiosXi version is 5.8.7. I think its the latest one.
You do not have the required permissions to view the files attached to this post.
-
- Posts: 247
- Joined: Tue Aug 31, 2021 3:25 pm
Re: NCPA 2.4.0 agent issue
Hi sahilrana,
I'm filing a bug report on the issue. After discussing it with our developers there are two options in the mean time:
1) Stay at NCPA version 2.3.1
2) You can remove the rrd and xml file for the memory usage graphs and it should start over with the updated number of data sources.
If you would like to use the second option you can find the rrd and xml files in the host subdirectory of perfdata on your XI server ( see below - change HOSTNAME to the hostname or IP of the remote system )
For example:
Hope this is useful.
Thanks and Best Regards,
Keith
I'm filing a bug report on the issue. After discussing it with our developers there are two options in the mean time:
1) Stay at NCPA version 2.3.1
2) You can remove the rrd and xml file for the memory usage graphs and it should start over with the updated number of data sources.
If you would like to use the second option you can find the rrd and xml files in the host subdirectory of perfdata on your XI server ( see below - change HOSTNAME to the hostname or IP of the remote system )
Code: Select all
/usr/local/nagios/share/perfdata/HOSTNAME
Code: Select all
rm /usr/local/nagios/share/perfdata/10.1.2.3/Memory_Usage.rrd
rm /usr/local/nagios/share/perfdata/10.1.2.3/Memory_Usage.xml
Hope this is useful.
Thanks and Best Regards,
Keith
-
- Posts: 32
- Joined: Sat Feb 20, 2021 6:55 am
Re: NCPA 2.4.0 agent issue
Hi Keith,
I used the second option and I am getting the error that the files donot exist. I am assuming the hostname or IP addresss in the command is of the remote server where agent is installed.
The server I tried has 2.4.0 agent installed. Please see the attached error.
I used the second option and I am getting the error that the files donot exist. I am assuming the hostname or IP addresss in the command is of the remote server where agent is installed.
The server I tried has 2.4.0 agent installed. Please see the attached error.
You do not have the required permissions to view the files attached to this post.
-
- Dreams In Code
- Posts: 7682
- Joined: Wed Feb 11, 2015 12:54 pm
Re: NCPA 2.4.0 agent issue
The directory name is based on the host name, is the host name in XI for this server the IP address or something else?
- It's likely something else so you'd need to use that something else in place of THEHOSTNAME in the commands below
- It's likely something else so you'd need to use that something else in place of THEHOSTNAME in the commands below
Code: Select all
rm /usr/local/nagios/share/perfdata/THEHOSTNAME/Memory_Usage.rrd
rm /usr/local/nagios/share/perfdata/THEHOSTNAME/Memory_Usage.xml
-
- Posts: 32
- Joined: Sat Feb 20, 2021 6:55 am
Re: NCPA 2.4.0 agent issue
I tried both with IP address and hostname.
In anycase this is not a feasible alternate since this command is required to be run against all hostnames, right?
For now, I am rolling back to the previous version (2.3.1).
Is there any expected time this bug issue resolution for 2.4.0 agent?
In anycase this is not a feasible alternate since this command is required to be run against all hostnames, right?
For now, I am rolling back to the previous version (2.3.1).
Is there any expected time this bug issue resolution for 2.4.0 agent?
-
- Dreams In Code
- Posts: 7682
- Joined: Wed Feb 11, 2015 12:54 pm
Re: NCPA 2.4.0 agent issue
We're unable to give an ETA at this time, development is aware of the issue though.
Development would be alerted to the bug report updates as well:
https://github.com/NagiosEnterprises/ncpa/issues/845
This would technically fix it but don't run this:
https://support.nagios.com/kb/article/n ... g-149.html
But because the ordering of them is different the resulting data will not be correct:
What it would do is add a datasource to the RRD on the end and then all data would be shifted over, so the new percent one would have the old free data which would through mess up the data.
Usually if you look in the .xml file if you have issues, this section would have an error in it:
Development would be alerted to the bug report updates as well:
https://github.com/NagiosEnterprises/ncpa/issues/845
This would technically fix it but don't run this:
https://support.nagios.com/kb/article/n ... g-149.html
But because the ordering of them is different the resulting data will not be correct:
Code: Select all
2.3.x: | 'available'=0.89GiB;;; 'total'=1.80GiB;;; 'free'=0.17GiB;;; 'used'=0.65GiB;;;
2.4.0: | 'available'=0.89GiB;;; 'total'=1.80GiB;;; 'percent'=50.50%;80;90; 'free'=0.17GiB;;; 'used'=0.65GiB;;;
Usually if you look in the .xml file if you have issues, this section would have an error in it:
Code: Select all
<RRD>
<RC>0</RC>
<TXT>successful updated</TXT>
</RRD>