Datastore monitoring failing in Nagios xi 5.8.3

This support forum board is for support questions relating to Nagios xi, our flagship commercial network monitoring solution.
bsanjay
Posts: 86
Joined: Mon Apr 29, 2019 9:38 am

Datastore monitoring failing in Nagios xi 5.8.3

Post by bsanjay »

Hello Team,
Few months back we configured VCenter monitoring through nagios xi wizard and everything looks good.But now, All other checks are working fine for that vcenter but Datastore monitoring is taking too long to execute and failing with this error message. Few details regarding sdk & nagios xi version are provided below,

Error Message
nagios@nagiosxi:[~]: /usr/local/nagios/libexec/check_vmware_api.pl -H "100.20.55.76" -f "/usr/local/nagiosxi/etc/components/vmware/auvc1_ux_corp_local_auth.txt" -l "VMFS"
CHECK_VMWARE_API.PL CRITICAL - SOAP request error - possibly a protocol issue: SSL read timeout: at /usr/local/share/perl5/Net/HTTP/Methods.pm line 278.
at /usr/local/lib64/perl5/Net/SSL.pm line 222.
Net::SSL::die_with_error('LWP::Protocol::https::Socket=GLOB(0x3cadd78)', 'SSL read timeout') called at /usr/local/lib64/perl5/Net/SSL.pm line 230
Net::SSL::__ANON__('ALRM') called at /usr/local/lib64/perl5/Net/SSL.pm line 234
eval {...} called at /usr/local/lib64/perl5/Net/SSL.pm line 234
Net::SSL::read('LWP::Protocol::https::Socket=GLOB(0x3cadd78)', 'HTTP/1.1 200 OK\x{d}\x{a}Date: Fri, 4 Jun 2021 15:53:09 GMT\x{d}\x{a}Cache-Co...', 1024, 2048) called at /usr/local/share/perl5/Net/HTTP/Methods.pm line 278
Net::HTTP::Methods::my_readline('LWP::Protocol::https::Socket=GLOB(0x3cadd78)', 'Status') called at /usr/local/share/perl5/Net/HTTP/Methods.pm line 397
Net::HTTP::Methods::read_response_headers('LWP::Protocol::https::Socket=GLOB(0x3cadd78)', 'laxed', 1, 'junk_out', 'ARRAY(0x335f8a8)') called at /usr/local/share/perl5/LWP/Protocol/http.pm line 431
LWP::Protocol::http::request('LWP::Protocol::https=HASH(0x43e8170)', 'HTTP::Request=HASH(0x44bcc60)', undef, undef, undef, 180) called at /usr/local/share/perl5/LWP/UserAgent.pm line 203
LWP::UserAgent::try {...} () called at /usr/local/share/perl5/Try/Tiny.pm line 103
eval {...} called at /usr/local/share/perl5/Try/Tiny.pm line 94
Try::Tiny::try('CODE(0x44bce40)', 'Try::Tiny::Catch=REF(0x44bcb58)') called at /usr/local/share/perl5/LWP/UserAgent.pm line 218
LWP::UserAgent::send_request('LWP::UserAgent=HASH(0x42684a8)', 'HTTP::Request=HASH(0x44bcc60)', undef, undef) called at /usr/local/share/perl5/LWP/UserAgent.pm line 290
LWP::UserAgent::simple_request('LWP::UserAgent=HASH(0x42684a8)', 'HTTP::Request=HASH(0x44bcc60)', undef, undef) called at /usr/local/share/perl5/LWP/UserAgent.pm line 297
LWP::UserAgent::request('LWP::UserAgent=HASH(0x42684a8)', 'HTTP::Request=HASH(0x44bcc60)') called at /usr/share/perl5/VMware/VICommon.pm line 2372
SoapClient::request('SoapClient=HASH(0x4268c70)', 'RetrieveProperties', '<_this type="PropertyCollector">propertyCollector</_this>\x{a}<sp...', '"urn:vim25/7.0.0.0"') called at (eval 52) line 9175
VimService::RetrieveProperties('VimService=HASH(0x39afed8)', '_this', 'ManagedObjectReference=HASH(0x42d9778)', 'specSet', 'PropertyFilterSpec=HASH(0x440f820)') called at /usr/share/perl5/VMware/VICommon.pm line 1832
ViewBase::update_view_data('Datastore=HASH(0x441ed80)', 'ARRAY(0x4432ac8)') called at /usr/share/perl5/VMware/VICommon.pm line 1675
Vim::get_view('mo_ref', 'ManagedObjectReference=HASH(0x4411ed8)', 'properties', 'ARRAY(0x4432ac8)') called at /usr/local/nagios/libexec/check_vmware_api.pl line 1216


Details
Nagios xi Version = 5.8.3

nagios@nagiosxi:[~]: vmware-cmd --version
vSphere SDK for Perl version: 7.0.0
Script 'vmware-cmd' version: 7.0.0

nagios@nagiosxi:[~]: perl -MLWP -e 'print "$LWP::VERSION\n"';
6.26

nagios@nagiosxi:[~]: yum list installed | grep -i libwww-perl
perl-libwww-perl.noarch 6.05-2.el7 @rhel-x86_64-server-7

nagios@nagiosxi:[~]: cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.9 (Maipo)


Best Regards,
bsanjay
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Datastore monitoring failing in Nagios xi 5.8.3

Post by vtrac »

Hi,
How are you doing?
Were there any changes make to your system?
Were there any updates make to Perl modules and Perl SDK?

I found a similar forum ticket with exact same issue:
https://support.nagios.com/forum/viewto ... 1&start=30

Looks like it was resolved with the install of the below package:

Code: Select all

cpan -i GAAS/libwww-perl-6.05.tar.gz
Best Regards,
Vinh
bsanjay
Posts: 86
Joined: Mon Apr 29, 2019 9:38 am

Re: Datastore monitoring failing in Nagios xi 5.8.3

Post by bsanjay »

Hi vtrac,
I tried that but still the same error. Please do the needful as other checks on same vcenter are working from same nagios but only this datastore is not working.


Best Regards,
bsanjay
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Datastore monitoring failing in Nagios xi 5.8.3

Post by vtrac »

Hi bsanjay,
How are you doing?

There must be changes make to your remote host "100.20.55.76", since it was working before.

Please find out from your unix admin or network admin what changes were make.


Best Regards,
Vinh
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Datastore monitoring failing in Nagios xi 5.8.3

Post by vtrac »

Also, please make sure your authentication file that you passed in has correct info ... :-)

Regards,
Vinh
bsanjay
Posts: 86
Joined: Mon Apr 29, 2019 9:38 am

Re: Datastore monitoring failing in Nagios xi 5.8.3

Post by bsanjay »

HI vtrac,
As i already informed you that other check for the same VCenter from same nagios server is working fine but only issue is with the Datastore monitoring on this VCenter.

If there is some chagnges on VCenter or password config updated then it should not work for any of the services monitored for that VCenter. Please see the attached screenshot for your refernece,

Best Regards,
Sanjay Batkura
You do not have the required permissions to view the files attached to this post.
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Datastore monitoring failing in Nagios xi 5.8.3

Post by vtrac »

Hi Sanjay,
Sorry, my bad!! ... :-)

Did you just upgraded vSphere SDK and vmware-cmd to version 7.0.0?

I checked with my teammate Sean and here is his suggestion:
I would have them increase this on their worker:
job_timeout=60
To 120 or higher, then have them restart the mod gearman worker service on that worker.

Then have them add -t 120 to that service,

Then make sure they have this or higher in /usr/local/nagios/etc/nagios.cfg:
service_check_timeout=120
Best Regards,
Vinh
bsanjay
Posts: 86
Joined: Mon Apr 29, 2019 9:38 am

Re: Datastore monitoring failing in Nagios xi 5.8.3

Post by bsanjay »

Hi vtrac,
i think i need to explain it another way. When i run the check command even from CLI it takes almost 30 minutes to return the output and the output contains lot of error message that i have attahced in text file. I can see the perf data in this error message at the end but not the status information (screenshot attached).
Apart from datastore all checks are working fine but only datastore is taking too long to complete and then finishing with error.


Best Regards,
bsanjay
You do not have the required permissions to view the files attached to this post.
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Datastore monitoring failing in Nagios xi 5.8.3

Post by vtrac »

Hi bsanjay,
How are you doing? ... :-)

Did you just upgraded vSphere SDK and vmware-cmd to version 7.0.0?

Regards,
Vinh
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Datastore monitoring failing in Nagios xi 5.8.3

Post by vtrac »

Hi Bsanjay,
Please share what changes were make to your system as it is very difficult for me to know what steps to take.

Also, this script was not written by Nagios as I am trying to ask development for helps.

I checked with developments, QA and supports for helps on this ..... here's one suggestion I got from Tom (from support):
I have seen in the past that the Datastores require a different permissions to be set. It was for a different plugin but that may be the issue they are having.

The may have to do something like this.

The problem is with the permissions that have been applied in the vSphere Client. For some reason assigning
the permissions at the top level of “Inventory > Hosts and Clusters” is not enough. You need to also assign the
permissions at the top level of “Inventory > Datastores and Datastore Clusters”.

Plus, the check taking 30 minutes to run is never going to work in Nagios.
If they added a lot of datastores to VMWare, that could be the issue as well.

They may need to blacklist the datastores they do not have access to.
Something like this example can be added to the command.

-o blacklistregexp -x 'ISOs|ESX|data'
Please follows the steps suggested by Tom (above) and see if it will helps!!

Best Regards,
Vinh