Good afternoon,
No this thread can not be locked as this is not a fix for every host that goes down. We have fleet of systems in the field that will be going on and off in the future and generate a ton of noise that can not be managed. We need to find out why these hosts are being reported down in xi as we can not consider the alerts for a host to be down as reliable.
Nagios xi 5.8.5 update, clients not reporting
-
- Posts: 109
- Joined: Thu Apr 16, 2020 10:27 am
-
- Dreams In Code
- Posts: 7682
- Joined: Wed Feb 11, 2015 12:54 pm
Re: Nagios xi 5.8.5 update, clients not reporting
If you need to restart the ncpa_passive service after upgrading, the only thing that I can think is that during the upgrade NRDP was unavailable and timed out X number of times. Restarting the passive service may cause it to try again, I'll need to lab this up.
I can't think of any other reason why it would start working after only restarting the ncpa_passive service if you didn't have to do anything to the xi server.
I can't think of any other reason why it would start working after only restarting the ncpa_passive service if you didn't have to do anything to the xi server.
-
- Posts: 109
- Joined: Thu Apr 16, 2020 10:27 am
Re: Nagios xi 5.8.5 update, clients not reporting
Good evening all,
To clarify,
The hosts will periodically go down and will require the passive service on the endpoint to be restarted. We noticed this trend after the upgrade which is why I had noted this. There is no pattern it seems to which host decides to go down and stay down even though the actual host is up. It does not trigger any type of warning like a network failure around the time of the event. The hosts other than the one noted in the post are currently up. I did not restart the service for the one in question yet in case there was any additional troubleshooting we wanted to perform on the host prior to simply restarting the service. This obviously is not the host that is on a passive NCPA connection which is experiencing this issue.
To clarify,
The hosts will periodically go down and will require the passive service on the endpoint to be restarted. We noticed this trend after the upgrade which is why I had noted this. There is no pattern it seems to which host decides to go down and stay down even though the actual host is up. It does not trigger any type of warning like a network failure around the time of the event. The hosts other than the one noted in the post are currently up. I did not restart the service for the one in question yet in case there was any additional troubleshooting we wanted to perform on the host prior to simply restarting the service. This obviously is not the host that is on a passive NCPA connection which is experiencing this issue.
-
- Dreams In Code
- Posts: 7682
- Joined: Wed Feb 11, 2015 12:54 pm
Re: Nagios xi 5.8.5 update, clients not reporting
It doesn't really sound like an xi server issue if you need to touch the sending device's service to get them to work again.
Didn't you have security software impacting the NRDP checks before? Could that be occurring here?
Set loglevel debug in the ncpa.cfg on one of the affected systems and restart the services, once it fails again, attach the ncpa_passive.log so we can see what it is showing.
Please create a ticket for this and include a link back to this forum thread so we can get a remote session setup:
https://support.nagios.com/tickets/
Didn't you have security software impacting the NRDP checks before? Could that be occurring here?
Set loglevel debug in the ncpa.cfg on one of the affected systems and restart the services, once it fails again, attach the ncpa_passive.log so we can see what it is showing.
Please create a ticket for this and include a link back to this forum thread so we can get a remote session setup:
https://support.nagios.com/tickets/
-
- Posts: 109
- Joined: Thu Apr 16, 2020 10:27 am
Re: Nagios xi 5.8.5 update, clients not reporting
Good morning Sean,
The debug log level set on client and service reset. The host currently is seen as up in xi. Created ticket for issue. #214061
Wil we do the remote session next time the occurrence happens on this host? Im not sure how often this occurs on this specific host as we dont track the frequency.
The debug log level set on client and service reset. The host currently is seen as up in xi. Created ticket for issue. #214061
Wil we do the remote session next time the occurrence happens on this host? Im not sure how often this occurs on this specific host as we dont track the frequency.
-
- Dreams In Code
- Posts: 7682
- Joined: Wed Feb 11, 2015 12:54 pm
Re: Nagios xi 5.8.5 update, clients not reporting
Ahh okay, hopefully the debug logs will show us more. I wasn't able to replicate the issue.
If we're unable to replicate it on the remote we're pretty limited on what we can do to troubleshoot it.
If we're unable to replicate it on the remote we're pretty limited on what we can do to troubleshoot it.