Nagios xi Detecting flapping that isn't there

This support forum board is for support questions relating to Nagios xi, our flagship commercial network monitoring solution.
User avatar
BanditBBS
Posts: 2474
Joined: Tue May 31, 2011 12:57 pm
Location: Scio, OH

Nagios xi Detecting flapping that isn't there

Post by BanditBBS »

We have a large geographic area up and down the Ohio River Valley but all sites are connected via MPLS and have great response time. We only have the one xi server at the main office. At two of the sites we have devices that keep reporting as down and before the subsequent checks shows back as up. However, we could be pinging them form our desktops in the same location as the xi server and no pings are ever dropped. We had to turn off alerting for flap detection because there are just to many hosts appearing as flapping. We only have a few hundred hosts and services, so the install isn't that large.

We are running the official VM image upgraded to 1.5.
2 of xi5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios xi Detecting flapping that isn't there

Post by mguthrie »

At two of the sites we have devices that keep reporting as down and before the subsequent checks shows back as up.
What do you have set for the "max_check_attempts" for these hosts?

Here's some good info on soft vs hard states:
http://nagios.sourceforge.net/docs/3_0/statetypes.html


The default "check-host-alive" uses the /usr/local/nagios/check_icmp check plugin. You could try testing this from command-line with a larger number of packets and see what it's coming up with. There's a lot of flexibility with that plugin, so if you needed to tweak the "check-host-alive" command to have a little more forgiveness in the check you could do that as well.
User avatar
BanditBBS
Posts: 2474
Joined: Tue May 31, 2011 12:57 pm
Location: Scio, OH

Re: Nagios xi Detecting flapping that isn't there

Post by BanditBBS »

mguthrie wrote:What do you have set for the "max_check_attempts" for these hosts?
Max_check_attempts is set to 3 with the check_interval set to 15 and retry_interval set to 5.

I'll mess around with the check_icmp and see if I can make this any better.
2 of xi5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios xi Detecting flapping that isn't there

Post by mguthrie »

Sounds good, let us know if you have additional questions.