High CPU Load after upgrading from 4.3.4 to 4.4.5

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
aanderson
Posts: 12
Joined: Thu Aug 24, 2017 4:02 pm

Re: High CPU Load after upgrading from 4.3.4 to 4.4.5

Post by aanderson »

fleish wrote:FWIW - I experienced similar behavior when upgrading from 4.4.3 -> 4.4.5. Downgrading back to 4.4.3 fixed it before I found this thread: https://i.imgur.com/SOUtJmX.jpg
The graph looks pretty similar to mine, even the peaks and troughs. As long as you apply the fix of setting max_concurrent_checks to 15 or whatever value allows your checks to be spread evenly, you should be fine on 4.4.5. I've had no problems with spikes after doing that. The problem should only come back if you stop Nagios for more than 5 mins causing the checks to bunch up again.

regards,
Aidan
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: High CPU Load after upgrading from 4.3.4 to 4.4.5

Post by scottwilkerson »

I see you have livestatus enabled, I'm not sure if it could be causing any issue, but would it be possible to disable the livestatus module in the nagios.cfg to see if the problem persists?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
aanderson
Posts: 12
Joined: Thu Aug 24, 2017 4:02 pm

Re: High CPU Load after upgrading from 4.3.4 to 4.4.5

Post by aanderson »

scottwilkerson wrote:I see you have livestatus enabled, I'm not sure if it could be causing any issue, but would it be possible to disable the livestatus module in the nagios.cfg to see if the problem persists?
I disabled 'livestatus' and tested again. Same issue, 80% of checks rescheduled to run at the same time and then spaced over 8 seconds after that. Very high load recorded as usual.

regards,
Aidan
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: High CPU Load after upgrading from 4.3.4 to 4.4.5

Post by scottwilkerson »

Can I have you try setting the following in the nagios.cfg

Code: Select all

auto_reschedule_checks=1
auto_rescheduling_interval=30
auto_rescheduling_window=45
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
aanderson
Posts: 12
Joined: Thu Aug 24, 2017 4:02 pm

Re: High CPU Load after upgrading from 4.3.4 to 4.4.5

Post by aanderson »

scottwilkerson wrote:Can I have you try setting the following in the nagios.cfg

Code: Select all

auto_reschedule_checks=1
auto_rescheduling_interval=30
auto_rescheduling_window=45
I'm on the move at the moment due to the holidays but will get this tested tomorrow evening and let you know.

regards,
Aidan
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: High CPU Load after upgrading from 4.3.4 to 4.4.5

Post by scottwilkerson »

sounds good
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
aanderson
Posts: 12
Joined: Thu Aug 24, 2017 4:02 pm

Re: High CPU Load after upgrading from 4.3.4 to 4.4.5

Post by aanderson »

aanderson wrote:
scottwilkerson wrote:Can I have you try setting the following in the nagios.cfg

Code: Select all

auto_reschedule_checks=1
auto_rescheduling_interval=30
auto_rescheduling_window=45
I'm on the move at the moment due to the holidays but will get this tested tomorrow evening and let you know.

regards,
Aidan
I've finally got round to testing the auto rescheduling options. I set them as per above and this has resolved the issue. I tested as before and stopped Nagios for over 5 minutes to let all the checks bunch up. After starting Nagios it was showing the usual 80% of checks scheduled to run in the same second. However, about 30-40 seconds before they were due to run, the auto rescheduling kicked in and spread them out evenly over the next 5 minutes avoiding the huge CPU spike.

I have left Nagios running with the auto rescheduling options in place and will let you know if I notice any performance hit. Host and service check latency is low so it looks like it is working fine.

regards,
Aidan
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: High CPU Load after upgrading from 4.3.4 to 4.4.5

Post by scottwilkerson »

aanderson wrote:I've finally got round to testing the auto rescheduling options. I set them as per above and this has resolved the issue. I tested as before and stopped Nagios for over 5 minutes to let all the checks bunch up. After starting Nagios it was showing the usual 80% of checks scheduled to run in the same second. However, about 30-40 seconds before they were due to run, the auto rescheduling kicked in and spread them out evenly over the next 5 minutes avoiding the huge CPU spike.

I have left Nagios running with the auto rescheduling options in place and will let you know if I notice any performance hit. Host and service check latency is low so it looks like it is working fine.

regards,
Aidan
Awesome! Glad to help
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart