Problem Description
With the release of Nagios xi 2014 the core version on the back-end was updated to Core 4. This introduced a issue in certain environments where an extremely high system level load can occur at intervals most commonly between an hour and seven hours of the Nagios process starting.
Editing Files
In many steps of this article you will be required to edit files. This documentation will use the vi text editor. When using the vi
editor:
- To make changes press i on the keyboard first to enter insert mode
- Press Esc to exit insert mode
- When you have finished, save the changes in vi by typing :wq and press Enter
Resolving The Problem
As a temporary solution to this we recommend that if you have been experiencing this problem, you should modify:
/usr/local/nagiosxi/html/config.inc.php
By changing the following line:
"nom_checkpoint_interval" => 1440, // time (in minutes) between nom checkpoints
To:
"nom_checkpoint_interval" => 90, // time (in minutes) between nom checkpoints
You may want to alter the above noted interval based on when you are experiencing these problems.
Ideally it should be set to occur as close to the high load anomaly as possible as to minimize system downtime and stress while we work towards a more permanent solution.
This will force the creation of a snapshot, so you may want to archive any important config snaphots as these changes will increase the number of daily snapshots (possibly pushing needed snapshots from the pool).
Final Thoughts
For any support related questions please visit the Nagios Support Forums at: