check_capacity_planning.py not working correctly

optionstechnology · Post by **optionstechnology** » Mon Apr 26, 2021 6:14 am

I use the following command to check disk usage-

/check_nrpe -H HOSTNAME -t 59 -u -c Check_DriveSize -a "drive=E:" "empty-state=ok" "show-all" "warn=free<10%"  "crit=free<5%"

It is currently outputting-

Code: Select all

OK E: - 11.495GB/39.999GB - 71% Free

Plus I can see in the perfdata that it is correctly outputting the warning/critical levels-

Code: Select all

'E: free'=28.50363GB;3.99989;1.99994;0;39.99899 'E: free %'=71%;10;5;0;100

I use this to check the capacity planning on the same server-

Code: Select all

/check_capacity_planning.py -H HOSTNAME -S 'Disk_Usage_-_E__Drive' -d 'E__free_%' -m 'Holt-Winters' -l '8w' -w 7d -c 7d

Code: Select all

CRITICAL: E__free_% will reach value 70.38 in 0.0 days

Why is it alerting at 70.38? Where is it even getting that number from?

gsmith · Post by **gsmith** » Mon Apr 26, 2021 4:34 pm

hi,

It's getting the 71% from the perf data 'E: free %'=71%;10;5;0;100, - it anticipates hitting 70.38 in 0.0 days, and is
rounding up (worse case scenario).

If you want to set a specific value for the warning you can check out the
--critical-uses-custom
--warning-uses-custom
--custom-value=CUSTOM_VALUE

options, see below:

Code: Select all

[root@gs-cent8-23-82 libexec]# ./check_capacity_planning.py -h
Usage: check_capacity_planning.py: alert based on the time until a perfdata value is expected to become CRITICAL.
usage: check_capacity_planning.py -H <host-name> -d <perdata-name>[,<perfdata-name>...] --use-warning|--use-critical|--min <min>|--max <max> [options...]

Options:
  -h, --help            show this help message and exit
  -H HOST_NAME, --host-name=HOST_NAME
                        The name of the host which you're monitoring
                        (incompatible with hostgroup/servicegroup)
  -S SERVICE_DESCRIPTION, --service-description=SERVICE_DESCRIPTION
                        The name of the service which you're monitoring
                        (requires host-name, incompatible with
                        hostgroup/servicegroup)
  -t TIMEOUT, --timeout=TIMEOUT
                        Set the timeout duration in seconds. Defaults to never
                        timing out.
  -w WARNING, --warning=WARNING
                        How far in advance of the predicted threshold to start
                        returning WARNING. Valid units are d, w, m, y (for
                        days, weeks, months, and years respectively). If left
                        empty, plugin does not alert WARNING.
  -c CRITICAL, --critical=CRITICAL
                        How far in advance of the predicted threshold to start
                        returning CRITICAL. Valid units are d, w, m, y (for
                        days, weeks, months, and years respectively). If left
                        empty, plugin does not alert CRITICAL.
  -m METHOD, --method=METHOD
                        The extrapolation method used for prediction. Should
                        be one of 'holt-winters', 'linear', 'quadratic',
                        'cubic' (only the first character is checked).
                        Defaults to holt-winters
  -d DATA_SOURCE, --data-source=DATA_SOURCE
                        The perfdata name for which you are planning
  -l LOOKAHEAD, --lookahead=LOOKAHEAD
                        How far in advance to look ahead when calculating
                        predicted values. Defaults to 8w (8 weeks).
  -v, --verbose         Print more verbose error messages.
  -V, --version         Print the version number and exit.
  --debug               Prints additional text in place of service output.
  --warning-uses-critical
                        If this flag is set, this plugin will alert WARNING
                        based on the forecast against the performance data's
                        CRITICAL value
  --warning-uses-custom
                        If this is set to a number, this plugin will alert
                        WARNING based on the forecast against a custom
                        value.Requires that --custom-value is set.
  --warning-is-minimal  If this flag is set, this plugin will alert based on
                        when forecasted data goes below the desired value.
  --critical-uses-warning
                        If this flag is set, this plugin will alert CRITICAL
                        based on the forecast against the performance data's
                        WARNING value
  --critical-uses-custom
                        If this is set to a number, this plugin will alert
                        CRITICAL based on the forecast against a custom
                        value.Requires that --custom-value is set.
  --critical-is-minimal
                        If this flag is set, this plugin will alert based on
                        when forecasted data goes below the desired value.
  --custom-value=CUSTOM_VALUE
                        The value to use with --warning-uses-custom,
                        --critical-uses-custom.
[root@gs-cent8-23-82 libexec]#

Thanks

optionstechnology · Post by **optionstechnology** » Tue Apr 27, 2021 5:19 am

Sorry I don't understand... so the disk is currently at 71%... and the check is designed to tell you when it hits the result that the check is currently at?
That would mean the check is always critical...

Surely it's supposed to alert based on the warning/critical level set in the perf data for the graph it is monitoring?

Regardless of that I tried it with the custom as suggested-

Code: Select all

/usr/local/nagios/libexec/check_capacity_planning.py -H HOSTNAME -S 'Disk_Usage_-_E__Drive' -d 'E__free_%' -m 'Holt-Winters' -l '8w' --critical-uses-custom --custom-value=50

and I get this-

Code: Select all

CRITICAL: E__free_% will reach value 70.96 in 0.01 days

ssax · Post by **ssax** » Tue Apr 27, 2021 6:50 pm

You need to define a higher threshold than what it currently is:

Code: Select all

/usr/local/nagios/libexec/check_capacity_planning.py -H HOSTNAME -S 'Disk_Usage_-_E__Drive' -d 'E__free_%' -m 'Holt-Winters' -l '8w' -w 15d -c 10d --warning-uses-custom --critical-uses-custom --custom-value 90

That will look ahead 8 weeks and warn if it's predicted to reach 90 within 15 days, it will go critical if it is predicted to reach 90 within 10 days.

optionstechnology · Post by **optionstechnology** » Thu May 13, 2021 5:31 am

I did try it with the suggested level-

Code: Select all

/usr/local/nagios/libexec/check_capacity_planning.py -H HOSTNAME -S 'Disk_Usage_-_E__Drive' -d 'E__free_%' -m 'Holt-Winters' -l '8w' --critical-uses-custom --custom-value=100

but it still doesnt work-

Code: Select all

CRITICAL: E__free_% will reach value 70.96 in 0.01 days

Its like its ignoring the custom value variable

I did some experimenting - firstly I inverted the disk check - so rather than report on "free" space it reports on "used"

Code: Select all

/usr/local/nagios/libexec/check_capacity_planning.py -H HOSTNAME -S 'Disk_Usage_-_E__Drive_Used' -d 'E__used' -m 'Holt-Winters' -l '8w'

This does work fine --

Code: Select all

OK: E__used will not cross either threshold in lookahead period

So it makes sense that the check is not working because its expecting the data to be higher than, not lower than.... BUT the check does have a specific switch to make it treat the value as lower than- "--critical-is-minimal"

But when I use that it also ignores the switch

Code: Select all

/usr/local/nagios/libexec/check_capacity_planning.py -H HOSTNAME -S 'Disk_Usage_-_E__Drive' -d 'E__free' -m 'Holt-Winters' -l '8w' --critical-is-minimal

I think the main problem here is that I cannot get the script to accept the switches I am passing to it... because both yours and my solution *should* solve this problem

ssax · Post by **ssax** » Thu May 13, 2021 5:44 pm

What does this output?

Code: Select all

/usr/local/nagios/libexec/check_capacity_planning.py -H HOSTNAME -S 'Disk_Usage_-_E__Drive' -d 'E__free_%' -m 'Holt-Winters' -l '8w' -w 15d -c 10d --warning-uses-custom --critical-uses-custom --custom-value 100

optionstechnology · Post by **optionstechnology** » Mon May 17, 2021 7:34 am

I get

Code: Select all

CRITICAL: E__free_% will reach value 102.0 in 9.98 days

I also tried increasing the critical number to 15 days so it should return green given the above response-

Code: Select all

/usr/local/nagios/libexec/check_capacity_planning.py -H HOSTNAME -S 'Disk_Usage_-_E__Drive' -d 'E__free_%' -m 'Holt-Winters' -l '8w' -w 30d -c 15d --warning-uses-custom --critical-uses-custom --custom-value 100

But I get this-

Code: Select all

CRITICAL: E__free_% will reach value 101.28 in 9.98 days

benjaminsmith · Post by **benjaminsmith** » Tue May 18, 2021 2:56 pm

Hi,

I believe that is correct. It's supposed to alert with x days of exceeding the custom value, so it's going to pass the custom value in 9.98 days which is below the critical threshold of days.

Best Regards,
Benjamin

Nagios Support Forum

check_capacity_planning.py not working correctly

check_capacity_planning.py not working correctly

Re: check_capacity_planning.py not working correctly

Re: check_capacity_planning.py not working correctly

Re: check_capacity_planning.py not working correctly

Re: check_capacity_planning.py not working correctly

Re: check_capacity_planning.py not working correctly

Re: check_capacity_planning.py not working correctly

Re: check_capacity_planning.py not working correctly