Hi Vinh,
Thanks for the reply. I've tried to answer parenthetically below...
Looks like the issue started after you upgraded the OS.
- Yes. After upgrading one of my SLES 15 SP2 servers to SP3, the 2.2.0 agent was reporting node does not exist.
- I then upgraded the agent to 2.3.1 and although the first error went away, the issue of no services listing appeared.
- In testing, I took a different SLES 15 SP2, upgraded it to SP3, and then installed NCPA 2.3.1 directly. The no-services behavior persists.
Were the OS upgraded on the Nagios XI machine or the NCPA agent machine?
- The upgrade was on a server that is monitored by NCPA agent. The Nagios server is the CentOS appliance 5.8.4.
Could you please show me the command used? ... and the errors?
-- On the first server: rpm -Uvh
https://assets.nagios.com/downloads/ncp ... x86_64.rpm
--- I was told this was the wrong agent, so I started over with a second server...
-- On the second server, I downloaded and ran this: rpm -Uvh ncpa-2.3.1.sle15.x86_64.rpm
-- I also tried a third server, and found the services error seems to begin at ncpa 2.2.1 with sles15 sp3.
Code: Select all
UNKNOWN: No services found for service names: postgresql-10.service
You can get the command from the "Run check command" button:
Open the Nagios XI GUI > Configure > Core Config Manager > Services
Select the service that you are having issue > click the "Run check command" button.
-- [
nagios@scanner.osufpp.org ~]$ /usr/local/nagios/libexec/check_ncpa.py -H drupshop-db-replica-01.osufpp.org -t 'our-lovely-secret-hash' -P 5693 -M 'services' -q 'service=postgresql-10.service,status=running'
-- UNKNOWN: No services found for service names: postgresql-10.service
Also, please upload the "ncpa.cfg" file as well.
Code: Select all
drupshop-db-rep:/usr/local/ncpa/etc # cat ncpa.cfg
#
# NCPA Main Config File
# ---------------------
#
#
# -------------------------------
# General Configuration
# -------------------------------
#
[general]
#
# Check logging (in ncpa.db and the interface) is on by default, you can disable it
# if you do not want to record the check requests that are coming in or checks being
# sent over NRDP.
# Default: check_logging = 1
#
check_logging = 1
#
# Check logging time - how long in DAYS you'd like to keep checks in the database.
# Default: 30
#
check_logging_time = 30
#
# Display all mounted disk partitions
# (essentially setting all=True here: https://psutil.readthedocs.io/en/latest/#psutil.disk_partitions)
# Default: 1
#
all_partitions = 1
#
# Excluded file system types removes these fs types from the disk metrics
# (This is mostly only noteable on UNIX systems but also works on Windows if you need it)
# Default: aufs,autofs,binfmt_misc,cifs,cgroup,configfs,debugfs,devpts,devtmpfs,
# encryptfs,efivarfs,fuse,fusectl,hugetlbfs,mqueue,nfs,overlayfs,proc,pstore,
# rpc_pipefs,securityfs,selinuxfs,smb,sysfs,tmpfs,tracefs
#
exclude_fs_types = aufs,autofs,binfmt_misc,cifs,cgroup,configfs,debugfs,devpts,devtmpfs,encryptfs,efivarfs,fuse,fusectl,hugetlbfs,mqueue,nfs,overlayfs,proc,pstore,rpc_pipefs,securityfs,selinuxfs,smb,sysfs,tmpfs,tracefs
#
# The default unit to convert bytes (B) into if no unit is specified
# (Gi = 1024 MiB, G = 1000 MB)
#
default_units = Gi
#
# -------------------------------
# Listener Configuration (daemon)
# -------------------------------
#
[listener]
#
# User and group to run plugins as (recommended to use nagios:nagios)
# Default: uid = nagios
# Default: gid = nagios
#
# ** Note - The daemon runs as root, but forks a child process when running a plugin
# that is defined by the user, for security reasons. However, without the main daemon
# running as root, much of the system information would be missing. This is typical behavior. **
#
# This is for Unix only (Linux, Mac OS X, etc)
#
uid = root
#nagios
gid = root
#nagios
#
# IP address and port number for the Listener to use for the web GUI and API
#
# :: allows for dual stack (IPv4 and IPv6 on most linux systems) but will only allow
# for IPv6 connections on Windows
# 0.0.0.0 allows for IPv4 connections only on Windows and most linux systems
#
# Default: ip = ::
# Default (Windows): ip = 0.0.0.0
# Default: port = 5693
#
# ip =
ip=172.16.1.22
# port =
#
# SSL connection and certificate config (if an SSL option is not available on some older
# operating systems it will default back to TLSv1)
# ssl_version options: TLSv1, TLSv1_1, TLSv1_2
#
# ssl_ciphers = <list of ciphers>
#
ssl_version = TLSv1_2
certificate = adhoc
#
# Listener logging file level, location, and the PID location
# Default: loglevel = info (debug, info, warning, error)
# Default: logfile = var/log/ncpa_listener.log
# Default: pidfile = var/run/ncpa_listener.pid (leave listener in pid file name)
#
loglevel = info
logfile = var/log/ncpa_listener.log
pidfile = var/run/ncpa_listener.pid
#
# Delay the listener (API & web GUI) from starting in seconds
# Default: 0
#
# delay_start = 30
#
# Allow admin functionality in the web GUI. When this is set to 0, the admin section will not
# be displayed in the header and will not be available to be accessed.
# Default: 1
#
admin_gui_access = 1
#
# Admin password for the admin section in the web GUI, by default there is no admin
# password and the admin section of the GUI can be accessed by anyone if admin_gui_access is set to 1.
# Default: None
#
# Note: Setting this value to 'None' will automatically log you in, setting it empty will allow you to
# log in using a blank password.
#
admin_password = None
#
# Require admin password to access ALL of the web GUI.
# This does not affect API access via token (community_string).
# Default: 0
#
admin_auth_only = 0
#
# Comma separated list of allowed hosts that can access the API (and GUI)
# Supported types: IPv4, IPv4-mapped IPv6, IPv6, hostnames
# Hostname wildcards are not supported.
#
# Exmaple IPv4: 192.168.23.15
# Example IPv4 subnet: 192.168.0.0/28
# Example IPv4-mapped IPv6: ::ffff:192.168.1.15
# Example IPv6: 2001:0db8:85a3:0000:0000:8a2e:0370:7334
# Example hostname: asterisk.mydomain.com
# Example mixed types: 192.168.23.15, 192.168.0.0/28, ::ffff:192.168.1.15, 2001:0db8:85a3:0000:0000:8a2e:0370:7334, asterisk.mydomain.com
#
# allowed_hosts =
allowed_hosts =172.16.0.0/16,172.20.0.0/16
#
# Number of maximum concurrent connections to the NCPA server.
# Use "None" for unlimited. Default is 200.
# Example: 200
#
# max_connections =
#
# Set the URL to use in the X-Frame-Options and Content-Security-Policy headers
# in order to enable the NCPA GUI to be allowed to load into a frame
# Default: None
# Example: mycoolwebsite.com
# Example: *.mycoolwebsite.com
#
# allowed_sources =
#
# The max size allowed for a log file in megabytes.
# When the log becomes larger than this, the log will be rolled over
# and a new log will be started.
# Default: 5
#
# logmaxmb =
#
# The max number of log rollovers that will be kept.
# Default: 5
#
# logbackups =
#
# -------------------------------
# Listener Configuration (API)
# -------------------------------
#
[api]
#
# The token that will be used to log into the basic web GUI (API browser, graphs, top charts, etc)
# and to authenticate requests to the API and requests through check_ncpa.py
#
community_string = our-lovely-secret-hash
#
# -------------------------------
# Passive Configuration (daemon)
# -------------------------------
#
[passive]
#
# Handlers are a comma separated list of what you would like the passive agent to run
# Default: None
# Options:
# nrdp, kafkaproducer
#
# Example:
# handlers = nrdp,kafkaproducer
#
handlers = None
#
# User and group to run passive checks as (Recommended to use nagios:nagios)
# Default: uid = nagios
# Default: gid = nagios
#
# This is for Unix only (Linux, Mac OS X, etc)
#
uid = root
#nagios
gid = root
#nagios
#
# Passive check interval - the amount in seconds to wait between each passive check by default,
# this can be overwritten by adding on a "|<duration>" in seconds to the passive check config
# Default: 300 (5 minutes)
#
sleep = 300
#
# Passive logging file level, location, and the PID location
# Default: loglevel = info (debug, info, warning, error)
# Default: logfile = var/log/ncpa_passive.log
# Default: pidfile = var/run/ncpa_passive.pid (leave passive in pid file name)
#
loglevel = info
logfile = var/log/ncpa_passive.log
pidfile = var/run/ncpa_passive.pid
#
# Delay passive checks from starting in seconds
# Default: 0
#
# delay_start = 30
#
# The max size allowed for a log file in megabytes.
# When the log becomes larger than this, the log will be rolled over
# and a new log will be started.
# Default: 5
#
# logmaxmb =
#
# The max number of log rollovers that will be kept.
# Default: 5
#
# logbackups =
#
# -------------------------------
# Passive Configuration (NRDP)
# -------------------------------
#
[nrdp]
#
# Connection settings to the NRDP server
# parent = NRDP server location (ex: http://<address>/nrdp)
# token = NRDP server token used to send NRDP results
#
parent =
token =
#
# The hostname that will replace %HOSTNAME% in the check definitions and will be
# sent to NRDP with the check name as the service description (service name)
#
hostname = NCPA 2
#
# -------------------------------
# Passive Configuration (Kafka)
# -------------------------------
#
[kafkaproducer]
hostname = None
servers = localhost:9092
clientname = NCPA-Kafka
topic = ncpa
#
# -------------------------------
# Plugin Configuration
# -------------------------------
#
[plugin directives]
#
# Plugin path where all plugins will be ran from.
#
plugin_path = plugins/
#
# Follow symlinks located in the plugin path
#
# This is for Unix only (Linux, Mac OS X, etc)
#
follow_symlinks = 0
#
# Plugin execution timeout in seconds. Different than the check_ncpa.py timeout, which is
# normally for network connection issues. Will return a CRITICAL value and error when the plugin
# reaches the defined max execution timeout and kills the process.
# Default: 60
#
# plugin_timeout = 60
#
# Comma separated list of plugins to run through sudo. Note: You will need to update your sudoers
# configuration for these plugins to work when called with sudo.
#
# Example: check_special,check_root_files
# (Command line: sudo /<plugin_absolute_path>/check_special <arguments>)
#
# This is for Unix only (Linux, Mac OS X, etc)
#
# run_with_sudo =
#
# Extensions for plugins
# ----------------------
# The extension for the plugin denotes how NCPA will try to run the plugin. Use this
# for setting how you want to run the plugin in the command line.
#
# NOTE: Plugins without an extension will be ran in the cmdline as follows:
# $plugin_name $plugin_args
#
# Defaults:
# .sh = /bin/sh $plugin_name $plugin_args
# .py = python $plugin_name $plugin_args
# .ps1 = powershell -ExecutionPolicy Bypass -File $plugin_name $plugin_args
# .vbs = cscript $plugin_name $plugin_args //NoLogo
# .bat = cmd /c $plugin_name $plugin_args
#
# Since windows NCPA is 32-bit, if you need to use 64-bit powershell, try the following for
# the powershell plugin definition:
# .ps1 = c:\windows\sysnative\windowspowershell\v1.0\powershell.exe -ExecutionPolicy Unrestricted -File $plugin_name $plugin_args
#
# Linux / Mac OS X
.sh = /bin/sh $plugin_name $plugin_args
.py = python $plugin_name $plugin_args
# Windows
.ps1 = powershell -ExecutionPolicy Bypass -File $plugin_name $plugin_args
.vbs = cscript $plugin_name $plugin_args //NoLogo
.wsf = cscript $plugin_name $plugin_args //NoLogo
.bat = cmd /c $plugin_name $plugin_args
If you could share the screenshot of the error displayed in the Nagios XI services page that would helps.
I will attach below
What version of Python used on both, your Nagios XI and the NCPA remote agent?
Nagios XI Server: Python 2.7.5
Server with NCPA agent: 3.6.13