Hello,
Since upgrading from 5.6.6 to 5.7.2 we've been experiencing some issues with the monitoring engine on multiple of our Nagios xi instances.
The monitoring engine has crashed a few times or when applying configurations the monitoring engine detects problems then after a while will repair itself and report as OK.
Is there any way to troubleshoot this?
Thanks!
Monitoring Engine crashing (Nagios xi 5.7.2)
-
- Posts: 101
- Joined: Tue Aug 06, 2019 7:49 am
-
- Posts: 235
- Joined: Wed Feb 05, 2020 2:50 pm
Re: Monitoring Engine crashing (Nagios xi 5.7.2)
You can look at the following logs, which may provide clues:
also, the ipcs command will show you if you have hundreds of messages queued up, that's a sign of a problem:
--Jeffrey
Code: Select all
/var/log/messages
/var/log/mariadb/mariadb.log
/usr/local/nagios/var/nagios.log (and other logs in that same directory)
Code: Select all
------ Message Queues --------
key msqid owner perms used-bytes messages
0xef000040 34439168 nagios 600 705536 689
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
- Posts: 101
- Joined: Tue Aug 06, 2019 7:49 am
Re: Monitoring Engine crashing (Nagios xi 5.7.2)
Nothing is sticking out in the logs but it does look like there are a thousands of messages queued up. What problem could that be a sign of and do you know how we can prevent the messages from queuing up?
Code: Select all
------ Message Queues --------
key msqid owner perms used-bytes messages
0xffffffff 0 root 600 0 0
0x000004d2 65537 root 666 0 0
0xdf000200 9371650 nagios 600 0 0
0x02000200 10256387 nagios 600 3590144 3506
-
- Posts: 101
- Joined: Tue Aug 06, 2019 7:49 am
Re: Monitoring Engine crashing (Nagios xi 5.7.2)
Also I wanted to mention, the message queue results I posted above is only from one of our nagios xi instances. The others that are also having monitoring engine issues do not have any queued up messages.
-
- Dreams In Code
- Posts: 7682
- Joined: Wed Feb 11, 2015 12:54 pm
Re: Monitoring Engine crashing (Nagios xi 5.7.2)
Please PM me a copy of your profile from each xi server, you can download it from Admin > System Profile > Download Profile button.
xi 5.7+ should not use the kernel message queue unless you downgraded NDO3 back to NDO2DB to resolve an issue.
What is the output of this command?
- NOTE: You may need to adjust the -h 127.0.0.1, the -uroot, and -pnagiosxi in the command if your DB is offloaded to another server and/or you've changed the root mysql password
If you run this tail command run for a few minutes do you see any errors pop up? (PM me the output)
Please PM your /usr/local/nagios/var/nagios.log as well.
xi 5.7+ should not use the kernel message queue unless you downgraded NDO3 back to NDO2DB to resolve an issue.
What is the output of this command?
- NOTE: You may need to adjust the -h 127.0.0.1, the -uroot, and -pnagiosxi in the command if your DB is offloaded to another server and/or you've changed the root mysql password
Code: Select all
mysql -uroot -pnagiosxi -h 127.0.0.1 -P 3306 nagios -e "desc nagios_hoststatus;desc nagios_servicestatus;"
Code: Select all
tail -Fn0 /usr/local/nagios/var/nagios.log /usr/local/nagiosxi/var/cmdsubsys.log /usr/local/nagiosxi/var/eventman.log
-
- Posts: 101
- Joined: Tue Aug 06, 2019 7:49 am
-
- Posts: 101
- Joined: Tue Aug 06, 2019 7:49 am
Re: Monitoring Engine crashing (Nagios xi 5.7.2)
This morning I was able to watch the monitoring engine crash and it was at the exact time our backups are scheduled for (0400 PT).
-
- Dreams In Code
- Posts: 7682
- Joined: Wed Feb 11, 2015 12:54 pm
Re: Monitoring Engine crashing (Nagios xi 5.7.2)
I was literally going to ask you that, I didn't see anything in your profiles.
Are they xi scheduled backups or 3rd party backups?
If it's an xi backup, please send these files:
Additionally, please send the output of this command so we can check your DB tables:
- NOTE: You may need to adjust the -h 127.0.0.1, the -uroot, and -pnagiosxi in the command if your DB is offloaded to another server and/or you've changed the root mysql password
Are they xi scheduled backups or 3rd party backups?
If it's an xi backup, please send these files:
Code: Select all
/usr/local/nagiosxi/var/components/scheduledbackups.log
/etc/php.ini
- NOTE: You may need to adjust the -h 127.0.0.1, the -uroot, and -pnagiosxi in the command if your DB is offloaded to another server and/or you've changed the root mysql password
Code: Select all
echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
-
- Posts: 101
- Joined: Tue Aug 06, 2019 7:49 am
-
- Dreams In Code
- Posts: 7682
- Joined: Wed Feb 11, 2015 12:54 pm
Re: Monitoring Engine crashing (Nagios xi 5.7.2)
Reply sent.