Nagios xi Crash multiple time per week

This support forum board is for support questions relating to Nagios xi, our flagship commercial network monitoring solution.
bennyboy
Posts: 157
Joined: Thu Oct 29, 2015 9:42 am

Nagios xi Crash multiple time per week

Post by bennyboy »

Hi, We have to run /usr/local/nagiosxi/scripts/repair_databases.sh multiple time per week. Can you help us to find why and fix it please.

Where I can send the profile information in private mode ?

Thx!
bennyboy
Posts: 157
Joined: Thu Oct 29, 2015 9:42 am

Re: Nagios xi Crash multiple time per week

Post by bennyboy »

I will update our instance to the latest version and see.
bennyboy
Posts: 157
Joined: Thu Oct 29, 2015 9:42 am

Re: Nagios xi Crash multiple time per week

Post by bennyboy »

User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Nagios xi Crash multiple time per week

Post by vtrac »

Hi bennyboy,
Yes, you can follow the steps in the KB article to move all your database storage engine to InnoDB:
https://support.nagios.com/kb/article/d ... i-896.html

There is benefit of using InnoDB:
https://dev.mysql.com/doc/refman/5.7/en ... efits.html


You are free to upgrade your Nagios xi to the latest released version, instruction below:
https://assets.nagios.com/downloads/nag ... ctions.pdf


Please run the below command and update outputs to this post:

Code: Select all

echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
Please upload or PM me the "profile.zip".

I'm not going to ask you to run the "/usr/local/nagiosxi/scripts/repair_databases.sh" script since you said you have ran it couple times a week. However, please upload the outputs of that command if you happen to have its.


Best Regards,
Vinh
bennyboy
Posts: 157
Joined: Thu Oct 29, 2015 9:42 am

Re: Nagios xi Crash multiple time per week

Post by bennyboy »

vtrac wrote:Please run the below command and update outputs to this post:

Code: Select all

echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
This is the output of the select you asked :

Code: Select all

+--------------------------------------------+------------+
| Table                                      | Size in MB |
+--------------------------------------------+------------+
| nagios_acknowledgements                    |       2.91 |
| nagios_commands                            |       0.06 |
| nagios_commenthistory                      |    4692.00 |
| nagios_comments                            |       1.61 |
| nagios_configfiles                         |       0.03 |
| nagios_configfilevariables                 |       0.02 |
| nagios_conninfo                            |       1.52 |
| nagios_contact_addresses                   |       0.03 |
| nagios_contact_notificationcommands        |       0.11 |
| nagios_contactgroup_members                |       0.03 |
| nagios_contactgroups                       |       0.03 |
| nagios_contactnotificationmethods          |       6.55 |
| nagios_contactnotifications                |       9.06 |
| nagios_contacts                            |       0.03 |
| nagios_contactstatus                       |       0.03 |
| nagios_customvariables                     |       4.55 |
| nagios_customvariablestatus                |       5.55 |
| nagios_dbversion                           |       0.02 |
| nagios_downtimehistory                     |     225.31 |
| nagios_eventhandlers                       |       0.06 |
| nagios_externalcommands                    |       3.52 |
| nagios_flappinghistory                     |      13.52 |
| nagios_host_contactgroups                  |       0.64 |
| nagios_host_contacts                       |       0.30 |
| nagios_host_parenthosts                    |       0.19 |
| nagios_hostchecks                          |       0.03 |
| nagios_hostdependencies                    |       0.03 |
| nagios_hostescalation_contactgroups        |       0.03 |
| nagios_hostescalation_contacts             |       0.03 |
| nagios_hostescalations                     |       0.03 |
| nagios_hostgroup_members                   |       0.50 |
| nagios_hostgroups                          |       0.09 |
| nagios_hosts                               |       1.77 |
| nagios_hoststatus                          |       4.64 |
| nagios_instances                           |       0.02 |
| nagios_logentries                          |    1768.48 |
| nagios_notifications                       |       9.02 |
| nagios_objects                             |      11.58 |
| nagios_processevents                       |       1.52 |
| nagios_programstatus                       |       0.03 |
| nagios_runtimevariables                    |       0.03 |
| nagios_scheduleddowntime                   |       0.56 |
| nagios_service_contactgroups               |       3.03 |
| nagios_service_contacts                    |       1.86 |
| nagios_service_parentservices              |       0.03 |
| nagios_servicechecks                       |       0.06 |
| nagios_servicedependencies                 |       0.03 |
| nagios_serviceescalation_contactgroups     |       0.03 |
| nagios_serviceescalation_contacts          |       0.03 |
| nagios_serviceescalations                  |       0.03 |
| nagios_servicegroup_members                |       0.25 |
| nagios_servicegroups                       |       0.03 |
| nagios_services                            |       6.41 |
| nagios_servicestatus                       |      17.03 |
| nagios_statehistory                        |    1687.42 |
| nagios_systemcommands                      |       0.16 |
| nagios_timedeventqueue                     |       0.09 |
| nagios_timedevents                         |       0.09 |
| nagios_timeperiod_timeranges               |       0.03 |
| nagios_timeperiods                         |       0.03 |
| tbl_command                                |       0.08 |
| tbl_contact                                |       0.03 |
| tbl_contactgroup                           |       0.03 |
| tbl_contacttemplate                        |       0.03 |
| tbl_domain                                 |       0.03 |
| tbl_host                                   |       1.73 |
| tbl_hostdependency                         |       0.03 |
| tbl_hostescalation                         |       0.03 |
| tbl_hostextinfo                            |       0.03 |
| tbl_hostgroup                              |       0.11 |
| tbl_hosttemplate                           |       0.03 |
| tbl_info                                   |       0.17 |
| tbl_lnkContactToCommandHost                |       0.02 |
| tbl_lnkContactToCommandService             |       0.02 |
| tbl_lnkContactToContactgroup               |       0.02 |
| tbl_lnkContactToContacttemplate            |       0.02 |
| tbl_lnkContactToVariabledefinition         |       0.02 |
| tbl_lnkContactgroupToContact               |       0.02 |
| tbl_lnkContactgroupToContactgroup          |       0.02 |
| tbl_lnkContacttemplateToCommandHost        |       0.02 |
| tbl_lnkContacttemplateToCommandService     |       0.02 |
| tbl_lnkContacttemplateToContactgroup       |       0.02 |
| tbl_lnkContacttemplateToContacttemplate    |       0.02 |
| tbl_lnkContacttemplateToVariabledefinition |       0.02 |
| tbl_lnkHostToContact                       |       0.02 |
| tbl_lnkHostToContactgroup                  |       0.02 |
| tbl_lnkHostToHost                          |       0.14 |
| tbl_lnkHostToHostgroup                     |       0.19 |
| tbl_lnkHostToHosttemplate                  |       0.44 |
| tbl_lnkHostToVariabledefinition            |       0.02 |
| tbl_lnkHostdependencyToHost_DH             |       0.02 |
| tbl_lnkHostdependencyToHost_H              |       0.02 |
| tbl_lnkHostdependencyToHostgroup_DH        |       0.02 |
| tbl_lnkHostdependencyToHostgroup_H         |       0.02 |
| tbl_lnkHostescalationToContact             |       0.02 |
| tbl_lnkHostescalationToContactgroup        |       0.02 |
| tbl_lnkHostescalationToHost                |       0.02 |
| tbl_lnkHostescalationToHostgroup           |       0.02 |
| tbl_lnkHostgroupToHost                     |       0.17 |
| tbl_lnkHostgroupToHostgroup                |       0.02 |
| tbl_lnkHosttemplateToContact               |       0.02 |
| tbl_lnkHosttemplateToContactgroup          |       0.02 |
| tbl_lnkHosttemplateToHost                  |       0.02 |
| tbl_lnkHosttemplateToHostgroup             |       0.02 |
| tbl_lnkHosttemplateToHosttemplate          |       0.02 |
| tbl_lnkHosttemplateToVariabledefinition    |       0.02 |
| tbl_lnkServiceToContact                    |       0.02 |
| tbl_lnkServiceToContactgroup               |       0.05 |
| tbl_lnkServiceToHost                       |       0.52 |
| tbl_lnkServiceToHostgroup                  |       0.02 |
| tbl_lnkServiceToServicegroup               |       0.02 |
| tbl_lnkServiceToServicetemplate            |       1.52 |
| tbl_lnkServiceToVariabledefinition         |       0.02 |
| tbl_lnkServicedependencyToHost_DH          |       0.02 |
| tbl_lnkServicedependencyToHost_H           |       0.02 |
| tbl_lnkServicedependencyToHostgroup_DH     |       0.02 |
| tbl_lnkServicedependencyToHostgroup_H      |       0.02 |
| tbl_lnkServicedependencyToService_DS       |       0.02 |
| tbl_lnkServicedependencyToService_S        |       0.02 |
| tbl_lnkServicedependencyToServicegroup_DS  |       0.02 |
| tbl_lnkServicedependencyToServicegroup_S   |       0.02 |
| tbl_lnkServiceescalationToContact          |       0.02 |
| tbl_lnkServiceescalationToContactgroup     |       0.02 |
| tbl_lnkServiceescalationToHost             |       0.02 |
| tbl_lnkServiceescalationToHostgroup        |       0.02 |
| tbl_lnkServiceescalationToService          |       0.02 |
| tbl_lnkServiceescalationToServicegroup     |       0.02 |
| tbl_lnkServicegroupToService               |       0.02 |
| tbl_lnkServicegroupToServicegroup          |       0.02 |
| tbl_lnkServicetemplateToContact            |       0.02 |
| tbl_lnkServicetemplateToContactgroup       |       0.02 |
| tbl_lnkServicetemplateToHost               |       0.02 |
| tbl_lnkServicetemplateToHostgroup          |       0.02 |
| tbl_lnkServicetemplateToServicegroup       |       0.02 |
| tbl_lnkServicetemplateToServicetemplate    |       0.02 |
| tbl_lnkServicetemplateToVariabledefinition |       0.02 |
| tbl_lnkTimeperiodToTimeperiod              |       0.02 |
| tbl_logbook                                |       0.02 |
| tbl_mainmenu                               |       0.02 |
| tbl_permission                             |       0.02 |
| tbl_permission_inactive                    |       0.02 |
| tbl_service                                |       3.52 |
| tbl_servicedependency                      |       0.03 |
| tbl_serviceescalation                      |       0.03 |
| tbl_serviceextinfo                         |       0.03 |
| tbl_servicegroup                           |       0.03 |
| tbl_servicetemplate                        |       0.13 |
| tbl_session                                |       0.02 |
| tbl_session_locks                          |       0.02 |
| tbl_settings                               |       0.03 |
| tbl_submenu                                |       0.02 |
| tbl_timedefinition                         |       0.02 |
| tbl_timeperiod                             |       0.03 |
| tbl_user                                   |       0.03 |
| tbl_variabledefinition                     |       0.08 |
| xi_auditlog                                |       2.50 |
| xi_auth_tokens                             |       0.11 |
| xi_cmp_trapdata                            |       0.03 |
| xi_cmp_trapdata_log                        |       0.03 |
| xi_commands                                |       0.27 |
| xi_eventqueue                              |       0.03 |
| xi_events                                  |    1188.53 |
| xi_incidents                               |       0.02 |
| xi_meta                                    |   22259.98 |
| xi_mibs                                    |       0.05 |
| xi_options                                 |       0.08 |
| xi_sessions                                |       0.22 |
| xi_sysstat                                 |       0.03 |
| xi_usermeta                                |       4.92 |
| xi_users                                   |       0.06 |
+--------------------------------------------+------------+

vtrac wrote: Please upload or PM me the "profile.zip".
I also upload the profile file via PM functionality.
bennyboy
Posts: 157
Joined: Thu Oct 29, 2015 9:42 am

Re: Nagios xi Crash multiple time per week

Post by bennyboy »

I apply the convert to all Nagios Table to innodb and this morning the DB don't corrupted. But we experienced a stuck nagiosxi and I found that the script that backup the DB

Code: Select all

# Backup MySQL & PostgreSQL Databases
0   7 * * * root   /root/scripts/automysqlbackup
Have those option :

Code: Select all

OPT="--quote-names --opt"                       # OPT string for use with mysqldump ( see man mysqldump )
I found in mysqldump manual that the option

Code: Select all

--opt
will lock the table.

So I read the man and I found that I can use

Code: Select all

--single-transaction
I decide to run a test with that option instead of

Code: Select all

--opt
Those are the option I add in the script

Code: Select all

--single-transaction --
skip-lock-tables --add-drop-table --add-locks --create-options --disable-keys --extended-insert --quick --set-charset --quote-names
I will update after the test...
bennyboy
Posts: 157
Joined: Thu Oct 29, 2015 9:42 am

Re: Nagios xi Crash multiple time per week

Post by bennyboy »

So during the backup we are now able to access nagios xi.

Code: Select all

root     28891 16.5  0.0 126916  3476 pts/0    D+   10:39   1:55 mysqldump --user=root --password=x xxxxxxxxxxxxxxxxxxxxxxx --host=localhost --single-transaction --
skip-lock-tables --add-drop-table --add-locks --create-options --disable-keys --extended-insert --quick --set-charset --quote-names --databases nagiosxi
Last time I run the script with

Code: Select all

--quote-names --opt


Nagiosxi was stuck.

I also change the backup time from 7ham to 2ham.Less traffic.

Do you have any other advice before we do the upgrade next week ??

Thx!
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Nagios xi Crash multiple time per week

Post by vtrac »

Hi,
I did not see the "profile.zip" in my inbox but looks like I don't need that file now since you have resolved the DB backup issue. I will take note of the OPT (options) you used for future supports.

Wonderful job!! ... :-)

If this is a VM, I would recommend that you shut down the instance and take a GOOD snapshot of the VM before doing the upgrade.

If taking a snapshot is not an option, then I would recommend that you do a full backup first:
https://assets.nagios.com/downloads/nag ... ios-xi.pdf

Good luck with the upgrade!!
https://assets.nagios.com/downloads/nag ... ctions.pdf


Best Regards,
Vinh
bennyboy
Posts: 157
Joined: Thu Oct 29, 2015 9:42 am

Re: Nagios xi Crash multiple time per week

Post by bennyboy »

vtrac wrote:Hi,
I did not see the "profile.zip" in my inbox but looks like I don't need that file now since you have resolved the DB backup issue. I will take note of the OPT (options) you used for future supports.

Wonderful job!! ... :-)

If this is a VM, I would recommend that you shut down the instance and take a GOOD snapshot of the VM before doing the upgrade.

If taking a snapshot is not an option, then I would recommend that you do a full backup first:
https://assets.nagios.com/downloads/nag ... ios-xi.pdf

Good luck with the upgrade!!
https://assets.nagios.com/downloads/nag ... ctions.pdf


Best Regards,
Vinh
I resend it. Can you confirm that you received it please.
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Nagios xi Crash multiple time per week

Post by vtrac »

Hi,
Yes, I did received it.

There is a "warning" in your database's log:
This is related to your changes.

Code: Select all

210409 11:57:20 [Warning] options --log-slow-admin-statements, --log-queries-not-using-indexes and --log-slow-slave-statements have no effect if --log_slow_queries is not set
I noticed you have some "passive" checks that has no service define:
You can go "Admin > Monitoring Config > Unconfigured Objects", then configure those.

Code: Select all

Apr 10 09:49:12 slpmon0034 nagios: Error: Got check result for service 'DBA_Alerte_passive_oracle' on host 'sxqgbd0798'. Unable to find service

Apr 10 09:49:48 slpmon0034 nagios: Error: Got check result for service 'passive-check-script' on host 'ctelpsa246'. Unable to find service
The "nagios.txt" log file is empty.


Best Regards,
Vinh