Nagios xi problems

This support forum board is for support questions relating to Nagios xi, our flagship commercial network monitoring solution.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios xi problems

Post by mguthrie »

Only afterwards my co-worker changed some settings. I asked him about the dns settings and from what he said they are fine.
Ok, well without knowing what all was changed, we're not really able to support your situation. So the more detail you can give us about what was modified after installation the better we can help you. At this point, without knowing more, my suggestion would be to reinstall.
SDohmen
Posts: 240
Joined: Thu Jun 30, 2011 4:14 am

Re: Nagios xi problems

Post by SDohmen »

The only things he changed where settings related to the editor (VIM) and some (commented) changes to apache so its ssl ready as soon as we get the certificate.

He didn't change 1 thing related to nagios.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios xi problems

Post by mguthrie »

Ok, well lets see if we can narrow this down...

Can you run the following and show me the output from:

Code: Select all

/usr/local/nagiosxi/scripts/reset_config_perms
killall -9 nagios
service nagios start
and also show the output from the following:

Code: Select all

ll /usr/local/nagios/var
and

Code: Select all

ll /usr/local/nagios/etc
SDohmen
Posts: 240
Joined: Thu Jun 30, 2011 4:14 am

Re: Nagios xi problems

Post by SDohmen »

Thank you for not giving up yet :)

Ok, here are the outputs:

Code: Select all

root@central.*[pts/0]:~ # /usr/local/nagiosxi/scripts/reset_config_perms                                                                                     15:17
SETUID ROOT OK
RESETTING PERMS
root@central.*[pts/0]:~ # killall -9 nagios                                                                                                                  15:17
root@central.*[pts/0]:~ # service nagios start                                                                                                               15:17
Starting nagios: done.
root@central.*[pts/0]:~ #                                                                                                                                    15:17

Code: Select all

root@central.*[pts/0]:~ # ll /usr/local/nagios/var                                                                                                           15:17
total 18M
drwxrwxr-x. 2 nagios nagios 4.0K Sep  2 00:00 archives/
-rw-rw-r--. 1 nagios nagios    0 Sep  2 09:51 host-perfdata
-rw-r--r--. 1 nagios nagios 2.1K Sep  2 09:51 nagios.log
-rw-r--r--  1 nagios nagios    5 Sep 14 08:48 ndo2db.lock
-rw-r--r--. 1 nagios nagios 7.4K Sep  2 09:51 ndomod.tmp
srwxr-xr-x  1 nagios nagios    0 Sep 14 08:48 ndo.sock=
-rw-r--r--  1 nagios nagios 7.9M Sep 22 15:19 npcd.log
-rw-r--r--  1 nagios nagios  11M Sep 19 00:00 npcd.log.old
-rw-r--r--. 1 nagios nagios  22K Aug 30 19:38 objects.cache
-rw-------. 1 nagios nagios  14K Sep  2 09:51 retention.dat
drwxrwsr-x. 2 nagios nagcmd 4.0K Aug 30 19:38 rw/
-rw-rw-r--. 1 nagios nagios    0 Sep  2 09:51 service-perfdata
drwxr-xr-x. 5 nagios nagios 4.0K Aug 30 19:34 spool/
drwxr-xr-x. 2 nagios nagios 4.0K Sep  2 09:52 stats/
root@central.*[pts/0]:~ #                                                                                                                                    15:19

Code: Select all

root@central.*[pts/0]:~ # ll /usr/local/nagios/etc                                                                                                           15:19
total 156K
-rwxrwxr-x  1 apache nagios  744 Aug 30 19:34 cgi.cfg*
-rw-rw-r--  1 apache nagios  18K Sep 16 09:23 commands.cfg
-rw-rw-r--  1 apache nagios  931 Sep 16 09:23 contactgroups.cfg
-rw-rw-r--  1 apache nagios 1.2K Sep 16 09:23 contacts.cfg
-rw-rw-r--  1 apache nagios 1.4K Sep 16 09:23 contacttemplates.cfg
-rw-rw-r--  1 apache nagios  642 Sep 16 09:23 hostdependencies.cfg
-rw-rw-r--  1 apache nagios  644 Sep 16 09:23 hostescalations.cfg
-rw-rw-r--  1 apache nagios  662 Sep 16 09:23 hostextinfo.cfg
-rw-rw-r--  1 apache nagios 1.7K Sep 16 09:23 hostgroups.cfg
drwsrwsr-x. 2 apache nagios 4.0K Sep 16 09:19 hosts/
-rw-rw-r--  1 apache nagios 5.3K Sep 16 09:23 hosttemplates.cfg
drwsrwsr-x. 2 apache nagios 4.0K Aug 30 19:38 import/
-rw-rw-r--  1 apache nagios 5.8K Sep 16 09:14 nagios.cfg
-rwxrwxr-x  1 apache nagios 2.2K Aug 30 19:36 ndo2db.cfg*
-rwxrwxr-x  1 apache nagios 4.7K Aug 30 19:36 ndomod.cfg*
-rw-rw-r--  1 apache nagios 7.1K Aug 30 19:37 nrpe.cfg
-rwxrwxr-x  1 apache nagios 5.3K Sep 13 11:52 nsca.cfg*
drwxrwsr-x. 4 apache nagios 4.0K Sep 13 09:55 pnp/
-rwxrwxr-x  1 apache nagios  210 Aug 30 19:34 resource.cfg*
-rwxrwxr-x  1 apache nagios 1.6K Aug 30 19:37 send_nsca.cfg*
-rw-rw-r--  1 apache nagios  648 Sep 16 09:23 servicedependencies.cfg
-rw-rw-r--  1 apache nagios  650 Sep 16 09:23 serviceescalations.cfg
-rw-rw-r--  1 apache nagios  668 Sep 16 09:23 serviceextinfo.cfg
-rw-rw-r--  1 apache nagios  638 Sep 16 09:23 servicegroups.cfg
drwsrwsr-x. 2 apache nagios 4.0K Sep 13 10:18 services/
-rw-rw-r--  1 apache nagios  11K Sep 16 09:23 servicetemplates.cfg
drwsrwsr-x. 2 apache nagios 4.0K Sep 13 09:55 static/
-rw-rw-r--  1 apache nagios 1.6K Sep 16 09:23 timeperiods.cfg
root@central.*[pts/0]:~ #                                                                                                                                    15:20
There we go. You can ignore the central.* since i changed that part to remove the complete hostname. If you are wondering why the times are aligned on the right side, thats what my co-worker changed on the looks of the machine. This was however later then the problems which are present.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios xi problems

Post by mguthrie »

If you are wondering why the times are aligned on the right side, thats what my co-worker changed on the looks of the machine. This was however later then the problems which are present.
I might drive your co-worker crazy with this, but at this point we need to know exactly what was changed. Installing vim is not a problem, but we're definitely seeing some oddities on the shell output that we've never come across before, and the errors don't seem to be consistent, so we need to be able to rule out anything that is outside of the fresh/default xi install.

Examples of odd output:

Code: Select all

srwxr-xr-x  1 nagios nagios    0 Sep 14 08:48 ndo.sock=
and

Code: Select all

-rwxrwxr-x  1 apache nagios 5.3K Sep 13 11:52 nsca.cfg*
and we saw this earlier:

Code: Select all

2011-09-11 10:03:14 (188 MB/s) - ânagiosql.loginâ

Should look like:

Code: Select all

-rwxrwxr-x 1 apache nagios  5352 Jun 23 10:55 nsca.cfg

Code: Select all

srwxr-xr-x 1 nagios nagios        0 Sep 20 11:59 ndo.sock

Code: Select all

2011-07-26 16:02:22 (332 MB/s) - `nagiosql.login' saved [5286/5286]
SDohmen
Posts: 240
Joined: Thu Jun 30, 2011 4:14 am

Re: Nagios xi problems

Post by SDohmen »

I will see tomorrow if he is registered here so he can follow/respond to this. He can better name the changes he made then me for sure.

[edit]
In the mean time i will make a screenshot of the output since that will explain more.
SDohmen
Posts: 240
Joined: Thu Jun 30, 2011 4:14 am

Re: Nagios xi problems

Post by SDohmen »

There we go. I think if you check out the screenshot with the 2 outputs from earlier you will see why there is a * etc behind the names. If i remember correctly it means they are executable.
You do not have the required permissions to view the files attached to this post.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios xi problems

Post by mguthrie »

Just to cover our bases some more, can you also run through the database repair procedure as well just to make sure there isn't any data corruption anywhere.

http://library.nagios.com/library/produ ... i-database
SDohmen
Posts: 240
Joined: Thu Jun 30, 2011 4:14 am

Re: Nagios xi problems

Post by SDohmen »

I think we hit the sweet spot if i see the output correctly. Part of the output is posted below.

Code: Select all

- Fixing index 5
- Fixing index 6
- Fixing index 7
- Fixing index 8
- Fixing index 9
- Fixing index 10
- Fixing index 11
- Fixing index 12
- Fixing index 13
- Fixing index 14
- Fixing index 15
- Fixing index 16
- Fixing index 17
- Fixing index 18
- Fixing index 19

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_instances.MYI'
Data records: 1
- Fixing index 1

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_logentries.MYI'
Data records: 101
- Fixing index 1

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_notifications.MYI'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_objects.MYI'
Data records: 94
- Fixing index 1
- Fixing index 2

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_processevents.MYI'
Data records: 6
- Fixing index 1

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_programstatus.MYI'
Data records: 1
- Fixing index 1
- Fixing index 2

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_runtimevariables.MYI'
Data records: 18
- Fixing index 1
- Fixing index 2

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_scheduleddowntime.MYI'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_servicechecks.MYI'
Data records: 0
- Fixing index 1
- Fixing index 2
- Fixing index 3
- Fixing index 4

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_service_contactgroups.MYI'
Data records: 8
- Fixing index 1
- Fixing index 2

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_service_contacts.MYI'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_servicedependencies.MYI'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_serviceescalation_contactgroups.MYI'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_serviceescalation_contacts.MYI'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_serviceescalations.MYI'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_servicegroup_members.MYI'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_servicegroups.MYI'
Data records: 0
- Fixing index 1
- Fixing index 2

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_services.MYI'
Data records: 8
- Fixing index 1
- Fixing index 2
- Fixing index 3

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_servicestatus.MYI'
Data records: 8
- Fixing index 1
- Fixing index 2
- Fixing index 3
- Fixing index 4
- Fixing index 5
- Fixing index 6
- Fixing index 7
- Fixing index 8
- Fixing index 9
- Fixing index 10
- Fixing index 11
- Fixing index 12
- Fixing index 13
- Fixing index 14
- Fixing index 15
- Fixing index 16
- Fixing index 17
- Fixing index 18
- Fixing index 19

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_statehistory.MYI'
Data records: 0
- Fixing index 1

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_systemcommands.MYI'
Data records: 0
- Fixing index 1
- Fixing index 2
- Fixing index 3

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_timedeventqueue.MYI'
Data records: 19
- Fixing index 1
- Fixing index 2
- Fixing index 3
- Fixing index 4
- Fixing index 5

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_timedevents.MYI'
Data records: 0
- Fixing index 1
- Fixing index 2
- Fixing index 3
- Fixing index 4
- Fixing index 5

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_timeperiods.MYI'
Data records: 8
- Fixing index 1
- Fixing index 2

---------

- check record delete-chain
- recovering (with sort) MyISAM-table 'nagios_timeperiod_timeranges.MYI'
Data records: 33
- Fixing index 1
- Fixing index 2
~

===============
REPAIR COMPLETE
===============
root@central.*[pts/0]:~ # service mysqld start                                                                                                               20:05
Starting mysqld:                                           [  OK  ]
root@central.*[pts/0]:~ #                                                                                                                                    20:06
Also i just got word from my co-worker and the only real thing he did besides setting VIM as editor is setting, is setting ZSH as user shell.
SDohmen
Posts: 240
Joined: Thu Jun 30, 2011 4:14 am

Re: Nagios xi problems

Post by SDohmen »

Scratch that from above. I still cant see the hosts/services and also the error that nagios isn't running is still alive and kicking.

[EDIT]

Because of this above i decided to make a clean new VM to see if it had the same problems. Everything thing seemed to be installed correctly and from what i seen no yum nagios install. With that in mind i decided to login and reset the security credentials. Then i try to restart in the tools menu.

Here however it got interesting. I got the error Restart failed - Nagios command file not found or no execute permissions.

After logging back into the shell i decided to do the last commands you asked me for and here are the results:

Code: Select all

[root@central2 nagiosxi]# ll /usr/local/nagios/var
total 136
drwxrwxr-x 2 nagios nagios  4096 Sep 23 08:46 archives
-rw-rw-r-- 1 nagios nagios   241 Sep 23 08:50 host-perfdata
-rw-r--r-- 1 nagios nagios     5 Sep 23 08:49 nagios.lock
-rw-r--r-- 1 nagios root    1212 Sep 23 08:49 nagios.log
-rw-r--r-- 1 nagios nagios     5 Sep 23 08:49 ndo2db.lock
-rw-r--r-- 1 nagios nagios     0 Sep 23 08:49 ndomod.tmp
srwxr-xr-x 1 nagios nagios     0 Sep 23 08:49 ndo.sock
-rw-r--r-- 1 nagios nagios   627 Sep 23 08:49 npcd.log
-rw-r--r-- 1 nagios nagios 21892 Sep 23 08:49 objects.cache
-rw-r--r-- 1 nagios root       0 Sep 23 08:49 retention.dat
drwxrwsr-x 2 nagios nagcmd  4096 Sep 23 08:49 rw
-rw-rw-r-- 1 nagios nagios     0 Sep 23 08:49 service-perfdata
drwxrwxr-x 5 nagios nagios  4096 Sep 23 08:46 spool
drwxr-xr-x 2 nagios nagios  4096 Sep 23 08:49 stats
-rw-rw-r-- 1 nagios nagios 13464 Sep 23 08:50 status.dat
[root@central2 nagiosxi]# ll /usr/local/nagios/etc
total 264
-rwxrwxr-x 1 apache nagios   744 Sep 23 08:46 cgi.cfg
-rw-rw-r-- 1 apache nagios 15262 Sep 23 08:49 commands.cfg
-rw-rw-r-- 1 apache nagios   931 Sep 23 08:49 contactgroups.cfg
-rw-rw-r-- 1 apache nagios  1192 Sep 23 08:49 contacts.cfg
-rw-rw-r-- 1 apache nagios  1396 Sep 23 08:49 contacttemplates.cfg
-rw-rw-r-- 1 apache nagios   642 Sep 23 08:49 hostdependencies.cfg
-rw-rw-r-- 1 apache nagios   644 Sep 23 08:49 hostescalations.cfg
-rw-rw-r-- 1 apache nagios   662 Sep 23 08:49 hostextinfo.cfg
-rw-rw-r-- 1 apache nagios   792 Sep 23 08:49 hostgroups.cfg
drwsrwsr-x 2 apache nagios  4096 Sep 23 08:49 hosts
-rw-rw-r-- 1 apache nagios  6108 Sep 23 08:49 hosttemplates.cfg
drwsrwsr-x 2 apache nagios  4096 Sep 23 08:49 import
-rwxrwxr-x 1 apache nagios  5764 Sep 23 08:46 nagios.cfg
-rwxrwxr-x 1 apache nagios  2229 Sep 23 08:48 ndo2db.cfg
-rwxrwxr-x 1 apache nagios  4723 Sep 23 08:48 ndomod.cfg
-rw-rw-r-- 1 apache nagios  7207 Sep 23 08:49 nrpe.cfg
-rwxrwxr-x 1 apache nagios  5345 Sep 23 08:49 nsca.cfg
drwxrwxr-x 4 apache nagios  4096 Sep 23 08:49 pnp
-rwxrwxr-x 1 apache nagios   210 Sep 23 08:46 resource.cfg
-rwxrwxr-x 1 apache nagios  1627 Sep 23 08:49 send_nsca.cfg
-rw-rw-r-- 1 apache nagios   648 Sep 23 08:49 servicedependencies.cfg
-rw-rw-r-- 1 apache nagios   650 Sep 23 08:49 serviceescalations.cfg
-rw-rw-r-- 1 apache nagios   668 Sep 23 08:49 serviceextinfo.cfg
-rw-rw-r-- 1 apache nagios   638 Sep 23 08:49 servicegroups.cfg
drwsrwsr-x 2 apache nagios  4096 Sep 23 08:49 services
-rw-rw-r-- 1 apache nagios  9955 Sep 23 08:49 servicetemplates.cfg
drwsrwsr-x 2 apache nagios  4096 Sep 23 08:46 static
-rw-rw-r-- 1 apache nagios  2884 Sep 23 08:49 timeperiods.cfg
[root@central2 nagiosxi]# updatedb
[root@central2 nagiosxi]# locate nagios.cmd
[root@central2 nagiosxi]# locate nagios.debug
[root@central2 nagiosxi]# updatedb
[root@central2 nagiosxi]# locate nagios.cmd
[root@central2 nagiosxi]# locate nagios.debug
[root@central2 nagiosxi]# locate nagios.cmd
[root@central2 nagiosxi]# cd /var/spool/nagios/cmd
-bash: cd: /var/spool/nagios/cmd: No such file or directory
[root@central2 nagiosxi]#
As you see the lock file this time is present but now other files are missing. Unless i am getting crazy i would really think something in the installer messes up. I used a CentOS 5.7 install btw.

To troubleshoot some more i decided to start our old testing xi to see (which does work) how everything was on there. First i decided to do the same commands as above and all except for the debug one worked just fine. Then i checked with ps aux | grep nagios to see which processes are running and i noticed the line from nagios itself was different then the central. So i decided to execute that line on our central aswell and see what happened. All of a sudden i was able to restart the nagios with the tools menu option. However i still dont see any hosts or groups i added earlier on.

That older install is a CentOS 5.6 install without anything special. The Central is a CentOS 6.0 install and the other test install i made earlier on in this post is a CentOS 5.7 install. I hope this information helps.