NagiosXI issues

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
SDohmen
Posts: 240
Joined: Thu Jun 30, 2011 4:14 am

NagiosXI issues

Post by SDohmen »

Hello,

Yesterday we decided to reinstall our central on a new vm so we could change the partition scheme we where using. Our original nagiosxi install was a 2011R2.0 install on a centos 6 x32 machine.

The new one was a centos 6 x64 installation (because of xfs support).

What i did was i created a backup of the old central and moved that file over to the new one where i restored it. This gave me several problems with NPCD and some other components. After some wiggling around and reinstalling the components alone i succeeded in getting everything to work again (at least i dont see errors anymore). What i did notice was that i had no performance graphs though.

According to the wiki you have to reset the permissions and several other things but none of those worked for us. In a post here on the forum there was a link which you can test (https://central/nagiosxi/includes/compo ... =localhost) to see if this returns something. The error i am getting here is an image:
Knipsel2.PNG
Is there a way to fix or remove the old graphs?

The other issue i seem to have is that the upgrade screen:
Knipsel.PNG
The weird thing about this, is that i have the 2.0 version installed and nothing else. Is this because of the x64 install or something perhaps?
You do not have the required permissions to view the files attached to this post.
SDohmen
Posts: 240
Joined: Thu Jun 30, 2011 4:14 am

Re: NagiosXI issues

Post by SDohmen »

After some more wiggling around i decided to create a new backup. While running this backup i got the error:

Backing up MySQL databases...
mysqldump: Got error: 144: Table './nagios/nagios_logentries' is marked as crashed and last (automatic?) repair failed when using LOCK TABLES
Error backing up MySQL database 'nagios' - check the password in this script!

Because of this i decided to do a force repair and then run the dbmaint script again. During the run of the dbmaint script i noticed the following errors:

OPTIMIZING NAGIOSXI TABLE: xi_notifications
SQL: VACUUM ANALYZE xi_notifications;
SQL: SQL Error [nagiosxi] :</b> ERROR: relation "xi_notifications" does not existOPTIMIZING NAGIOSXI TABLE: xi_meta
SQL: VACUUM ANALYZE xi_meta;

and

SQL: VACUUM ANALYZE xi_users;
CLEANING nagiosql TABLE 'logbook'...
SQL: DELETE FROM tbl_logbook WHERE time < FROM_UNIXTIME(1329362961)
Repair Complete: FAILED TO REMOVE LOCK FILE

It seems that i can backup again though. Since i am not sure if these probs might be related to the graph prob i added them here.

Lastly i noticed that the db backup scripts arent added to the crontab anymore. This seems to be a installer issue i guess?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: NagiosXI issues

Post by scottwilkerson »

I think we may still have some permissions problems. Can I have you run through this
http://assets.nagios.com/downloads/nagi ... ios_XI.pdf

also, the Upgrade screen problem was our issue, thanks for pointing it out.

After creating the backup you mentioned moving the files over... What files did you all move?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
SDohmen
Posts: 240
Joined: Thu Jun 30, 2011 4:14 am

Re: NagiosXI issues

Post by SDohmen »

I did the fix permissions on the central but it didn't change. I added a picture of a service on the central itself.
Knipsel3.PNG
The only file i moved over was the backup that got created with the backupxi script. This seemed to break npcd and ndo services though. I fixed those by running the component installer again. For the rest i dont see any errors on the dashboard or logs anymore.

While i imported the backup i got some mysql errors aswell which i had to fix by forcing the repair. I somehow doubt this has anything todo with it though.

I was just checking the nagios.log and i noticed this entry:
[1329414668] ndomod: Successfully reconnected to data sink! 6548 items lost, 5000 queued items to flush.
[1329414670] ndomod: Successfully flushed 5000 queued items to data sink.

Also i thought that mysql was fixed but it seems its still broken:

Code: Select all

Feb 16 19:00:21 central ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_logentries SET instance_id='1', logentry_time=FROM_UNIXTIME(1329415221), entry_time=FROM_UNIXTIME(1329415221), entry_time_usec='28793', logentry_type='2', logentry_data='Warning: The results of host \'-sw1\' are stale by 0d 0h 1m 0s \(threshold=0d 0h 15m 0s\)\.  I\'m forcing an immediate check of the host\.', realtime_data='1', inferred_data_extracted='1''
Feb 16 19:00:21 central ndo2db: mysql_error: 'Table './nagios/nagios_logentries' is marked as crashed and last (automatic?) repair failed'
Feb 16 19:01:21 central nagios: Warning: The check of host 'o4snoc-landesk' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the host...
Feb 16 19:01:21 central ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_logentries SET instance_id='1', logentry_time=FROM_UNIXTIME(1329415281), entry_time=FROM_UNIXTIME(1329415281), entry_time_usec='250292', logentry_type='2', logentry_data='Warning: The check of host \'landesk\' looks like it was orphaned \(results never came back\)\.  I\'m scheduling an immediate check of the host\.\.\.', realtime_data='1', inferred_data_extracted='1''
Feb 16 19:01:21 central ndo2db: mysql_error: 'Table './nagios/nagios_logentries' is marked as crashed and last (automatic?) repair failed'

You do not have the required permissions to view the files attached to this post.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: NagiosXI issues

Post by scottwilkerson »

Yep we need to fix that table...

Please run

Code: Select all

# /usr/local/nagiosxi/scripts/repairmysql.sh nagios
If you continue to have problems with the table lets run

Code: Select all

mysql -u root -pnagiosxi nagios
REPAIR TABLE nagios_logentries USE_FRM;
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
SDohmen
Posts: 240
Joined: Thu Jun 30, 2011 4:14 am

Re: NagiosXI issues

Post by SDohmen »

The strange thing is that this afternoon i did the above 2 times and both times it said the table was fixed. At the moment i am running a safe-recover to see if that fixes it. Somehow i have the idea that ndo breaks the table.

Is there a way to get rid of the queued ndo data?

[EDIT]
Whatever i tried nothing seem to fix the prob. I am trying the command above again from the mysql shell. At the moment its still running (around 1 hour so far) and i have no idea how far it is.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: NagiosXI issues

Post by scottwilkerson »

let us know if this fixes the table when it completes.

That can be a very large table.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
SDohmen
Posts: 240
Joined: Thu Jun 30, 2011 4:14 am

Re: NagiosXI issues

Post by SDohmen »

It has been running for over 9 hours now and by the looks of it, its still not completed. The table itself is isn't that big (just below 3GB).

Is it normal it takes this long?
SDohmen
Posts: 240
Joined: Thu Jun 30, 2011 4:14 am

Re: NagiosXI issues

Post by SDohmen »

We came to the conclusion that the database was somehow beyond repair. What we did is drop the database en then recreate it. This solved the mysql problems i had. For the graphs i googled some around and i found out that the 32bit rrd files arent compatible with the 64bit ones. One option was to remove them and the other was to convert them. I choose the first of the 2 to save some time ;).

It was a bit of hell though finding the sql script for importing but i managed at the end :). Thanks for the help.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: NagiosXI issues

Post by scottwilkerson »

Glad you got it sorted out.

I never want someone to have drop the database if there is other options, but sometimes that is all that is left.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart