Nagios xi - Import hangs without any diagnostic messages

This support forum board is for support questions relating to Nagios xi, our flagship commercial network monitoring solution.
burkemw
Posts: 7
Joined: Tue Nov 16, 2010 5:27 pm

Nagios xi - Import hangs without any diagnostic messages

Post by burkemw »

Issue:
Nagios xi - Import hangs without any diagnostic messages

Active Nagios Core user attempting INITIAL import of functioning Nagios Core configuration into Nagios xi.
Nagios Core configuration functional, but GUI is sluggish due to large configuration and corresponding processing of large flat files (see details below).
Attempting to replace core front-end with Nagios xi to improve GUI performance,allow for target user "role" implementation, and use other xi specific features.
Nagios xi import process hangs without displaying root cause or progress details.
Can import process logging verbosity be enabled and/or increased to determine root cause?
Can any timers be adjusted or parameters altered?

Background (Nagios Core):

Currently running Nagios Core 3.2.3 and Plugins 1.4.15
Fully functional distributed environment with three distributed servers sending checks via nsca to a single central server.
Central server has roughly 15,000 hosts defined and 50,000 services defined.
Distributed servers each have a subset of the full configuration and checks are split between the three.
Central server successfully processing roughly 30,000 passive checks results /5min.
Central server active checks restricted to nagios server/daemon healthchecks executed via nrpe, roughly 300 /5min (these could also be migrated to a distributed server if desired).
Central server objects.cache file consists of 2.5 million lines.
Central server status.dat file consists of 3.7 million lines.

Background (Nagios xi):

Nagiosxi
uname -a
Linux localhost.localdomain 2.6.18-164.9.1.el5 #1 SMP Tue Dec 15 21:04:57 EST 2009 i686 i686 i386 GNU/Linux

Nagiosxi is running as a VM under Oracle VM VirtulalBox version 3.2.10 r66523

I am using the nagiosxi-2009r1.3g-vmware.zip file converted to virtual box as described by the Nagiosxi instructions.

The VM failed to work initially and the initrd had to be modified since virtual box uses SATA drives instead of IDE drives.
Through instruction found at http://support.nagios.com/forum/viewtop ... lbox#p3063 I was able to create the new initrd file:

mkinitrd --allow-missing --preload=ahci --force-scsi-probe /boot/initrd-`uname -r`-custom.img `uname -r`

edit /boot/grub.conf:

from:
initrd /initrd-2.6.18-164.91.e15.img

to:
initrd /initrd-2.6.18-164.91.e15-custom.img


At this point everything fires up and works. With the defaults.

Using instructions from http://library.nagios.com/library/produ ... -prep-tool i was able to import all the config files into the cfgprep directory. One file did generate some php notifications but from looking at the error I don’t think this is an issue.

Now when following the instruction from
http://library.nagios.com/library/produ ... es-into-xi
is where the issue is.

After selecting the files in cfgprep with a check in overwrite database the page just times out and returns nothing. There is no messages indicating anything is being done.. the browser will either timeout or or just return like a blank frame after several minutes.

After giving it plenty of time I moved on to Write monitoring data. Same results here the page does no tell me its doing anything and tends to time out or return a blank frame.
Moving on to write additional data.. I do see output as indicated by the instructions and this does finish successfully.

At this point attempting to do a checkconfig fails as it seems parts of the config is missing even though I verified the files were there in this particular instance it was missing a host template.

I also attempted to use the objects.cache file (2.5 million lines) from our central Nagios server and import that file to the data base with basically the same results.. the import and write process through the web page just seems to time out.

If there are logs somewhere that we could look at to see what is going on while it is trying to process the data that would be helpful. I am thinking our config is just too big for Nagios to handle. I really think the import process is hanging they put on red that they recommend importing in a certain order “commands->timeperiods->contacttemplates etc…. but with our files I don’t think it is possible. Is this really necessary? If so it could be why the import process is failing.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios xi - Import hangs without any diagnostic messages

Post by mguthrie »

As you might imagine, our testing is limited with installations as large as yours. Lets see if we can narrow down the issue.

Is Nagios xi able to apply a new configuration on your system when there is nothing in the import directory? I'd like to know if the Nagios xi is able to Apply a new Configuration at all first.

If the Configuration can apply ok, lets look closer at the import.
Would your configuration allow you to import your files in smaller stages? I'm wondering if it's timing out because of the scale of the import.
burkemw
Posts: 7
Joined: Tue Nov 16, 2010 5:27 pm

Re: Nagios xi - Import hangs without any diagnostic messages

Post by burkemw »

Thank you for the response.

1) I'm afraid we don't understand the questions: "Is Nagios xi able to apply a new configuration on your system when there is nothing in the import directory? I'd like to know if the Nagios xi is able to Apply a new Configuration at all first." Are we being asked to import "nothing," to attempt an import operation with no input files? Can you be more specific?

2) We have made several import attempts--none of which have completed. Perhaps we are adding more and more configuration fragments to the xi database? Is there an option or procedure to zero-out the database contents?

3) We have a secondary stand-alone Nagios installation which is very small (Hosts+Services<100). We are going to attempt to import that small config. If that occurs without a problem, we'll know that our installation is functional and the problem probably does reside with the scale/size of our "main" configuration. Of course, we will then want to zero-out the database to start over for testing of our "main" configuration (see item #2).

4) We do have many, many configuration files in our "main" configuration that will not allow us to import in the Nagios xi breakdown/order suggested. However, it seems to me that we could perform some editing of our objects.cache or objects.precache file (created as the result of nagios core parsing all of our individual cfg files) and easily split THAT into the requested segments as an alternate approach to the Nagios xi import. Does this seem feasible? Do you see an inherent problem with this approach?






Is Nagios xi able to apply a new configuration on your system when there is nothing in the import directory? I'd like to know if the Nagios xi is able to Apply a new Configuration at all first.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios xi - Import hangs without any diagnostic messages

Post by mguthrie »

My apologies.

Try removing your import files from the import directory, and then try adding a single host or service through the Core Config Manager (CCM) and then attempt to Apply the Configuration. See if the configuration Applies correctly and the host or service is visible in the status tables.

If that's working correctly we'll take a closer look at the import.
burkemw
Posts: 7
Joined: Tue Nov 16, 2010 5:27 pm

Re: Nagios xi - Import hangs without any diagnostic messages

Post by burkemw »

using the web interface to import a single host or a smaller config works ok. the import results are displayed as they should.

Is there a easy way to clear out the database of all the old host and services i tried to import?
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios xi - Import hangs without any diagnostic messages

Post by mguthrie »

The "safest" way I know would be to just delete all but the original services, and then all but the original "localhost." I'm sure there's an SQL query that could do it more easily, but I don't know for sure if that would cause unwanted issues elsewhere. That's probably something we should get documented since you're not the first person to ask about it.

In the Core Config Manager->Core Config Manager Admin->Config Manager Settings you can change the result limit so that you don't have to go through dozens of pages to do a mass delete.
burkemw
Posts: 7
Joined: Tue Nov 16, 2010 5:27 pm

Re: Nagios xi - Import hangs without any diagnostic messages

Post by burkemw »

I ended up reverting to a snapshot of the VM taken before trying to implement the large config files and was able to run the import, write the configuration data and additional data then verify the config. the config did generate some duplicate entries (mainly in the _empty_hosts.cfg and _multiple_hosts.cfg files) which were not there prior to the import (import script) but i was able to get the small nagios config to work ok..
now trying to import the large config renders the same results as before.
Also when trying to say delete 1000 host at a time though the web interface that does not work 500 or so host is ok but slow.

Again why would the web page "timeout" and not return results for importing the config and writing configs when it does so with smaller files.
our config files are not set up to allow us to import them in the manner the instructions say so would it be feasible to break down the current objects.cashe file from the large main server and try to use that to import into the database.

Is there any additional data that you need?
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios xi - Import hangs without any diagnostic messages

Post by mguthrie »

I'm guessing that the page times out because the time limit exceeds a threshold somewhere. It may be either a webserver configuration or a NagiosQL setting. Let me do some snooping around and I'll see what I can find.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios xi - Import hangs without any diagnostic messages

Post by mguthrie »

It looks like the import process could be hitting a timeout limit in a few different places.

The /etc/httpd/conf/httpd.conf file has a time limit of 120 set for http requests.
The /etc/php.ini file has limits for both input and execution time on it's scripts.

You could try cranking those numbers up and see if it imports.

I'm not sure anyone has ever tried breaking down the objects file from a previous server. However, the problem with this is that the template information will be lost, which I'm guessing would be used extensively on a larger environment.

Also, not sure if this is useful to know or not, but the nagios/etc/static directory allows for manual maintenance of config files, and the files in the directory are read by Nagios xi, but are not managed by NagiosQL. Some people using as a staging area while they import.
burkemw
Posts: 7
Joined: Tue Nov 16, 2010 5:27 pm

Re: Nagios xi - Import hangs without any diagnostic messages

Post by burkemw »

We create/use the precache in order to speed up restart operations.

While we do make extensive use of templates in our user-edited configuration files, the act of creating the pre-cache applies all of the templates and creating a "result" file. While the templates themselves are not transferred to the precache file, the template PREFERENCES have been applied to the host/service definitions in creating the precache file. Therefore, the lack of actual template definitions in the resulting objects.precache file doesn't seem as though it should present any issue.

We'll try breaking down the precache file into multiple sections for a more segmented import.