graph error after upgrade to R3.3

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
nagiosadmin42
Posts: 96
Joined: Sat Feb 11, 2012 2:16 pm

graph error after upgrade to R3.3

Post by nagiosadmin42 »

I just completed the upgrade to xi-2011r3.3 and now graphs don't display. We're getting these errors in /var/log/httpd/error_log whenever we attempt to view a graph:

Code: Select all

[Thu Aug 23 15:53:42 2012] [error] [client 172.16.1.62] PHP Warning:  simplexml_load_string(): Entity: line 2: parser error : XML declaration allowed only at the start of the document in /usr/local/nagiosxi/html/includes/utils-backend.inc.php on line 27, referer: http://nagios.mycompany.com/nagiosxi/index.php?
[Thu Aug 23 15:53:42 2012] [error] [client 172.16.1.62] PHP Warning:  simplexml_load_string(): <?xml version="1.0" encoding="utf-8"?> in /usr/local/nagiosxi/html/includes/utils-backend.inc.php on line 27, referer: http://nagios.mycompany.com/nagiosxi/index.php?
[Thu Aug 23 15:53:42 2012] [error] [client 172.16.1.62] PHP Warning:  simplexml_load_string():      ^ in /usr/local/nagiosxi/html/includes/utils-backend.inc.php on line 27, referer: http://nagios.mycompany.com/nagiosxi/index.php?
Now, there was a problem during the upgrade, the installer script aborted near the end, at the restart of the httpd service. It failed because it thought the service's port was already in use, apparently the restart command didn't succeed. Anyway, I simply entered the command "service httpd restart" manually, and at that point it succeeded. I saw that there were only two commands remaining to process in the upgrade script, so I entered them manually at the console (all commands were done as root):

./install-templates
./install-sourceguardian-extension.sh

Should I try running the upgrade script again?
Last edited by nagiosadmin42 on Thu Aug 23, 2012 7:23 pm, edited 1 time in total.
nagiosadmin42
Posts: 96
Joined: Sat Feb 11, 2012 2:16 pm

Re: graph error after upgrade to R3.3

Post by nagiosadmin42 »

I thought I'd try to troubleshoot this, to get some more details on what is going wrong. I brought up the Nagios XI admin GUI in Firebug and navigated to one of the service details and clicked on the Performance graphs tab. I watched the Firebug console, and saw this error scroll by amongst the many ajax requests taking place:

Code: Select all

Image corrupt or truncated: http://nagios.mycompany.com/nagiosxi/includes/components/perfdata/graphApi.php?host=TEST-HOST&service=Test%2BService&source=1&view=1&start=&end=&rand=1345764817
I copied the URL to a separate Firefox tab, also with Firebug open. Firefox displays an IMAGE containing the text: "The image 'http://...' cannot be displayed because it contains errors.", and the same "Image corrupt or truncated" error is logged to the Firebug console. What's interesting is that the response headers indicated the status was "200 OK". Here are the Request and Response headers displayed by Firebug.

Code: Select all

REQUEST Headers:
GET /nagiosxi/includes/components/perfdata/graphApi.php?host=TEST-HOST&service=Test%2BService&source=1&view=1&start=&end=&rand=1345764997 HTTP/1.1
Host: nagios.mycompany.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:14.0) Gecko/20100101 Firefox/14.0.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: nagiosxi=sj6ik84bda53mm234alnljl306
X-ClickOnceSupport: ( .NET CLR 3.5.30729; .NET4.0E)

RESPONSE Headers:
HTTP/1.1 200 OK
Date: Thu, 23 Aug 2012 23:37:27 GMT
Server: Apache/2.2.15 (CentOS)
X-Powered-By: PHP/5.3.3
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: nagiosxi=sj6ik84bda53mm234alnljl306; expires=Fri, 24-Aug-2012 00:07:27 GMT; path=/
Connection: close
Transfer-Encoding: chunked
Content-Type: image/png
To see what's going on with graphApi.php, I set $bool = 1 after the call to passthru() to force it to log the RRDTool command being executed for the graph. Then I watched /usr/local/nagios/var/graphapi.log and reloaded the URL.

The log showed:

Code: Select all

graph ERROR: 2012-08-23T16:57:07-07:00
/usr/bin/rrdtool graph -  --width=500 --height=100  --start=-24h --vertical-label "" --title "TEST-HOST / Test_Service" --lower-limit 0 DEF:var1=/usr/local/nagios/share/perfdata/TEST-HOST/Test_Service.rrd:1:MAX AREA:var1#EACC00:"Nbr_Users " LINE1:var1#000000:"" GPRINT:var1:LAST:"%3.4lf  LAST " GPRINT:var1:MAX:"%3.4lf  MAX " GPRINT:var1:AVERAGE:"%3.4lf  AVERAGE \n" COMMENT:"Check Command check_dummy\r"
I ran the rrdtool command manually at the console, and redirected its output to /tmp/test.png. The file was 19k, as previously indicated by Firebug in the browser. I uploaded the .png file to my PC and the graph displays just fine! (Windows 7, with the default Windows Photo Viewer) Note that it also displays fine when loaded into Firefox using a file: URL, e.g. file:///C:/Users/myuser/Desktop/test.png

At this point, I'm not sure what's happening with graphApi.php to make the browser think the image is corrupt or truncated. And I don't know if the xml errors reported in my initial post are related or a separate issue.

Any tips on how to continue troubleshooting this problem?
Last edited by nagiosadmin42 on Fri Aug 24, 2012 12:03 pm, edited 1 time in total.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: graph error after upgrade to R3.3

Post by scottwilkerson »

Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
nagiosadmin42
Posts: 96
Joined: Sat Feb 11, 2012 2:16 pm

Re: graph error after upgrade to R3.3

Post by nagiosadmin42 »

Yeah, I saw that post. However, we're not using rrdcached.

Is there a problem with running the upgrade script a second time? I was thinking about doing that.
nagiosadmin42
Posts: 96
Joined: Sat Feb 11, 2012 2:16 pm

Re: graph error after upgrade to R3.3

Post by nagiosadmin42 »

I ran the upgrade again, it didn't resolve the problem, still no graphs.
nagiosadmin42
Posts: 96
Joined: Sat Feb 11, 2012 2:16 pm

Re: graph error after upgrade to R3.3

Post by nagiosadmin42 »

I tracked the problem down to our custom PNP script /usr/local/nagios/share/pnp/templates.special/check_dummy.php

Ninety percent of our services are passive, and by default use the check_dummy.php template to display their graphs. I was looking through our services, trying to figure out what could be going on, and by chance happened to view one that doesn't use this template, and the graph worked!

We've been using a modified version of check_dummy.php for a while now. I restored the original version and the graphs are now working.

I need to review the custom template and determine why it stopped working on the R3.3 upgrade. I'll keep you posted.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: graph error after upgrade to R3.3

Post by scottwilkerson »

Thanks for sharing your findings.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
nagiosadmin42
Posts: 96
Joined: Sat Feb 11, 2012 2:16 pm

Re: graph error after upgrade to R3.3

Post by nagiosadmin42 »

I found the problem. It is caused by our template having a blank line at the end of the file!

This file difference shows the problem:

Code: Select all

# diff check_dummy.php check_dummy.php.NEW
206a207
>
Removing that blank line at the end of the file fixed it.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: graph error after upgrade to R3.3

Post by mguthrie »

Those whitespace bugs are killer, nice catch, and thanks for posting what you found!