Perhaps the hardware requirements for 2012 are slightly higher and we've experienced the straw that broke the camels back during the upgrade. Hope so

4 core 4Gb, 40Gb drive, currently 175/2133 host/services, about 300 passive, rest active, one of the issues we do have is that we are monitoring servers in Australia from our UK base, latency has always been high.lmiltchev wrote:Here's the official Nagios XI hardware requirements:
http://assets.nagios.com/downloads/nagi ... ements.pdf
What's your system like? Do you meet (or exceed) these requirements?
Code: Select all
2012-10-24 21:48:42 [3852] [0] *** process_perfdata.pl terminated on signal ALRM
2012-10-24 21:50:19 [4893] [0] *** TIMEOUT: Timeout after 15 secs. ***
2012-10-24 21:50:19 [4893] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2012-10-24 21:50:19 [4893] [0] *** TIMEOUT: Please check your npcd.cfg
2012-10-24 21:50:19 [4893] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1351083622.perfdata.service-PID-4893 deleted
2012-10-24 21:50:19 [4893] [0] *** Timeout while processing Host: "asc-jadedev1.int.ascribe.com" Service: "Page_File_Usage"
2012-10-24 21:50:19 [4893] [0] *** process_perfdata.pl terminated on signal ALRM
2012-10-24 21:52:48 [6520] [0] *** TIMEOUT: Timeout after 15 secs. ***
2012-10-24 21:52:48 [6520] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2012-10-24 21:52:48 [6520] [0] *** TIMEOUT: Please check your npcd.cfg
2012-10-24 21:52:48 [6520] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1351083763.perfdata.service-PID-6520 deleted
2012-10-24 21:52:48 [6520] [0] *** Timeout while processing Host: "CORE-TFS" Service: "Drive_H__Disk_Usage"
2012-10-24 21:52:48 [6520] [0] *** process_perfdata.pl terminated on signal ALRM
2012-10-24 21:53:05 [6620] [0] *** TIMEOUT: Timeout after 15 secs. ***
2012-10-24 21:53:05 [6620] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2012-10-24 21:53:05 [6620] [0] *** TIMEOUT: Please check your npcd.cfg
2012-10-24 21:53:05 [6620] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1351083777.perfdata.service-PID-6620 deleted
2012-10-24 21:53:05 [6620] [0] *** Timeout while processing Host: "localhost" Service: "Avg_HostExecTime"
2012-10-24 21:53:05 [6620] [0] *** process_perfdata.pl terminated on signal ALRM
2012-10-24 21:53:44 [6889] [0] *** TIMEOUT: Timeout after 15 secs. ***
2012-10-24 21:53:44 [6889] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2012-10-24 21:53:44 [6889] [0] *** TIMEOUT: Please check your npcd.cfg
2012-10-24 21:53:44 [6889] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1351083807.perfdata.service-PID-6889 deleted
2012-10-24 21:53:44 [6889] [0] *** Timeout while processing Host: "cnllhrs5.cnl.cnw.co.nz" Service: "Check_Temp_-_ioBoard"
2012-10-24 21:53:44 [6889] [0] *** process_perfdata.pl terminated on signal ALRM
2012-10-24 21:53:44 [6896] [0] *** TIMEOUT: Timeout after 15 secs. ***
2012-10-24 21:53:44 [6896] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2012-10-24 21:53:44 [6896] [0] *** TIMEOUT: Please check your npcd.cfg
2012-10-24 21:53:44 [6896] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1351083824.perfdata.service-PID-6896 deleted
2012-10-24 21:53:44 [6896] [0] *** Timeout while processing Host: "cnllhrs4.cnl.cnw.co.nz" Service: "Memory_Usage"
2012-10-24 21:53:44 [6896] [0] *** process_perfdata.pl terminated on signal ALRM
2012-10-25 07:07:38 [5895] [0] *** TIMEOUT: Timeout after 15 secs. ***
2012-10-25 07:07:38 [5895] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2012-10-25 07:07:38 [5895] [0] *** TIMEOUT: Please check your npcd.cfg
2012-10-25 07:07:38 [5895] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1351107834.perfdata.service-PID-5895 deleted
2012-10-25 07:07:38 [5895] [0] *** Timeout while processing Host: "mel-jadedev2.int.ascribe.com" Service: "Drive_H__Disk_Usage"
2012-10-25 07:07:38 [5895] [0] *** process_perfdata.pl terminated on signal ALRM
Code: Select all
2012-10-25 10:05:09 [1348] [1] Found Performance Data for core-email.int.ascribe
.com / Drive_D__Disk_Usage (D:\ Used Space=0.12Gb;11.72;13.18;0.00;14.65)
2012-10-25 10:05:09 [1348] [1] Found Performance Data for gri_acu-fp-10.acute.xg
lasgow.scot.nhs.uk / CPU_Usage (5 min avg Load=1%;85;95;0;100)
2012-10-25 10:05:12 [1348] [1] Found Performance Data for cnllhrs2.cnl.cnw.co.nz
/ Drive_D__Ops_Tools (D:\ Used Space=28.04Gb;31.22;35.13;0.00;39.03)
2012-10-25 10:05:17 [1348] [0] *** TIMEOUT: Timeout after 5 secs. ***
2012-10-25 10:05:17 [1348] [0] *** TIMEOUT: Deleting current file to avoid NPCD
loops
2012-10-25 10:05:17 [1348] [0] *** TIMEOUT: Please check your npcd.cfg
2012-10-25 10:05:17 [1348] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata
//1351123012.perfdata.service-PID-1348 deleted
2012-10-25 10:05:17 [1348] [0] *** Timeout while processing Host: "cnllhrs2.cnl.
cnw.co.nz" Service: "Drive_D__Ops_Tools"
2012-10-25 10:05:17 [1348] [0] *** process_perfdata.pl terminated on signal ALRM
Code: Select all
drwxrwxr-x 7 nagios nagios 4096 Oct 25 11:23 .
drwxr-xr-x 9 root root 4096 Oct 9 2011 ..
drwxrwxr-x 2 nagios nagios 20480 Oct 25 00:00 archives
drwxr-xr-x 2 nagios nagios 4096 Dec 22 2011 archives.old
-rw-r--r-- 1 apache apache 47880 Oct 13 11:07 graphapi.log
-rw-rw-r-- 1 nagios users 248 Oct 25 11:23 host-perfdata
-rw-r--r-- 1 root root 11060 May 17 08:54 nagios.debug
-rw-r--r-- 1 nagios users 5 Oct 25 11:16 nagios.lock
-rw-r--r-- 1 nagios nagios 5 Oct 25 09:10 ndo2db.lock
-rw-rw-r-- 1 nagios users 0 Oct 25 11:16 ndomod.tmp
srwxr-xr-x 1 nagios nagios 0 Oct 25 09:10 ndo.sock
-rw-r--r-- 1 nagios nagios 7346089 Oct 25 11:23 npcd.log
-rw-r--r-- 1 nagios nagios 10485783 Aug 31 08:11 npcd.log.old
-rw-r--r-- 1 nagios nagios 2145277 Oct 25 11:16 objects.cache
-rw-rw-rw- 1 nagios nagios 7459491 Oct 25 10:05 perfdata.log
-rw------- 1 nagios users 3485074 Oct 25 11:16 retention.dat
drwxrwsr-x 2 nagios nagcmd 4096 Jun 18 09:42 rw
-rw-rw-r-- 1 nagios users 4050 Oct 25 11:23 service-perfdata
drwxr-xr-x 5 nagios nagios 4096 Jan 26 2011 spool
drwxr-xr-x 2 nagios nagios 4096 Oct 25 10:05 stats
-rw-rw-r-- 1 nagios users 3402524 Oct 25 11:23 status.dat
Code: Select all
2012-10-25 13:12:06 [1058] [1] Found Performance Data for test-esx.int.ascribe.c
om / VMware_Host_Current_Datastore_datastore1_Usage (datastore1-free=14087723417
6B;3;1;0;141465485312 datastore1=588251136B;141465485309;141465485311;0;14146548
5312)
2012-10-25 13:12:06 [1058] [1] Found Performance Data for ascribe-esx2.int.ascri
be.com / VMware_Host_Current_Datastore_vmfs05_Usage (VMFS_05-free=20057161728B;3
;1;0;549487378432 VMFS_05=529430216704B;549487378429;549487378431;0;549487378432
)
2012-10-25 13:12:06 [1058] [1] Found Performance Data for localhost / PassiveSer
viceChecks_1mn (Passive_Checks_1mn=0;;;)
2012-10-25 13:12:06 [1058] [1] Found Performance Data for tstesting-web2.elt / I
IS_Web_Server_Connections (CurrentConnections=0; _ConnectionAttemptsPersec=0;)
2012-10-25 13:12:06 [1058] [1] 127 lines processed
Code: Select all
2012-10-25 15:43:55 [32332] [1] Found Performance Data for tstesting-db1.elt / CPU_Usage (5 min avg Load=3%;85;95;0;100)
2012-10-25 15:43:55 [32332] [1] Found Performance Data for core-email.int.ascribe.com / Memory_Usage (Memory usage=5770.10Mb;27013.89;30191.99;0.00;31781.05)
2012-10-25 15:43:55 [32332] [1] Found Performance Data for ascribesql.xchristie.nhs.uk / CPU_Usage (5 min avg Load=19%;85;95;0;100)
2012-10-25 15:43:55 [32332] [1] Found Performance Data for cnllhrs4.cnl.cnw.co.nz / CPU_Usage (5 min avg Load=0%;85;95;0;100)
2012-10-25 15:43:55 [32332] [1] Found Performance Data for gri_acu-fp-10.acute.xglasgow.scot.nhs.uk / CPU_Usage (5 min avg Load=1%;85;95;0;100)
2012-10-25 15:43:55 [32332] [1] Found Performance Data for dev-esx.int.ascribe.com / VMware_Host_Current_Datastore_DEV2_Usage (DEV2-free=1058624503808B;3;1;0;1466462896128 DEV2=40783
8392320B;1466462896125;1466462896127;0;1466462896128)
2012-10-25 15:43:57 [32332] [0] *** TIMEOUT: Timeout after 5 secs. ***
2012-10-25 15:43:57 [32332] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2012-10-25 15:43:57 [32332] [0] *** TIMEOUT: Please check your npcd.cfg
2012-10-25 15:43:57 [32332] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1351139712.perfdata.service-PID-32332 deleted
2012-10-25 15:43:57 [32332] [0] *** Timeout while processing Host: "dev-esx.int.ascribe.com" Service: "VMware_Host_Current_Datastore_DEV2_Usage"
2012-10-25 15:43:57 [32332] [0] *** process_perfdata.pl terminated on signal ALRM
Code: Select all
cd /usr/local/nagios/var/spool/perfdata
ls -f | wc -l
cd /usr/local/nagios/var/spool/xidpe
ls -f | wc -l
JulianFDRacing wrote:[root@NagiosXI perfdata]# cd /usr/local/nagios/var/spool/perfdata
[root@NagiosXI perfdata]# ls -f | wc -l
4687
[root@NagiosXI perfdata]#
[root@NagiosXI perfdata]# cd /usr/local/nagios/var/spool/xidpe
[root@NagiosXI xidpe]# ls -f | wc -l
4