service-perfdata.out got increasing huge size

grayloglearn · Post by **grayloglearn** » Mon Sep 09, 2019 3:44 am

Hi team,

We are getting the issue with service-perfdata.out file usage is more. Could you please suggest how to decrease this.
In our nagios core pnp4 nagios is also configured.

[root@1 nagios]# cd var/
[root@ var]# ls
archives nagios.configtest nagios.log objects.precache rw spool
host-perfdata.out nagios.lock objects.cache retention.dat service-perfdata.out status.dat
[root@1 var]# du -sh *
0 archives
35G host-perfdata.out
4.0K nagios.configtest
4.0K nagios.lock
421M nagios.log
2.7M objects.cache
2.7M objects.precache
4.2M retention.dat
40K rw
598G service-perfdata.out
8.0K spool
4.2M status.dat
[root@1 var]# pwd
/usr/local/nagios/var

benjaminsmith · Post by **benjaminsmith** » Mon Sep 09, 2019 3:12 pm

Hello @grayloglearn,

How long has this been happening and what is the status of npcd?

Code: Select all

systemctl status npcd.service
# or
service npcd status

If it's not running, you can re-start it but with that many files backed up, you'll need to remove them otherwise the server will not be able to process them and keep up with incoming files.

It's also possible that the load on the server is too high, hitting the max threshold settings causing the files to spool up.

grayloglearn · Post by **grayloglearn** » Mon Sep 09, 2019 9:00 pm

Hi team,
we have restarted the npcd service. but even though we are able to see the same usage.

[root@XXXX var]# tail +n 50 service-perfdata.out
tail: cannot open ‘+n’ for reading: No such file or directory
tail: cannot open ‘50’ for reading: No such file or directory
==> service-perfdata.out <==

599G service-perfdata.out. just want to check with you how to trim the data in this files but when we see the service-perfdata.out first 50 lines we are able to see today entry only then how service-perfdata.out is in huge not understand.
1568080719 app_fraepkdb fs_/oracle/stage OK 1 HARD 0.000 0.740 OK - 0.0% used (0.00 of 9.8 GB), (levels at 80.0/90.0%), trend: 0.00B / 24 hours /oracle/stage=1.8515625MB;8000;9000;0;10000.0
1568080719 app_fraepkdb lparstat OK 1 HARD 0.000 0.742 OK: AIX lparstat, user=2.6% sys=2.2% wait=0.1% idle=95.1% physc=0.02 app=1246867123 user=2.6%;;;; sys=2.2%;;;; wait=0.1%;;;; idle=95.1%;;;; physc=0.02;;;; entc=10.5%;;;; lbusy=0.4;;;; app=1246867123;;;;
1568080719 de-ffm-iproxy02 Number of threads OK 1 HARD 0.000 0.727 OK - 224 threads threads=224;2000.0;4000.0;0;
1568080719 app_fraepkdb fs_/tmp OK 1 HARD 0.000 0.740 OK - 25.3% used (2.03 of 8.0 GB), (levels at 80.0/90.0%), trend: +907.81B / 24 hours /tmp=2074.5703125MB;6553;7372;0;8192.0
1568080719 app_fraepkdb fs_/var OK 1 HARD 0.000 0.742 OK - 70.2% used (0.88 of 1.2 GB), (levels at 80.0/90.0%), trend: +115.89KB / 24 hours /var=898.44921875MB;1024;1152;0;1280.0
1568080719 app_fraepkdb Check_MK OK 1 HARD 0.093 0.000 OK - Agent version 1.1.10, execution time 0.1 sec execution_time=0.064
1568080719 htpmsdbtest session-usage OK 1 HARD 0.136 0.000 OK - 11.69% of session resources usedsession_usage=11.69%;80;100
1568080719 app_fraepkdb fs_/usr OK 1 HARD 0.000 0.742 OK - 79.7% used (5.53 of 6.9 GB), (levels at 80.0/90.0%), trend: +544.53KB / 24 hours /usr=5661.95703125MB;5683;6393;0;7104.0
1568080719 de-ffm-iproxy02 fs_/usr OK 1 HARD 0.000 0.732 OK - 25.9% used (2.52 of 9.7 GB), (levels at 80.0/90.0%), trend: +5.08B / 24 hours /usr=2581.40625MB;7961;8956;0;9951.3046875
1568080719 PROVISDBPROD process-usage OK 1 HARD 0.140 0.000 OK - 49.33% of process resources usedprocess_usage=49.33%;80;100

Please suggest how we can resolve this.

scottwilkerson · Post by **scottwilkerson** » Tue Sep 10, 2019 6:50 am

You can truncate it by running

Code: Select all

cat /dev/null > service-perfdata.out

But finding out why it isn't getting reaped by your pnp installation is a different matter

Can you show the output of

Code: Select all

grep perfdata /usr/local/nagios/etc/nagios.cfg
ls -al /usr/local/nagios/var|grep perfdata

Do you get graphs in PNP4nagios ? Is it setup correctly to read these files?

Have you considered Nagios XI that has graphing already setup properly?
https://www.nagios.com/products/nagios-xi/

grayloglearn · Post by **grayloglearn** » Fri Sep 13, 2019 2:08 am

Hi Team,

thanks for the reply, Need some help we have observed that file contain data from aug 2016. Could you please suggest command to remove the data from aug 2016 to dec 2016.

scottwilkerson · Post by **scottwilkerson** » Fri Sep 13, 2019 6:42 am

I don't even know if this files is being used by any systems. With it being 598G I doubt it, but if you want to just keep the last xxxx line you could run something like this

Code: Select all

tail - xxxx service-perfdata.out > service-perfdata.out_new
mv service-perfdata.out_new service-perfdata.out

grayloglearn · Post by **grayloglearn** » Wed Sep 18, 2019 2:08 am

I just want to give some details hope this details will help you to figure it out the problem

# 'process-host-perfdata' command definition
define command{
command_name process-host-perfdata
command_line /usr/bin/printf "%b" "$LASTHOSTCHECK$\t$HOSTNAME$\t$HOSTSTATE$\t$HOSTATTEMPT$\t$HOSTSTATETYPE$\t$HOSTEXECUTIONTIME$\t$HOSTOUTPUT$\t$HOSTPERFDATA$\n" >> /usr/local/nagios/var/host-perfdata.out
}

# 'process-service-perfdata' command definition
define command{
command_name process-service-perfdata
command_line /usr/bin/printf "%b" "$LASTSERVICECHECK$\t$HOSTNAME$\t$SERVICEDESC$\t$SERVICESTATE$\t$SERVICEATTEMPT$\t$SERVICESTATETYPE$\t$SERVICEEXECUTIONTIME$\t$SERVICELATENCY$\t$SERVICEOUTPUT$\t$SERVICEPERFDATA$\n" >> /usr/local/nagios/var/service-perfdata.out
}

#
# Bulk with NPCD mode
#
define command {
command_name process-service-perfdata-file
command_line /bin/mv /usr/local/pnp4nagios/var/service-perfdata /usr/local/pnp4nagios/var/spool/service-perfdata.$TIMET$
}
define command {
command_name process-host-perfdata-file
command_line /bin/mv /usr/local/pnp4nagios/var/host-perfdata /usr/local/pnp4nagios/var/spool/host-perfdata.$TIMET$

The red colored/Bold is having 600GB now. We don't know how to resolve this.
But we will delete the 2016 data in service-perfdata.out then we will check it out how its working. For this i need command to remove the data from 2016 jan to 2016 dec. So that we wll check , could you please help with such commands. We have tried but we could not find the right command.

you ask about some details but we did not provide now we are providing the details please check.

[root@XXXXX ~]# grep perfdata /usr/local/nagios/etc/nagios.cfg
perfdata_timeout=5
# host_perfdata_command (defined below) and service performance
# data will be processed using the service_perfdata_command (also
host_perfdata_command=process-host-perfdata
service_perfdata_command=process-service-perfdata
#host_perfdata_file=/usr/local/nagios/var/host-perfdata
#service_perfdata_file=/usr/local/nagios/var/service-perfdata
#host_perfdata_file_template=[HOSTPERFDATA]\t$TIMET$\t$HOSTNAME$\t$HOSTEXECUTION TIME$\t$HOSTOUTPUT$\t$HOSTPERFDATA$
#service_perfdata_file_template=[SERVICEPERFDATA]\t$TIMET$\t$HOSTNAME$\t$SERVICE DESC$\t$SERVICEEXECUTIONTIME$\t$SERVICELATENCY$\t$SERVICEOUTPUT$\t$SERVICEPERFDA TA$
#host_perfdata_file_mode=a
#service_perfdata_file_mode=a
#host_perfdata_file_processing_interval=0
#service_perfdata_file_processing_interval=0
#host_perfdata_file_processing_command=process-host-perfdata-file
#service_perfdata_file_processing_command=process-service-perfdata-file
# These options determine wether the core will process empty perfdata
# If you don't require empty perfdata - saving some cpu cycles
#host_perfdata_process_empty_results=1
#service_perfdata_process_empty_results=1
service_perfdata_file=/usr/local/pnp4nagios/var/service-perfdata
service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNA ME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\t SERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYP E::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\tSERVICESTATETYPE::$SERVICESTAT ETYPE$
service_perfdata_file_mode=a
service_perfdata_file_processing_interval=15
service_perfdata_file_processing_command=process-service-perfdata-file
host_perfdata_file=/usr/local/pnp4nagios/var/host-perfdata
host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$H OSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHO STSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$
host_perfdata_file_mode=a
host_perfdata_file_processing_interval=15
host_perfdata_file_processing_command=process-host-perfdata-file

[root@XXXX~]# ls -al /usr/local/nagios/var|grep perfdata
-rw-r--r-- 1 nagios nagios 37084081895 Sep 18 09:12 host-perfdata.out
-rw-r--r-- 1 nagios nagios 647629058601 Sep 18 09:12 service-perfdata.out

scottwilkerson · Post by **scottwilkerson** » Wed Sep 18, 2019 6:18 am

Do you have something on your system that is doing something with the file you are appending to at : ?
/usr/local/nagios/var/host-perfdata.out
/usr/local/nagios/var/service-perfdata.out

grayloglearn · Post by **grayloglearn** » Wed Sep 18, 2019 11:49 pm

Actually we are also surprised after seeing service-perfdata.out file. We are really not sure why the file is gradually increasing.
Not aware of appending data too.
But once we open the file using timestamp we could see that file consist the data from 2016 so finally i decided that i want to remove the 2016 data in that file so that we can do some free on that file.

If any chance to give the command to remove the only 2016 jan to 2016 dec data in service-perdata.out??

scottwilkerson · Post by **scottwilkerson** » Thu Sep 19, 2019 6:35 am

I don't have a command to do that but you would somehow need to process the file where the first field is less than 1483189199

Nagios Support Forum

service-perfdata.out got increasing huge size

service-perfdata.out got increasing huge size

Re: service-perfdata.out got increasing huge size

Re: service-perfdata.out got increasing huge size

Re: service-perfdata.out got increasing huge size

Re: service-perfdata.out got increasing huge size

Re: service-perfdata.out got increasing huge size

Re: service-perfdata.out got increasing huge size

Re: service-perfdata.out got increasing huge size

Re: service-perfdata.out got increasing huge size

Re: service-perfdata.out got increasing huge size