Mod gearman process died

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
elinagios
Posts: 146
Joined: Thu Feb 16, 2017 3:45 am

Mod gearman process died

Post by elinagios »

Hello

nagios - 5.7.3
centos 7
mod_gearman 3

i accidently stumbled on my nagios load graph today and it showed over 2 hours of doing nothing. When i started to look into it, i discovered mod_gearman proccess dieing and then starting again in 2hours.

From nagios event log:
Information 2020-11-12 15:14:42 Event broker module '/usr/lib64/mod_gearman/mod_gearman_nagios4.o' initialized successfully.
Information 2020-11-12 15:14:42 mod_gearman: initialized version 3.0.7 (libgearman 0.33)
Information 2020-11-12 15:14:42 Event broker module '/usr/local/nagios/bin/ndo.so' initialized successfully.
Process Information 2020-11-12 13:01:33 Caught SIGSEGV, shutting down...
Information 2020-11-12 12:59:50 Successfully launched command file worker with pid 68617
Runtime Warning 2020-11-12 12:59:48 WARNING: RLIMIT_NPROC is 127602, total max estimated processes is 172158! You should increase your limits (ulimit -u, or limits.conf)
Information 2020-11-12 12:59:50 Successfully launched command file worker with pid 68617

Information 2020-11-12 12:59:47 Event broker module '/usr/lib64/mod_gearman/mod_gearman_nagios4.o' initialized successfully.
Information 2020-11-12 12:59:47 mod_gearman: initialized version 3.0.7 (libgearman 0.33)
Information 2020-11-12 12:59:47 Event broker module '/usr/local/nagios/bin/ndo.so' initialized successfully.
Process Information 2020-11-12 12:59:41 Successfully shutdown... (PID=92225)
Process Information 2020-11-12 12:59:41 Caught SIGTERM, shutting down...


Seems the SIGTERMS are happening regulary and after that process starts right up, but after SIGSEGV it stopped for 2h.

Any suggestions?
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Mod gearman process died

Post by tgriep »

Can you restart the nagios process and let it run for about 10 minutes.

Then get the /usr/local/nagios/var/nagios.log file and add it to the post.

Also, I will need you to run the following commands as root and post the output to the ticket.

Code: Select all

ps -ef --cols=300
yum list installed |grep gear
One thing I do see in the data you provided is that the time looks like it went backwards an hour and 45 minutes. Make sure the system's time is stable and that it is getting updated with a stable time source.
Time changes could be causing the issue you are having.
Be sure to check out our Knowledgebase for helpful articles and solutions!
elinagios
Posts: 146
Joined: Thu Feb 16, 2017 3:45 am

Re: Mod gearman process died

Post by elinagios »

The time is correct, it is synced from 2 different ntp servers. The logs have been taken from the Nagios Home view, event logs. It jumps because the process died and nothing happened till it restarted itself.
Luckily the problem havent occured anymore and i have scheduled a centos upgrade, 7.9 came out recently and nagiosxi upgrade to the latest.
Hopefully those solve the issue and it will not occur again.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Mod gearman process died

Post by tgriep »

Thanks for the update. If you have any further questions, post them here and we'll get back to you.
Be sure to check out our Knowledgebase for helpful articles and solutions!