Re: [Nagios-devel] Test Please: Buffer Slots Variable CVS Code

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Guest

Re: [Nagios-devel] Test Please: Buffer Slots Variable CVS Code

Post by Guest »

Ton Voon wrote:
>
> On 22 Dec 2006, at 01:50, Ethan Galstad wrote:
>
>> Based on the recent thread about hanging Nagios processes, I have
>> removed the COMMAND_BUFFER_SLOTS and SERVICE_BUFFER_SLOTS definitions
>> out to config file variables:
>>
>> external_command_buffer_slots=4096
>> check_result_buffer_slots=4096
>>
>> I have also updated nagiostats to report the avail/used number of slots
>> for graphing in MRTG. Could folks try out the latest 2.x CVS code and
>> give it some testing?
>
> Ethan,
>
> Thanks for applying to CVS. Several comments:
>
> - external_command_buffer_slots and check_result_buffer_slots only needs
> to be an int as the circular_buffer struct only uses an int for items
>
> - in xsddefault.c, when you print out external_command_buffer.items, I
> think this is not thread-safe. My thread knowledge is pretty limited, so
> please correct me if I am wrong. The main nagios process writes the
> status data via xsddefault_save_status_data, which needs to read the
> external_command_buffer variable. However, this variable is written to
> by the command_file_worker_thread. So I think the
> xsddefault_save_status_data routine needs a thread lock on
> external_command_buffers before it can read the items data, otherwise
> there is the potential for corrupt data. Note, there is a cost to that,
> especially if the status data is being written with
> aggregate_status_updates = 0.
>
> - your output to status.dat is different from mine. You are outputting
> max_external_command_buffer_slots (the value defined in nagios.cfg) and
> used_external_command_buffer_slots (the current number of items in the
> buffer). In my patch, I had a different definition:
> max_command_buffer_items meant the "maximum number of items that has
> been in the buffer".
>
> (I would prefer used_external_command_buffer_slots be changed to
> current_external_command_buffer_slots because it more accurately
> describes "this is the number I have now".)
>
> From now on, I'll call it high_external_command_buffer_items, as it can
> also be the "high water mark of the number of items in the buffer". This
> is a useful statistic as it tells you what the
> max_external_command_buffer_slots should be to get no holdups.
>
> Also, it probably makes sense to put the high water mark within the
> circular_buffer struct.
>
> Please find a patch attached with these changes.
>
> On my small test system, the used_check_result_buffer_slots is usually
> 0. When I introduce 1 fake slave (128 results per 10 seconds),
> used_check_result_buffer fluctuates from 0 to 20s to 30s. Introducing a
> 2nd fake slave, the high mark moves up to 100s. A 3rd slave moves the
> high mark to 192.
>
> If I introduce NDO into the system, I get a large iowait time (in the
> 80%s), presumably database writes. The status file is not updated as
> regularly (one instance of 60 seconds between writes), but when it does,
> then the high_* values jump up to the 200-300s. This is a poorly
> configured database, so I'm guessing that there are delays due to the
> main nagios process passing data to the the broker module.
>
> At the moment with 2 slaves sending 128 packets per 10 seconds, I'm
> getting high values of 983 for external commands and 1405 for check results.
>
> I think these recent changes help with seeing if there are bottlenecks
> at the reading of the command pipe, but I think there are possibly other
> slow downs further down the chain (which Nagios 3 may aid with).
>
> Ton


Good suggestions. I have applied almost identical patches to CVS based
on your comments. Thanks!



Ethan Galstad,
Nagios Developer
---
Email: nagios@nagios.org
Website: http://www.nagios.org





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: nagios@nagios.org