CPU Usage when running reports/queries

This support forum board is for support questions relating to Nagios Network Analyzer, our network traffic and bandwidth analysis solution.
simon.keelan
Posts: 2
Joined: Thu Nov 21, 2013 8:57 am

CPU Usage when running reports/queries

Post by simon.keelan »

We have just started a trial of Network Analyzer in our environment on a VM running NA. VM is configured as follows: CentOS 6.x, 4GB RAM, 2 vCPU, 10gb disk.

Only one source is configured in Network Analyzer, saving 24hrs of flow data, totalling ~1.8GB of used disk for this single router. If the source is selected and simple reports or queries run against the flow data, the CPU on the VM spikes to 100% and stays that way for 30-seconds to a 1 minute. This is cause for concern as we have only added one source and it is at this point a single user install as credentials have not yet been shared with other staff.

I'm certainly hopeful this is an anomaly and easily rectified through configuration. Otherwise it does not bode well for adding ~100 routers and multiple users to the system if reporting Top 5 on one source consumes all available CPU.

Anything I can take a look as a starting point?
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: CPU Usage when running reports/queries

Post by abrist »

How fast is the disk? When running a report/query, run top and check the io wa(it). Report back with your findings.
Do you see any issues in the apache log:

Code: Select all

tail -50 /var/log/httpd/error_log
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
simon.keelan
Posts: 2
Joined: Thu Nov 21, 2013 8:57 am

Re: CPU Usage when running reports/queries

Post by simon.keelan »

Vm storage is on 600GB 15K SAS
Hosts: 256GB RAM and 2 x 8 core Dell R720s

iowait sits quite low, I've seen it spike up to 3% or so only for a second, otherwise it sits at or very near 0% throughout the high cpu utilization.

tail -50 /var/log/httpd/error_log produces the following:

[Sun Oct 19 03:08:02 2014] [notice] Digest: generating secret for digest authentication ...
[Sun Oct 19 03:08:02 2014] [notice] Digest: done
[Sun Oct 19 03:08:02 2014] [notice] Apache/2.2.15 (Unix) DAV/2 PHP/5.3.3 configured -- resuming normal operations
[Thu Oct 23 09:30:56 2014] [error] [client 10.245.8.26] File does not exist: /var/www/html/favicon.ico

Other than when a query/report is run cpu sits at or near 0%. The offending process is nfdump and it would seem that it is not multi-threaded as overall system CPU is at 50% but a single vCPU is pegged at 100%. I should also note that it takes over 1 minute to just load the landing page for the one source I have added to the database. Takes about 15-20 secs to display the bandwidth graph about one minute to populate the 'Top 5' talkers lists at the bottom of the page. Overall host cpu/ram resources are not taxed or overprovisioned and we experience no performance issues on any of the other 6 vms on the host.

I just now rebooted the server and again nfdump is consuming 100% of one vCPU. I haven't tried loading the web interface at this point, just bounced the server and watching top.
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: CPU Usage when running reports/queries

Post by sreinhardt »

This certainly does not seem like normal load for an NNA server. What kind of device is pushing to NNA? Do you know about how much total traffic this device should be seeing in terms of GB/s? I run several sources and store for 1 month, with no real issues. You are correct that nfcapd is not threaded, one daemon per source is run, and they are only used for collecting and putting into rrds. Once that is completed rrdtool, and a mix of php and python handle the rest of it. If nfcapd was overloaded, I could see it causing some issues like this, but it seems like a fair amount more wait than I would expect. Also, is the 15k disk\array local or through a san?
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
matt.lilek
Posts: 137
Joined: Wed Aug 07, 2013 11:53 am

Re: CPU Usage when running reports/queries

Post by matt.lilek »

Its running off of a DAS. And it sees about 5 GB a minute on average i would say. It is now consuming about 2.6 GB of disk for a 24 hour time period.
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: CPU Usage when running reports/queries

Post by sreinhardt »

Just a note, looking at possible io issues with your storage, and doing internal testing on our end for different sizes and types of flows to see where the issue may lie.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.