Nagios graphing - what to use?

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
veehexx
Posts: 43
Joined: Mon Jan 09, 2017 9:17 am

Nagios graphing - what to use?

Post by veehexx »

until this week we've been happily running with Nagios Core to monitor our systems. graphing has been on my to-do list for ages but after downtime earlier this week with excessive disk space consumption taking our our SAN (~850GB in 12hours; which is about 50% of our entire main fileserver), we need to add additional monitoring and trending. While i have rough estimates of usage over time, it would be good to visualise it and report on it when it meets certain criteria.

SO, what i'm thinking is something that can handle monitoring along the lines of '15% change of disk over a 3 hour period = alert'.
It's possible i'm trying to reinvent the wheel and there's a solution already in place that i haven't found? seems like one of those features that could of existed for a long time, but well hidden.

We've looked at NagiosXI, and tieing it into other graphing products like graphite or cacti, but we've next to no experience with these products know if that will suit our needs.

As you can tell, we have no intention of moving away from nagios but would like to extend it's abilities from what we have now...

and help/advice would be appreciated - thanks!
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Nagios graphing - what to use?

Post by mcapra »

Nagios Core/XI aren't super duper geared towards time-series monitoring. Nagios XI can do the collection/analysis out of the box, but in terms of alerting on trends, there isn't much.

For simple graphing and recording time-series data, PNP4Nagios is pretty popular.

I can think of a few ways to go about this in Nagios XI that all involve custom plugins, but if I were to be tackling this problem from a Nagios Core perspective I'd just feed the perfdata from every check into something like Prometheus by leveraging an existing nagios perfdata exporter. You would need to know some Prometheus things to get the ball rolling, though.

Though if you work under the assumption that the perfdata is being collected in a consistent predictable fashion (like it is in Nagios XI or PNP4Nagios), a generic plugin could probably be written to read a given perfdata file over X hours and do some math.
Former Nagios employee
https://www.mcapra.com/
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN

Re: Nagios graphing - what to use?

Post by dwhitfield »

mcapra wrote: Though if you work under the assumption that the perfdata is being collected in a consistent predictable fashion (like it is in Nagios XI or PNP4Nagios), a generic plugin could probably be written to read a given perfdata file over X hours and do some math.
Not the only way, but possibly the easiest.


Apologies if this is too basic, and I certainly don't want to steer you away from XI, but I don't want to overlook the obvious. Why not just check more often, or lower the warning and critical thresholds?
veehexx
Posts: 43
Joined: Mon Jan 09, 2017 9:17 am

Re: Nagios graphing - what to use?

Post by veehexx »

dwhitfield wrote:Apologies if this is too basic, and I certainly don't want to steer you away from XI, but I don't want to overlook the obvious. Why not just check more often, or lower the warning and critical thresholds?
the problem actually occured after hours; as mentioned, our SAN ran out of disk space. it triggered an snmp alert at 90%, which i saw on email. it then chewed through another 5% (= 820GB) in 2-3 hours and caused the SAN to drop iSCSI connections when it hit 95% fill. Those type of figures are unheard of for us, let alone in a 2-3 hour period. And yes, that means we still had 5%/820gb remaining free. Obviously with an alert at 90%, it started less than that. I can just about account for 400GB data churn, but no where near the figures we've seen.

anyway, I've opted for XI. running on the 60day trial VM image they've built and so far it's going well. All our old Core monitors have been moved across.

either way, moving to XI seems to be the way forward. what with a sql backend, it can now tie into various dedicated graphic tools if we find XI isnt quite what we're looking for. So far it's doing things well.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Nagios graphing - what to use?

Post by tmcdonald »

Great to hear it! If you need any help with XI, feel free to post in our Nagios XI forum and we'd be glad to assist.

Are we all good to consider this thread resolved?
Former Nagios employee