NLS and XI perfdata

benhank · Post by **benhank** » Thu Sep 06, 2018 1:22 pm

I want to crate a dashboard in NLS that is able to graph incoming logs as well as the perfdata recorded by nagiosxi.
I have already implemented grafana into my environment, but the current version of grafana is incompatible with the version of the elk stack that powers NLS.
By allowing NLS to graph perfdata, I get the best visibility into my environment for my end users.

It seems that my options would be to either
1. have nagios via the script that send syslogs into NLS also send the perfdata to NLS.
2. Throu some api magig have NLS read the perfdata directly from nagios.
3. Something else

Is what I am thinking possible and if so how can i make it happen?

scottwilkerson · Post by **scottwilkerson** » Fri Sep 07, 2018 9:06 am

There isn't a builtin easy way to do this but is on our internal todo list to be able to send check results and performance data to Log Server.

It really needs to be added to the perfdata processing subsystem in a component to be done correctly, I see no other way.

I suppose it could be possible to use the http_poller logstash input
https://www.elastic.co/guide/en/logstas ... oller.html

then poll these API's
objects/hoststatus
objects/servicestatus

but you would need to create a good grok filter that can parse the performance data field properly which would be a challenge

Post by **mcapra** » Fri Sep 07, 2018 1:59 pm

A neat article:
https://www.elastic.co/blog/integrating ... h-logstash

Unfortunately:

Note that this input plugin requires Logstash 6.2.3 at minimum.

scottwilkerson · Post by **scottwilkerson** » Fri Sep 07, 2018 2:15 pm

mcapra wrote:A neat article:
https://www.elastic.co/blog/integrating ... h-logstash

Unfortunately:
Note that this input plugin requires Logstash 6.2.3 at minimum.

Well ya it is kind of neat, but who would want to manage all of there checks that way, the configuration management to configure nagios checks would be a nightmare, it even mentions it in the article

This use case begs the question, what if I want to programatically add thousands or more checks into Logstash?

benhank · Post by **benhank** » Fri Sep 07, 2018 4:12 pm

for the following reason:
Users start to report that they cant work remotely , after connecting to the network the via the vpn the 'internet" is slooooooowwwww.
Network admins are perplexed the Datacenter is not showing anything being down in nagios.
now the network is so slow no one can even login.
Verizon is called , but verizon is saying everything is ok on their end. Coffee is drunk, pills are popped (headache meds we hope) phones are called.
The big wallets never sprung for the development cycles for NNA so thats out.

Benhank says waitaminute! and cracks open a ice cold 12 pack of NLS to track syslogs for the network gear being affected. Because he has extensive training in creating queries and regex technomagigal miracles,(but in real life he hasnt and will be making a post in the future about it to you guys heh heh ) he quickly creates a dashboad that shows the throughput for the slow flaccid drooping network devices and BAM! there it is! All golden and shiny and sparking: a dashboard that clearly shows when the drop in throughput occurred extrapolated from nagios perfdata and graph of the syslog errors on said devices at the time of the slowdown.
Presenting the info to the quivering redeyed big wallet boys in charge, he helps them make their case to verizon, or on a darker side HR aka Heads will Roll ,pointing out what happened and who responsible for the resolution to the problem..
it's boils down to "it's better to have it and not need it than to need it, really badly, right now idontknowwhatimgonnatellmyboss and not have it.

scottwilkerson · Post by **scottwilkerson** » Fri Sep 07, 2018 4:59 pm

lol

yep

benhank · Post by **benhank** » Mon Sep 10, 2018 7:31 am

So fellas can this be done with the current version of NLS? =D

Post by **mcapra** » Mon Sep 10, 2018 8:42 am

You could maybe rig up something like a Logstash HTTP poller to hit the various Nagios XI API perfdata endpoints, but I haven't looked at the API docs in a while. What you'd really want is an API endpoint that exposes the "last measurement" for each of your checks. I'd think scraping that every 5/10/30 minutes would be valuable by itself. Combine it with a simple filter to break up each XI check into an individual message.

Here's a custom API endpoint I rigged up a while ago:
https://support.nagios.com/forum/viewto ... 93#p214393

All of that still sounds reasonably expensive (in time and performance) to me though. The most performant solution I can think of would be some sort of custom perfdata handler which is unlikely to cooperate well with upgrades.

Depending on the database chosen for the perfdata rework scheduled for XI 6, this could all become much easier in a year or so.

scottwilkerson wrote:who would want to manage all of there checks that way

¯\_(ツ)_/¯

It's totally absolutely not applicable to this particular situation, but if you already had your check definitions held by Puppet/Ansible/Chef/etc achieving parity between the two systems doesn't sound *toooo* terrible?

scottwilkerson · Post by **scottwilkerson** » Mon Sep 10, 2018 11:36 am

This would be possible but you would still need to figure out some type of grok to extract the perfdata

benhank · Post by **benhank** » Mon Sep 10, 2018 12:52 pm

That's what I thought as well. i wish there was a .rrd plugin or something but I cant find it. Lord knows I don't wanna bring graphite into my environment...
you can lock this for now, but if there was anyway to make this a feature request or something ....

Nagios Support Forum

NLS and XI perfdata

NLS and XI perfdata

Re: NLS and XI perfdata

Re: NLS and XI perfdata

Re: NLS and XI perfdata

Re: NLS and XI perfdata

Re: NLS and XI perfdata

Re: NLS and XI perfdata

Re: NLS and XI perfdata

Re: NLS and XI perfdata

Re: NLS and XI perfdata