NLS and XI perfdata

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

NLS and XI perfdata

Post by benhank »

I want to crate a dashboard in NLS that is able to graph incoming logs as well as the perfdata recorded by nagiosxi.
I have already implemented grafana into my environment, but the current version of grafana is incompatible with the version of the elk stack that powers NLS.
By allowing NLS to graph perfdata, I get the best visibility into my environment for my end users.

It seems that my options would be to either
1. have nagios via the script that send syslogs into NLS also send the perfdata to NLS.
2. Throu some api magig have NLS read the perfdata directly from nagios.
3. Something else

Is what I am thinking possible and if so how can i make it happen?
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: NLS and XI perfdata

Post by scottwilkerson »

There isn't a builtin easy way to do this but is on our internal todo list to be able to send check results and performance data to Log Server.

It really needs to be added to the perfdata processing subsystem in a component to be done correctly, I see no other way.

I suppose it could be possible to use the http_poller logstash input
https://www.elastic.co/guide/en/logstas ... oller.html

then poll these API's
objects/hoststatus
objects/servicestatus

but you would need to create a good grok filter that can parse the performance data field properly which would be a challenge
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: NLS and XI perfdata

Post by mcapra »

A neat article:
https://www.elastic.co/blog/integrating ... h-logstash

Unfortunately:
Note that this input plugin requires Logstash 6.2.3 at minimum.
Former Nagios employee
https://www.mcapra.com/
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: NLS and XI perfdata

Post by scottwilkerson »

mcapra wrote:A neat article:
https://www.elastic.co/blog/integrating ... h-logstash

Unfortunately:
Note that this input plugin requires Logstash 6.2.3 at minimum.
Well ya it is kind of neat, but who would want to manage all of there checks that way, the configuration management to configure nagios checks would be a nightmare, it even mentions it in the article
This use case begs the question, what if I want to programatically add thousands or more checks into Logstash?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: NLS and XI perfdata

Post by benhank »

for the following reason:
Users start to report that they cant work remotely , after connecting to the network the via the vpn the 'internet" is slooooooowwwww.
Network admins are perplexed the Datacenter is not showing anything being down in nagios.
now the network is so slow no one can even login.
Verizon is called , but verizon is saying everything is ok on their end. Coffee is drunk, pills are popped (headache meds we hope) phones are called.
The big wallets never sprung for the development cycles for NNA so thats out.

Benhank says waitaminute! and cracks open a ice cold 12 pack of NLS to track syslogs for the network gear being affected. Because he has extensive training in creating queries and regex technomagigal miracles,(but in real life he hasnt and will be making a post in the future about it to you guys heh heh ) he quickly creates a dashboad that shows the throughput for the slow flaccid drooping network devices and BAM! there it is! All golden and shiny and sparking: a dashboard that clearly shows when the drop in throughput occurred extrapolated from nagios perfdata and graph of the syslog errors on said devices at the time of the slowdown.
Presenting the info to the quivering redeyed big wallet boys in charge, he helps them make their case to verizon, or on a darker side HR aka Heads will Roll ,pointing out what happened and who responsible for the resolution to the problem..
it's boils down to "it's better to have it and not need it than to need it, really badly, right now idontknowwhatimgonnatellmyboss and not have it.
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: NLS and XI perfdata

Post by scottwilkerson »

lol

yep
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: NLS and XI perfdata

Post by benhank »

So fellas can this be done with the current version of NLS? =D
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: NLS and XI perfdata

Post by mcapra »

You could maybe rig up something like a Logstash HTTP poller to hit the various Nagios XI API perfdata endpoints, but I haven't looked at the API docs in a while. What you'd really want is an API endpoint that exposes the "last measurement" for each of your checks. I'd think scraping that every 5/10/30 minutes would be valuable by itself. Combine it with a simple filter to break up each XI check into an individual message.

Here's a custom API endpoint I rigged up a while ago:
https://support.nagios.com/forum/viewto ... 93#p214393

All of that still sounds reasonably expensive (in time and performance) to me though. The most performant solution I can think of would be some sort of custom perfdata handler which is unlikely to cooperate well with upgrades.

Depending on the database chosen for the perfdata rework scheduled for XI 6, this could all become much easier in a year or so.
scottwilkerson wrote:who would want to manage all of there checks that way
¯\_(ツ)_/¯

It's totally absolutely not applicable to this particular situation, but if you already had your check definitions held by Puppet/Ansible/Chef/etc achieving parity between the two systems doesn't sound *toooo* terrible?
Former Nagios employee
https://www.mcapra.com/
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: NLS and XI perfdata

Post by scottwilkerson »

This would be possible but you would still need to figure out some type of grok to extract the perfdata
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: NLS and XI perfdata

Post by benhank »

That's what I thought as well. i wish there was a .rrd plugin or something but I cant find it. Lord knows I don't wanna bring graphite into my environment...
you can lock this for now, but if there was anyway to make this a feature request or something ....
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion