Hi Support Team,
I am using Nagios Core 4.4.5 on CentOS 7.7. I am monitoring 25 servers as of now. Is there a way to find out what is the current resource utilization of all the 25 servers. For example:- Server 1 -> CPU, Memory and Disk Space resource utilization and so on and so forth for the remaining servers. This is an exercise to do Capacity Planning in our Organisation to find out if the capability of current infrastructure (25 servers) is sufficient for the next 6 months to a year.
Thanks in Advance and I look forward to hearing from you.
Best Regards,
Kaushal
CPU, Memory and Disk Space resource utilization
-
- Posts: 124
- Joined: Fri May 22, 2015 7:12 am
-
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
Re: CPU, Memory and Disk Space resource utilization
Nagios XI comes with a lot of this built in as it comprises of a lot of different technologies.
Most likely you'll extrapolate that data from performance data. I would suggest you look at implementing Influxdb/Grafana. Here's some documentation on how to do that:
https://support.nagios.com/kb/article/n ... u-802.html
Most likely you'll extrapolate that data from performance data. I would suggest you look at implementing Influxdb/Grafana. Here's some documentation on how to do that:
https://support.nagios.com/kb/article/n ... u-802.html
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
- Posts: 124
- Joined: Fri May 22, 2015 7:12 am
Re: CPU, Memory and Disk Space resource utilization
Thanks Troy Lea for the reply. I will keep you posted as it progresses. Much appreciated for your help
-
- Support Tech
- Posts: 5045
- Joined: Tue Feb 07, 2017 11:26 am
Re: CPU, Memory and Disk Space resource utilization
Sounds good!
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
- Posts: 124
- Joined: Fri May 22, 2015 7:12 am
Re: CPU, Memory and Disk Space resource utilization
Hi
Functionally it is working. I am attaching the screenshot. I am running
[img]
[/img]
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Nagios Core 4.4.5
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2019-08-20
License: GPL
Website: https://www.nagios.org
Reading configuration data...
Read main config file okay...
Read object config files okay...
Running pre-flight check on configuration data...
Checking objects...
Checked 311 services.
Checked 34 hosts.
Checked 1 host groups.
Checked 0 service groups.
Checked 28 contacts.
Checked 9 contact groups.
Checked 39 commands.
Checked 5 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 34 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
Versions :- influxdb-1.7.9-1.x86_64 , grafana-6.5.3-1.x86_64, histou v0.4.3, Nagflux v0.4.1, CentOS Linux release 7.7.1908 (Core), Nagios Core 4.4.5
Please let me know if you need any additional information.
Best Regards,
Kaushal
Functionally it is working. I am attaching the screenshot. I am running
[img]
[/img]
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Nagios Core 4.4.5
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2019-08-20
License: GPL
Website: https://www.nagios.org
Reading configuration data...
Read main config file okay...
Read object config files okay...
Running pre-flight check on configuration data...
Checking objects...
Checked 311 services.
Checked 34 hosts.
Checked 1 host groups.
Checked 0 service groups.
Checked 28 contacts.
Checked 9 contact groups.
Checked 39 commands.
Checked 5 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 34 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
Versions :- influxdb-1.7.9-1.x86_64 , grafana-6.5.3-1.x86_64, histou v0.4.3, Nagflux v0.4.1, CentOS Linux release 7.7.1908 (Core), Nagios Core 4.4.5
Please let me know if you need any additional information.
Best Regards,
Kaushal
You do not have the required permissions to view the files attached to this post.
-
- Posts: 124
- Joined: Fri May 22, 2015 7:12 am
Re: CPU, Memory and Disk Space resource utilization
Hi Troy,
I see breaks in the graph as per the screenshot attached.
Best Regards,
Kaushal
I see breaks in the graph as per the screenshot attached.
Best Regards,
Kaushal
-
- Posts: 124
- Joined: Fri May 22, 2015 7:12 am
Re: CPU, Memory and Disk Space resource utilization
Hi Troy,
I am attaching the screenshot again for your reference.
Best Regards,
Kaushal
I am attaching the screenshot again for your reference.
Best Regards,
Kaushal
You do not have the required permissions to view the files attached to this post.
-
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
Re: CPU, Memory and Disk Space resource utilization
The breaks in your performance data generally mean you are not receiving valid data back from the plugin during these intervals, or perhaps for some reason the performance data is being deleted before it is being processed.
You may want to look at the influxdb logs to see if it is reporting any errors.
You may want to look at the influxdb logs to see if it is reporting any errors.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
- Posts: 124
- Joined: Fri May 22, 2015 7:12 am
Re: CPU, Memory and Disk Space resource utilization
Hi Troy,
Please find the below details after investigating further.
Livestatus live socker file is /usr/local/nagios/var/live.sock
srw-rw----. 1 nagios nagios 0 Jan 29 07:20 /usr/local/nagios/var/live.sock
I have enabled the below in /opt/nagflux/config.gcfg. nagflux service does not start at all.
Nagios Cfg file
#systemctl status nagflux.service
Best Regards,
Kaushal
Please find the below details after investigating further.
Code: Select all
[b]systemctl status nagflux.service[/b]
● nagflux.service - A connector which transforms performancedata from Nagios/Icinga(2)/Naemon to InfluxDB/Elasticsearch
Loaded: loaded (/usr/lib/systemd/system/nagflux.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2020-01-31 04:10:26 UTC; 12h ago
Docs: https://github.com/Griesbacher/nagflux
Main PID: 28895 (nagflux)
CGroup: /system.slice/nagflux.service
└─28895 /opt/nagflux/nagflux -configPath /opt/nagflux/config.gcfg
cat /opt/nagflux/config.gcfgJan 31 17:05:59 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:05:59 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:28 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:28 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:28 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:28 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:28 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:28 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:29 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:29 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:29 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:29 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:29 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:29 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:59 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:59 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:59 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:59 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:59 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:59 Critical: Connection type is unknown, options are: tcp, file. Input :
Code: Select all
[main]
NagiosSpoolfileFolder = "/usr/local/nagios/var/spool/nagfluxperfdata"
NagiosSpoolfileWorker = 1
InfluxWorker = 2
MaxInfluxWorker = 5
DumpFile = "nagflux.dump"
NagfluxSpoolfileFolder = "/usr/local/nagios/var/nagflux"
FieldSeparator = "&"
BufferSize = 10000
FileBufferSize = 65536
DefaultTarget = "all"
[Log]
LogFile = ""
MinSeverity = "INFO"
[Livestatus]
# # tcp or file
Type = "file"
# # tcp: 127.0.0.1:6557 or file /var/run/live
file /usr/local/nagios/var/live.sock
# #Address = "127.0.0.1:6557"
# # The amount to minutes to wait for livestatus to come up, if set to 0 the detection is disabled
MinutesToWait = 2
# # Set the Version of Livestatus. Allowed are Nagios, Icinga2, Naemon.
# # If left empty Nagflux will try to detect it on it's own, which will not always work.
Version = ""
[InfluxDBGlobal]
CreateDatabaseIfNotExists = true
NastyString = ""
NastyStringToReplace = ""
HostcheckAlias = "hostcheck"
[InfluxDB "nagflux"]
Enabled = true
Version = 1.0
Address = "http://127.0.0.1:8086"
Arguments = "precision=ms&u=root&p=root&db=nagflux"
StopPullingDataIfDown = true
[InfluxDB "fast"]
Enabled = false
Version = 1.0
Address = "http://127.0.0.1:8086"
Arguments = "precision=ms&u=root&p=root&db=fast"
StopPullingDataIfDown = false
srw-rw----. 1 nagios nagios 0 Jan 29 07:20 /usr/local/nagios/var/live.sock
I have enabled the below in /opt/nagflux/config.gcfg. nagflux service does not start at all.
Code: Select all
[Livestatus]
# # tcp or file
Type = "file"
# # tcp: 127.0.0.1:6557 or file /var/run/live
file /usr/local/nagios/var/live.sock
# #Address = "127.0.0.1:6557"
# # The amount to minutes to wait for livestatus to come up, if set to 0 the detection is disabled
MinutesToWait = 2
# # Set the Version of Livestatus. Allowed are Nagios, Icinga2, Naemon.
# # If left empty Nagflux will try to detect it on it's own, which will not always work.
Version = ""
Code: Select all
process_performance_data=1
host_perfdata_file=/usr/local/nagios/var/host-perfdata
host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$
host_perfdata_file_mode=a
host_perfdata_file_processing_interval=15
host_perfdata_file_processing_command=process-host-perfdata-file-nagflux
#
service_perfdata_file=/usr/local/nagios/var/host-perfdata
service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$
service_perfdata_file_mode=a
service_perfdata_file_processing_interval=15
service_perfdata_file_processing_command=process-service-perfdata-file-nagflux
#systemctl status nagflux.service
Please suggest further and correct me if I am missing anything. I look forward to hearing from you. Thanks in Advance.● nagflux.service - A connector which transforms performancedata from Nagios/Icinga(2)/Naemon to InfluxDB/Elasticsearch
Loaded: loaded (/usr/lib/systemd/system/nagflux.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Fri 2020-01-31 17:10:48 UTC; 3s ago
Docs: https://github.com/Griesbacher/nagflux
Process: 10845 ExecStart=/opt/nagflux/nagflux -configPath /opt/nagflux/config.gcfg (code=exited, status=2)
Main PID: 10845 (code=exited, status=2)
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal nagflux[10845]: main.main()
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal nagflux[10845]: /root/gorepo/src/github.com/griesbacher/nagflux/main.go:68 +0x22e
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal systemd[1]: Unit nagflux.service entered failed state.
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal systemd[1]: nagflux.service failed.
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal systemd[1]: nagflux.service holdoff time over, scheduling restart.
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal systemd[1]: Stopped A connector which transforms performancedata from Nagios/Icinga(2)/Naemon to InfluxDB/Elasticsearch.
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal systemd[1]: start request repeated too quickly for nagflux.service
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal systemd[1]: Failed to start A connector which transforms performancedata from Nagios/Icinga(2)/Naemon to InfluxDB/Elasticsearch.
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal systemd[1]: Unit nagflux.service entered failed state.
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal systemd[1]: nagflux.service failed.
Best Regards,
Kaushal
-
- Posts: 124
- Joined: Fri May 22, 2015 7:12 am
Re: CPU, Memory and Disk Space resource utilization
Hi Troy,
Checking in again if you had a chance to look at the post to this forum?
Best Regards,
Kaushal
Checking in again if you had a chance to look at the post to this forum?
Best Regards,
Kaushal