I had more issues today. Dashboard loading was superslow. It's like since I added 5 cpu's performance has only gone down....
Had to install lsof first. This command was executed after an elasticsearch restart, as the nls site was frozen again.
Node01:
Node02:
I'm sorry to say this, but we have got nothing but problems since we started using Nagios Log Server. I've spent multiple days troubleshooting and trying to make the NLS stable. 23/02 I have to do a presentation to our management about the NLS server. If I don't manage to make it stable, I will have to postpone this presentation, as it is just too slow or starts hanging / freezing or event stops processing logs completely. I have no idea how to explain the time I invested in NLS or why it's just not stable enough to process logs of 34 esx servers, 1 infoblox device, one Windows server and 3 linux servers (Nagios XI + 2 NLS) on 2 NLS servers with 6 cpu's, 4 GB RAM and SSD storage..
It's not like I'm doing any exotic configuration and I would think our NLS is not receiving as many logs as it should be able to receive.
A tail Logstash log:
Code: Select all
tail -f /var/log/logstash/logstash.log
{:timestamp=>"2015-02-12T11:37:51.987000+0100", :message=>"Failed to flush outgoing items", :outgoing_count=>634, :exception=>#<Errno::EBADF: Bad file descriptor - Bad file descriptor>, :backtrace=>["org/jruby/RubyIO.java:2097:in `close'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:173:in `connect'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:168:in `connect'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:156:in `connect'", "org/jruby/RubyArray.java:1613:in `each'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:139:in `connect'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:406:in `connect'", "org/jruby/RubyProc.java:271:in `call'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/pool.rb:48:in `fetch'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:403:in `connect'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:319:in `execute'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:217:in `post!'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:106:in `bulk_ftw'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:80:in `bulk'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch.rb:315:in `flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:219:in `buffer_flush'", "org/jruby/RubyHash.java:1339:in `each'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:216:in `buffer_flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:193:in `buffer_flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:112:in `buffer_initialize'", "org/jruby/RubyKernel.java:1521:in `loop'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:110:in `buffer_initialize'"], :level=>:warn}
{:timestamp=>"2015-02-12T11:37:52.153000+0100", :message=>"Failed to flush outgoing items", :outgoing_count=>1145, :exception=>#<Errno::EBADF: Bad file descriptor - Bad file descriptor>, :backtrace=>["org/jruby/RubyIO.java:2097:in `close'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:173:in `connect'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:168:in `connect'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:156:in `connect'", "org/jruby/RubyArray.java:1613:in `each'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:139:in `connect'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:406:in `connect'", "org/jruby/RubyProc.java:271:in `call'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/pool.rb:48:in `fetch'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:403:in `connect'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:319:in `execute'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:217:in `post!'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:106:in `bulk_ftw'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:80:in `bulk'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch.rb:315:in `flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:219:in `buffer_flush'", "org/jruby/RubyHash.java:1339:in `each'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:216:in `buffer_flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:193:in `buffer_flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:112:in `buffer_initialize'", "org/jruby/RubyKernel.java:1521:in `loop'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:110:in `buffer_initialize'"], :level=>:warn}
{:timestamp=>"2015-02-12T11:37:52.244000+0100", :message=>"Failed to flush outgoing items", :outgoing_count=>5000, :exception=>#<Errno::EBADF: Bad file descriptor - Bad file descriptor>, :backtrace=>["org/jruby/RubyIO.java:2097:in `close'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:173:in `connect'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:168:in `connect'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:156:in `connect'", "org/jruby/RubyArray.java:1613:in `each'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:139:in `connect'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:406:in `connect'", "org/jruby/RubyProc.java:271:in `call'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/pool.rb:48:in `fetch'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:403:in `connect'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:319:in `execute'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:217:in `post!'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:106:in `bulk_ftw'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:80:in `bulk'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch.rb:315:in `flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:219:in `buffer_flush'", "org/jruby/RubyHash.java:1339:in `each'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:216:in `buffer_flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:193:in `buffer_flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:159:in `buffer_receive'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch.rb:311:in `receive'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/base.rb:86:in `handle'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/base.rb:78:in `worker_setup'"], :level=>:warn}
{:timestamp=>"2015-02-12T11:37:52.266000+0100", :message=>"Failed to flush outgoing items", :outgoing_count=>1588, :exception=>#<Errno::EBADF: Bad file descriptor - Bad file descriptor>, :backtrace=>["org/jruby/RubyIO.java:2097:in `close'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:173:in `connect'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:168:in `connect'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:156:in `connect'", "org/jruby/RubyArray.java:1613:in `each'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:139:in `connect'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:406:in `connect'", "org/jruby/RubyProc.java:271:in `call'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/pool.rb:48:in `fetch'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:403:in `connect'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:319:in `execute'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:217:in `post!'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:106:in `bulk_ftw'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:80:in `bulk'", "/usr/local/nagioslogserver/logstash/lib/logstash/outputs/elasticsearch.rb:315:in `flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:219:in `buffer_flush'", "org/jruby/RubyHash.java:1339:in `each'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:216:in `buffer_flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:193:in `buffer_flush'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:112:in `buffer_initialize'", "org/jruby/RubyKernel.java:1521:in `loop'", "/usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:110:in `buffer_initialize'"], :level=>:warn}
{:timestamp=>"2015-02-12T11:43:31.399000+0100", :message=>"Using milestone 2 input plugin 'tcp'. This plugin should be stable, but if you see strange behavior, please let us know! For more information on plugin milestones, see http://logstash.net/docs/1.4.2/plugin-milestones", :level=>:warn}
{:timestamp=>"2015-02-12T11:43:31.602000+0100", :message=>"Using milestone 1 input plugin 'syslog'. This plugin should work, but would benefit from use by folks like you. Please let us know if you find bugs or have suggestions on how to improve this plugin. For more information on plugin milestones, see http://logstash.net/docs/1.4.2/plugin-milestones", :level=>:warn}
{:timestamp=>"2015-02-12T12:00:20.674000+0100", :message=>"syslog udp listener died", :address=>"0.0.0.0:5546", :exception=>#<SocketError: recvfrom: name or service not known>, :backtrace=>["/usr/local/nagioslogserver/logstash/lib/logstash/inputs/syslog.rb:119:in `udp_listener'", "org/jruby/RubyKernel.java:1521:in `loop'", "/usr/local/nagioslogserver/logstash/lib/logstash/inputs/syslog.rb:118:in `udp_listener'", "/usr/local/nagioslogserver/logstash/lib/logstash/inputs/syslog.rb:76:in `run'"], :level=>:warn}
{:timestamp=>"2015-02-12T12:00:30.270000+0100", :message=>"Using milestone 2 input plugin 'tcp'. This plugin should be stable, but if you see strange behavior, please let us know! For more information on plugin milestones, see http://logstash.net/docs/1.4.2/plugin-milestones", :level=>:warn}
{:timestamp=>"2015-02-12T12:00:30.454000+0100", :message=>"Using milestone 1 input plugin 'syslog'. This plugin should work, but would benefit from use by folks like you. Please let us know if you find bugs or have suggestions on how to improve this plugin. For more information on plugin milestones, see http://logstash.net/docs/1.4.2/plugin-milestones", :level=>:warn}
{:timestamp=>"2015-02-12T12:02:42.256000+0100", :message=>"syslog udp listener died", :address=>"0.0.0.0:5545", :exception=>#<SocketError: recvfrom: name or service not known>, :backtrace=>["/usr/local/nagioslogserver/logstash/lib/logstash/inputs/syslog.rb:119:in `udp_listener'", "org/jruby/RubyKernel.java:1521:in `loop'", "/usr/local/nagioslogserver/logstash/lib/logstash/inputs/syslog.rb:118:in `udp_listener'", "/usr/local/nagioslogserver/logstash/lib/logstash/inputs/syslog.rb:76:in `run'"], :level=>:warn}
EDIT 1: I just had to restart elasticsearch service again and afterwards I tried applying configuration and the website is completely frozen...
EDIT 2: After another restart of elasticsearch service, I can log into the website again, but it seems logs are no longer getting processed.. I can't just keep restarting elasticsearch service...... and hoping it will suddenly work. When applying config, I get " The apply command hasn't started yet. The instance may not be online or is unreachable."
EDIT 3: After doing a service elasticserch restart on the node which could not apply conf and re-applying configuration, logs are coming in again.
EDIT 4: Just realised I'm monitoring the NLS servers with Nagios, so I attached a
graph of open files from the moment I installed them. I hope it helps.
EDIT 5: When I read posts on GitHub, eg
https://github.com/elasticsearch/logstash/issues/1896 of people with "Errno::EBADF: Bad file descriptor - Bad file descriptor" errors, the are talking about a misconfiguration
My issue turned out to be a misconfiguration. I had 127.0.0.1 for the elasticsearch output host on my remote nodes, when I should have targeted the proper elasticsearch server in my organization.
Could I have the same misconfiguration? Where can I check this?
/etc/rsyslog.d/nagioslogserver.conf nls01
Code: Select all
# ### begin forwarding rule ###
#
# NAGIOS LOG SERVER
#
$WorkDirectory /var/lib/rsyslog # where to place spool files
$ActionQueueFileName fwdRule1 # unique name prefix for spool files
$ActionQueueMaxDiskSpace 1g # 1gb space limit (use as much as possible)
$ActionQueueSaveOnShutdown on # save messages to disk on shutdown
$ActionQueueType LinkedList # run asynchronously
$ActionResumeRetryCount -1 # infinite retries if host is down
*.* @@localhost:5546
#
# ### end of the forwarding rule ###
/etc/rsyslog.d/nagioslogserver.conf nls02
Code: Select all
# ### begin forwarding rule ###
#
# NAGIOS LOG SERVER
#
$WorkDirectory /var/lib/rsyslog # where to place spool files
$ActionQueueFileName fwdRule1 # unique name prefix for spool files
$ActionQueueMaxDiskSpace 1g # 1gb space limit (use as much as possible)
$ActionQueueSaveOnShutdown on # save messages to disk on shutdown
$ActionQueueType LinkedList # run asynchronously
$ActionResumeRetryCount -1 # infinite retries if host is down
*.* @@localhost:5546
#
# ### end of the forwarding rule ###
Could it have something to do with the port I changed to 5546, as discussed in ticket 2015012810000141?
This the filter I had to make or all my Linux servers, as otherwise I was experiencing date parsing errors:
Code: Select all
syslog {
type => 'syslog-linux'
port => 5546
}
You do not have the required permissions to view the files attached to this post.