Recv-Q buildup

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Envera IT
Posts: 159
Joined: Wed Jun 19, 2013 10:21 am

Recv-Q buildup

Post by Envera IT »

LS setup:
3 instances, 2 vCPU's and 32GB RAM each. 62 open indexes, optimization after 2 days.
System Profile is attached.

For the past several months we've had an issue with a certain log file not keeping up (shipped via NXLog on Windows), as a workaround method we began restarting the log-sending service daily as this appeared to resolve it short term. In the past week this problem has intensified with no real change to log input levels, it is now affecting multiple logs from different source types. In looking into this I found that the Recv-Q in a

Code: Select all

netstat -ntp
command was showing high numbers that did not seem to go down for certain connections; these connections were the ones we were not receiving logs for.

For the purpose of trying to keep this simple I'll show what I've found in regards to the original log type coming in from NXLog as a Windows Event log on TCP 3515 with the default input. Here are 4 successive

Code: Select all

netstat -ntp | grep 3515
commands issued approximately 5 seconds apart:

Code: Select all

[root@srq-nagios-ls3 ~]# netstat -ntp | grep 3515
tcp      405      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.130:51058     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.35:54592      ESTABLISHED 21681/java
tcp   1274342      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.24:52260      ESTABLISHED 21681/java
tcp   3489066      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.49:49198      ESTABLISHED 21681/java
tcp      219      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.123:54340     ESTABLISHED 21681/java
tcp   3269416      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.55:49193      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.27:59551      ESTABLISHED 21681/java
tcp      219      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.126:58418     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.16:49417      ESTABLISHED 21681/java
tcp   3255556      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.13:49196      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.1.126:49201     ESTABLISHED 21681/java
tcp   3483670      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.65:50832      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.3.201:43098     ESTABLISHED 21681/java
tcp   3489693      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.69:56011      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.75:51676      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.212:49193     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.1.126:49223     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.1.126:49182     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.71:49877      ESTABLISHED 21681/java
tcp   3490424      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.48:49176      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.107:52745     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.38:50131      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.85:63105      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.79:49185      ESTABLISHED 21681/java
tcp   1106055      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.32:65310      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.59:49176      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.33:49197      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.1.115:37564     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.129:57378     ESTABLISHED 21681/java
tcp   3489706      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.204:63318     ESTABLISHED 21681/java
tcp   3484202      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.31:53547      ESTABLISHED 21681/java
tcp   3489762      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.72:51689      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.23:49194      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.20:49864      ESTABLISHED 21681/java

[root@srq-nagios-ls3 ~]# netstat -ntp | grep 3515
tcp     2225      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.130:51058     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.35:54592      ESTABLISHED 21681/java
tcp   1274342      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.24:52260      ESTABLISHED 21681/java
tcp   3489066      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.49:49198      ESTABLISHED 21681/java
tcp     2960      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.123:54340     ESTABLISHED 21681/java
tcp   3269416      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.55:49193      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.27:59551      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.126:58418     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.16:49417      ESTABLISHED 21681/java
tcp   3258760      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.13:49196      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.1.126:49201     ESTABLISHED 21681/java
tcp   3483670      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.65:50832      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.3.201:43098     ESTABLISHED 21681/java
tcp   3489693      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.69:56011      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.75:51676      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.212:49193     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.1.126:49223     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.1.126:49182     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.71:49877      ESTABLISHED 21681/java
tcp   3490424      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.48:49176      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.107:52745     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.38:50131      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.85:63105      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.79:49185      ESTABLISHED 21681/java
tcp   1125781      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.32:65310      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.59:49176      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.33:49197      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.1.115:37564     ESTABLISHED 21681/java
tcp     1824      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.129:57378     ESTABLISHED 21681/java
tcp   3489706      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.204:63318     ESTABLISHED 21681/java
tcp   3484202      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.31:53547      ESTABLISHED 21681/java
tcp   3489762      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.72:51689      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.23:49194      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.20:49864      ESTABLISHED 21681/java

[root@srq-nagios-ls3 ~]# netstat -ntp | grep 3515
tcp      673      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.130:51058     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.35:54592      ESTABLISHED 21681/java
tcp   1257958      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.24:52260      ESTABLISHED 21681/java
tcp   3488958      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.49:49198      ESTABLISHED 21681/java
tcp     4098      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.123:54340     ESTABLISHED 21681/java
tcp   3275230      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.55:49193      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.27:59551      ESTABLISHED 21681/java
tcp      675      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.126:58418     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.16:49417      ESTABLISHED 21681/java
tcp   3260886      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.13:49196      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.1.126:49201     ESTABLISHED 21681/java
tcp   3488190      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.65:50832      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.3.201:43098     ESTABLISHED 21681/java
tcp   3489651      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.69:56011      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.75:51676      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.212:49193     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.1.126:49223     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.1.126:49182     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.71:49877      ESTABLISHED 21681/java
tcp   3489639      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.48:49176      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.107:52745     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.38:50131      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.85:63105      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.79:49185      ESTABLISHED 21681/java
tcp   1135040      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.32:65310      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.59:49176      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.33:49197      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.1.115:37564     ESTABLISHED 21681/java
tcp     1318      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.129:57378     ESTABLISHED 21681/java
tcp   3489541      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.204:63318     ESTABLISHED 21681/java
tcp   3489598      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.31:53547      ESTABLISHED 21681/java
tcp   3483702      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.72:51689      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.23:49194      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.20:49864      ESTABLISHED 21681/java

[root@srq-nagios-ls3 ~]# netstat -ntp | grep 3515
tcp      860      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.130:51058     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.35:54592      ESTABLISHED 21681/java
tcp   1241574      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.24:52260      ESTABLISHED 21681/java
tcp   3484046      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.49:49198      ESTABLISHED 21681/java
tcp     3412      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.123:54340     ESTABLISHED 21681/java
tcp   3275230      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.55:49193      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.27:59551      ESTABLISHED 21681/java
tcp     1153      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.126:58418     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.16:49417      ESTABLISHED 21681/java
tcp   3283528      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.13:49196      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.1.126:49201     ESTABLISHED 21681/java
tcp   3489926      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.65:50832      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.3.201:43098     ESTABLISHED 21681/java
tcp   3489651      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.69:56011      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.75:51676      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.212:49193     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.1.126:49223     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.1.126:49182     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.71:49877      ESTABLISHED 21681/java
tcp   3489639      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.48:49176      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.107:52745     ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.38:50131      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.85:63105      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.79:49185      ESTABLISHED 21681/java
tcp   1148263      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.32:65310      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.59:49176      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.33:49197      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.1.115:37564     ESTABLISHED 21681/java
tcp      456      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.129:57378     ESTABLISHED 21681/java
tcp   3489541      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.204:63318     ESTABLISHED 21681/java
tcp   3489708      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.31:53547      ESTABLISHED 21681/java
tcp   3483702      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.72:51689      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.23:49194      ESTABLISHED 21681/java
tcp        0      0 ::ffff:10.0.1.173:3515      ::ffff:10.0.2.20:49864      ESTABLISHED 21681/java
As you can see in this example, 11 of the 34 connections to 3515 have a very high Recv-Q and they don't appear to be going down. All of these connections are being switched on the local network, nothing is coming in over WAN. top command and instance stats corroborate memory sits around 58%, CPU for the instances sits around 20-50% with peaks up to approx 80%. The process 21681 that is listening for events sits around 3-5% and peaks at around 20-25%.

Here is a result of a tcpdump for a minute from a bad source:

Code: Select all

 tcpdump -n dst port 3515 and host 10.0.2.49
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
12:17:58.096613 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 1364353879:1364353880, ack 3367158999, win 256, length 1
12:18:00.496735 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 0:1, ack 1, win 256, length 1
12:18:05.297087 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 0:1, ack 1, win 256, length 1
12:18:07.303163 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 0:1460, ack 1, win 256, length 1460
12:18:07.303235 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 1460:2920, ack 1, win 256, length 1460
12:18:07.303242 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 2920:4380, ack 1, win 256, length 1460
12:18:07.303247 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 4380:5840, ack 1, win 256, length 1460
12:18:07.303477 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 5840:7300, ack 1, win 256, length 1460
12:18:07.303488 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 7300:7637, ack 1, win 256, length 337
12:18:07.342863 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 7637:8275, ack 1, win 256, length 638
12:18:07.343670 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 8275:9735, ack 1, win 256, length 1460
12:18:12.352642 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 9735:10375, ack 1, win 256, length 640
12:18:12.354744 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 10375:11835, ack 1, win 256, length 1460
12:18:12.354762 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 11835:13295, ack 1, win 256, length 1460
12:18:12.354767 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 13295:14755, ack 1, win 256, length 1460
12:18:12.354772 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 14755:16215, ack 1, win 256, length 1460
12:18:17.363021 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 16215:16343, ack 1, win 256, length 128
12:18:17.672932 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 16343:16344, ack 1, win 256, length 1
12:18:18.272988 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 16343:16344, ack 1, win 256, length 1
12:18:19.483090 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 16343:16344, ack 1, win 256, length 1
12:18:21.883252 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 16343:16344, ack 1, win 256, length 1
12:18:23.073437 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 16343:17803, ack 1, win 256, length 1460
12:18:23.073470 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 17803:19263, ack 1, win 256, length 1460
12:18:23.073486 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 19263:20723, ack 1, win 256, length 1460
12:18:23.073492 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 20723:22183, ack 1, win 256, length 1460
12:18:23.073739 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 22183:23643, ack 1, win 256, length 1460
12:18:23.073751 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 23643:24042, ack 1, win 256, length 399
12:18:23.074805 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 24042:25502, ack 1, win 256, length 1460
12:18:23.074819 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 25502:25566, ack 1, win 256, length 64
12:18:23.075439 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 25566:27026, ack 1, win 256, length 1460
12:18:23.075451 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 27026:27085, ack 1, win 256, length 59
12:18:23.112892 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 27085:27521, ack 1, win 256, length 436
12:18:23.113117 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 27521:28012, ack 1, win 256, length 491
12:18:23.113349 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 28012:29472, ack 1, win 256, length 1460
12:18:23.113369 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 29472:30902, ack 1, win 256, length 1430
12:18:23.113827 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 30902:32362, ack 1, win 256, length 1460
12:18:23.113841 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 32362:32459, ack 1, win 256, length 97
12:18:23.114228 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 32459:32897, ack 1, win 256, length 438
12:18:23.453337 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 32897:32898, ack 1, win 256, length 1
12:18:24.053470 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 32897:32898, ack 1, win 256, length 1
12:18:25.253507 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 32897:32898, ack 1, win 256, length 1
12:18:27.653647 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 32897:32898, ack 1, win 256, length 1
12:18:32.452953 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 32897:32898, ack 1, win 256, length 1
12:18:37.154671 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 32897:34357, ack 1, win 256, length 1460
12:18:37.154702 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 34357:35817, ack 1, win 256, length 1460
12:18:37.154993 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 35817:37277, ack 1, win 256, length 1460
12:18:37.155003 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 37277:38737, ack 1, win 256, length 1460
12:18:37.155006 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 38737:40197, ack 1, win 256, length 1460
12:18:37.155011 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 40197:41049, ack 1, win 256, length 852
12:18:37.155698 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 41049:42509, ack 1, win 256, length 1460
12:18:37.155709 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 42509:42628, ack 1, win 256, length 119
12:18:37.195858 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 42628:43957, ack 1, win 256, length 1329
12:18:37.196301 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 43957:44676, ack 1, win 256, length 719
12:18:37.544277 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 44676:44677, ack 1, win 256, length 1
12:18:38.144349 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 44676:44677, ack 1, win 256, length 1
12:18:39.344442 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 44676:44677, ack 1, win 256, length 1
12:18:41.744611 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 44676:44677, ack 1, win 256, length 1
12:18:46.545001 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 44676:44677, ack 1, win 256, length 1
12:18:56.141592 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 44676:44677, ack 1, win 256, length 1
12:18:58.340561 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 44676:46136, ack 1, win 256, length 1460
12:18:58.340595 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 46136:47596, ack 1, win 256, length 1460
12:18:58.340602 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 47596:49056, ack 1, win 256, length 1460
12:18:58.340608 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 49056:50516, ack 1, win 256, length 1460
12:18:58.340612 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 50516:51976, ack 1, win 256, length 1460
12:18:58.340617 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 51976:52869, ack 1, win 256, length 893
12:18:58.379993 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 52869:54237, ack 1, win 256, length 1368
12:18:58.380645 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 54237:55697, ack 1, win 256, length 1460
12:18:58.380659 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 55697:55787, ack 1, win 256, length 90
12:18:58.380882 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 55787:57247, ack 1, win 256, length 1460
12:18:58.380893 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 57247:57380, ack 1, win 256, length 133
12:18:58.382846 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 57380:57989, ack 1, win 256, length 609
12:18:58.383256 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 57989:59449, ack 1, win 256, length 1460
12:18:58.383267 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 59449:60909, ack 1, win 256, length 1460
12:18:58.383273 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 60909:61430, ack 1, win 256, length 521
12:18:58.383968 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 61430:62890, ack 1, win 256, length 1460
12:18:58.383980 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 62890:62924, ack 1, win 256, length 34
12:18:58.384658 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 62924:64384, ack 1, win 256, length 1460
12:18:58.384671 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 64384:64469, ack 1, win 256, length 85
12:18:58.385147 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 64469:65925, ack 1, win 256, length 1456
12:18:58.423135 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 65925:66132, ack 1, win 256, length 207
12:18:58.423647 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 66132:67592, ack 1, win 256, length 1460
12:18:58.423661 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 67592:67726, ack 1, win 256, length 134
12:18:58.424082 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [P.], seq 67726:68229, ack 1, win 256, length 503
12:18:58.759749 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 68229:68230, ack 1, win 256, length 1
12:18:59.359800 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 68229:68230, ack 1, win 256, length 1
12:19:00.559895 IP 10.0.2.49.49198 > 10.0.1.173.must-backplane: Flags [.], seq 68229:68230, ack 1, win 256, length 1
And a good source:

Code: Select all

tcpdump -n dst port 3515 and host 10.0.2.130
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
12:20:33.226117 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 1948922560:1948922797, ack 3957586243, win 256, length 237
12:20:33.226385 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 237:907, ack 1, win 256, length 670
12:20:36.226354 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 907:1144, ack 1, win 256, length 237
12:20:36.226670 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 1144:1363, ack 1, win 256, length 219
12:20:38.225669 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 1363:1550, ack 1, win 256, length 187
12:20:39.225664 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 1550:1787, ack 1, win 256, length 237
12:20:39.225958 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 1787:2006, ack 1, win 256, length 219
12:20:42.225382 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 2006:2243, ack 1, win 256, length 237
12:20:42.225709 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 2243:2462, ack 1, win 256, length 219
12:20:45.227665 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 2462:2699, ack 1, win 256, length 237
12:20:45.227942 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 2699:2918, ack 1, win 256, length 219
12:20:48.227221 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 2918:3155, ack 1, win 256, length 237
12:20:48.227553 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 3155:3374, ack 1, win 256, length 219
12:20:51.226130 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 3374:3611, ack 1, win 256, length 237
12:20:51.226471 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 3611:3830, ack 1, win 256, length 219
12:20:54.225371 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 3830:4067, ack 1, win 256, length 237
12:20:54.226011 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 4067:4286, ack 1, win 256, length 219
12:20:57.226002 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 4286:4523, ack 1, win 256, length 237
12:20:57.226291 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 4523:4742, ack 1, win 256, length 219
12:21:00.227997 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 4742:4979, ack 1, win 256, length 237
12:21:00.228305 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 4979:5198, ack 1, win 256, length 219
12:21:03.227713 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 5198:5435, ack 1, win 256, length 237
12:21:03.228025 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 5435:6105, ack 1, win 256, length 670
12:21:06.228337 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 6105:6342, ack 1, win 256, length 237
12:21:06.228704 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 6342:6561, ack 1, win 256, length 219
12:21:08.228426 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 6561:6748, ack 1, win 256, length 187
12:21:09.229078 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 6748:6985, ack 1, win 256, length 237
12:21:09.229373 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 6985:7204, ack 1, win 256, length 219
12:21:12.228258 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 7204:7441, ack 1, win 256, length 237
12:21:12.228585 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 7441:7660, ack 1, win 256, length 219
12:21:15.229242 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 7660:7897, ack 1, win 256, length 237
12:21:15.229584 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 7897:8116, ack 1, win 256, length 219
12:21:18.228770 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 8116:8353, ack 1, win 256, length 237
12:21:18.229152 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 8353:8572, ack 1, win 256, length 219
12:21:21.228870 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 8572:8809, ack 1, win 256, length 237
12:21:21.229189 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 8809:9028, ack 1, win 256, length 219
12:21:24.228714 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 9028:9265, ack 1, win 256, length 237
12:21:24.229033 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 9265:9484, ack 1, win 256, length 219
12:21:27.229013 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 9484:9721, ack 1, win 256, length 237
12:21:27.229287 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 9721:9940, ack 1, win 256, length 219
12:21:30.228435 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 9940:10177, ack 1, win 256, length 237
12:21:30.228743 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 10177:10396, ack 1, win 256, length 219
12:21:33.229511 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 10396:10633, ack 1, win 256, length 237
12:21:33.229841 IP 10.0.2.130.51058 > 10.0.1.173.must-backplane: Flags [P.], seq 10633:11303, ack 1, win 256, length 670
For reference here are the flags for tcpdump:
TCP Flag tcpdump Flag Meaning
SYN S Syn packet, a session establishment request.
ACK A Ack packet, acknowledge sender’s data.
FIN F Finish flag, indication of termination.
RESET R Reset, indication of immediate abort of conn.
PUSH P Push, immediate push of data from sender.
URGENT U Urgent, takes precedence over other data.
NONE A dot . Placeholder, usually used for ACK.
You do not have the required permissions to view the files attached to this post.
I like graphs...
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Recv-Q buildup

Post by cdienger »

The bad machine definitely is sending a lot more than the good one. The logstash process is likely not able to keep up with everything and I would suggest following https://support.nagios.com/kb/article/n ... g-576.html to increase the memory allocated to logstash. You can also double the LS_WORKER_THREADS value found in the file it has you edit. Make sure to restart the service after making any changes to the file:

service logstash restart
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Envera IT
Posts: 159
Joined: Wed Jun 19, 2013 10:21 am

Re: Recv-Q buildup

Post by Envera IT »

Thanks for the suggestion, unfortunately it didn't yield the anticipated results. I'm not sure resource contention (or anything based on overloading/configuration) is the issue here given that it seems as though specific connections seem to wind up being "ignored" while other good connections will have traffic flow in and read freely.

I believe the reason the "bad" connections are pushing more data is because they are becoming backlogged at NXLog so NXLog always is trying to push a maximum amount of logs every 20 second interval, compared to the good connections where they are only trying to send however many new logs in the last 20 seconds have been gathered.

We know that these logs are not technically lost and NXLog is aware of its progress successfully delivering logs. "bad" connections do technically still write some logs, for example one such site is writing logs at a rate of 15 minutes worth of logs are taking approximately an hour to be digested. Currently that log source is at 8:19am this morning (compared to 11:33am at time of posting).

Finally despite our original LS instance having 150 connections on that port vs 35 on the one sampled for the OP, it only has 17 bad connections at time of writing. This presents a very difficult pattern as all of our instances are equally provisioned for resources, however 11:34 and 17:150 ratios is not consistent with either resource contention nor end device configuration, to my understanding.
I like graphs...
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Recv-Q buildup

Post by cdienger »

What version of nxlog is installed?

Does the machine send just one type of log or multiple? You may find better performance if you send the large log to it's own output. For example, on a test machine I use one output for most logs while another log I was testing with goes to another:

Code: Select all

<Output out>
    Module om_tcp
    Host 192.168.3.42
    Port 3515
	
    Exec  $tmpmessage = $Message; delete($Message); rename_field("tmpmessage","message");
    Exec  $raw_event = to_json();
	
	# Uncomment for debug output
	#Exec file_write('%ROOT%\data\nxlog_output.log', $raw_event + "\n");
</Output>
 
<Output out2>
    Module om_tcp
    Host 192.168.3.42
    Port 2056
	
    Exec  $tmpmessage = $Message; delete($Message); rename_field("tmpmessage","message");
    Exec  $raw_event = to_json();
	
	# Uncomment for debug output
	#Exec file_write('%ROOT%\data\nxlog_output.log', $raw_event + "\n");
	
</Output>

<Route 1>
    Path internal, eventlog, Important_File => out
</Route>

<Route 2>
    Path file3 => out2
</Route>
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Envera IT
Posts: 159
Joined: Wed Jun 19, 2013 10:21 am

Re: Recv-Q buildup

Post by Envera IT »

Version of NXLog installed is 2.9.1716.

We have a similar setup in that we send multiple log files per machine to a single location, however I did already break out one log sender to send to 3516 as part of a test, (duplicate input setup just +1 port in log server) and rather than showing any signs of improving it seemed to just immediately start building up a Recv-Q on that new connection.

Last Friday we identified a potential cause and enacted the corresponding possible fix, and as of this morning there are no Recv-Q buildups however Friday evening logging from our main server stopped. Here are NXLog's debug log this morning, there is a whole mess of "2018-03-18 10:01:41 INFO connecting to srq-syslog.envera.local:3515

Code: Select all

2018-03-18 10:01:41 INFO reconnecting in 200 seconds
2018-03-18 10:01:41 ERROR couldn't connect to tcp socket on srq-syslog.envera.local:3515; An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full.  ".
I completely stopped and restarted NXLog on that machine and it appears to have recovered.

I'll leave this as-is for a few days to see if the problem represents itself. If the issue appears to have been resolved at that time I'll provide an explanation FWIW to future visitors.
I like graphs...
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Recv-Q buildup

Post by tmcdonald »

We'll keep this thread open pending your reply.
Former Nagios employee
Envera IT
Posts: 159
Joined: Wed Jun 19, 2013 10:21 am

Re: Recv-Q buildup

Post by Envera IT »

tmcdonald wrote:We'll keep this thread open pending your reply.
Thanks,

So we were able to resolve this by turning off a reverse dns filter. We noticed that the error log was filled with reverse dns failures which generally speaking would be expected, not everything should resolve after all. I had searched through the log for relevant data (using Notepad++ I removed all lines regarding reverse dns) and found nothing. While looking again after doing logstash restarts I decided to turn off the reverse dns temporarily so I wouldn't have to parse through the logs and I noticed immediately afterwards all receive queues stayed clear and low and behold they still are.

The issue with NXLog last Monday appears to have been unrelated and a one off. I will make a new thread and reference this one if the RecvQ issue reoccurs but it looks like we're out of the woods on this one. Thank you for your assistance in working to resolve this!
I like graphs...
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: Recv-Q buildup

Post by scottwilkerson »

Now that you mention it, I do remember a case about 2 years ago that had similar results and do remember reading on some logstash websites that the dns filter can be a massive bottleneck.

I'm glad it is resolved. Locking
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart