Linux Networking – Troubleshooting Packet Drops

packetloss

I've run dropwatch and this is the result I got:

dropwatch> start
Enabling monitoring...
Kernel monitoring activated.
Issue Ctrl-C to stop monitoring
39 drops at tcp_rcv_state_process+1b6 (0xffffffff815eeda6)
36 drops at sk_stream_kill_queues+50 (0xffffffff81583970)
3 drops at skb_release_data+10e (0xffffffff8157bf3e)
2 drops at tcp_v4_do_rcv+80 (0xffffffff815f8f70)
2 drops at tcp_v4_rcv+87 (0xffffffff815fa087)
30 drops at tcp_rcv_state_process+1b6 (0xffffffff815eeda6)
31 drops at sk_stream_kill_queues+50 (0xffffffff81583970)
20 drops at unix_dgram_sendmsg+4f8 (0xffffffff81646a38)
5 drops at tcp_v4_rcv+87 (0xffffffff815fa087)
2 drops at tcp_v4_do_rcv+80 (0xffffffff815f8f70)
19 drops at tcp_rcv_state_process+1b6 (0xffffffff815eeda6)
23 drops at sk_stream_kill_queues+50 (0xffffffff81583970)
2 drops at tcp_v4_rcv+87 (0xffffffff815fa087)
2 drops at skb_release_data+10e (0xffffffff8157bf3e)
11 drops at unix_dgram_sendmsg+4f8 (0xffffffff81646a38)
57 drops at tcp_rcv_state_process+1b6 (0xffffffff815eeda6)
49 drops at sk_stream_kill_queues+50 (0xffffffff81583970)
5 drops at skb_release_data+10e (0xffffffff8157bf3e)
5 drops at tcp_v4_rcv+87 (0xffffffff815fa087)
1 drops at skb_queue_purge+18 (0xffffffff8157c0a8)
4 drops at tcp_rcv_state_process+1b6 (0xffffffff815eeda6)
4 drops at sk_stream_kill_queues+50 (0xffffffff81583970)
3 drops at tcp_v4_do_rcv+80 (0xffffffff815f8f70)
3 drops at tcp_v4_rcv+87 (0xffffffff815fa087)
10 drops at skb_release_data+10e (0xffffffff8157bf3e)
38 drops at unix_dgram_sendmsg+4f8 (0xffffffff81646a38)
29 drops at sk_stream_kill_queues+50 (0xffffffff81583970)
28 drops at tcp_rcv_state_process+1b6 (0xffffffff815eeda6)
1 drops at tcp_v6_rcv+87 (0xffffffff81677ff7)
2 drops at tcp_v4_rcv+87 (0xffffffff815fa087)
1 drops at tcp_v4_do_rcv+80 (0xffffffff815f8f70)
1 drops at skb_release_data+10e (0xffffffff8157bf3e)
17 drops at tcp_rcv_state_process+1b6 (0xffffffff815eeda6)
14 drops at sk_stream_kill_queues+50 (0xffffffff81583970)
1 drops at skb_release_data+10e (0xffffffff8157bf3e)
1 drops at tcp_v4_rcv+87 (0xffffffff815fa087)
5 drops at tcp_rcv_state_process+1b6 (0xffffffff815eeda6)
10 drops at skb_release_data+10e (0xffffffff8157bf3e)
2 drops at unix_dgram_sendmsg+4f8 (0xffffffff81646a38)
4 drops at sk_stream_kill_queues+50 (0xffffffff81583970)
20 drops at tcp_rcv_state_process+1b6 (0xffffffff815eeda6)
22 drops at sk_stream_kill_queues+50 (0xffffffff81583970)
2 drops at skb_release_data+10e (0xffffffff8157bf3e)
48 drops at tcp_rcv_state_process+1b6 (0xffffffff815eeda6)
53 drops at sk_stream_kill_queues+50 (0xffffffff81583970)

From now on I'm stuck. I've checked that tcp_rcv_state_process and sk_stream_kill_queues are functions in Linux Kernel but I don't know by what they are controlled. I've jumped on this issue because in my node some application timeout in a expected manner.

Any advice how could I go on?

Best Answer

In order to progress you need to install kernel debug modules and 'elfutils' package. On Centos 7:

#debuginfo-install kernel
#yum install elfutils

After that you can find source code position in kernel, that corresponds to the address from dropwatch. For example you have "57 drops at tcp_rcv_state_process+1b6 (0xffffffff815eeda6)"

#eu-addr2line -f -k 0xffffffff815eeda6
tcp_rcv_state_process
net/ipv4/tcp_input.c:5834

In this case it will drop data from tcp packet if it is a SYN packet.