Linux – Latency in TCP/IP-over-Ethernet networks

ethernetlatencylinuxnetworkingtcpip

What resources (books, Web pages etc) would you recommend that:

  • explain the causes of latency in TCP/IP-over-Ethernet networks;
  • mention tools for looking out for things that cause latency (e.g. certain entries in netstat -s);
  • suggest ways to tweak the Linux TCP stack to reduce TCP latency (Nagle, socket buffers etc).

The closest I am aware of is this document, but it's rather brief.

Alternatively, you're welcome to answer the above questions directly.

edit To be clear, the question isn't just about "abnormal" latency, but about latency in general. Additionally, it is specifically about TCP/IP-over-Ethernet and not about other protocols (even if they have better latency characteristics.)

Best Answer

In regards to kernel tunables for latency, one sticks out in mind:

echo 1 > /proc/sys/net/ipv4/tcp_low_latency

From the documentation:

If set, the TCP stack makes decisions that prefer lower latency as opposed to higher throughput. By default, this option is not set meaning that higher throughput is preferred. An example of an application where this default should be changed would be a Beowulf compute cluster. Default: 0

You can also disable Nagle's algorithm in your application (which will buffer TCP output until maximum segment size) with something like:

#include <sys/types.h>
#include <stdio.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <stdlib.h>
#include <linux/tcp.h>

int optval = 1;
int mysock;

void main() {
    void errmsg(char *msg) {perror(msg);exit(1);}

    if((mysock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0) {
        errmsg("setsock failed");
    }

    if((setsockopt(mysock, SOL_SOCKET, TCP_NODELAY, &optval, sizeof(optval))) < 0) {
        errmsg("setsock failed");
    }

    /* Some more code here ... */

    close(mysock);
}

The "opposite" of this option is TCP_CORK, which will "re-Nagle" packets. Beware, however, as TCP_NODELAY might not always do what you expect, and in some cases can hurt performance. For example, if you are sending bulk data, you will want to maximize throughput per-packet, so set TCP_CORK. If you have an application that requires immediate interactivity (or where the response is much larger than the request, negating the overhead), use TCP _NODELAY. On another note, this behavior is Linux-specific and BSD is likely different, so caveat administrator.

Make sure you do thorough testing with your application and infrastructure.