Sockets – Why is it assumed that send may return with less than requested data transmitted on a blocking socket

ccoding-stylesendsockets

The standard method to send data on a stream socket has always been to call send with a chunk of data to write, check the return value to see if all data was sent and then keep calling send again until the whole message has been accepted.

For example this is a simple example of a common scheme:

int send_all(int sock, unsigned char *buffer, int len) {
  int nsent;

  while(len > 0) {
    nsent = send(sock, buffer, len, 0);
    if(nsent == -1) // error
      return -1;

    buffer += nsent;
    len -= nsent;
  }
  return 0; // ok, all data sent
}

Even the BSD manpage mentions that

…If no messages space is available at the socket to hold the message to be transmitted, then send() normally blocks…

Which indicates that we should assume that send may return without sending all data. Now I find this rather broken but even W. Richard Stevens assumes this in his standard reference book about network programming, not in the beginning chapters, but the more advanced examples uses his own writen (write all data) function instead of calling write.

Now I consider this still to be more or less broken, since if send is not able to transmit all data or accept the data in the underlying buffer and the socket is blocking, then send should block and return when the whole send request has been accepted.

I mean, in the code example above, what will happen if send returns with less data sent is that it will be called right again with a new request. What has changed since last call? At max a few hundred CPU cycles have passed so the buffer is still full. If send now accepts the data why could'nt it accept it before?

Otherwise we will end upp with an inefficient loop where we are trying to send data on a socket that cannot accept data and keep trying, or else?

So it seems like the workaround, if needed, results in heavily inefficient code and in those circumstances blocking sockets should be avoided at all an non blocking sockets together with select should be used instead.

Best Answer

The thing that is missing in above description is, in Unix, system calls might get interrupted with signals. That's exactly the reason blocking send(2) might return a short count.

Related Solutions

Linux – Case when blocking recv() returns less than requested bytes

recv will return when there is data in the internal buffers to return. It will not wait until there is 100 bytes if you request 100 bytes.

If you're sending 100 byte "messages", remember that TCP does not provide messages, it is just a stream. If you're dealing with application messages, you need to handle that at the application layer as TCP will not do it.

There are many, many conditions where a send() call of 100 bytes might not be read fully on the other end with only one recv call when calling recv(..., 100); here's just a few examples:

The sending TCP stack decided to bundle together 15 write calls, and the MTU happened to be 1460, which - depending on timing of the arrived data might cause the clients first 14 calls to fetch 100 bytes and the 15. call to fetch 60 bytes - the last 40 bytes will come the next time you call recv() . (But if you call recv with a buffer of 100 , you might get the last 40 bytes of the prior application "message" and the first 60 bytes of the next message)
The sender buffers are full, maybe the reader is slow, or the network is congested. At some point, data might get through and while emptying the buffers the last chunk of data wasn't a multiple of 100.
The receiver buffers are full, while your app recv() that data, the last chunk it pulls up is just partial since the whole 100 bytes of that message didn't fit the buffers.

Many of these scenarios are rather hard to test, especially on a lan where you might not have a lot of congestion or packet loss - things might differ as you ramp up and down the speed at which messages are sent/produced.

Anyway. If you want to read 100 bytes from a socket, use something like

int
readn(int f, void *av, int n)
{
    char *a;
    int m, t;

    a = av;
    t = 0;
    while(t < n){
        m = read(f, a+t, n-t);
        if(m <= 0){
            if(t == 0)
                return m;
            break;
        }
        t += m;
    }
    return t;
}

...

if(readn(mysocket,buffer,BUFFER_SZ) != BUFFER_SZ) {
  //something really bad is going on.

}

Sockets – Blocking sockets: when, exactly, does “send()” return

Does this mean that the send() call will always return immediately if there is room in the kernel send buffer?

Yes. As long as immediately means after the memory you provided it has been copied to the kernel's buffer. Which, in some edge cases, may not be so immediate. For instance if the pointer you pass in triggers a page fault that needs to pull the buffer in from either a memory mapped file or the swap, that would add significant delay to the call returning.

Is the behavior and performance of the send() call identical for TCP and UDP? If not, why not?

Not quite. Possible performance differences depends on the OS' implementation of the TCP/IP stack. In theory the UDP socket could be slightly cheaper, since the OS needs to do fewer things with it.

EDIT: On the other hand, since you can send much more data per system call with TCP, typically the cost per byte can be a lot lower with TCP. This can be mitigated with sendmmsg() in recent linux kernels.

As for the behavior, it's nearly identical.

For blocking sockets, both TCP and UDP will block until there's space in the kernel buffer. The distinction however is that the UDP socket will wait until your entire buffer can be stored in the kernel buffer, whereas the TCP socket may decide to only copy a single byte into the kernel buffer (typically it's more than one byte though).

If you try to send packets that are larger than 64kiB, a UDP socket will likely consistently fail with EMSGSIZE. This is because UDP, being a datagram socket, guarantees to send your entire buffer as a single IP packet (or train of IP packet fragments) or not send it at all.

Non blocking sockets behave identical to the blocking versions with the single exception that instead of blocking (in case there's not enough space in the kernel buffer), the calls fail with EAGAIN (or EWOULDBLOCK). When this happens, it's time to put the socket back into epoll/kqueue/select (or whatever you're using) to wait for it to become writable again.

As usual when working on POSIX, keep in mind that your call may fail with EINTR (if the call was interrupted by a signal). In this case you most likely want to call send() again.

Best Answer

Related Solutions

Linux – Case when blocking recv() returns less than requested bytes

Sockets – Blocking sockets: when, exactly, does “send()” return

Related Topic