So, as you've found out, TCP congestion control is a pretty complicated area.
For this particular case, because of the small requests, you're going to want to try to keep the connections open as much as possible, because one connection per request is going to take five packets each, whereas you can get the average down to a little more than two packets if you keep connections around.
NODELAY is the right thing for a game server; you want your 256 bytes delivered right away, and that's not a whole segment, so Nagle will pause unless you use NODELAY.
If your servers have loads of memory, the memory options are no big deal, new kernels have them right.
As for congestion control algorithms, you spotted Westwood. The other option is CUBIC. You can just go with one, or you can do some research and benchmark them. That could be quite a bit of work, but for 10M clients it's worth it. So, I'd be looking in to running a simulation using a traffic generator on a Mac or three (since they have the same TCP implementation as the phone), a Linux box in between acting as a router (more about this shortly) and one of your servers, to see how it goes.
Now, that middle Linux box should run ns-3 so you can simulate a more complicated path than just an ethernet switch. You then capture some packet traces on the sending end of the TCP connections, and analyse them with tcptrace or the tcptrace graphing modes of wireshark. The tcptrace documentation is a good introduction to analysing TCP congestion behaviour.
You wouldn't normally see a TCP RST. I suppose an application at layer 7 aborting might generate a RST, but I think you'll find that a RST is most often generated by a firewall between the two hosts. Here's a list of possible reasons from the TCP/IP guide:
Receipt of any TCP segment from any device with which the device receiving the segment does not currently have a connection (other than a SYN requesting a new connection).
Receipt of a message with an invalid or incorrect Sequence Number or Acknowledgment Number field, indicating the message may belong to a prior connection or is spurious in some other way.
Receipt of a SYN message on a port where there is no process listening for connections.
Best Answer
My understanding is that a TCP socket consists of the IP+port number, so changing the IP breaks that connection. nc has no way of knowing the IP changed, so it continues sending data to the original IP until the session times out.
See RFC 793 (Transmission Control Protocol), specifically section 2.7:
2.7. Connection Establishment and Clearing
To identify the separate data streams that a TCP may handle, the TCP provides a port identifier. Since port identifiers are selected independently by each TCP they might not be unique. To provide for unique addresses within each TCP, we concatenate an internet address identifying the TCP with a port identifier to create a socket which will be unique throughout all networks connected together.
I suggest using Wireshark or another packet sniffer to watch the traffic for yourself and see it in action.