So, this is a neat question.
Initially, I was surprised that you saw any connections in SYN_RECV state with SYN cookies enabled. The beauty of SYN cookies is that you can statelessly participate in the in TCP 3-way handshake as a server using cryptography, so I would expect the server not to represent half-open connections at all because that would be the very same state that isn't being kept.
In fact, a quick peek at the source (tcp_ipv4.c) shows interesting information about how the kernel implements SYN cookies. Essentially, despite turning them on, the kernel behaves as it would normally until its queue of pending connections is full. This explains your existing list of connections in SYN_RECV state.
Only when the queue of pending connections is full, AND another SYN packet (connection attempt) is received, AND it has been more than a minute since the last warning message, does the kernel send the warning message you have seen ("sending cookies"). SYN cookies are sent even when the warning message isn't; the warning message is just to give you a heads up that the issue hasn't gone away.
Put another way, if you turn off SYN cookies, the message will go away. That is only going to work out for you if you are no longer being SYN flooded.
To address some of the other things you've done:
net.ipv4.tcp_synack_retries
:
- Increasing this won't have any positive effect for those incoming connections that are spoofed, nor for any that receive a SYN cookie instead of server-side state (no retries for them either).
- For incoming spoofed connections, increasing this increases the number of packets you send to a fake address, and possibly the amount of time that that spoofed address stays in your connection table (this could be a significant negative effect).
- Under normal load / number of incoming connections, the higher this is, the more likely you are to quickly / successfully complete connections over links that drop packets. There are diminishing returns for increasing this.
net.ipv4.tcp_syn_retries
: Changing this cannot have any effect on inbound connections (it only affects outbound connections)
The other variables you mention I haven't researched, but I suspect the answers to your question are pretty much right here.
If you aren't being SYN flooded and the machine is responsive to non-HTTP connections (e.g. SSH) I think there is probably a network problem, and you should have a network engineer help you look at it. If the machine is generally unresponsive even when you aren't being SYN flooded, it sounds like a serious load problem if it affects the creation of TCP connections (pretty low level and resource non-intensive)
What I would do in this situation is run
strace -f -p <PID> -tt -T -s 500 -o trace.txt
on one of your Apache processes during the ab test until you capture one of the slow responses. Then have a look through trace.txt
.
The -tt
and -T
options give you timestamps of the start and duration of each system call to help identify the slow ones.
You might find a single slow system call such as open()
or stat()
or you might find a quick call with (possibly multiple) poll()
calls directly after it. If you find one that's operating on a file or network connection (quite likely) look backwards through the trace until you find that file or connection handle. The earlier calls on that same handle should give you an idea of what the poll()
was waiting for.
Good idea looking at the -c
option. Did you ensure that the Apache child you were tracing served at least one of the slow requests during that time? (I'm not even sure how you would do this apart from running strace
simultaneously on all children.)
Unfortunately, strace
doesn't give us the complete picture of what a running program is doing. It only tracks system calls. A lot can happen inside a program that doesn't require asking the kernel for anything. To figure out if this is happening, you can look at the timestamps of the start of each system call. If you see significant gaps, that's where the time is going. This isn't easily greppable and there are always small gaps between the system calls anyway.
Since you said the CPU usage stays low, it's probably not excessive things happening in between system calls but it's worth checking.
Looking more closely at the output from ab
:
The sudden jump in the response times (looks like there are no response times anywhere between 150ms and 3000ms) suggests that there is a specific timeout happening somewhere that gets triggered above around 256 simultaneous connections. A smoother degradation would be expected if you were running out of RAM or CPU cycles normal IO.
Secondly, the slow ab
response shows that the 3000ms were spent in the connect
phase. Nearly all of them took around 30ms but 5% took 3000ms. This suggests that the network is the problem.
Where are you running ab
from? Can you try it from the same network as the Apache machine?
For more data, try running tcpdump
at both ends of the connection (preferably with ntp
running at both ends so you can sync the two captures up.) and look for any tcp retransmissions. Wireshark is particularly good for analysing the dumps because it highlights tcp retransmissions in a different colour, making them easy to find.
It might also be worth looking at the logs of any network devices you have access to. I recently ran into a problem with one of our firewalls where it could handle the bandwidth in terms of kb/s but it couldn't handle the number of packets per second it was receiving. It topped out at 140,000 packets per second. Some quick maths on your ab
run leads me to believe you would have been seeing around 13,000 packets per second (ignoring the 5% of slow requests). Maybe this is the bottleneck you have reached. The fact that this happens around 256 might be purely a coincidence.
Best Answer
I would use a firewall at the network perimeter to prevent\remediate SYN flood attacks (as well as DOS, DDOS, spoofing, port probes, address space probes, etc.). I don't want this type of stuff getting into my internal network, where I'll have to deal with it on a machine by machine basis.