Iptables: match only the first packet of established TCP-connection

iptablestcp

In my Apache-logfiles I find a lot of entries that contain "GET /w00tw00t.at.ISC.SANS.DFind:) HTTP/1.1" 400 or similar crap. They come from non-RFC2616-connections (HTTP/1.1 without hostname).

I don't want to have my log files spammed with this messages. So I want to reject those connections using iptables. Therefore I want to search for the string "HTTP/1.1" followed by two subsequent CR/LFs (CR/LF/CR/LF) (which gives in total the hex-string 485454502f312e310d0a0d0a) in the packets payload.

But its stupid to waste CPU-cycles for searching for this string in all TPC-packets when i know it is in the very first packet. It even would be wrong because "HTTP/1.1" followed by two subsequent CR/LFs might be a legal part of transmission inside the http-requests payload.

Here http://spamcleaner.org/en/misc/w00tw00t.html is a solution for this problem, but I don't understand the part that identifies the first packet of an established tcp-connection.

What I don't understand is why all 3 packets of the initial TCP-Handshake (SYN, ACK+SYN, ACK) can be seen in the INPUT-Chain or a chain that only can be reached from INPUT. So far as I did understand iptables and its chains the second packet (ACK+SYN) never goes through INPUT. I think it passes OUTPUT because its me (i.e. the server) who is sending it.

This is the script form spamcleaner.org, I changed just some comments in the first part of the script but I left all commands unchanged:

#!/bin/bash

# allow loopback
iptables -A INPUT -i lo -j ACCEPT

# DROP any IP that is in the blacklist "w00tlist" and set the
# blacklist-timeout to 6 hour
iptables -A INPUT -p tcp -m recent --name w00tlist --update --seconds 21600 -j DROP

# create the chain "w00tchain"
iptables -N w00tchain

# this chain will add the IP to the blacklist "w00tlist"
# and will reset the connection:
iptables -A w00tchain -m recent --set --name w00tlist -p tcp \
-j REJECT --reject-with tcp-reset

# create another chain named "w00t". It's purpose is to identify the first packet
# of an newly established tcp-connection and to search for a string in it:
iptables -N w00t

# redirect all incoming (no outgoing!) TCP packets to the chain "w00t":
iptables -A INPUT -p tcp -j w00t

# all remaining rules are part of the chain "w00t"
#---------------------------------------------------------------
# all following comments in lowercase are unchanged from spamcleaner.org
# COMMENTS IN UPPERCASE ARE FROM ME
#---------------------------------------------------------------

# look for the SYN packet and create the list :
iptables -A w00t -m recent -p tcp --syn --dport 80 --set  

# look for the SYN,ACK packet and update the list :
iptables -A w00t -m recent -p tcp --tcp-flags PSH,SYN,ACK SYN,ACK --sport 80 --update
#---------------------------------------------------------------------------------
# THIS IS WHAT I DON'T UNDERSTAND:
# THE CHAIN w00t CAN ONLY BE REACHED FROM THE CHAIN INPUT. SO WE ARE DEALING HERE
# WITH PACKETS THAT THE CLIENT IS SENDING AND THAT THE SERVER IS RECEIVING. BUT IN
# STEP 2 OF TCP-HANDSHAKE ITS THE SERVER WHO IS SENDING AND THE CLIENT WHO IS
# RECEIVING. SO THE PACKET WITH SYN AND ACK SET AND WITH sport 80 GOES THROUGH THE
# CHAIN "OUTPUT", NOT "INPUT". SO HOW CAN IT BE DETECTED IN CHAIN w00t?
#---------------------------------------------------------------------------------

# look for the ACK packet and update the list :
iptables -A w00t -m recent -p tcp --tcp-flags PSH,SYN,ACK ACK --dport 80 --update

# look for the hexadecimal string in the first PSH+ACK.
# If found, redirect to w00tchain in order to blacklist the IP and
# to close the connection.
# Delete our list, we do not want to filter any further packet from that connection :
iptables -A w00t -m recent -p tcp --tcp-flags PSH,ACK PSH,ACK --dport 80 --remove \
-m string --to 80 --algo bm --hex-string '|485454502f312e310d0a0d0a|' -j w00tchain

And there is a second thing that I don't understand:

The last rule is searching for the hex-string in a packet that has it's PSH- and ACK-flags set. But how can I be sure, that PSH is set for my packet? I am not sure, but I think its possible and legal to send TCP-packets that have its PSH-flag unset.

EDIT:
There's a 3rd question: What if the server receives two wore more HTTP-Requests over TCP from the same IP-Adresses at the same time (Each request with its own port number)?

Best Answer

Forget IPTables. You can simply use mod_security with nolog action. Something like this (untested):

SecRule REQUEST_URI "^/w00tw00t\.at\.ISC\.SANS\.DFind" phase:1,nolog,deny,id:1000

Or you can create a dummy virtualhost with separate logs, which just denies all requests and configure it as first. Clients which would not provide a hostname or provide unknown hostname would always end there.