Duplicate TCP Traffic for Benchmarking – How to Guide

benchmarkdebianiptablesPROXYroute

Infrastructure: Servers in Datacenter, OS – Debian Squeeze, Webserver – Apache 2.2.16


Situation:

The live server is in use by our cusotmers every day, which makes it impossible to test adjustments and improvements.
Therefore we would like to duplicate the inbound HTTP traffic on the live server to one or multiple remote servers in realtime. The traffic has to be passed to the local Webserver (in this case Apache) AND to the remote server(s). Thereby we can adjust configurations and use different/updated code on the remote server(s) for benchmarking and comparison with the current live-server.
Currently the webserver is listening to approx. 60 additional ports besides 80 and 443, because of the client structure.


Question:
How can this duplication to one or multiple remote servers be implemented?

We have already tried:

  • agnoster duplicator – this would require one open session per port which is not applicable. (https://github.com/agnoster/duplicator)
  • kklis proxy – does only forward traffic to remote server, but does not pass it to the lcoal webserver. (https://github.com/kklis/proxy)
  • iptables – DNAT does only forward the traffic, but does not pass it to the local webserver
  • iptables – TEE does only duplicate to servers in the local network -> the servers are not located in the same network due to the structure of the datacenter
  • suggested alternatives provided for the question "duplicate tcp traffic with a proxy" at stackoverflow (https://stackoverflow.com/questions/7247668/duplicate-tcp-traffic-with-a-proxy) were unsuccessful. As mentioned, TEE does not work with remote servers outside the local network. teeproxy is no longer available (https://github.com/chrislusf/tee-proxy) and we could not find it somewhere else.
  • We have added a second IP address (which is in the same network) and assigned it to eth0:0 (primary IP address is assigned to eth0). No success with combining this new IP or virtual interface eth0:0 with iptables TEE function or routes.
  • suggested alternatives provided for the question "duplicate incoming tcp traffic on debian squeeze" (Duplicate incoming TCP traffic on Debian Squeeze) were unsuccessful. The cat|nc sessions (cat /tmp/prodpipe | nc 127.0.0.1 12345 and cat /tmp/testpipe | nc 127.0.0.1 23456) are interrupted after every request/connect by a client without any notice or log. Keepalive did not change this situation. TCP Packages were not transported to remote system.
  • Additional tries with with different options of socat (HowTo: http://www.cyberciti.biz/faq/linux-unix-tcp-port-forwarding/ , https://stackoverflow.com/questions/9024227/duplicate-input-unix-stream-to-multiple-tcp-clients-using-socat) and similar tools were unsuccessful, because the provided TEE function will write to FS only.
  • Of course, googling and searching for this "problem" or setup was unsuccessful as well.

We are running out of options here.

Is there a method to disable the enforcement of "server in local network" of the TEE function when using IPTABLES?

Can our goal be achieved by different usage of IPTABLES or Routes?

Do you know a different tool for this purpose which has been tested and works for these specific circumstances?

Is there a different source for tee-proxy (which would fit our requirements perfectly, AFAIK)?


Thanks in advance for your replies.

———-

edit: 05.02.2014

here is the python script, which would function the way we need it:

import socket  
import SimpleHTTPServer  
import SocketServer  
import sys, thread, time  

def main(config, errorlog):
    sys.stderr = file(errorlog, 'a')

    for settings in parse(config):
        thread.start_new_thread(server, settings)

    while True:
        time.sleep(60)

def parse(configline):
    settings = list()
    for line in file(configline):
        parts = line.split()
        settings.append((int(parts[0]), int(parts[1]), parts[2], int(parts[3])))
    return settings

def server(*settings):
    try:
        dock_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

        dock_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

        dock_socket.bind(('', settings[0]))

        dock_socket.listen(5)

        while True:
            client_socket = dock_socket.accept()[0]

            client_data = client_socket.recv(1024)
            sys.stderr.write("[OK] Data received:\n %s \n" % client_data)

            print "Forward data to local port: %s" % (settings[1])
            local_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            local_socket.connect(('', settings[1]))
            local_socket.sendall(client_data)

            print "Get response from local socket"
            client_response = local_socket.recv(1024)
            local_socket.close()

            print "Send response to client"
            client_socket.sendall(client_response)
            print "Close client socket"
            client_socket.close()

            print "Forward data to remote server: %s:%s" % (settings[2],settings[3])
            remote_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            remote_socket.connect((settings[2], settings[3]))
            remote_socket.sendall(client_data)       

            print "Close remote sockets"
            remote_socket.close()
    except:
        print "[ERROR]: ",
        print sys.exc_info()
        raise

if __name__ == '__main__':
    main('multiforwarder.config', 'error.log')

The comments to use this script:
This script forwards a number of configured local ports to another local and a remote socket servers.

Configuration:
Add to the config file port-forward.config lines with contents as follows:

Error messages are stored in file 'error.log'.

The script splits the parameters of the config file:
Split each config-line with spaces
0: local port to listen to
1: local port to forward to
2: remote ip adress of destination server
3: remote port of destination server
and return settings

Best Answer

It is impossible. TCP is statefull protocol. User end computer is involved in every step of connection and it will never answer to two separate servers trying to communicate to it. All you can do is collect all http request on webserver or some proxy and replay them. But that will not give and exact concurrency or traffic conditions of a live server.

Related Topic