Docker – Publish Docker Swarm services on specific IP addresses

dockerdocker-swarmhaproxyipvs

On Centos 7.4 I am setting up a swarm where I want to run multiple routers all reachable on port 80/443.
The purpose is to host multiple environment (test/staging…) on a single swarm, all symmetrically.

I am using Docker 17.12.0-ce and Traefik v1.4.6 as router.

The basic idea is to have a virtual IP address per environment and publish Traefik ports only on that address. This is impossible with Docker swarm, so I have to resort to have the Traefik instances listen on ports 81/82 etc and somehow bring the traffic from VIP:80 to :81/:82.

Virtual IP addresses for all the environments across the swarm managers are handled by Keepalived.

Relevant docker service config for Traefik:

"Ports": [
          {
           "Protocol": "tcp",
           "TargetPort": 80,
           "PublishedPort": 81,
           "PublishMode": "ingress"
          },

# netstat -anp|grep 81
tcp6       7      0 :::81                   :::*                    LISTEN      4578/dockerd        

firewalld is set up to allow traffic to ports 80, 81, 82, etc

Accessing the backend services exposed by Traefik directly on port 81 on the VIP works.

Accessing port 80 on the VIP when nothing is configured on it corretly leads to connection refused

The Traefik docker instance is running on the same host I'm using for the following tests.

I first tried with basic DNAT:

firewall-cmd --add-forward-port=port=80:proto=tcp:toport=81:toaddr=127.0.0.1

This led to timeouts, no connection appeared established on the server and tcpdump told me SYNs are ignored

next I tried with a little more specific DNAT:

firewall-cmd --add-rich-rule='rule family=ipv4 forward-port port=80 protocol=tcp to-port=81 to-addr=127.0.0.1'

with the same results.

I discovered GORB which seems tailored to my use case, and provisioned it with

Service:

{
  "host": "<VIP>",
  "port": 80,
  "protocol": "tcp",
  "method": "rr",
  "persistent": true,
  "flags": "sh-port"
}

Backend for said service:

{
  "host": "<VIP>",
  "port": 81,
  "method": "nat",
  "weight": 100,
  "pulse": {
     "type": "tcp",
     "interval": "30s",
     "args": null
  }
}

I verified the setup using ipvsadm and it seems correct:

# ipvsadm -l -n 
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP         <VIP>:80 rr (flag-2)
  ->        <VIP>:81              Masq    100    0          0     

in this case, while no connection appeared on the server, tcpdump showed SYN, SYNACK and ACK being exchanged, followed by the HTTP request and its ACK.
No other traffic passed and the request ultimately timed out on the client side.
ipvsadm registered the connection as active.

If I set up HAProxy to listen on VIP:80 and to proxy the requests via HTTP to 127.0.0.1:81 everything works, but I'd like to avoid it, as it requires all data to pass thru HAProxy, wasting resources for nothing and requiring local configuration.

I'm out of ideas and I don't know how to further troubleshoot.

EDIT for clarification. My question is:
Is it possible to route traffic from VIP:80 to :81/:82 etc without using HAProxy or another process that would simply pump data to the real router (Traefik)?

Best Answer

We had a need to publish separate docker swarm services on the same ports, but on separate specific IP addresses. Here's how we did it.

Docker adds rules to the DOCKER-INGRESS chain of the nat table for each published port. The rules it adds are not IP-specific, hence normally any published port will be accessible on all host IP addresses. Here's an example of the rule Docker will add for a service published on port 80:

iptables -t nat -A DOCKER-INGRESS -p tcp -m tcp --dport 80 -j DNAT --to-destination 172.18.0.2:80

(You can view these by running iptables-save -t nat | grep DOCKER-INGRESS).

Our solution is to publish our services on different ports, and use a script that intercepts dockerd's iptables commands to rewrite them so they match the correct IP address and public port pair.

For example:

  • service #1 is published on port 1080, but should listen on 1.2.3.4:80
  • service #2 is published on port 1180, but should listen on 1.2.3.5:80

We then configure our script accordingly:

# cat /usr/local/sbin/iptables
#!/bin/bash

REGEX_INGRESS="^(.*DOCKER-INGRESS -p tcp) (--dport [0-9]+) (-j DNAT --to-destination .*)"
IPTABLES=/usr/sbin/iptables

SRV_1_IP=1.2.3.4
SRV_2_IP=1.2.3.5

ipt() {
  echo "EXECUTING: $@" >>/tmp/iptables.log
  $IPTABLES "$@"
}

if [[ "$*" =~ $REGEX_INGRESS ]]; then
  START=${BASH_REMATCH[1]}
  PORT=${BASH_REMATCH[2]}
  END=${BASH_REMATCH[3]}
  
  echo "REQUESTED: $@" >>/tmp/iptables.log

  case "$PORT" in
     '--dport 1080') ipt $START --dport 80 -d $SRV_1_IP $END; exit $?; ;;
     '--dport 2080') ipt $START --dport 80 -d $SRV_2_IP $END; exit $?; ;;
                  *) ipt "$@"; exit $?; ;;
  esac
fi

echo "PASSING-THROUGH: $@" >>/tmp/iptables.log

$IPTABLES "$@"

N.B. The script must be installed in dockerd's PATH ahead of your distribution's iptables command. On Debian Buster, iptables is installed to /usr/sbin/iptables, and dockerd's PATH has /usr/local/sbin ahead of /usr/sbin, so it makes sense to install the script at /usr/local/sbin/iptables. (You can check dockerd's PATH by running cat /proc/$(pgrep dockerd)/environ | tr '\0' '\012' | grep ^PATH).

Now, when these docker services are launched, the iptables rules will be rewritten as follows:

iptables -t nat -A DOCKER-INGRESS -d 1.2.3.4/32 -p tcp -m tcp --dport 80 -j DNAT --to-destination 172.18.0.2:1080
iptables -t nat -A DOCKER-INGRESS -d 1.2.3.5/32 -p tcp -m tcp --dport 80 -j DNAT --to-destination 172.18.0.2:2080

The result is that requests for http://1.2.3.4/ go to service #1, while requests for http://1.2.3.5/ go to service #2.

The script can be customised and extended according to your needs, and must be installed on all nodes to which you will be directing requests, and customised to that node's public IP addresses.

Related Topic