How to setup Docker Swarm + Traefik 2.4 + domain-based routing on bare metal with CLI

docker-swarmreverse-proxy

I would like to scale my little Docker webapp and make it highly available. I have been using Docker for many years and K8s seems overly complicated, therefore I am looking into Docker Swarm.

Colorful IT architecture diagram

The idea is simple: have a highly available load balancer as first contact, forwarding all TCP/IP traffic to 3 Docker Swarm master nodes with Traefik 2.4 listening directly on the servers port. Traefik uses the http domain configured in the service to forward it to an appropriate container on one of the workers over the Docker network.

For simplicity we leave out https for now, as even plain http is not working for me. The load balancer is configured correctly, the Docker Swarm is up and running. This is how I start the services:

sudo docker network create --driver=overlay traefik-public

# reverse proxy service
sudo docker service create \
  --name traefik \
  -p 80:80 \
  --mount type=bind,source=/var/run/docker.sock,destination=/var/run/docker.sock \
  --mode=global \
  --constraint node.role==manager \
  --network traefik-public \
  traefik:2.4 \
    --providers.docker.swarmMode=true \
    --providers.docker.endpoint=unix:///var/run/docker.sock \
    --providers.docker.exposedbydefault=false \
    --providers.docker.watch=true \
    --providers.docker.network=traefik-public \
    --entryPoints.web.address=:80

# webapp A service
sudo docker service create \
  --replicas 5 \
  --name hostname \
  --constraint node.role!=manager \
  --network traefik-public \
  --publish published=8080,target=80 \
  --label  traefik.enabled=true \
  --label 'traefik.http.routers.hostname.rule=Host(`a.domain.tld`)' \
  --label  traefik.http.routers.hostname.entrypoints=http \
  --label  traefik.http.services.hostname.loadbalancer.server.scheme=http \
  --label  traefik.http.services.hostname.loadbalancer.server.port=8080 \
  nginxdemos/hello

For some reason there seems to be an error in the configuration. I have been trying to tweak it, but I either get an empty response or 404 page not found when using curl http://a.domain.tld. Latest error is level=error msg="Skip container : field not found, node: enabled" providerName=docker.

Assumptions:

  1. Traefik is running on Swarm master nodes to get Docker event
    notifications
  2. Traefik is listening directly on external port 80 of master nodes
  3. Traefik will recognize new services and route to containers based on domain name
  4. Multiple webapp container of the same service can run on the same worker node

Main Question: how do I get the basic version up and running? What's wrong?

Further questions:

  1. Can I use env variables with services like with containers (for DB connection string)?

  2. How do I access Traefik dashboard? I assume every dashboard will show different data.

  3. How to add own SSL certificates to Traefik? Do Swarm services support local storage?
    (I am for easy solutions, happy to copy my .pem on all 3 nodes, once every year)

  4. How do I enable SSL and http redirect to https?

  5. Can I add paths to domains so http://a.domain.tld/api uses a different service?

  6. How to collect container logs? Will Elastic Filebeat work with worker containers?

Otherwise I am happy for any kind of feedback about the planned IT architecture.

Thanks,
bluepuma

Best Answer

Basic template for Docker Swarm with Traefik 2.4, domain-based routing, regular SSL and scalable web-app, all on bare metal servers.

Traefik will be run on all master nodes, directly listening on host's port 0.0.0.0:80 and 0.0.0.0:443. http is upgraded to https, web-apps are started on worker nodes and will be automatically registered with their domain. Then Traefik will load balanced all incoming requests and forward them to the matching worker containers.

Note that this is NOT a failover solution. You need to have a load balancer in front of this setup or a floating IP which you can switch over if a server fails.

Requirements: Setup a docker swarm, this is out of scope here. Every Docker Swarm master node Traefik is running on needs a local folder with the config.yml and SSL certificate. Alternatively you can use a Docker volume, which can be a remote NFS mount.

traefik.yml

version: '3.8'
services:
    traefik:
        image: traefik:v2.4
        ports:
          - target: 80
            published: 80
            protocol: tcp
            mode: host
          - target: 443
            published: 443
            protocol: tcp
            mode: host
        command:
          - --providers.docker.swarmMode=true
          - --providers.docker.exposedByDefault=false
          - --providers.docker.network=proxy
          - --providers.file.filename=/data/traefik/config.yml
          - --providers.file.watch=true
          - --entrypoints.web.address=:80
          - --entrypoints.web.http.redirections.entryPoint.to=websecure
          - --entrypoints.web.http.redirections.entryPoint.scheme=https
          - --entrypoints.websecure.address=:443
          - --accesslog
          - --log.level=info
        environment:
          - TZ=Europe/Berlin
        volumes:
          - /var/run/docker.sock:/var/run/docker.sock:ro
          - /data/traefik:/data/traefik
        networks:
          - proxy
        deploy:
            mode: global
            placement:
                constraints:
                    - node.role == manager
networks:
    proxy:
        external: true

config.yml, volume from local folder, SSL certificate settings NEED to be in a separate file

tls:
    certificates:
      - certFile: /data/traefik/certs/wildcard.crt
        keyFile: /data/traefik/certs/wildcard.key
      - certFile: /data/traefik/certs/another-certificate.crt
        keyFile: /data/traefik/certs/another-certificate.key

    stores:
        default:
        defaultCertificate:
            certFile: /data/traefik/certs/wildcard.crt
            keyFile: /data/traefik/certs/wildcard.key

Command line, start your engines :-)

# create network (just once)
docker network create --driver=overlay proxy

# start traefik via traefic.yml
docker stack deploy --compose-file traefik.yml traefik

# start a web-app with its domain name
docker service create \
  --replicas 15 \
  --name web-app \
  --constraint node.role!=manager \
  --network proxy \
  --label  traefik.enable=true \
  --label 'traefik.http.routers.traefik.rule=Host(`app.doma.in`)' \
  --label  traefik.http.routers.traefik.entrypoints=websecure \
  --label  traefik.http.routers.traefik.tls=true \
  --label  traefik.http.services.hostname.loadbalancer.server.port=80 \
  nginxdemos/hello

You can reduce the log.level (or remove it completely), also the accesslog can be removed. Alternatively it is possible to log those two types into two different files. Traefik dashboard is still missing in this config.

For better security you can use docker-socket-proxy which @webjocky describes in his pastebin in this discussion.

Related Topic