Docker-Swarm with RabbitMQ Autocluster

dockerdocker-swarmrabbitmq

We are facing the following problem: How to run RabbitMQ in a Docker-Swarm with persistent Data.

Currently we have the following setup in place:

  • Docker-Swarm 3 Nodes
  • GlusterFS as replicated Filesystem between all nodes
  • RabbitMQ with Consul Image: gavinmroy/alpine-rabbitmq-autocluster

This works most of the times fine.. but now we have to use durable queues to persist data.

We have tried to use –hostnames or to set the RABBITMQ_NODENAME, than we get a subdirectory for every started node like "rabbit@CONTAINERID" the problem: when the containers are restarted a new Folder is used to persist the data (new ContainerID).. any suggestions how to get a working setup, with usage of the the Docker Swarm features?

Best Answer

Although this has been dormant for a while: This is what we use.

docker-compose.yaml:

version: "3.6"

services:
rabbitmq-01:
    image: rabbitmq:3.8-management
    hostname: rabbitmq-01
    environment:
    - RABBITMQ_ERLANG_COOKIE=<erlang-cookie-value>
    secrets:
    - source: main-config
        target: /etc/rabbitmq/rabbitmq.conf
    configs:
    - source: definitions
        target: /etc/rabbitmq/definitions.json
    networks:
    - main
    - consul_main
    volumes:
    - rabbitmq-01-data:/var/lib/rabbitmq
    deploy:
    mode: global
    placement:
        constraints: [node.labels.rabbitmq1 == true]

rabbitmq-02:
    image: rabbitmq:3.8-management
    hostname: rabbitmq-02
    environment:
    - RABBITMQ_ERLANG_COOKIE=<erlang-cookie-value>
    secrets:
    - source: main-config
        target: /etc/rabbitmq/rabbitmq.conf
    configs:
    - source: definitions
        target: /etc/rabbitmq/definitions.json
    networks:
    - main
    - consul_main
    volumes:
    - rabbitmq-02-data:/var/lib/rabbitmq
    deploy:
    mode: global
    placement:
        constraints: [node.labels.rabbitmq2 == true]

rabbitmq-03:
    image: rabbitmq:3.8-management
    hostname: rabbitmq-03
    environment:
    - RABBITMQ_ERLANG_COOKIE=<erlang-cookie-value>
    secrets:
    - source: main-config
        target: /etc/rabbitmq/rabbitmq.conf
    configs:
    - source: definitions
        target: /etc/rabbitmq/definitions.json
    networks:
    - main
    - consul_main
    volumes:
    - rabbitmq-03-data:/var/lib/rabbitmq
    deploy:
    mode: global
    placement:
        constraints: [node.labels.rabbitmq3 == true]

secrets:
main-config:
    file: secret.rabbitmq.conf

configs:
definitions:
    file: definitions.json

networks:
main:
consul_main:
    external: true

volumes:
rabbitmq-01-data:
rabbitmq-02-data:
rabbitmq-03-data:

rabbitmq.conf:

# The standard config that each cluster member gets
default_user = rmqadmin
# TODO Might wanna encrypt or use Docker secret into an env var
default_pass = <password>

default_vhost = /

loopback_users.admin = false

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_consul
cluster_formation.consul.host = consul.server
cluster_formation.node_cleanup.only_log_warning = true
cluster_formation.consul.svc_addr_auto = true
cluster_formation.consul.svc = rabbitmq-config-file

cluster_partition_handling = autoheal

# Flow Control is triggered if memory usage above %80.
vm_memory_high_watermark.relative = 0.8

# Flow Control is triggered if free disk size below 5GB.
disk_free_limit.absolute = 5GB

management.load_definitions = /etc/rabbitmq/definitions.json

definitions.json as exported from https://your-rabbitmq-admin-ui/api/definitions

Notice that the rabbitmqs are deployed in "global" mode and then constrained to node labels.

We have three nodes in our cluster which bear the "rabbitmqX" label like so:

$ dt node inspect node01 | grep -i -C 20 label
[
    {
        ...
        "Spec": {
            "Labels": {
                "rabbitmq1": "true"
            },
        },
        ...
    }

One with "rabbitmq1": "true", the other with "rabbitmq2": "true", the last with "rabbitmq3": "true".

Thus a single instance of RMQ is deployed onto each of those three nodes, so we end up with a 3 node RMQ cluster with one node on each of the three Docker Swarm members.

A Consul is in place, reachable at "consul.server" for the RMQ nodes to use as the clustering backend.

This will fix the problem of RMQ using their container hostnames to create their data-directories, as the hostnames are fixed as "rabbitmq-0X" in the docker-compose file.

Hope this helps whoever comes by!

Find this at https://github.com/olgac/rabbitmq