Docker Swarm – What Certificate is Used for Join?

dockerdocker-swarmsslssl-certificate

I am trying to setup a docker swarm.

I need my nodes to communicate via TLS.

I have created a cert for the manager node with extendedKeyUsage = serverAuth

I have configured the manager node with the following daemon.json:

{
    "hosts": ["unix:///var/run/docker.sock", "tcp://0.0.0.0:2376"],
    "tlscacert": "/var/docker/ca.pem",
    "tlscert": "/var/docker/server-cert.pem",
    "tlskey": "/var/docker/server-key.pem",
    "tlsverify": true
}

To test this I have created a client cert used it t connect to the docker api from my laptop and I am able to connect sucessfully.

Now I need to add one worker node to the swarm.

I have set it up in the same way as the manager node; with a similar daemon.json. I have used an SSL key with extendedKeyUsage = serverAuth and proved client connection in the same way as on the manager node.

Then in the manager I have run docker swarm init

To join the worker node to the swarm I use the following command:
docker swarm join –token XXX dockman.myhost.com:2376

But I get an error:

Error response from daemon: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"

I thought I could test it further by trying to connect to the docker API on the manager node from the worker node:

sudo docker --tlsverify --tlscacert=/var/docker/ca.pem --tlscert=./server-cert.pem --tlskey=./server-key.pem -H=127.0.0.1:2376 version

The result is:

Client: Docker Engine - Community
 Version:           19.03.5
 API version:       1.40
 Go version:        go1.12.12
 Git commit:        633a0ea838
 Built:             Wed Nov 13 07:29:52 2019
 OS/Arch:           linux/amd64
 Experimental:      false
The server probably has client authentication (--tlsverify) enabled. Please check your TLS client certification settings: Get https://127.0.0.1:2376/v1.40/version: remote error: tls: bad certificate

This second test has given me lots more to think about.
Of course it will fail because I am trying to connect with a server certificate and not a client certificate, but isn't that exactly what the docker swarm join is trying to do? It doesn't make sense to me to put the client certificate into daemon.json. I googled making a single certificate both server and client and it is possible but seems to be bad practice. I would have thought it would have been covered in the tutorial if it was required.

I have been stuck at this point. I can't work out what certificate setup is required.

I have been following
https://github.com/docker/docker.github.io/blob/master/swarm/configure-tls.md
This describes the creation of certificates but doesn't mention client or server auth at all.

Update 1

I found a document that said certs need to be client and server

https://hub.docker.com/_/swarm/

So I remade the node certificate to be both client and server. Now the docker version command works when run from the node but not the swarm join.

Best Answer

You're mixing up Swarm Mode (docker swarm and similar CLIs) with the classic container based Swarm (hosted as a container on docker hub). These are two different tools.

For the two sets of documentation, see:

There's no need to do any manual TLS configuration with Swarm Mode, it's all built in, and the ports for Swarm Mode are different from the ports for the docker API socket. You do not want to expose the docker API on the network without a good reason (this is a common source of hacks), and Swarm Mode is not a reason.

Therefore, you should remove the -H option to the dockerd command, along with any TLS options there. Then run docker swarm init on the first manager which will generate the TLS credentials and give a token that includes a hash of the self signed certs. Then the other managers and workers run a docker swarm join to generate the client certificates, connect to the manager, validate the the hash of the manager certificates from the token, and authenticate itself to the manager with the secret part of the join token.

The above will encrypt the management plane between the managers and workers. To encrypt the data transmitted on overlay networks between workers, you need to enable IPSec on the overlay networks you create:

docker network create --opt encrypted --driver overlay app-overlay-net

Documentation on this feature is at: https://docs.docker.com/v17.09/engine/userguide/networking/overlay-security-model/

Related Topic