Ubuntu – Client SSL fails with GlusterFS 3.5

glusterfssslUbuntuubuntu-12.04

I'm running GlusterFS 3.5 from the official Ubuntu packages on Ubuntu 12.04 and when I enable client.ssl, mounting starts to fail. The error I'm getting is:

[2014-06-13 10:33:52.690770] E [socket.c:297:ssl_setup_connection] 0-public_uploads-client-0: SSL connect error

I enabled SSL by following this guide: http://www.gluster.org/author/zbyszek/ except that I have the keys and cert in both gluster and glusterfs files as the guide seemed to use one at the beginning and the other one afterwards:

# ls /etc/ssl/gluster*
/etc/ssl/gluster.ca  
/etc/ssl/glusterfs.ca
/etc/ssl/glusterfs.key
/etc/ssl/glusterfs.pem
/etc/ssl/gluster.key
/etc/ssl/gluster.pem

The volume looks like this:

Volume Name: public_uploads
Type: Distribute
Volume ID: 52aa6d85-f4ea-4c39-a2b3-d20d34ab5916
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: koraga.int.example.com:/var/lib/glusterfs/brick01/public_uploads
Options Reconfigured:
auth.allow: 127.0.0.1
client.ssl: on
server.ssl: on
nfs.disable: on

If I set client.ssl to off then it works just fine. The full logs are:

2014-06-13 10:33:52.673417] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.0 (/usr/sbin/glusterfs --volfile-server=koraga.int.example.com --volfile-id=/public_uploads /var/www/shared/public/uploads)
[2014-06-13 10:33:52.681140] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-06-13 10:33:52.681200] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread
[2014-06-13 10:33:52.685101] I [dht-shared.c:311:dht_init_regex] 0-public_uploads-dht: using regex rsync-hash-regex = ^\.(.+)\.[^.]+$
[2014-06-13 10:33:52.686887] I [socket.c:3561:socket_init] 0-public_uploads-client-0: SSL support is ENABLED
[2014-06-13 10:33:52.686910] I [socket.c:3576:socket_init] 0-public_uploads-client-0: using private polling thread
[2014-06-13 10:33:52.689055] I [client.c:2273:notify] 0-public_uploads-client-0: parent translators are ready, attempting connect on transport
Final graph:
+------------------------------------------------------------------------------+
  1: volume public_uploads-client-0
  2:     type protocol/client
  3:     option remote-host koraga.int.example.com
  4:     option remote-subvolume /var/lib/glusterfs/brick01/public_uploads
  5:     option transport-type socket
  6:     option username 51275c7d-33b4-46cc-b8e9-9c06b5dfcda5
  7:     option password 36401ce2-18e7-427e-b126-30d2d9351480
  8:     option transport.socket.ssl-enabled on
  9: end-volume
 10:
 11: volume public_uploads-dht
 12:     type cluster/distribute
 13:     subvolumes public_uploads-client-0
 14: end-volume
 15:
 16: volume public_uploads-write-behind
 17:     type performance/write-behind
 18:     subvolumes public_uploads-dht
 19: end-volume
 20:
 21: volume public_uploads-read-ahead
 22:     type performance/read-ahead
 23:     subvolumes public_uploads-write-behind
 24: end-volume
 25:
 26: volume public_uploads-io-cache
 27:     type performance/io-cache
 28:     subvolumes public_uploads-read-ahead
 29: end-volume
 30:
 31: volume public_uploads-quick-read
 32:     type performance/quick-read
 33:     subvolumes public_uploads-io-cache
 34: end-volume
 35:
 36: volume public_uploads-open-behind
 37:     type performance/open-behind
 38:     subvolumes public_uploads-quick-read
 39: end-volume
 40:
 41: volume public_uploads-md-cache
 42:     type performance/md-cache
 43:     subvolumes public_uploads-open-behind
 44: end-volume
 45:
 46: volume public_uploads
 47:     type debug/io-stats
 48:     option latency-measurement off
 49:     option count-fop-hits off
 50:     subvolumes public_uploads-md-cache
 51: end-volume
 52:
+------------------------------------------------------------------------------+
[2014-06-13 10:33:52.689913] I [rpc-clnt.c:1685:rpc_clnt_reconfig] 0-public_uploads-client-0: changing port to 49155 (from 0)
[2014-06-13 10:33:52.690770] E [socket.c:297:ssl_setup_connection] 0-public_uploads-client-0: SSL connect error
[2014-06-13 10:33:52.690799] E [socket.c:2263:socket_poller] 0-public_uploads-client-0: client setup failed
[2014-06-13 10:33:52.698166] I [fuse-bridge.c:4946:fuse_graph_setup] 0-fuse: switched to graph 0
[2014-06-13 10:33:52.698402] I [fuse-bridge.c:3883:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel 7.22
[2014-06-13 10:33:52.698671] W [fuse-bridge.c:739:fuse_attr_cbk] 0-glusterfs-fuse: 2: LOOKUP() / => -1 (Transport endpoint is not connected)
[2014-06-13 10:33:52.717268] I [fuse-bridge.c:4787:fuse_thread_proc] 0-fuse: unmounting /var/www/shared/public/uploads
[2014-06-13 10:33:52.717597] W [glusterfsd.c:1095:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f63a253d3fd] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7f63a2810e9a] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xc5) [0x7f63a330a1b5]))) 0-: received signum (15), shutting down
[2014-06-13 10:33:52.717616] I [fuse-bridge.c:5444:fini] 0-fuse: Unmounting '/var/www/shared/public/uploads'.

Any ideas why is it failing? any way to get more information about the failure?

Best Answer

I had the same problem, and it turned out that I wasn't unmounting the volume before turning on SSL and then remounting it afterwards. This is needed for the glusterfs client to pick up the config changes. It isn't enough to start and stop the glusterd server daemon, as most guides seem to suggest.

There's quite a bit about Gluster SSL scattered around the web. The best detailed guide I've found so far is: https://kshlm.in/network-encryption-in-glusterfs. However, it doesn't provide specific commands to create the SSL certificates. For that, please see: http://opensource-storage.blogspot.com/2015/03/using-ssl-with-glusterfs.html.