GlusterFS not running on correct port! (peer disconnected / brick not starting)

centos7glusterfs

On CentOS 7 witch two bricks on srv1 and srv2

I've upgraded gluster from 313 to 6 by using yum. I then rebooted server 1, started and mounted the drive successfully.

This is my mount command:
/usr/sbin/mount.glusterfs 127.0.0.1:/RepVol /home -o direct-io-mode=enable

I then restarted srv2, I cannot mount:

[2019-08-29 14:16:01.354362] I [MSGID: 101190] [event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0
[2019-08-29 14:16:01.354402] I [glusterfsd-mgmt.c:2443:mgmt_rpc_notify] 0-glusterfsd-mgmt: disconnected from remote-host: srv2
[2019-08-29 14:16:01.354409] I [glusterfsd-mgmt.c:2463:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2019-08-29 14:16:01.354600] W [glusterfsd.c:1570:cleanup_and_exit] (-->/lib64/libgfrpc.so.0(+0xf1d3) [0x7f477284f1d3] -->/usr/sbin/glusterfsd(+0x12fef) [0x564e35a67fef] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x6b) [0x564e35a6001b] ) 0-: received signum (1), shutting down
[2019-08-29 14:16:01.357036] I [socket.c:3754:socket_submit_outgoing_msg] 0-glusterfs: not connected (priv->connected = 0)
[2019-08-29 14:16:01.357050] W [rpc-clnt.c:1704:rpc_clnt_submit] 0-glusterfs: failed to submit rpc-request (unique: 0, XID: 0x2 Program: Gluster Portmap, ProgVers: 1, Proc: 5) to rpc-transport (glusterfs)

The error message is Exhausted all volfile servers. At least that's the only thing showing as an error imo.

on srv1:

Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick srv1:/datafold                        49152     0          Y       16291
Self-heal Daemon on localhost               N/A       N/A        Y       16313

Task Status of Volume RepVol
------------------------------------------------------------------------------
There are no active volume tasks

on srv2:

Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick srv1:/datafold                        49152     0          Y       16291
Brick srv2:/datafold                        N/A       N/A        N       N/A
Self-heal Daemon on localhost               N/A       N/A        N       N/A
Self-heal Daemon on srv1                    N/A       N/A        Y       16313

Task Status of Volume RepVol
------------------------------------------------------------------------------
There are no active volume tasks

So it makes sense it cannot mount when the brick is offline. However, I have no clue how to start this brick, even after searching for hours. It would be nice to find a solution.

I tried removing the volume to recreate it but it complains not all bricks are connected.

I also read that gluster uses ipv6 on default since version 5, but not sure how it affect my setup since srv1 seems to be up and running?

EDIT:

Glusterd is not running on the right port! It should be 24007 but it is shown as:
netstat -tulpn | grep gluster
tcp 0 0 0.0.0.0:34678 0.0.0.0:* LISTEN 28743/glusterd

what the hell? How do I fix this?? Restarting does nothing than that it assigns a new random port…
tcp 0 0 0.0.0.0:43914 0.0.0.0:* LISTEN 17134/glusterd

Why is it not running on 24007?

Best Answer

I removed glusterfs-server yum remove glusterfs-server -y and installed it again:

yum install glusterfs-server -y
systemctl enable glusterd.service
systemctl start glusterd.service

It then started at port 24007 and everything worked again.

I just wasted a couple of hours because glusterd decided a random port would be fine while 24007 wasn't even in use, great!

Related Solutions

Linux – Rename a GlusterFS Peer

When you need to rename your peers, your bricks, etc. without destroy your cluster, you will must stop your glusterfs service, and then rename all the occurrences in the glusterfs data files.

I will provide you an script (without any warranty), that I used to automate this task.

WARNING: Before proceed be sure of backup your data, and proceed with caution. Do not execute any command without understanding what it do exactly, be sure that you are in the correct path, make the correct names replacements in each command if apply (indicated in uppercase), and ensure that all your new peer names are resolvable by DNS.

THE NEXT STEPS MUST BE PERFORMED ON ALL THE NODES

Step 1: stop glusterd service.

sudo systemctl stop glusterd.service

Step 2: list the content of the /var/lib/glusterd/vols directory.

ls -l /var/lib/glusterd/vols

Step 3: renaming volumes data files, for each volume do:

cd /var/lib/glusterd/vols/YOURVOLUMENAME
ls -l | grep .data.vol    #<-- gets the list of files you need to rename for the current volume
sudo mv clusterdata.OLDNAME1.data.vol clusterdata.NEWNAME1.data.vol
sudo mv clusterdata.OLDNAME2.data.vol clusterdata.NEWNAME2.data.vol
sudo mv clusterdata.OLDNAME-N.data.vol clusterdata.NEWNAME-N.data.vol

Step 4: renaming volumes bricks, for each volume do:

cd /var/lib/glusterd/vols/YOURVOLUMENAME/bricks
ls -l | grep :-data    #<-- gets the list of brick files you need to rename for the current volume
sudo mv OLDNAME1\:-data NEWNAME1\:-data
sudo mv OLDNAME2\:-data NEWNAME2\:-data
sudo mv OLDNAME-N\:-data NEWNAME-N\:-data

Step 5: Detect all the occurrences of the OLDNAME in the config files:

cd /var/lib/glusterd
sudo grep -rnw . -e 'OLDNAME'

Step 6: Automatically replace all the occurrences of the OLDNAME in the config files:

cd /var/lib/glusterd
sudo find . -type f -exec sed -i 's/OLDNAME/NEWNAME/g' {} \;

Step 7: Check that all the occurrences has been replaced:

sudo grep -rnw . -e 'OLDNAME'
sudo grep -rnw . -e 'NEWNAME'

ONLY WHEN YOU HAVE COMPLETED THE STEPS ON ALL NODES ...

Start the glusterd service en each node, and check status.

sudo systemctl start glusterd.service
sudo systemctl status glusterd.service
sudo gluster peer status
sudo gluster volume status
sudo gluster volume info

How to change directory of glusterfs brick

You need to add an action in the end of command: commit or force

E.g:

sudo gluster volume replace-brick vol2 /home/data/bricks/brick1/brick1 /bricks/project/brick1 commit

Best Answer

Related Solutions

Linux – Rename a GlusterFS Peer

How to change directory of glusterfs brick

Related Topic