Gluster and “failed to fetch volume file”

I recently upgraded one of my gluster clients to a Debian stretch based system and am not able to mount any gluster volumes from it. My gluster server runs 3.4.2 on Ubuntu 14.04. The Stretch system is running some flavor of 3.8.x. The error I get is 0-mgmt: failed to fetch volume file (key:/sata_temp)

Is this due to version incompatibility?

After reinstalling, the client is still unable to mount volume ssd_temp. This looks like a blocked port perhaps as mentioned by @Spooler:
(on client)

# mount -t glusterfs 172.22.24.5:/ssd_temp ssd_temp/
Mount failed. Please check the log file for more details.

(on server)

# gluster volume status ssd_temp                                                                                                                                                                                                           
Status of volume: ssd_temp
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick 172.22.24.5:/mnt/ssd_temp/brick                   49163   Y       2936
NFS Server on localhost                                 2049    Y       2949

There are no active volume tasks


# tail /var/log/glusterfs/bricks/mnt-ssd_temp-brick.log                                                                                                                                                                                                              
[2018-06-14 18:22:29.691196] E [rpcsvc.c:195:rpcsvc_program_actor] 0-rpc-service: RPC Program procedure not available for procedure 45 in GlusterFS 3.3
[2018-06-14 18:22:29.691236] E [rpcsvc.c:450:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully

# tail /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
[2018-06-14 18:32:12.197131] E [rpcsvc.c:521:rpcsvc_handle_rpc_call] 0-glusterd: Request received from non-privileged port. Failing request

# gluster volume status test-volume Status of volume: test-volume Gluster process Port Online Pid ------------------------------------------------------------ Brick arch0:/export/rep1 24010 Y 18474 Brick arch1:/export/rep2 24011 Y 18479 NFS Server on localhost 38467 Y 18486 Self-heal Daemon on localhost N/A Y 18491

Best Answer

It could be. However, the client is generally pretty good about connecting to older server versions (but not the other way around). In general, you should labor to ensure that your server and client versions match.

The best way to figure this out is the volume logs for that resource, both from the client and the server. Those can be found in the following locations (I'm assuming you're using the FUSE mounter. because it seems that way):

FUSE client log: /var/log/glusterfs/<mountpoint path extraction>.log
glusterd server log: /var/log/glusterfs/glusterd.log

You'll probably get the most data from your client log.

This kind of issue is also typically caused by an inability to contact a gluster server for your volume data. Make sure that you can get to these servers over the network using whatever name is in the volume details. You can see those details on the server by calling:

# gluster volume status <volume_name>

Which will print output similar to this:

On the "Brick:" lines, you'll see in this case a hostname (arch[0,1]). Whatever is listed as the brick address will be used by the client to connect to Gluster, and in many cases that involves the use of DNS to allow Gluster to use a different IP internally than the clients use to connect to it. No matter what, just make sure the clients can contact the server via that brick address on that port.

You upgraded an entire OS, so maybe a firewall was turned on/reset in some way.

Best Answer

Related Solutions

Linux – failed to fetch volume file

GlusterFS Mount Keeps Disconnecting Randomly

Related Topic