How to monitor glusterfs volumes

glusterfsmonitoring

Glusterfs, while being a nice distributed filesystem, provides almost no way to monitor it's integrity. Servers can come and go, bricks might get stale or fail and I afraid to know about that when it is probably too late.

Recently we had an strange failure when everything appeared working, but one brick fell out from the volume (found by pure coincidence).

Is there a simple and reliable way (cron script?) that will let me know about health status of my GlusterFS 3.2 volume?

Best Answer

This has been a request to the GlusterFS developers for a while now and there is nothing out-of-the-box solution you can use. However, with a few scripts it's not impossible.

Pretty much entire Gluster system is managed by a single gluster command and with a few options, you can write yourself health monitoring scripts. See here for listing info on bricks and volumes -- http://gluster.org/community/documentation/index.php/Gluster_3.2:_Displaying_Volume_Information

To monitor performance, look at this link -- http://gluster.org/community/documentation/index.php/Gluster_3.2:_Monitoring_your_GlusterFS_Workload

UPDATE: Do consider upgrading to http://gluster.org/community/documentation/index.php/About_GlusterFS_3.3

You are always better off with being on the latest release since they seem to have more bug fixes and well supported. Ofcourse, run your own tests before moving to a newer release -- http://vbellur.wordpress.com/2012/05/31/upgrading-to-glusterfs-3-3/ :)

There is an admin guide with specific section for monitoring your GlusterFS 3.3 installation in Chapter 10 -- http://www.gluster.org/wp-content/uploads/2012/05/Gluster_File_System-3.3.0-Administration_Guide-en-US.pdf

See here for another nagios script -- http://code.google.com/p/glusterfs-status/