Linux – Check RAID software: the status

linuxraid

I have an Ubuntu dedicated server and I got a message from my provider saying that one of my disks has en error and that I must "check whether my RAID software is functioning properly" before they replace the disk. Here is what I typed in shell and the report I got:

root@Ubuntu-1204-precise-64-minimal # cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md3 : active raid1 sda4[0] sdb4[1]
      1839089920 blocks super 1.2 [2/2] [UU]

md1 : active raid1 sda2[0] sdb2[1]
      523968 blocks super 1.2 [2/2] [UU]

md2 : active raid1 sda3[0] sdb3[1]
      1073610560 blocks super 1.2 [2/2] [UU]

md0 : active raid1 sdb1[1] sda1[0]
      16768896 blocks super 1.2 [2/2] [UU]

unused devices: <none>

root@Ubuntu-1204-precise-64-minimal # mdadm -D /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Wed Nov  6 08:02:41 2013
     Raid Level : raid1
     Array Size : 16768896 (15.99 GiB 17.17 GB)
  Used Dev Size : 16768896 (15.99 GiB 17.17 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Thu Sep 10 04:02:26 2015
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           Name : rescue:0
           UUID : 872ad258:c42ccb36:e9e19c96:98b55ee9
         Events : 156

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1

Does this mean RAID is functioning and all my drives are synchronized? If not, how to check whether the drives are sychronized and that it is safe to replace the disk?

Thanks.

Best Answer

Yup. The output of cat /proc/mdstat and mdadm -D both show that things are ok for this array.

State: clean

and

[UU]

are the important notifiers that things are working as intended with your array.

You can double check from the kernel documentation on md

clean - no pending writes, but otherwise active.
    When written to inactive array, starts without resync
    If a write request arrives then
    if metadata is known, mark 'dirty' and switch to 'active'.
    if not known, block and switch to write-pending
    If written to an active array that has pending writes, then fails.

and the Linux Kernel Wiki on mdstat

Paraphrasing from the wiki entry:

The [UU] represents the status of each device, either U for up or _ for down.

If you wanted to set up email to alert you if there was an issue with your software raid array, then you could use this post from the Ubuntu forums: http://ubuntuforums.org/showthread.php?t=1185134 which should walk you through a process to set up emails to a remote account.

If you want to double-check the array is ok, you can always use this command: /usr/share/mdadm/checkarray -a /dev/mdX This command should also be in /etc/cron.d/mdadm and run monthly.

Aside from that, a possible run of smartctl might be reasonable, if you suspect impending hardware failure that isn't triggering failures in your array yet. Examples can be found here: SMART checks with smartctl

And finally, because this can never be said too much: Make sure you have good tested backups! =D Raid is very nice, but it is not a substitute for backups, and messages like that from your provider are less scary when you know you have good backups. =)

Hope that helps. =)