Raid1 degraded after every reboot

degradedraid1

After setup, cat /proc/mdstat output looks like this:

proxmox:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sdc2[1] sdb2[0]
      293024832 blocks [2/2] [UU]

unused devices: <none>

Also, after I setup raid1 fresh, i got the following:

proxmox:~# mdadm --examine --scan
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=fbda4051:61cbc27f:7f2b1f39:e153e83f

But, after reboot, cat /proc/mdstat outputs:

proxmox:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active (auto-read-only) raid1 sdc[1]
      293024832 blocks [2/1] [_U]

unused devices: <none>

Why is it using sdc1 now?

Also, now I get:

proxmox:~# mdadm --examine --scan
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=fbda4051:61cbc27f:7f2b1f39:e153e83f
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=fbda4051:61cbc27f:9822ee23:9b948649


proxmox:~# dmesg | grep md0
md/raid1:md0: active with 1 out of 2 mirrors
md0: detected capacity change from 0 to 300057427968
 md0: p1 p2
md0: p2 size 586049840 exceeds device capacity, limited to end of disk

Where did the two partition on /dev/md0 come from? I never made them.
Also, sdc1 and sdc2 aren't listed in the /dev tree.

Here is the fdisk output:

proxmox:~# fdisk -l /dev/sdb

Disk /dev/sdb: 300.0 GB, 300069052416 bytes
255 heads, 63 sectors/track, 36481 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x3bd84a48

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1           2       10240   83  Linux
Partition 1 does not end on cylinder boundary.
/dev/sdb2               2       36482   293024920   fd  Linux raid autodetect
Partition 2 does not end on cylinder boundary.

proxmox:~# fdisk -l /dev/sdc

Disk /dev/sdc: 300.0 GB, 300069052416 bytes
255 heads, 63 sectors/track, 36481 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x371c8012

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1           2       10240   83  Linux
Partition 1 does not end on cylinder boundary.
/dev/sdc2               2       36482   293024920   fd  Linux raid autodetect
Partition 2 does not end on cylinder boundary.

A bit of info: Server running Proxmox v1.9, which is debian lenny 64bit. sda is the system hard drive (hard ware RAID).
sdb and sdc are 300GB brand new Raptor drives.

Best Answer

First off, check the physical hardware like connections, cables, and correctly-seated cards. For the disk itself, check SMART data on /dev/sdb to make sure the disk itself is not failing out periodically. Western Digital Raptors are fast but prone to failure, I've had one fail on me out of nowhere (not even SMART data predicted it). Use smartctl to read the SMART data and run tests. It comes in the smartmontools package:

apt-get install smartmontools

Pull the data and look for anomalies or errors logged:

smartctl -a /dev/sdb

Finally, run a manual self-test, which will take about 2 minutes. long can be substituted for short, and is more thorough, but it takes far longer ("tens of minutes"):

smartctl -t short /dev/sdb

Once the test is done, review the results:

smartctl -l selftest /dev/sdb

If it all comes back clean, you can move on to debugging the mdadm stack.

Your partition arrangement is a bit strange on the RAID devices. If those devices will be dedicated to RAID, you don't need a partition table at all. Assuming there's no data, you'd be well advised to keep it simple and use the block devices directly. In fdisk they would show up like this:

Disk /dev/sdb: 300.0 GB, 300069052416 bytes
255 heads, 63 sectors/track, 36481 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000
Disk /dev/sdb doesn't contain a valid partition table

To eliminate any issues with partitioning and start from scratch, just dd some zeros at the beginning of each disk;

dd if=/dev/zero of=/dev/sdb count=128 bs=4096k

Repeat for /dev/sdc. Create the array using those two devices:

mdadm --create --raid-devices=2 --level=raid1 --bitmap=internal --assume-clean --name=RAID1 /dev/sdb /dev/sdc

Don't forget to check dmesg for any disk-related output!

Related Solutions

Linux – mdadm software RAID isn’t assembled at boot during initramfs stage

You could mirror the drives with btrfs itself, instead of creating that fs on top of the software raid: mkfs.btrfs -d raid1 /dev/sdc /dev/sdd

Otherwise try:

    umount /dev/md0 if mounted
    mdadm --stop /dev/md0
    mdadm --assemble --scan
    mv /etc/mdadm/mdadm.conf /etc/mdadm/mdadm.conf.bak
    /usr/share/mdadm/mkconf > /etc/mdadm/mdadm.conf

If cat /proc/mdstat shows the correct output now then create your filesystem and mount it, use blkid to get the UUID for /dev/md0 and edit /etc/fstab accordingly.

If you are still having issues you could try this before proceeding with the above mentioned instructions:

    mdadm --zero-superblock /dev/sdc /dev/sdd
    mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdc1 /dev/sdd1

I tested this on a system running Debian Jessie with 3.16.0-4-amd64 kernel, and I wrote gpt partition tables to the two block devices that I mirrored together. The array is properly assembled at boot and mounted as specified.

Linux – How increase write speed of raid1 mdadm

This variable you are referring to is related to RAID rebuild speed. If your array is in normal mode (not rebuilding), it should not affect the system performance.

I can notice slow HD (or too busy one) by looking at running processes. You can look for processes that you know are accessing disk with D state (uninterruptible sleep). If you got some, this may indicate a slow disk. Otherwise, your processes may not be doing IO at expected rate.

Best Answer

Related Solutions

Linux – mdadm software RAID isn’t assembled at boot during initramfs stage

Linux – How increase write speed of raid1 mdadm

Related Topic