Debian – RAID 1 hard disks errors, file system is not mounting

debianhard driveraid

I have a Server with 2 HDDs in soft RAID 1, the OS is Debian Wheezy. After some database testing, the file system has gone to read-only mode, after which I have rebooted the machine. The server is not starting again, so I have booted a rescue system to look after the HDDs.

Now the problems start: fdisk -l is giving no output, and fdisk /dev/sda says Unable to read /dev/sda, while smartctl -a gives me a SMART overall-health self-assessment test result: PASSED. All the problems occur with /dev/sdb as well.

mdadm isn't helpful as well:

mdadm: no recogniseable superblock on /dev/sda
mdadm: /dev/sda has no superblock - assembly aborted

Output from dmesg:

ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata5.00: irq_stat 0x40000001
ata5.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
res 61/04:08:00:00:00/00:00:00:00:00/e0 Emask 0x1 (device error)
ata5.00: SB600 AHCI: limiting to 255 sectors per cmd
ata5.00: SB600 AHCI: limiting to 255 sectors per cmd
ata5.00: configured for UDMA/133
ata5: EH complete

testdisk is listing the drives with the correct size, but when I am trying to test the disk, every block is return a read error:

...
file_pread(4,2,buffer,34(0/0/35)) read err: Input/output error
file_pread(4,8,buffer,32(0/0/33)) read err: Input/output error
file_pread(4,8,buffer,40(0/0/41)) read err: Input/output error
file_pread(4,3,buffer,48(0/0/49)) read err: Input/output error
...

I am a bit puzzled, because I doubt that both disks have died at the same time and I suspect, that the SATA-controller may have some problems. How can I test this? And what else can I check?

Best Answer

First of all, I would recommend to rebuild the array with the second hard disk and make a backup if you don't have one. Probably something like that:

mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdb1 missing

Then after exchanging the first hard disk, you should be able to copy the partition table

sfdisk -d /dev/sdb | sfdisk /dev/sda

and add the new harddrive to the array again

mdadm --manage /dev/md0 --fail /dev/sda1
mdadm --manage /dev/md0 --remove /dev/sda1
mdadm --manage /dev/md0 --add /dev/sda1

If you have more than one partition just do this for all partitions (marking faulty and removing)

Related Topic