I have a Server with 2 HDDs in soft RAID 1, the OS is Debian Wheezy. After some database testing, the file system has gone to read-only mode, after which I have rebooted the machine. The server is not starting again, so I have booted a rescue system to look after the HDDs.
Now the problems start: fdisk -l
is giving no output, and fdisk /dev/sda
says Unable to read /dev/sda
, while smartctl -a
gives me a SMART overall-health self-assessment test result: PASSED
. All the problems occur with /dev/sdb as well.
mdadm
isn't helpful as well:
mdadm: no recogniseable superblock on /dev/sda
mdadm: /dev/sda has no superblock - assembly aborted
Output from dmesg
:
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata5.00: irq_stat 0x40000001
ata5.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
res 61/04:08:00:00:00/00:00:00:00:00/e0 Emask 0x1 (device error)
ata5.00: SB600 AHCI: limiting to 255 sectors per cmd
ata5.00: SB600 AHCI: limiting to 255 sectors per cmd
ata5.00: configured for UDMA/133
ata5: EH complete
testdisk
is listing the drives with the correct size, but when I am trying to test the disk, every block is return a read error:
...
file_pread(4,2,buffer,34(0/0/35)) read err: Input/output error
file_pread(4,8,buffer,32(0/0/33)) read err: Input/output error
file_pread(4,8,buffer,40(0/0/41)) read err: Input/output error
file_pread(4,3,buffer,48(0/0/49)) read err: Input/output error
...
I am a bit puzzled, because I doubt that both disks have died at the same time and I suspect, that the SATA-controller may have some problems. How can I test this? And what else can I check?
Best Answer
First of all, I would recommend to rebuild the array with the second hard disk and make a backup if you don't have one. Probably something like that:
Then after exchanging the first hard disk, you should be able to copy the partition table
and add the new harddrive to the array again
If you have more than one partition just do this for all partitions (marking faulty and removing)