Inexplicably slow Western Digital WD15EARS disk

hard drivesmart

I have four disks, of which two are WD15EARS, in a server and I'm trying to put the two WD15EARS's in Linux mdadm RAID, but for some reason, the array's performance is very slow (it syncs at about 15 MB/s). At first I thought it was an alignment issue, since they're Advanced Format drives, but I don't think so. This is how I aligned them. I also have two of these drives in my desktop PC, aligned painstakingly with LVM and RAID, and they're running fine.

I did some speed tests on the individual drives (sdb and sdd). Hdparm -t shows 80 MB/s for sdb and only 30 MB/s for sdd (and the two other drives, two Samsung ones, measure at about 100 MB/s). These results are repeatable. It also shows that it's not an alignment thing, because then hdparm -t would be slow on both drives.

I have been unable to discern any differences that might explain why one of these drives is slower, except that SMART reports the following on the good drive:

Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.

And it reports this on the bad drive:

Offline data collection status:  (0x85) Offline data collection activity
                                    was aborted by an interrupting command from host.
                                    Auto Offline Data Collection: Enabled.

The auto offline data collection should run every four hours, but the status message on the bad drive never changes.

I hypothesized that it's this offline collection that causes slowdown, but I am unable to abort it. Smartctl -X doesn't do anything, which makes sense, because the drives do not have the "Abort Offline collection upon new command" capability, according to smartctl -c.

I'm currently running a long self test which hopefully yields something, but in the meantime, I was hoping if anybody knows what might be going on.

Edit: the self test finished, it says it's OK. Turning off automatic offline data collection also didn't help.

And, I just did DD write tests. dd if=/dev/zero of=/dev/sdX bs=10M yieled 65 MB/s on the good disk and about 15 MB/s on the bad one. There's definitely something wrong.

Edit2: I picked up the drives from the datacenter and connected it to my PC with a USB to SATA converter. Now it works fine…

Best Answer

Possibly the cause for significantly decreased performance would be the Automatic Acoustic Management (AAM) - check its status on both drives using hdparm -M and disable it entirely by setting the value to 254.

Also, enabling the write cache on the drives using hdparm -W should be worth trying.

Since you are using Caviar Green drives (which you should not be doing for a Linux RAID setup, honestly) and already are at it, make sure to disable the power saving features, especially the IntelliPark feature.

If nothing helps, it might be a hardware problem indeed - open up an RMA and return the drive to WD.