Linux – Slow disk performance on PowerEdge R720

centosdell-poweredgelinuxssd

Every few months PowerEdge with SSD (INTEL SSDSA2BW60) comes with results like:

hdparm -tT /dev/sda

/dev/sda:
 Timing cached reads:   690 MB in  2.00 seconds = 344.84 MB/sec
 Timing buffered disk reads: 170 MB in  3.00 seconds =  56.59 MB/sec
100000+0 records in
100000+0 records out
819200000 bytes (819 MB) copied, 66.0552 s, 12.4 MB/s

These are on system with all services taken down in order not to interfere with hdparm. After few hours everything comes to normal disk speeds and behaviour can not be reproduced.
Is there some known problem with Dell machines, or Intel SSD drives?

Relevant parts from dmesg,

Linux version 2.6.32-358.11.1.el6.x86_64 (mockbuild@c6b7.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Wed Jun 12 03:34:52 UTC 2013
DMI: Dell Inc. PowerEdge R720/0XH7F2, BIOS 1.6.0 03/07/2013
scsi0 : LSI SAS based MegaRAID driver
scsi 0:0:0:0: Direct-Access     DELL     PERC H710        3.13 PQ: 0 ANSI: 5
scsi 0:2:0:0: Direct-Access     DELL     PERC H710        3.13 PQ: 0 ANSI: 5
sd 0:2:0:0: [sda] 1170997248 512-byte logical blocks: (599 GB/558 GiB)
sd 0:2:0:0: [sda] Write Protect is off
sd 0:2:0:0: [sda] Mode Sense: 1f 00 00 08
sd 0:2:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sda: sda1 sda2 sda3
sd 0:2:0:0: [sda] Attached SCSI disk

Best Answer

You should check the SMART information of the disk. What you describe could be caused by a die failure which is then replaced by the internal raid structure of the SSD, a process that can take a couple hours and will result in a degraded performance. There is no way to really know what is the cause of the degraded performance if it is indeed inside the SSD, but you may notice that the before and after of the SMART reallocation counters have changed to know that a reallocation process did happen.