Linux – Slow md RAID-1 array

linuxmdadm

We have some problems with very slow disk response on our server. I checked iostat (iostat -d -x 30) and have some problems with its interpretation:

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               1.04   396.31    6.60   57.44   382.47  3649.21    62.95    10.31  160.87   8.64  55.36
sda               6.26   391.15   16.16   62.75  1810.79  3649.22    69.19     2.97   37.66   1.79  14.13
md0               0.00     0.00    0.55    0.01    16.88     0.08    30.11     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.02    0.07     1.10     0.54    18.31     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.02    0.04     0.13     0.34     8.00     0.00    0.00   0.00   0.00
md3               0.00     0.00   29.48  453.28  2175.15  3643.46    12.05     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00    56.15    0.70   81.34    12.00  1110.03    13.68    47.56  600.17   5.23  42.89
sda               0.00    51.02    0.47   81.37     4.53  1059.38    13.00     0.32    3.95   0.69   5.64
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    1.17   47.45    16.53   379.61     8.15     0.00    0.00   0.00   0.00

First one is initial (history) statistics iostat displays, the second one is after 30 seconds.

Why is await for sdb so higher than for sda? OK, because svctm is also higher (svctm is part of await but also influences the queue length). But why, if there are in mirror? They are both exactly the same disks, smartctl does not report any problems or significant counters difference:

SDA:

Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   145   145   021    Pre-fail  Always       -       9716
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       71
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   080   080   000    Old_age   Always       -       14623
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       69
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       68
193 Load_Cycle_Count        0x0032   113   113   000    Old_age   Always       -       262965
194 Temperature_Celsius     0x0022   126   114   000    Old_age   Always       -       26
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SDB:

Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   145   145   021    Pre-fail  Always       -       9708
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       67
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   080   080   000    Old_age   Always       -       14622
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       65
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       63
193 Load_Cycle_Count        0x0032   113   113   000    Old_age   Always       -       261839
194 Temperature_Celsius     0x0022   128   115   000    Old_age   Always       -       24
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

/etc/fstab:

proc            /proc           proc    defaults        0       0
/dev/md0        /               ext3    relatime,errors=remount-ro 0       1
/dev/md1        /var            ext3    relatime        0       2
/dev/md2        none            swap    sw              0       0
/dev/md3        /vz             ext3    relatime        0       3
/dev/hda        /media/cdrom0   udf,iso9660 user,noauto 0       0

Some measurements with iostat -d -x 2 (every two seconds) under heavy load. You can see that both disks can have longer queue and waiting time, but sda successfully reduces this, while sdb keeps having the longer waiting time. This is strange as disks are the same and it is RAID-1 (mirror).

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00     0.00    0.00    6.00     0.00  6144.00  1024.00    21.40 4545.00 166.67 100.00
sda               0.00     0.00    2.00    1.00    16.00     8.00     8.00     0.49  390.00  75.33  22.60
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    1.50    0.50    12.00     4.00     8.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00  1405.00    0.50   23.00     4.00 10632.00   452.60    18.96 1889.62  41.62  97.80
sda               0.50  1401.50    1.50   37.50   120.00 11512.00   298.26     4.29  110.00   3.13  12.20
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    2.50 1439.00   124.00 11512.00     8.07     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00  1995.50    0.00   29.00     0.00  5304.00   182.90    13.64  873.31  34.34  99.60
sda               0.50  1986.50    6.50   28.50   512.00  1664.00    62.17     0.57    7.14   1.89   6.60
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    7.00 2046.00   512.00 16368.00     8.22     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00   930.00    0.00   18.50     0.00  1192.00    64.43    92.52  859.68  54.05 100.00
sda               0.00   928.50    0.00   35.50     0.00 18192.00   512.45    51.52  701.97  28.17 100.00
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    0.00  946.50     0.00  7572.00     8.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00     0.00    0.00   16.00     0.00  8976.00   561.00    56.14 2710.38  62.50 100.00
sda               0.00     0.00    0.00   13.50     0.00  4084.00   302.52     6.26 2457.63  47.56  64.20
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00     0.00    0.00   10.00     0.00 10240.00  1024.00    33.75 4877.20 100.00 100.00
sda               0.00     0.00    0.50    0.00     4.00     0.00     8.00     0.01   16.00  16.00   0.80
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    0.50    0.00     4.00     0.00     8.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00  3245.50    1.50   31.50   208.00 12756.00   392.85    64.57 2644.30  30.24  99.80
sda               0.00  3245.00    2.00   60.50   108.00 26444.00   424.83    17.03  272.42   4.61  28.80
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    3.50 3305.50   316.00 26444.00     8.09     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00     0.00    0.00    8.00     0.00  8192.00  1024.00    74.48 2241.50 125.00 100.00
sda               0.00     0.00    0.00    1.00     0.00     8.00     8.00     0.00    0.00   0.00   0.00
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    0.00    1.00     0.00     8.00     8.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00     3.00    0.00   22.50     0.00  5192.00   230.76    58.21 3204.18  44.44 100.00
sda               0.00     3.00    3.50    6.50    48.00    76.00    12.40     0.09    8.00   5.60   5.60
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    4.00   10.00    52.00    80.00     9.43     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.50  4098.50    1.50   31.50   324.00  4160.00   135.88    78.08 3401.39  30.24  99.80
sda               0.50  4084.00    2.00   32.00   216.00  8200.00   247.53    57.79   27.53  15.35  52.20
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    4.00 4173.00   536.00 33384.00     8.12     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00     0.00    0.00   20.50     0.00  9228.00   450.15    97.71 1776.78  48.78 100.00
sda               0.00     0.00    0.00   32.00     0.00 13536.00   423.00    72.55 1675.31  31.25 100.00
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00     0.00    0.00   13.00     0.00  7220.00   555.38    67.20 3830.46  76.92 100.00
sda               0.00     0.00    0.00   25.50     0.00 11652.00   456.94    38.91 4491.14  39.22 100.00
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    0.50    0.50     4.00     4.00     8.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00     0.00    0.00   11.00     0.00  5548.00   504.36    50.62 6367.45  90.91 100.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     2.37    0.00   0.00 100.00
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00     0.00    0.00    7.50     0.00  6648.00   886.40    28.48 7513.07 133.33 100.00
sda               0.00     0.00    1.50    3.50    12.00    28.00     8.00     0.24  560.80  20.80  10.40
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    1.50    2.00    12.00    16.00     8.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00     0.00    0.00   10.00     0.00  4036.00   403.60    12.15 9193.00 100.00 100.00
sda               0.00     0.00    1.00    0.50     8.00     4.00     8.00     0.02   14.67  14.67   2.20
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    1.00    0.50     8.00     4.00     8.00     0.00    0.00   0.00   0.00

Best Answer

There's an option -W or --write-mostly which is described very similar to what you get: «…

This is valid for RAID1 only and means that the 'md' driver will avoid reading from these devices if at all possible. This can be useful if mirroring over a slow link.

…» — man mdadm

Check it out. This could be the issue.