Linux – High await in sar

iolinuxperformancesar

My database server has the following sar output for the data device:

[postgres@dbsrv07 ~]$ LC_ALL=POSIX sar -d  |egrep "await|dev253-2"

00:00:01          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util

00:10:01     dev253-2   2721.27  18357.23  20291.52     14.20    613.68    225.51     0.15     40.60

00:20:01     dev253-2   1345.04    574.92  10685.38      8.37    290.65    215.99      0.06      8.61

00:30:01     dev253-2    801.39    193.53   6364.92      8.18     87.49    109.34      0.07      5.95

00:40:01     dev253-2    832.95    195.70   6617.82      8.18     89.30    107.20      0.07      5.87

00:50:01     dev253-2    835.58    162.90   6644.64      8.15     85.35    102.14      0.06      5.24

01:00:01     dev253-2    847.99    232.36   6722.90      8.20     89.91    106.03      0.07      5.64

01:10:01     dev253-2   2240.78   2295.28  17543.52      8.85    163.37     72.91      0.10     23.06

01:20:01     dev253-2   2706.18   1358.97  21482.68      8.44    175.98     65.00      0.08     20.73

01:30:01     dev253-2   5839.31   3292.69  45960.39      8.43    520.98     89.19      0.07     42.24

01:40:01     dev253-2   5221.88   1945.32  41384.97      8.30    553.92    106.05      0.06     33.85

The high await persists throughout the day.

Am I right in assuming that this indicates an I/O bottleneck?

Thanks

Best Answer

svctm is a measure of how long the storage took to respond after the command left the IO scheduler and the IO was no longer under the kernel's control. You're seeing less than 1ms here which is excellent.

await is a measure of how long a given IO spent in the entire IO scheduler. You're seeing hundreds of milliseconds here which is pretty bad. Different people/vendors have different ideas about what is "good", I'd say under 50ms is good.

If your physical storage was slow, you'd see a large svctm and a large await. If the kernel's IO is slow, you'll see a large await but small svctm.

What IO scheduler are you using to this device? Given the small IO size (8kb) you care more about latency of requests than about bulk throughput. You'd probably be best off using the deadline scheduler, as opposed to the default cfq scheduler.

This is done by putting elevator=deadline on the kernel line in grub.conf and rebooting.

Also, given that you have hundreds of IOs backed up in the queue (avgqu-sz), and you're getting into thousands of IOPS (tps), and I'd assume that these are database IO which is likely to be directio so they cannot be merged into larger requests or take advantage of the pagecache, you may just be expecting too much from the storage subsystem.

Related Topic