Linux Performance – Relation Between Disk IOPS and SAR TPS

hard driveiopslinuxperformancesar

I'm trying to estimate IOPS requirements of my application running on 32-bit CentOS 6.2. I started to take some measurement on a machine with SATA disks and I'm quite confused of difference between IOPS and tps measured by sar.

According to wikipedia SATA disk should perform 75-100 IOPS. ioping utility seems to confirm this for random access test:

# ./ioping -R /dev/sda
--- /dev/sda (device 931.0 Gb) ioping statistics ---
279 requests completed in 3.0 s, 92 iops, 371.3 kb/s
min/avg/max/mdev = 2.7 ms / 10.8 ms / 130.8 ms / 7.9 ms

But tps values produced by sar are much higher (/dev/sda):

# iostat 1
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
       0.17    0.00    2.02   14.86    0.00   82.96

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             559.00         0.00    142600.00          0     142600
dm-0          18433.00         0.00    147464.00          0     147464
dm-1              0.00         0.00         0.00          0          0
dm-2              0.00         0.00         0.00          0          0

It does not really mind if this load is sequential (dd with various block sizes) or random access (ioping), value is still the same. I thought tps actually is IOPS and I would expect it go down with larger chunks transferred.

So what exactly does the tps value mean? And how does it relate to IOPS?

Best Answer

Transactions are single IO-commands (fetch block/write block) that are written to the RAW-disk (in your example dm-0). The linux-kernel tries to order those commands into a better sequence or tries to compress them into more efficient commands (like: get two blocks at once instead of get one block and get another block right after this one). These are the transactions that go out to the disk-controller (tps for sda).

Good controllers migth have a logic of their own that reduce the real number of transactions further.

A transaction might be the SCSI-command "write 2 GB to crontoller 1 target 2 lun 3 starting from sector 22). As you can see this can not be brought into direct correlation with throughput-numbers.

What you are after is the sustained write-rate. You have a couple of limiting factors here:

  • client-connection: If the network is Gigabit you will never have more than 100 MB/s input
  • disk-controller: If this is a 3 Gb controller you will never have more than 300 MB/s throughput
  • disk: Look up the manufacturers value for sustained write performance
  • Filesystem: There is a little overhead since the OS needs to process data - test that in a RAM-disk...

My guess for your system is: Get a good hardware-raid-controller that is capable of doing raid 10 or 5 and get at least 6 fast (15k) disks.

For professional use use SAS instead of SATA.

Related Topic