Linux – iostat – dm-0 device shows much higher latencies than sdX


We are running linux (Centos 5.x) VM's on top of vmware vsphere 5.5. I am monitoring disk latency using iostat, specifically the await column, but I am noticing strange results with the device mapper/LVM vs the "physical" disks backing LVM.

Below is one set of output from iostat -x 5 on one of our fairly active VM's. The vm in question has two disks, sda with 1 partition being /boot, and sdb as our main disk with / on sdb2. While iostat shows ~20-40ms latencies for await for the sdb2 device (the only device/partition backing my volgroup / dm-0), iostat for dm-0 shows 100+ms await.

My question is: which statistic is "correct" here, as far as the real latencies the system is seeing? Is it seeing the ~20ms shown for the "physical" disk sdb, or is it really seeing 100+ms from dm-0, maybe due to some alignment / etc issues that arise when LVM gets involved? It is strange because sometimes the stats match up pretty well, others they are way off – for example, in the block of iostat output below, sdb2 shows 419 write IOPS, while dm-0 shows 39k write IOPS.

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.78    0.00    8.42   39.07    0.00   46.73

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb              15.67 39301.00 745.33 419.67 64146.67 317765.33   327.82    53.55   45.89   0.86 100.07
sdb1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb2             15.67 39301.00 745.33 419.67 64146.67 317765.33   327.82    53.55   45.89   0.86 100.07
dm-0              0.00     0.00 761.33 39720.67 64120.00 317765.33     9.43  4933.92  121.88   0.02 100.07

I did some further reading, including the links in Gene's answer below. I know there are a lot of variables involved (virtualization, block file system, etc), but that portion of it seems sorted, per our vendors + VMware's best practices, and performance is actually very good. I really am just looking at this from the "within the VM" perspective here.

On that note, I suspect there is an issue with our partition + LVM alignment:

GNU Parted 1.8.1
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) unit s
(parted) print

Model: VMware Virtual disk (scsi)
Disk /dev/sdb: 2147483647s
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number  Start     End          Size         Type     File system  Flags
 1      63s       4192964s     4192902s     primary  linux-swap   boot
 2      4192965s  2097151999s  2092959035s  primary               lvm

~]# pvdisplay
  --- Physical volume ---
  PV Name               /dev/sdb2
  VG Name               VolGroup00
  PV Size               998.00 GB / not usable 477.50 KB
  Allocatable           yes (but full)
  PE Size (KByte)       32768
  Total PE              31936
  Free PE               0
  Allocated PE          31936
  PV UUID               tk873g-uSZA-JaWV-R8yD-swXg-lPvM-dgwPQv

Reading on alignment, it looks like your start sector should be divisible by 8, so you align on a 4kb boundary, with the standard 512b sector size. It looks like LVM is able to automatically align when you apply it to an entire disk, but since we're partiitoning out the disk first, and then making our i.e. /dev/sdb2 partition a physical device for LVM to use, I'm not sure its able to calculate an offset in that case. Per, the parameter data_alignment_offset_detection: "If set to 1, and your kernel provides topology information in sysfs for the Physical Volume, the start of the aligned data area of the Physical Volume will be shifted by the alignment_offset exposed in sysfs." This is Centos5, and I don't see any of that info exposed in sysfs, only on our Centos6 and newer vm's, so it might not be able to align correctly on a physical volume.

I found this netapp whitepaper on VM partition alignment
Specifically, there's good info in section 4.5, page 29, about properly partitioning a VM for proper alignment with LVM. I'll follow that so our new vm's are aligned correctly.

This seems like it could cause this behavior, can anyone with more knowledge/experience confirm that?

Best Answer

There's no easy answer since virtualisation is involved. You have a virtual disk sitting on top of a file system on top of a block device presented to a virtual guest which has its own driver presenting a block device to LVM. I don't know for sure if that would necessarily cause such a huge difference, but it may be possible.

Beyond that...

LVM adds overhead so there will be a difference. If your LVM and block devices aren't aligned properly that can also be a contributing factor.

Alignment isn't a simple subject that can be covered in a setting such as this. The best I can do is refer you to a couple of documents, maybe you'll find more answers in them: