Linux LVM – Best Scheduler for Virtual Machines

linuxlvm

When you have LVM, you have an entry for a scheduler in /sys/block for your physical volumes, but also for each individual logical volume, and the raw device.

We have a Debian 6 LTS x64, kernel 2.6.32 system running Xen hypervisor 4.0 (3Ware 9650 SE hardware RAID1). When running virtual machines on each logical volume, on which one do you need to set the scheduler if you want to influence how they get scheduled by the OS? If you set the logical volume to deadline, will that even do anything when the physical volume is set to cfq? And if you do set it do deadline on the logical volume, will those deadlines be honoured even when the disk is slowing down because of IO on other LV's that are set to cfq?

Question relates to IO on VMs slowing down other VMs too much. All guests use noop as scheduler internally.

Edit: according to this, in a multipath environment, only the DM's scheduler will take effect. So if I want to handle IO between virtual machines in a deadline manner, I have to set the DM path of the physical volume (dm-1 in my case) to deadline. Is that right? There is also a scheduler for sdc, which is the original block device of my dm-1. Why doesn't shouldn't it be done on that?

edit2: but then someone says in the comments that dm-0/1 doesn't have a scheduler in newer kernels:

famzah@VBox:~$ cat /sys/block/dm-0/queue/scheduler
none

On my system (Debian 6, kernel 2.6.32), I have:

cat /sys/block/dm-1/queue/scheduler 
noop anticipatory [deadline] cfq

Question is also, do I have a multipath setup? pvs shows:

# pvs
PV         VG                 Fmt  Attr PSize PFree
/dev/dm-0  universe           lvm2 a-   5,41t 3,98t
/dev/dm-1  alternate-universe lvm2 a-   1,82t 1,18t

But they were created with /dev/sd[bc]. Does that mean I have multipath, even though it's a standard LVM setup?

The main question, I guess, is do I have to set the scheduler on sdc or dm-1? If I do iostat, I see a lot of access on both:

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sdc               0,00     0,00   13,02   25,36   902,71   735,56    42,68     0,08    2,17   0,73   2,79
dm-1             82,25    57,26   12,97   25,36   902,31   735,56    42,72     0,18    4,73   0,84   3,23

So, what is what and who is the boss? If it's sdc, I can tell you that setting it to deadline doesn't do a thing for the performance of my VMs. Looking at the difference in the 'requests merged' columns (first two), I'd say it's dm-1 that controls the scheduling.

Best Answer

So, the answer turned out to be simply: the underlying device. Newer kernels only have 'none' in /sys/block/*/queue/scheduler when there is no scheduler to configure.

However, for a reason unknown to me, the devices on this server are created as multipath devices, therefore my fiddling with the scheduler on /dev/sd[bc] never did anything in the past. Now I set dm-1 and dm-0 to deadline with a read_expire=100 and write_expire=1500 (much more stringent that normal) and the results seem very good.

This graph shows the effect on disk latency in a virtual machine, caused by another virtual machine with an hourly task:

Disk latency over 24h in ms

You can clearly see the moment where I changed the scheduler parameters.

Related Topic