Linux – Best practices for thin-provisioning Linux servers (on VMware)

linuxlvm

I have a setup of about 20 Linux machines, each with about 30-150 gigabytes of customer data. Probably the size of data will grow significantly faster on some machines than others. These are virtual machines on a VMware vSphere cluster. The disk images are stored on a SAN system.

I'm trying to find a solution that would use disk space sparingly, while still allowing for easy growing of individual machines.

In theory, I would just create big disks for each machine and use thin provisioning. Each disk would grow as needed. However, it seems that a 500 GB ext3 filesystem with only 50 GB of data and quite a low number of writes still easily grows the disk image to eg. 250 GB over time. Or maybe I'm doing something wrong here? (I was surprised how little I found on the subject with Google. BTW, there's even no thin-provisioning tag on serverfault.com.)

Currently I'm planning to create big, thin-provisioned disks – but with a small LVM volume on them. For example: a 100 GB volume on a 500 GB disk. That way I could more easily grow the LVM volume and the filesystem size as needed, even online.

Now for the actual question:

Are there better ways to do this? (that is, to grow data size as needed without downtime.)

Possible solutions include:

  • Using a thin-provisioning friendly filesystem that tries to occupy the same spots over and over again, thus not growing the image size.

  • Finding an easy method of reclaiming free space on the partition (re-thinning?)

  • Something else?

A bonus question: If I go with my current plan, would you recommend creating partitions on the disks (pvcreate /dev/sdX1 vs pvcreate /dev/sdX)? I think it's against conventions to use raw disks without partitions, but it would make it a bit easier to grow the disks, if that is ever needed. This is all just a matter of taste, right?

Best Answer

If I understand thin provisioning correctly then it could really cause problems if you aren't monitoring your VMFS filesystems growth closely and allow your VMDKs to fill up your VMFS volumes. You've seen in your testing that thin provisioned disks tend to grow to fill their available space quickly and that they cannot reclaim space that may be free inside the OS.

The other option is creating sufficiently sized VMDK files to handle your current usage and expected spikes in growth and just add more VMDK files as your application data usage grows. New VMDK files can be added live to a VM, you just have to rescan (echo "- - -" > /sys/class/scsi_host/host?/scan). You can partition the new disk, add it to your LVM and extend the filesystem all live. This way you are always aware how much space is allocated to each of the VMs and you can't accidently run your VMFS out of space from inside a guest.

As far as whether to partition or not if the disk is only going to be used by LVM, I always partition. Partitioning the disk prevents any warnings about bogus partition tables from coming up when the machine boots and makes it clear that the disk is allocated. It's a bit of voodoo but I also make sure to start the partition at 64 to help make sure the partition and filesystem is block aligned with the underlying storage. It's hard to detect and categorize as you usually don't have something to easily compare against but if the OS filesystem isn't aligned properly with the underlying storage then you can end up with extra IOPS required to service requests which cross block boundaries on the underlying storage.

Related Topic