Is running permanently in a VMWare snapshot bad for performance

performancesnapshotstoragevirtualizationvmware-esxi

I understand that the VMWare KB frowns upon long running snapshots mainly due to two things (In my opinion)

  • Taking tons of snapshots can fill up the data store. Snapshots are simply delta files. Let's say you have a 50 Gig VMDK, near full, and you take a snapshot. In your snapshot you flip every single bit. Your delta file will also be about 50 GB. Snapshot again, flip the bits, another 50 Gig delta file. These can get out of control fast.

  • Committing large snapshots carries risk. When consolidating snapshots you are writing the delta changes to the original VMDK. This takes time and carries the risk that if something happens you just nuked your VMDK.

Their warnings seem to make logical sense.

With that being said, is it inherently bad to run my machine permanently off of a snapshot VMDK? I want to make my tree the following:

  • Base
    • Snap1
      • Snap 2
      • You are here

Snap 1 and 2 will be taken immediately after installing and provisioning the base system. These are machines I plan to refresh frequently so I will simply make my tree look like the following:

  • Base
    • Snap1
      • You are here
      • Snap 2

Delete Snap2 and recreate Snap2.

I can not see how this could have any implications for the following reasons:

  • Since I simply installed a base image and took my deltas immediately after there is no way I could possibly fill up the data store. Assuming my base image is only 10 GB (on a 50 GB thin provisioned disk), even if my delta flipped every single bit the max my total usage could be is 60 GB (10 GB base VMDK which is locked + 50 GB of delta in the snapshot VMDK file). This assumes I do not create any further snapshots.

  • Since my use case does not call for consolidating the snapshots I do not risk errors upon consolidating my deltas. When I move back to Snap1 and delete Snap2, all of the delta that resided in Snap2 simply gets deleted.

  • The storage load is exactly the same, so I should be getting the same IOPS. I understand that some files (mainly system files) will exist on the original VMDK and others (everything after the base) will reside in the delta but I don't see how ESXI would care. All the files are on the same physical datastore so the performance should be equivalent to referencing everything in the original VMDK without snapshots.

Any thoughts? ESXI 5.5 with the data store being RAID'd DAS.

I do not have a vCenter license so templating and cloning is off the table.

RESULTS OF TEST

I got in early today to run some tests. Here's the results. There is a performance penalty but I'm not sure why.

Before Snapshotting:
Before Snapshoting

After Snapshotting:
After Shapshoting

Best Answer

Yes, there are performance implications for long-running snapshots. There are even greater implications for consolidating delta VMDKs back to the original disk file. This can cause unresponsiveness in your VM's operating system or other undesirable behavior.

VMware has templating and cloning functionality built into vCenter. You need a $600 vSphere Essentials license to enable this.

You can create a VM to your taste, then clone it to a template. That template can then be used to generate new virtual machines from a "Golden Master" image.

enter image description here

This allows you to have a "clean state" but also create long-running or permanent VMs from that master image. No snapshots needed.