Hyper-V VMs sometimes freeze after backup

hyper-v-server-2012-r2

I'm having issues with a Hyper-V host (2012 R2 Datacenter, patched to June 2017 Rollup level). Here's a bit of general information on the configuration:

Host

  • It's a standalone host, domain joined (DC is not a guest).
  • It is a full 2012 R2 installation, not standalone Hyper-V nor Core.
    • Hyper-V is the only role, apart from "File Server". The host is only used as a Hyper-H host, no network shares or other activity.
  • The VM files are stored on a SAN, connected via iSCSI.
  • The volumes that hold the VM files (one volume per VM) are formatted as NTFS.
  • There are no VSS locations configured for those volumes (which, from what I've read, should be fine for VSS based backups; it's only important that no volume INSIDE the Guests is configured to store VSS copies on another volume than itself).
  • I use Windows Server Backup to run nightly backups of all VMs.
    • Included are Bare-Metal-Recovery, System State, Hyper-V Host Component and VMs, plus EFI and system volume.
    • Backup is set as VSS copy backup.
    • Backup is written to internal storage.
  • BPA analyzer for Hyper-V shows no relevant warnings or errors.

If there are any misconfigurations, I've so far missed them (though my 2012 R2 course is still pending, so… not an expert here.). Now for the guests.

VMs

  • 3 2012 R2 guests, same patch level as host.
  • Guests are mostly low to very low load levels. The only VM that sees regular use is our file server (around 1.5 TB in one .vhdx volume).
  • 1 Debian guest.
  • Guest integration services are enabled.
  • VSS settings inside the Windows guests: All volumes have themselves configured as VSS location.

The problem:
Sometimes (did not find a time pattern so far) the backup either stops working with a Hyper-V VSS error, or the backup completes, but some VMs (sometimes 1, sometimes all, Windows or Linux) are frozen.
When I check the Virtual Disk folder, there's an .avhdx still present, so the merge after the backup must have failed. My guess is that it resets the active partition back to the main file, but doesn't merge the .avhdx, so I get a frozen machine because it can't find the system drive. Shutting down the guest (hard power-off) allows these files to be merged without any interaction by me, and all is well. Occasionally, the Linux guest stops booting, but that's what backups are for. Another option on guests with multiple mounted volumes is that one or more of them go missing (mounted in Hyper-V guest settings, nowhere to be found inside guest). Shutting down the guest, removing and re-adding the volume fixes this.

Is this some kind of known quirk of Hyper-V I'll have to live with, or can I do something about it? Am I running into some kind of VSS timeout due to large .vhdx of the file server, something else misconfigured? Constantly taking guests offline doesn't seem like best practice.

Thanks for any help.

Best Answer

Have you got integration services configured correctly on all the VMs? I had a similar problem with our system in that a Debian VM kept hanging overnight. Getting up to date integration services for Debian wasn't easy (i switched to centos). The windows VMs should be fine though.. check integration settings are enabled on each of the VMs' settings.

Related Topic