How to keep time on resumed KVM guest with libvirt

kvm-virtualizationlibvirttime-synchronizationvirtualization

On my host I am using libvirt and a KVM guest. When the host is shutting down, libvirt suspends the guest. When the host is starting up, libvirt resumes the guest. The problem is, if the guest is suspended and resumed after 24 hours for example, then the guest time is 24 hours in the past.

I thought that maybe the problem is with the clocksource, but it is set to "kvm-clock" already.

$ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
kvm-clock tsc hpet acpi_pm 

$ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
kvm-clock

Best Answer

The Problem

I've got the same problem and I haven't found a good solution. Here's what I found:

The problem is that after resume, the system and hardware clock times on the guest are different:

root@guest:~# date; hwclock
Sat Oct 11 13:09:38 UTC 2014
Sat Oct 11 13:10:42 2014  -0.454380 seconds

On the host, they agree:

root@four:~# date; hwclock
Sat Oct 11 13:11:35 UTC 2014
Sat Oct 11 13:11:36 2014  -1.000372 seconds

The solution would be to run hwclock --hctosys on the guest after it's been resumed. However, I haven't found a way to do this with changes on the guest system only, as the guest doesn't notice that it is suspended and resumed.

QEmu Guest Agent

There is the possibility to run a software called QEmu Guest Agent on the guest and notify from the host to update the guest system clock from the guest hardware clock. However, the page mentions that the guest agent makes the host and guest vulnerable to attacks from each other because of issues with a JSON parser (at least I believe that the affected code is also run on the host, I'm not sure about that). Anyway, here's how to set that up:

  1. Set up a virtio serial channel for the agent as mentioned in the libvirt wiki (see also libvirt domain format documentation).

  2. After the serial channel is available, install and start the QEmu Guest Agent on the guest. (Debian: apt-get install --no-install-recommends qemu-guest-agent.)

  3. Trigger the clock offset by suspending, waiting and resuming. Then run the following command on the host to correct it: virsh qemu-agent-command backup '{"execute":"guest-set-time"}' The wiki page that using virsh qemu-agent-command is unsupported, but I haven't found any other command that does the job.

I found two discussions on automating within libvirt the call to guest-set-time on resume from suspend:

However, nothing has been implemented yet as far as I could see.

I found information on how to submit commands to the guest agent on wiki of stoney-cloud.org.

I've also tried setting tickpolicy="catchup" in the libvirt timer configuration but this didn't solve the problem.

NTP

An alternative to using the agent would be to use an ntp daemon or to call ntpdate periodically from a cron job. I wouldn't recommend the latter, as it can cause the time to go backwards, which can confuse programs (for example, the Dovecot IMAP server doesn't try to handle time going backwards and can terminate).

I tried the following ntp daemons:

  • openntpd: Corrects time very slowly at a rate of about 2 seconds per 60 minutes in my test. The time offset was 120 seconds. Also, openntpd throws an error if the time offset is too large and, in my test, completely fails to correct time in that case. Advantages of openntpd: Can run as regular user in chroot.

  • chrony: Corrects a time offset of 120 seconds in 30 minutes in my test. chrony can be configured to run as regular user. chroot support is not implemented. NTP server polling interval can be configured for each NTP server.

  • systemd-timesyncd: Corrects a time offset of 120 seconds in 30 seconds in my test. Runs as regular user by default. However, the polling interval of NTP servers increases up to 2048 seconds, so that a suspend/resume wouldn't be detected until 34 minutes after the resume in the worst case. This does not seem to be configurable. Also, I've observed timesyncd step the time backwards, which causes the same problems as calling ntpdate in a cron (see above).

chrony solves the problem. Openntpd isn't suitable because its correction rate is too low and doesn't seem to be configurable. systemd-timesyncd doesn't entirely solve the problem either, because its polling interval is not configurable.

I tested the following Debian versions of the NTP daemons: openntpd 20080406p-10, chrony 1.30-1 and systemd 215-5+b1.