Need a tweak to force VMware ESXi 5 to sync time with Windows NTP server

ntpvmware-esxi

I'm plagued by a seemingly simple but actually troublesome problem, so I ask for help here.

Symptom

I installed an instance of VMware ESXi 5.1 inside VMware Workstation to learn about various ESXi features. Let me call it vESXi machine. My problem running vESXi is, everytime I suspend the vESXi at night and resume it the next day, the time in vESXi lags. That is, if I suspend vESXi at 22:00 last night and resume it today morning, I will see vESXi reports it is still 22:00 last night. I know this is normal because a PC system keeps its time by counting hardware timer tick interrupt. So, I decide to have vESXi use NTP to sync with an NTP server. My NTP server is a Windows Server 2003 domain controller running on the same LAN segment.

I quickly observe the problem: my vESXi never sync its time with the Window NTP server, no matter how long I wait.

You may suggest that I shutdown vESXi instead of suspend it, but I really love the suspend feature provided by VMware Workstation, which is really quick and fluent.

snap0327-esxi-time-drift.png

Questions

I've read http://kb.vmware.com/kb/1035833 which provides the workaround(some tweak to make it work with Windows NTP server), but that KB article creates more baffles at first reading. What's more, it is really cumbersome to tweak every vESXi if I have quite many of them.

Q1(primary): What on earth does these statements mean?

By default, an unsynced Windows server chooses a 10-second dispersion
and adds to the dispersion on each poll interval that it remains in
sync. An ESXi/ESX host, by default, does not accept any NTP reply with
a root dispersion greater than 1.5 seconds.

Q2: The KB tells me to add

tos maxdist 30

to /etc/ntp.conf . What does that line mean? The ntpd.conf man page http://linux.die.net/man/5/ntp.conf seems to say nothing about tos or maxdist.

Q3: Since the ESXi NTP client(daemon) does not sync successfully with the assigned NTP server, does ESXi generate some log message explaining the detail?

Q4: Is there any way to force ESXi's ntp client to immediately sync with a specific NTP server? Even a manual operation is welcome. I don't know whether the bundled ntpd command can do this.

Q5: Without a manual /etc/ntp.conf tweak on ESXi, what kind of NTP server can I set up to provide time source to my vESXi box? Linux ntpd or what? any special configuration on server side required?

Thank you in advance.

UPDATE WITH EXPERIMENTS

After reading NTP FAQ and doing quite some experiments, I have confirmed the following facts:

  1. With the KB1035833 tweak, the vESXi can really sync with my Windows NTP server, even if the vESXi's time is 7 days behind the server(only a 5-to-10-minute wait).
  2. Using viclient to assign a Linux machine(openSUSE 11.4, ntp package 4.2.6 in my case) in my LAN as NTP server, the vESXi without KB1035833 tweak can as well sync with the NTP server with only 5-to-10-minute delay, even with 7-day time lag-behind. But one thing I find I have to keep in mind is that I need to tick Restart NTP service to apply changes in viclient in order to force the time sync shortly(in 5 to 10 minutes). In other word, if I don't restart ESXi's NTP service, it will be quite hard to predict how much time it will take to get the time sync after resuming vESXi with an overnight time lag(sometimes it costs one hour or more) — probably because the resumed NTP client code considers the NTP server time's jitter unacceptable during that period.

So: My practical problem regarding time-keeping for vESXi has been solved(preferring to sync time with a Linux NTP server).

Now the focus is on Q1. What's wrong with Windows NTP server? I hope some one can help explain those two statements in Q1 from NTP protocol's perspective.

I express my thanks to quadruplebucky and Reality Extractor for their useful information, although their answers to my specific problem is not quite accurate.

Best Answer

If the time difference is too great NTP will not sync. That's expected behavior.

Normally this is not an issue as the VMware Tools installed in a guest OS will sync the time to the host regardless of how far the time drifted. Since you are using the experimental and unsupported option to run ESXi as VM you don't get the VMware Tools functionality, and hence it doesn't sync.

For tos maxdist you can review Automatic NTP Configuration Options but those won't solve your issue because the time is just too far off.