RedHat ntp offset becomes bigger over time

ntpredhat

I have a strange issue with a server. It is configured to use 2 servers, just like all the other server (the config is the same for a couple of server and they are all working except this one) and it had a huge offset (over 1000s), but every 1h20min it corrects itself and it's back on time for a couple of minutes. So I already did the following:

Stopped ntpd daemon
Issued the following command:
```
ntpdate -b xxx.xxx.xxx.Xxx
```
started the ntpd daemon again

But with no result.

My ntp.conf file looks like this:

listen-on xxx.xxx.xxx.xxx accept
server xxx.xxx.xxx.xxx burst iburst minpoll 4 maxpoll 4
restrict xxx.xxx.xxx.xxx
driftfile /var/lib/ntp.drift
logfile /var/lib/ntpd.log
server xxx.xxx.xxx.xxx burst iburst minpoll 4 maxpoll 4
restrict xxx.xxx.xxx.xxx
driftfile /var/lib/ntp.drift
logfile /var/lib/ntpd.lo

Any advice on what steps to take next? Or a way to fix this?

Best regards

Update
The ntpq -p -crv

[root@xxxxxxxx ~]# ntpq -p -crv
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
+xxxxxxxxxxxxxxx xxx.xxx.xxx.xxx  3 u   10   16  377    1.992  410.988  10.517
*xxxxxxxxxxxxxxx xxx.xxx.xxx.xxx  2 u    6   16  377    2.758  420.365  12.230
status=0614 leap_none, sync_ntp, 1 event, event_peer/strat_chg,
version="ntpd 4.2.6p1@1.2158-o Fri Aug 24 16:13:49 UTC 2012 (3)",
processor="i686", system="Linux/2.6.22.9-61.NS5", leap=00, stratum=3,
precision=-21, rootdelay=8.343, rootdisp=457.558,
refid=xxxxxxxxxxxx,
reftime=d7d674da.8041b0f8  Wed, Oct  1 2014 14:40:58.501,
clock=d7d67500.7904168c  Wed, Oct  1 2014 14:41:36.472, peer=59935,
tc=4, mintc=3, offset=223.125, frequency=0.000, sys_jitter=19.824,
clk_jitter=123.945, clk_wander=0.000

Best Answer

I had a similar problem. I had once had a working NTP server. And, of course, I changed something and eventually I noticed that it wasn't working anymore.

I discovered that I had changed my BIOS and had disabled the IOMMU on a machine that was acting as VM hypervisor. And the VM hypervisor was also my NTP server.

I can't believe it was able operate as a hypervisor. So check for IOMMU in kern.log (or dmesg).

Another clue, under time, if timedatectl status works.

Related Solutions

Linux – Single NTP server on isolate network

NTP should work fine. Look at some of the options for fast synchronization on start-up. Look at the burst and iburst options for the system B. Look at the true option for the GPS clock source.

Consider using the hardware clock as a backup time source on both systems. Set a higher stratum system B. Something like the following should work:

server  127.127.1.0
fudge   127.127.1.0 stratum 8

Watch the output of ntpq -c peers to see when you get a trusted clock source. Normally ntp wants a number of responses from a trusted time source before it trusts it. This is indicated by the first character on each line.

While NTP likes more sources, any odd number of time sources within one stratum level should work well. As you only have two servers and a GPS clock the priority (stratum) of the sources should increase from GPS, clock on server A, clock on server B. Increasing the stratum between each by three or four levels will ensure priorities are respected.

EDIT: If you have the busybox NTP server on server A, it may be worthwhile installing the full ntp server package. Understanding what is happening with server A should go a long way to solving your problem. You will need at least one trusted time source there before server B should trust it. If ntpq -c peers doesn't work, then you can try ntpdc peers. Both these commands allow you to query other hosts. A peerstats log could also be useful.

On server B use ntpclient as documented the busybox ntp howto to log what is happening on it

The clocks should be reasonably close to the correct time if the servers haven't been down for long. If you need to sync the two systems, that should be sufficient. The GPS will bring the time into sync with the real world eventually.

'ntpd -q' synchronizes quickly, but exits (ntpdate behaviour). It needs to be followed by an ntpd command without the quit option to have continuous synchronization.

EDIT2: I check my server and found one of the servers was off by a second. While fixing this I played with the settings. iburst gets a server trusted very quickly. true ensured the clock driver was trusted if there weren't multiple other trusted sources. The clock took a little more than a minute before it was locally trusted and could be trusted remotely.

When testing you should be able to restart the ntpd process once it is synchronized and test how fast settings work. In the above case Server B may need to be restarted to test how fast it synchronizes. When monitoring ntpd changes I use a line like:

while ntpq -c peers localhost; do sleep 10; done

The hostname and sleep time are adjusted as require. In some cases I chain two or more ntpq command lines in the loop. When doing so I use an echo and/or date command to provide an indication of where sets of data change.

Linux – NTP: ntpdate to sync time between the PCs on a private network

ntpdate does not read the ntp.conf file.

To synchronize one-time, pass the IP address of the server on the command line:

ntpdate 169.254.10.10

Best Answer

Related Solutions

Linux – Single NTP server on isolate network

Linux – NTP: ntpdate to sync time between the PCs on a private network

Related Topic