Setting up NTP servers

ntp

I have a problem setting up NTP to maintain time on a stand-alone network. This will be an island time-zone. The problem is that the time drifts apart, even after they have been initially synchronised.

There are two redundant NTP servers running RHEL 5.4 and several Windows XP clients. The requirements are that the network syncs to server A whilst server B acts as a backup. We do have a GPS that acts as a time server controlling both server A and server B, but it is not always available. When the GPS is present, both servers sync to the GPS.

The XP clients seem to divide into two groups once the servers drift apart; with some following server A and others server B.

How can I prevent my two servers from drifting apart?

Can I control which server the XP clients follow?

The two ntp.conf files are as follows

ntp.conf for Server A (10.203.224.13)

# Tweek NTP's behavior
tinker panic 0 step 0.01 stepout 64

# GPS
server 10.203.220.12 burst iburst minpoll 4 maxpoll 6

# Server A
server 10.203.224.13 burst iburst minpoll 4 maxpoll 6

# Server B
server 10.203.224.14 burst iburst minpoll 4 maxpoll 6

# Configure the local clock to serve from
server 127.127.1.1
fudge 127.127.1.1 stratum 11

# Establish the drift file location
driftfile /etc/ntp.drift 

ntp.conf for Server B (10.203.224.14)

# Tweek NTP's behavior
tinker panic 0 step 0.01 stepout 64

# GPS
server 10.203.220.12 burst iburst minpoll 4 maxpoll 6

# Server A
server 10.203.224.13 burst iburst minpoll 4 maxpoll 6

# Server B
server 10.203.224.14 burst iburst minpoll 4 maxpoll 6

# Configure the local clock to serve from
server 127.127.1.1
fudge 127.127.1.1 stratum 13

# Establish the drift file location
driftfile /etc/ntp.drift

On Server A

[root@serverA]# ntpq -p

     remote           refid          st t when poll reach   delay   offset  jitter
==============================================================================
 10.203.220.12   .INIT.          16 u    -   64    0    0.000    0.000   0.000
 10.203.224.13   .INIT.          16 u    -   64    0    0.000    0.000   0.000
 10.203.224.14   LOCAL(1)        14 u   27   64  377    0.312  359.753   0.289
*LOCAL(1)       .LOCL.          11 l   55   64  377    0.000    0.000   0.001

On Server B

[root@serverB]# ntpq -p

     remote           refid          st t when poll reach   delay   offset  jitter
==============================================================================
 10.203.220.12   .INIT.          16 u    -   64    0    0.000    0.000   0.000
 10.203.224.13   LOCAL(1)        12 u   55   64  377    0.346  -359.56   0.107
 10.203.224.14   .INIT.          16 u    -   64    0    0.000    0.000   0.000
*LOCAL(1)       .LOCL.          13 l   54   64  377    0.000    0.000   0.001

Best Answer

On server A, remove the lines pointing to itself and server B, leaving only the "fudge" local clock line and the GPS. On server B, remove the "fudge" line and the server B line, leaving only the server A line and the GPS.

The idea is that server A should use the GPS if it's available, otherwise it should trust its own clock. Server B should use server A, howsoever server A is getting time, or the GPS. If server B is allowed to trust itself, it will advertise a reliable time source to its clients, even though that time is different from server A's - which is what you're seeing.