Linux system time temporally jumps

linuxtimetime-synchronizationtimestamp

I saw a strange system time changing behavior in some (hardware) servers: in /var/logs/syslog, the date time preceding each log message sometimes change to a random one and gets back to normal in the next message, like the following:

Feb 22 2018 09:09:30 ...
Feb 22 2018 09:09:32 ...
Jan 13 2610 15:37:42 ...
Feb 22 2018 09:09:33 ...
Feb 22 2018 09:09:34 ...

As in the example, the sudden change of date time can be as far as hundreds of years away.

I can confirm that the log messages having the strange time stamps does not come from any specific process – it just can happen randomly for every one.

And duration between 2 abnormal time changes varies between a few minutes to a few hours (however, I suspect the abnormal time changes could happen more frequently but many of them are not revealed in the syslog, since it is not writing logs every second).

Also, since it happens on more than one server, I assume it is not a hardware problem.

More info about the severs: they are an openstack installation with one controller and a few compute nodes. Each server has ntp service running. The controller is configured to take time from its own hardware clock, and the compute node servers sync time from the controller. Note that each server have abnormal time changes at its own pace – looks like the "wrong time" is not synchronized from the controller through ntp.

I was suspecting the guest systems (virtual machines) on compute nodes could affect their host system time. But this can not explain why the controller has the same problem while not running any virtual machine.

I need a method to detect: who changed the system time and how does it happen?

Best Answer

This script will tell you when a time drift occurs and the difference in the process tree, and this should help identify this if it is caused by a process changing the system time. It will print to the terminal as well as log into timedrift.log inside the current working directory.

#!/bin/bash

oldTime="$(date +%s)"
oldPsOutput="$(ps faux)"
while true; do
  sleep 1;
  currentTime="$(date +%s)"
  oldTimeplusfive="$((($oldTime+5)))"
  currentPsOutput="$(ps faux)"
  if [[ "$currentTime" -lt "$oldTime" ||  "$currentTime" -gt "$oldTimeplusfive"  ]]
  then
    (
        echo -e '\n\n======================='
        echo "currentTime=$currentTime oldTime=$oldTime oldTimeplusfive=$oldTimeplusfive"
        echo '-----------------------'
        echo "$oldPsOutput"
        echo '::::::::::::::::::::::::::'
        echo "$currentPsOutput"
    ) | tee -a timedrift.log
  fi
  oldPsOutput=$currentPsOutput
  oldTime=$currentTime
done

Credit to the original script in the Unexplainable time jumps in CRON bug that Stone mentioned as a comment.

Can you also comment as if you are using rsyslog and if so what version? Do you see it outside the realm of rsyslog (ie. apache logs, etc). This bug looks simmlar, and it would nice to confirm it or rule it out either way.