Windows Time unstable at synchronising?

T

TechoMad

Some systems I manage are showing wild behaviour when synchronising their
clocks using NTP against a known accurate time server. Synchronising once per
week, the difference between the time server and the computer (similar on all
computers) AFTER the synch event look a bit like this:

seconds
+34, +1, -85, +1, +122, -125, +2, +120

The synch events are at the standard 7 days interval, and in between, the
clock on the computer(s) drifts at about -2 to -3 seconds per day.

The step changes at each attempt to "correct" the clock are getting larger
and larger, with no indication that the clock time is going to be set
correctly. The W32time applet shows that the "synchronisation" event was
"successful".

These differences were recorded by comparing the computer clock with an
atomic time receiver, so it is not as if the "reference" clock is inaccurate
here.

So, what is happening? What causes this wild oscillation and how can it be
tamed? Is there any way of getting windows to actually set a correct time for
once?
 
T

TechoMad

Yes they are, but the latency is sub 5 milliseconds. A similar effect is seen
for machines on the same sub-net. Quite what difference would different
sub-nets make then?
 
P

Paul

TechoMad said:
Some systems I manage are showing wild behaviour when synchronising their
clocks using NTP against a known accurate time server. Synchronising once per
week, the difference between the time server and the computer (similar on all
computers) AFTER the synch event look a bit like this:

seconds
+34, +1, -85, +1, +122, -125, +2, +120

The synch events are at the standard 7 days interval, and in between, the
clock on the computer(s) drifts at about -2 to -3 seconds per day.

The step changes at each attempt to "correct" the clock are getting larger
and larger, with no indication that the clock time is going to be set
correctly. The W32time applet shows that the "synchronisation" event was
"successful".

These differences were recorded by comparing the computer clock with an
atomic time receiver, so it is not as if the "reference" clock is inaccurate
here.

So, what is happening? What causes this wild oscillation and how can it be
tamed? Is there any way of getting windows to actually set a correct time for
once?

Do any of the computers use Nforce2 chipsets ?

http://nforcershq.com/forum/20-vt19...torder=asc&highlight=nforce2+clock&&start=190

Is there a common hardware characteristic, on all the affected machines ?

Paul
 
T

TechoMad

No to the chipset, but the systems checked are common hardware (but that is a
cop-out surely); however, similar, lesser effects have been seen on radically
different hardware. On these units the step changes induced by W32time are
just enormous. Why?

Looking at the data, the clock was close to atomic time for two synch
periods, then started to diverge at each "synch" event, with larger and
larger step changes.

So,
1) Why did W32time start to move the clock away from a settled, close time?
2) What parameters in W32time control the convergence behaviour?
3) can I adjust its "pid" parameters to bring about more "damped" and
convergent behaviour?
4) Why, when it has just received a perfectly accurate time from an NTP
source, dows W32time decide to ignore that and set a time at a massive (100
SECONDS) offset either ahead or behind this time, and not just the once
either?

In between adjustments, a typical PC clock drift of 0.25 secs/hour is seen;
there is no compensation for this from W32time, presumably because it is more
involved with whatever is causing the huge offset jumps.

There is no user access to clock adjustment, so no "finger trouble" is
involved; the machines are in a control system, so have no-one continually
monitoring or customising them each day. The time server is a guaranteed
accurate source, which is independently monitored. The network latency is
small.

I've been monitoring my test-bend unit, and when synching that, W32time
decided to change a clock at +0 offset, to +2, then +9, then +1.5, then 0
seconds difference from atomic time. Again, when it was accurate at first,
why the "walk-around" to an inaccurate clock for 4 synch periods? I tried
this using hourly synch periods, so I wouldn't grow too old waiting for the
results :)

Well, any ideas, as I may have to expend significant resource applying any
fix, so I want to be absolutely sure at first try that this can be fixed.
 
P

Paul

TechoMad said:
No to the chipset, but the systems checked are common hardware (but that is a
cop-out surely); however, similar, lesser effects have been seen on radically
different hardware. On these units the step changes induced by W32time are
just enormous. Why?

Looking at the data, the clock was close to atomic time for two synch
periods, then started to diverge at each "synch" event, with larger and
larger step changes.

So,
1) Why did W32time start to move the clock away from a settled, close time?
2) What parameters in W32time control the convergence behaviour?
3) can I adjust its "pid" parameters to bring about more "damped" and
convergent behaviour?
4) Why, when it has just received a perfectly accurate time from an NTP
source, dows W32time decide to ignore that and set a time at a massive (100
SECONDS) offset either ahead or behind this time, and not just the once
either?

In between adjustments, a typical PC clock drift of 0.25 secs/hour is seen;
there is no compensation for this from W32time, presumably because it is more
involved with whatever is causing the huge offset jumps.

There is no user access to clock adjustment, so no "finger trouble" is
involved; the machines are in a control system, so have no-one continually
monitoring or customising them each day. The time server is a guaranteed
accurate source, which is independently monitored. The network latency is
small.

I've been monitoring my test-bend unit, and when synching that, W32time
decided to change a clock at +0 offset, to +2, then +9, then +1.5, then 0
seconds difference from atomic time. Again, when it was accurate at first,
why the "walk-around" to an inaccurate clock for 4 synch periods? I tried
this using hourly synch periods, so I wouldn't grow too old waiting for the
results :)

Well, any ideas, as I may have to expend significant resource applying any
fix, so I want to be absolutely sure at first try that this can be fixed.

In terms of traceability to crystals on the motherboard, the computer
uses the RTC (and a 32KHz quartz crystal near the Southbridge) when the
OS isn't running. When Windows is running, it uses clock tick interrupts
and counts those to determine the time. There are also various timer functions
supported by the chipset or processor, which also might be used.
Their drift would be traceable to the clock generator chip, rather than
the RTC. As far as I know, the RTC is not relied on for
telling time when the OS is running.

Maybe you need to test using a third party NTP client, and see
if the results are similar ? If a third party client could be run
to simply record the drift, that might give you some additional data.

The reason I mention the Nforce2, is there seemed to be
a bug related to interrupts, and disabling APIC and using
PIC seemed to help. The bug seemed to occur when the
CPU input clock was offset from the nominal value (i.e.
overclocking or underclocking).

There is a little background here.

http://people.ucsc.edu/~ashawn/tech/W32Time.htm

Something else you might consider, is to install a copy
of Wireshark, a packet sniffer, and keep a log of what the
machine is doing with respect to the network.

http://en.wikipedia.org/wiki/Wireshark

Paul
 
T

TechoMad

I understand very well about RTC crystals etc., I spent a while developing
off-air time standards for data logging,etc. Let's get past one thing, this
is NOT to do with the PC clock hardware, its accuracy or stability. The
measured drift on the hardware when it is NOT synchronising is 0.2 secs/hour
- perfectly normal for a PC clock (sadly). The loggng of the timesource
stability, and use of alternative time sources have all been explored
satisfactorily. Neither is this to do with network issues, as the effect
occurs on several independent, different LANS.

This is to do with W32time and its quite insane actions of taking a clock
that is less than one second away from atomic time and "jumping" it +/- 100
or more seconds away when it checks and sets that clock.

Clearly, it can get a precise time from an NTP source, it can compare that
with the clock on the computer and simple maths can indicate that it is as
close as seconds, so why does it "go off on one" and decide to ignore the NTP
time and set something over a minute and a half away, and not just once, but
several times in a row?

This sort of behaviour is reminiscent of undamped PID controllers which
become susceptible to the timiest noise pulses and consequently oscillate
wildly, regardless of how close to the "set-point" the system is currently
placed.

Is there a PID controller in W32time? Does it have settable parameters and
what are they? What is the best synchonisation interval to achieve least
drift, maximum stability and RTC drift compensation, within the shortest
number of attempts. Is there a way to FORCE W32time to use the obtained NTP
time, regardless of its internal "adjustables", so avoiding it going off on
its own, which it quite clearly does?

Now many might think I'm a bit of a nerd and worried about a few seconds
clock shift here or there but there are two issues here - (1) if you fit a
clock to a device, why waste the effort by making it so useless that a $1
digital watch from the market can do a better job and, (2) for remote,
distrubuted data capture (of which there are hundreds of applications),
precision of the clock is the only means of comparing and combining the data
into a single set. Time-stamps comparable to less than 1/100th second can
matter in some circumstances.

Quite clearly W32time doesn't even match the mechanical clocks of the 1850's
at the moment, so what point "progress"?

Is there anyone from Microsoft reading this that could answer some of the
technical questions here? Practical advice will be welcomed.


:
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top