baffled: reoccurring Time Sync error on PDC FSMO domain controller in forrest rootdomain

P

Pshorthorn

Hi all,

I hope anybody can help because I quite ran out of options.

Situation:

fyi: I use fictional names for the servers and the rootdomain.

4 new domain controllers, all new Dell 1750 servers;

dc01 = pdc & ridmaster
dc02 = domain name master & GC & DNS server for the rootdomain
dc03 = schema master & GC
dc04 = infrastructure master & Certificate Authority

dc01 is setup as a reliable time source, syncing time with internet
sntp timeservers. See end of posting. Dc01 syncing with sntp servers
is fine, and all other servers, members and DC's do sync fine with the
pdc & ridmaster dc01. (as shown end of posting).

2 old domain controllers (only ones left to be demoted)

dc05 = GC
dc06 = DNS server

Replication schema with 21 Child domains is HUB - SPOKE, replication
is fine.

Problem:

On the pdc & ridmaster there's a time error, making it impossible to
logon via the GUI. Acces via telnet still works. When looking at
eventlogs ( via "manage computer" on other Domain controller) the
systemlog shows that because of a time error the dc01 server is unable
to do DC lookups, Kerberos errors, cannot find a GC. All because of
the time difference error between dc01 and the rest.
All servers are indeed in the same time zone.

Now, the baffling thing is; when transfering the PDC & RID fsmo roles
to another server, for example dc02, and configuring that server as a
reliable time source, exactly the same thing happens after an hour or
two.
Concluding; it is related to serverhardware OR to these fsmo roles.
Haven't heard anything from Dell yet. I asked them if it could be a
real time clock problem.

Rebooting does not help.
Syncing all other servers with dc01 (via w32tm -s) does not help.
DNS settings are OK. Cleaning DNS client cache (net stop netlogon,
ipconfig /flushdns, delete netlogon.dns / netlogon.dnb) does not work.

Anybody?

Thanks a thousand times in advance !!!

showing syncing is fine indeed:

dc06.mydomain.local [10.254.1.20]:
ICMP: 66ms delay.
NTP: -0.0018164s offset from dc01.mydomain.local
RefID: dc01.mydomain.local [10.254.1.30]

dc05.mydomain.local [10.254.1.14]:
ICMP: 67ms delay.
NTP: -0.0158164s offset from dc01.mydomain.local
RefID: dc01.mydomain.local [10.254.1.30]

dc04.mydomain.local [10.254.1.33]:
ICMP: 69ms delay.
NTP: -0.0068164s offset from dc01.mydomain.local
RefID: dc01.mydomain.local [10.254.1.30]

dc02.mydomain.local [10.254.1.31]:
ICMP: 54ms delay.
NTP: +0.0370000s offset from dc01.mydomain.local
RefID: dc01.mydomain.local [10.254.1.30]

ICMP: 54ms delay. dc01.mydomain.local *** PDC *** [10.254.1.30]:
NTP: +0.0000000s offset from dc01.mydomain.local
RefID: time.windows.com [207.46.130.100]

dc03.mydomain.local [10.254.1.32]:
ICMP: 53ms delay.
NTP: -0.0219847s offset from dc01.mydomain.local
RefID: dc01.mydomain.local [10.254.1.30]

registry w32time settings for dc01:

Default reg_sz (value not set)
LocalNTP reg_dword 0x0000001 (1)
Ntpserver reg_sz time.windows.com swisstime.ethz.ch
Period reg_sz 24
ReliableTimeSource reg_dword 0x000000 (1)
Type reg_sz NTP
 
A

Ace Fekay [MVP]

Kerberos authentication is time sensitive and only allows +/- 5 minute time
skew or authentication fails. Are all the servers in the same time zone have
the same exact time? Keep in mind, the default time server in each domain it
the one that holds the PDC Emulator role, which of course each domain has
one, and the time service needs to be set on each PDC Emulator in each
domain.


--
Regards,
Ace

G O E A G L E S !!!
Please direct all replies ONLY to the Microsoft public newsgroups
so all can benefit.

This posting is provided "AS-IS" with no warranties or guarantees
and confers no rights.

Ace Fekay, MCSE 2003 & 2000, MCSA 2003 & 2000, MCSE+I, MCT, MVP
Microsoft Windows MVP - Windows Server - Directory Services

Security Is Like An Onion, It Has Layers
HAM AND EGGS: A day's work for a chicken;
A lifetime commitment for a pig.
--
=================================


Pshorthorn said:
Hi all,

I hope anybody can help because I quite ran out of options.

Situation:

fyi: I use fictional names for the servers and the rootdomain.

4 new domain controllers, all new Dell 1750 servers;

dc01 = pdc & ridmaster
dc02 = domain name master & GC & DNS server for the rootdomain
dc03 = schema master & GC
dc04 = infrastructure master & Certificate Authority

dc01 is setup as a reliable time source, syncing time with internet
sntp timeservers. See end of posting. Dc01 syncing with sntp servers
is fine, and all other servers, members and DC's do sync fine with the
pdc & ridmaster dc01. (as shown end of posting).

2 old domain controllers (only ones left to be demoted)

dc05 = GC
dc06 = DNS server

Replication schema with 21 Child domains is HUB - SPOKE, replication
is fine.

Problem:

On the pdc & ridmaster there's a time error, making it impossible to
logon via the GUI. Acces via telnet still works. When looking at
eventlogs ( via "manage computer" on other Domain controller) the
systemlog shows that because of a time error the dc01 server is unable
to do DC lookups, Kerberos errors, cannot find a GC. All because of
the time difference error between dc01 and the rest.
All servers are indeed in the same time zone.

Now, the baffling thing is; when transfering the PDC & RID fsmo roles
to another server, for example dc02, and configuring that server as a
reliable time source, exactly the same thing happens after an hour or
two.
Concluding; it is related to serverhardware OR to these fsmo roles.
Haven't heard anything from Dell yet. I asked them if it could be a
real time clock problem.

Rebooting does not help.
Syncing all other servers with dc01 (via w32tm -s) does not help.
DNS settings are OK. Cleaning DNS client cache (net stop netlogon,
ipconfig /flushdns, delete netlogon.dns / netlogon.dnb) does not work.

Anybody?

Thanks a thousand times in advance !!!

showing syncing is fine indeed:

dc06.mydomain.local [10.254.1.20]:
ICMP: 66ms delay.
NTP: -0.0018164s offset from dc01.mydomain.local
RefID: dc01.mydomain.local [10.254.1.30]

dc05.mydomain.local [10.254.1.14]:
ICMP: 67ms delay.
NTP: -0.0158164s offset from dc01.mydomain.local
RefID: dc01.mydomain.local [10.254.1.30]

dc04.mydomain.local [10.254.1.33]:
ICMP: 69ms delay.
NTP: -0.0068164s offset from dc01.mydomain.local
RefID: dc01.mydomain.local [10.254.1.30]

dc02.mydomain.local [10.254.1.31]:
ICMP: 54ms delay.
NTP: +0.0370000s offset from dc01.mydomain.local
RefID: dc01.mydomain.local [10.254.1.30]

ICMP: 54ms delay. dc01.mydomain.local *** PDC *** [10.254.1.30]:
NTP: +0.0000000s offset from dc01.mydomain.local
RefID: time.windows.com [207.46.130.100]

dc03.mydomain.local [10.254.1.32]:
ICMP: 53ms delay.
NTP: -0.0219847s offset from dc01.mydomain.local
RefID: dc01.mydomain.local [10.254.1.30]

registry w32time settings for dc01:

Default reg_sz (value not set)
LocalNTP reg_dword 0x0000001 (1)
Ntpserver reg_sz time.windows.com swisstime.ethz.ch
Period reg_sz 24
ReliableTimeSource reg_dword 0x000000 (1)
Type reg_sz NTP
 
P

Pingus

Hi Ace,

First I have be cocky: (cocky mode) you could have read about servers
being in the same timezone and having the same time. (/cocky mode)

Problem was caused by buggy Dell / Broadcom NIC loadbalancing /
failover software and drivers.

This crappy sofware was causing network delays resulting in time sync
errors EVEN when according to all root domain controllers everything
was okidoki.. (enabled time sync logging on all root DC's and all root
member servers.)

Thank you Dell and Broadcom for delaying a rootdomain migration to new
servers for 2 weeks :>(

regards,

Pingus.
 
A

Ace Fekay [MVP]

In Pingus <[email protected]> made a post then I
commented below
:: Hi Ace,
::
:: First I have be cocky: (cocky mode) you could have read about
:: servers being in the same timezone and having the same time.
:: (/cocky mode)
::
:: Problem was caused by buggy Dell / Broadcom NIC loadbalancing /
:: failover software and drivers.
::
:: This crappy sofware was causing network delays resulting in time sync
:: errors EVEN when according to all root domain controllers everything
:: was okidoki.. (enabled time sync logging on all root DC's and all
:: root member servers.)
::
:: Thank you Dell and Broadcom for delaying a rootdomain migration to
:: new servers for 2 weeks :>(
::
:: regards,
::
:: Pingus.

Well, at least you found the problem. I just made a mental note to be aware
of Dell with Broadcom NICs.

Ace
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top