Thermal shutdown initiated by windows?

  • Thread starter Thread starter Lucvdv
  • Start date Start date
L

Lucvdv

Has anyone ever heard of a thermal reboot being initiated by windows?

During tests with a new version of our XPe target on a new mainboard, the
system spontaneously reboots when the ambient temperature reaches 40 C.
Processor core temperature at that time is approx. 55 C.

The mainboard specs say it can operate up to 60 C ambient, and the
processor is a Pentium M, max temperature 100 C.


It doesn't reboot when it is running DOS or just sitting in the BIOS setup
screen, it only does it under XPe (at least that's what the hardware people
say who did the test, I doubt if by "DOS" they don't mean the BIOS screen).
 
Lucvdv said:
Has anyone ever heard of a thermal reboot being initiated by windows?

During tests with a new version of our XPe target on a new mainboard, the
system spontaneously reboots when the ambient temperature reaches 40 C.
Processor core temperature at that time is approx. 55 C.

The mainboard specs say it can operate up to 60 C ambient, and the
processor is a Pentium M, max temperature 100 C.


It doesn't reboot when it is running DOS or just sitting in the BIOS setup
screen, it only does it under XPe (at least that's what the hardware people
say who did the test, I doubt if by "DOS" they don't mean the BIOS screen).

My first thought is that the system may be bug-checking (Blue screen),
and you have it set to automatically restart on bug-check.

You may want to verify the setting by going to
My Computer -> Properties
Advanced Tab
Startup and Recovery -> Settings
System Failure

and verify that "Automatically restart" is not checked.
 
My first thought is that the system may be bug-checking (Blue screen),
and you have it set to automatically restart on bug-check.

Checked (in HKLM\SYSTEM\ControlSet001\Control\CrashControl because the
system control panel isn't included the target), that's not it.
 
Lucvdv said:
Checked (in HKLM\SYSTEM\ControlSet001\Control\CrashControl because the
system control panel isn't included the target), that's not it.

It could be a power supply failure at that temp, which could cause a
reboot. Is the power supply external or internal to your solution?

The P3 M CPU is a max of 100c, but that is the max die temp - depending
upon what cooling solution you have and what load the CPU is seeing,
this could occur at much lower ambient temps. My own thermal benchmarks
showed that with a passive heatsink attached this occurred at ~50c when
the CPU was running Prime95 benchmark.
 
It could be a power supply failure at that temp, which could cause a
reboot. Is the power supply external or internal to your solution?

The power supply is external, it wasn't included in the heat test.
The test was performed by putting the mainboard in its case and blowing in
air that got gradually warmer. A probe measured the air temperature in the
case, the reboot occurred when that indicated slightly above 40 C
(sometimes 41, sometimes 43).

The P3 M CPU is a max of 100c, but that is the max die temp - depending
upon what cooling solution you have and what load the CPU is seeing,
this could occur at much lower ambient temps. My own thermal benchmarks
showed that with a passive heatsink attached this occurred at ~50c when
the CPU was running Prime95 benchmark.

Here it's an active heatsink, and the temperature I said the CPU reached is
the die temperature, shown by a monitor app that came with the mainboard.

I've witnessed some tests now, and I noticed something the others missed:
right *after* the reboot, the BIOS indicates a "system area" temperature as
high as 75 C. I think that's the culprit, and that must be kept below the
60 C the manufacturer specifies as "operating range" (which we took for
"ambient").


I don't even know where that sensor is located: the docs we got with the
board are preliminary and incomplete.

It's a DFI G5M200, used as closest replacement for a G5M300-P our supplier
couldn't get us anymore. Neither of the boards is listed on DFI's website
(I can't find anything that small there, nor anything with a CompactFlash
socket on board connected directly to the primary IDE channel).

Through Google I can find references to the G5M300 (at one place described
as "an industrial version of the DFI 855GME-MGF"), but not (yet?) the 200.
 
First, error code detected by XP will be in system (event) logs. You
don't need BSOD displays to see those errors only when they occur. Get
history events to learn what happened previously; when failures
occurred AND what was happening before failure.

Second, heat is how defective hardware is found. Put a computer in a
100 degree F room. It must work just fine. If heat creates failure in
a 70 degree room, then selectively heat components with a hairdryer on
highest setting to find that 100% defective part. Components
uncomfortable to touch but that don't burn skin are at perfectly normal
operating temperatures - must not fail. Don't cure a heat symptom.
Use heat to find a 100% defective part.

Third, summary only speculates heat - does not first learn from
facts. Numerous other suspects must be considered. For example, a
marginal power supply (that a silly power supply tester says is good)
would not be loaded sufficient by DOS but would start failing with
Windows graphics. But then a few minutes using the 3.5 digit
multimeter would identify or eliminate that reason for failure
definitively (and by not changing or disconnecting anything).

Of course, better manufacturers provide comprehensive diagnostics for
free. Diagnostics are executed at room temperature; then repeated with
components warmed. Better computer system providers also provide
comprehensive diagnostics for free.

Which components can cause an XP crash? This list is shorter with
XP. Sound card, memory, power supply, video controller, and CPU. That
is a hardware suspect list.

Too often, heat is the number one suspect only because concepts such
as 'following the evidence' are not grasped. Too many want to fix a
heat problem rather than recognize heat as a diagnostic tool.
 
Back
Top