help! over temperature error / alert

Z

zark

hello-
i am using the p5gd2 premium mb with intel p4 3.2 cpu.

in the past week i have been getting "cpu over temperature" followed by
"press f1 to contimue"

this occurs on boot up, and i press f1 and it continues into windows-
the computer works fine, i did benchmarking tests and the system runs
as designed. i reattached the cpu cooler (thermaltake heat pipe
cooler). all the fans are running. i have placed a temperature probe
against the side of the cpu chip-its temperature varies between 34-45
degrees C-whereas the asus temperature on board sensor indicates that
it is 124 degrees C!!!!!!!!!

before getting this error message i had been putting my computer in
hibernation mode-though i dont think this lead to the temperature
error.

i also went into bios setup-didnt see anything out of the usual.

any advice? is the over temperature error false considering the other
temperature sensor readings? what to do? ignore it? would the computer
shut down/SLOW way down if it really were too hot?
 
P

Paul

"zark" said:
hello-
i am using the p5gd2 premium mb with intel p4 3.2 cpu.

in the past week i have been getting "cpu over temperature" followed by
"press f1 to contimue"

this occurs on boot up, and i press f1 and it continues into windows-
the computer works fine, i did benchmarking tests and the system runs
as designed. i reattached the cpu cooler (thermaltake heat pipe
cooler). all the fans are running. i have placed a temperature probe
against the side of the cpu chip-its temperature varies between 34-45
degrees C-whereas the asus temperature on board sensor indicates that
it is 124 degrees C!!!!!!!!!

before getting this error message i had been putting my computer in
hibernation mode-though i dont think this lead to the temperature
error.

i also went into bios setup-didnt see anything out of the usual.

any advice? is the over temperature error false considering the other
temperature sensor readings? what to do? ignore it? would the computer
shut down/SLOW way down if it really were too hot?

Would that be 124 degrees F perhaps ?

Intel processors have thermal throttling at 70C die temperature.
This slows the effective clock rate of the processor, until
the processor temperature drops below 70C. People who experience
that their CPU "only rises to 69-70C" are in fact throttling
and don't know it. I think there is a program called Throttlewatch
that can tell you when that happens.

On the newest Intel processors, if throttling does not control
the CPU temperature (say the heatsink fell off), the CPU also
has overheat protection that shuts the computer off at 90C.
(I think the docs say 20C more than the throttle temp.) Some
older processors have the trip point set at 135C, but you
aren't likely to be using one of those.

In other words, if the processor was purchased recently,
and if the silicon die was at 124C, the computer has
already shut off.

124F = 51C

Paul
 
I

Ian Boys

before getting this error message i had been putting my computer in
hibernation mode-though i dont think this lead to the temperature
error.

I've had the same thing coming out of standby on my ECS PF21 motherboard.
Motherboard Monitor says the CPU fan is off, although it's not, and the temp
alarm comes on. Restart fixes all this but it means I don't trust standby
anymore.

Ian Boys
DTE
 
Z

zark

Paul said:
Would that be 124 degrees F perhaps ?

Intel processors have thermal throttling at 70C die temperature.
This slows the effective clock rate of the processor, until
the processor temperature drops below 70C. People who experience
that their CPU "only rises to 69-70C" are in fact throttling
and don't know it. I think there is a program called Throttlewatch
that can tell you when that happens.

On the newest Intel processors, if throttling does not control
the CPU temperature (say the heatsink fell off), the CPU also
has overheat protection that shuts the computer off at 90C.
(I think the docs say 20C more than the throttle temp.) Some
older processors have the trip point set at 135C, but you
aren't likely to be using one of those.

In other words, if the processor was purchased recently,
and if the silicon die was at 124C, the computer has
already shut off.

124F = 51C

Paul

according to the program "asus probe" the temperature of the cpu is
124C which is OVER 250 F !!!!!!!!! i reattached the cpu cooler a second
time, reapplying silver conductive grease. there is NO throttling or
shut down- i ran an arithmetic benchmarking test with SANDRA from
sisoftware and it came out just fine.

is there any way to reset the temperature sensor? do a diagnostic on
it? the external cpu temp sensor is now reading 40C.

should i just ignore the over temp alert? wouldnt you expect to be
smelling smoke/ hearing alarms/ dead computer if the cpu was really at
250 F???
 
P

Paul

"zark" said:
according to the program "asus probe" the temperature of the cpu is
124C which is OVER 250 F !!!!!!!!! i reattached the cpu cooler a second
time, reapplying silver conductive grease. there is NO throttling or
shut down- i ran an arithmetic benchmarking test with SANDRA from
sisoftware and it came out just fine.

is there any way to reset the temperature sensor? do a diagnostic on
it? the external cpu temp sensor is now reading 40C.

should i just ignore the over temp alert? wouldnt you expect to be
smelling smoke/ hearing alarms/ dead computer if the cpu was really at
250 F???

The CPU temperature is measured by using a diode inside the CPU
die. The monitor chip measures the voltage across the diode, and
then a conversion formula is used to convert the voltage reading
into a temperature. (Asus boards have used a thermistor in the
socket area in the past, and I don't know of a simple way to
prove which method is used on a given board. The diode method
saves Asus a few pennies, so it would be a preferred method.)

If the two pins on the CPU package, that connect the diode to the
board, are not making good contact, that could account for some
bad readings. If the monitor chip is not set up properly (set
in the wrong mode) then I suppose that could mess things up.

PDF page 31 shows how the Winbond W83627EHF chip measures
CPU die temp:

http://www.winbond-usa.com/products/winbond_products/pdfs/PCIC/W83627EHF_EHG.pdf

In that diagram, if the diode doesn't make good contact
with the circuit, the voltage at the monitor pin climbs
to Vref, and that should yield a high temperature reading
from the conversion formula. Conversely, if the diode fails
shorted, the measured temperature will be very low.

If Speedfan (http://www.almico.com/sfdownload.php) is able to
get good temperature readings from your board, then that means
the BIOS is immature. Consider doing a BIOS upgrade, if
Speedfan seems to be working properly. If Speedfan also
reports a bad value, then replacing the motherboard or
the CPU might be the only solution.

If the motherboard socket was dirty, I suppose that could do it.
You cannot touch the contacts in the LGA775 socket, or touch
the contacts on the bottom of your processor, without
contaminating them. What would be especially bad, is getting
thermal grease in the socket. (I have never read of an
approved cleaning method for the socket, so don't even think
about spraying solvents in there! For an end user, reinsertion
of the component, to allow the contacts to scrape a clean
connection, is the safest alternative.)

Good luck,
Paul
 
P

Paul

"zark" said:
is there any way to reset the temperature sensor? do a diagnostic on
it? the external cpu temp sensor is now reading 40C.

In terms of a diagnostic, in theory you could take the processor,
flip it over, and probe the two diode signals with an ohmmeter.
But don't ask me to do the math to convert the voltage
reading to a temperature :)

qVd/nkT
Ifw = Is*(e - 1)

Ifw would be the applied ohmmeter current, and Intel recommends a
small current between 11 and 187 microamp forward bias. (Selecting
an ohmmeter range that uses 100 microamp might work.) Vd is the
voltage developed across the diode and displayed on the ohmmeter
display. T is the applied temperature in degrees Kelvin (above
absolute zero).

See PDF page 83.
http://download.intel.com/design/Pentium4/datashts/30235104.pdf

On PDF page 30 of the Winbond chip spec, there is a table showing
temperature register readout values from +125C to -55C. So,
in fact, you don't have to do any math on the Winbond (whew!).
The Winbond chip is using an internal lookup table, to convert
voltage value into a temperature reading in degrees C. The only
really puzzling thing, is why your readout is 124C instead of
the max 125C.

The implication here, is that your diode is open circuit.
One of the two (CPU internal) diode pins is not making contact
with the socket, or a connection is broken elsewhere on the
motherboard. The diode could also be open inside the processor
itself, but I would think Intel checked that before the processor
left the factory.

Paul
 
Z

zark

so it seems that the temperature sensor is in error, but what i still
cant figure out is why it will not slow/shut down the computer?
 
P

Paul

"zark" said:
so it seems that the temperature sensor is in error, but what i still
cant figure out is why it will not slow/shut down the computer?

There are two diodes involved on the processor die:

diode #1 -----> processor internal throttle function
-----> processor internal overheat & shutdown

diode #2 -----> feeds Winbond monitor interface

The readings taken by the BIOS or by a utility when the OS is
running, do not have to act on the information they are
collecting. After all, it is known that diode #1 and the
associated circuitry is autonomous and can protect the
hardware well.

The fact that diode #2 and its associated hardware are
misconfigured, has no impact in this case.

*******

To consider another case, there is an eight pin chip on
an A7N8X family motherboard, that connects to a diode
on an AMD S462 processor. If that chip is misconfigured
(say the threshold is set too low), the motherboard
will shut off. The reason it is implemented that way,
is the old S462 processors didn't have any way to
defend against overheating, so an external chip was used
for CPU overheat prevention. It is more important not
to fool with that 8 pin chip, unless you want a lot
of nuisance shutdowns.

In your current situation, the processor is already
protected, and a faulty monitor chip or diode #2 is
merely annoying. And diode #1 should be fully tested
at the Intel factory.

Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top