New system stability

B

Bill

Greetings...

I just bought a new system with an AMD dual core and it runs great,
until I try to run Prime95 as a "burn-in" test.

The entire system is stock, no overclocking, no fancy settings, just
default BIOS settings right now until I confirm the hardware is good.
Here's the list of items:

Asus A8N-E motherboard
AMD 64x2 3800+ at default 2.0 GHz
OCZ gold 2x512MB DDR at AUTO setting gives 2.5-3-3-5 timings
BFG 6800 GT video card (PCI-Express)
Antec Sonata II case with Antec 450w Smart power supply
And the usual other goodies like a serial ATA drive, floppy, DVD burner
on parallel ATA. The case has a 120mm exhaust fan and I added a 92mm
intake fan that blows cool air on the motherboard over the CPU and RAM.
The BIOS reports 33'C and 34'C for the board and CPU temps respectively.

Windows XP and all of my programs run fine. MemTest86 shows no RAM
problems, yet Prime95 loads up and starts to run, but then it gives
SUMOUT errors after about 1-3 hours, no rounding errors or crashes
though.

I think the RAM is fine because it works great in my old P4 system that
runs Prime95 100%, but it may not be compatible with Asus. I also tried
dropping the CPU speed to 1.0 GHz to see if the CPU was acting up, and
it still failed.

Does anyone know of any compatibility issues with AMD and Asus?
Or with OCZ memory and Asus boards?
 
F

Friedrich Wuelfing

I had a similar thing with my A7M266-D.
I modified two XPs to MPs and used unbuffered memory.
Everything incl. memtest86 etc. ran fine except prime95.
They tell that something must be wrong if it's not running without error.
So I replaced the ram with buffered ecc samsung, still the same.
I replaced the cpus with true MPs, still the same.
Finally I gave up, as everything incl. SETI etc. worked fine.
It's running this way since two years without other probs.

I know, it's a different setup, but the same problem.
 
E

Egil Solberg

Friedrich said:
I had a similar thing with my A7M266-D.
I modified two XPs to MPs and used unbuffered memory.
Everything incl. memtest86 etc. ran fine except prime95.
They tell that something must be wrong if it's not running without
error. So I replaced the ram with buffered ecc samsung, still the
same. I replaced the cpus with true MPs, still the same.
Finally I gave up, as everything incl. SETI etc. worked fine.
It's running this way since two years without other probs.

I know, it's a different setup, but the same problem.

If your rig is not prime95 stable, it is not calculating correctly and there
is reason to believe that your SETI- data might be worthless. "SETI etc
worked fine" does not mean anything. Only way to find out if a computer is
calculating correctly is to compare your result to a known correct result.
 
E

Egil Solberg

Bill said:
Greetings...

I just bought a new system with an AMD dual core and it runs great,
until I try to run Prime95 as a "burn-in" test.

Are you using latest bios 1010? And latest drivers.
 
P

Paul

Greetings...

I just bought a new system with an AMD dual core and it runs great,
until I try to run Prime95 as a "burn-in" test.

The entire system is stock, no overclocking, no fancy settings, just
default BIOS settings right now until I confirm the hardware is good.
Here's the list of items:

Asus A8N-E motherboard
AMD 64x2 3800+ at default 2.0 GHz
OCZ gold 2x512MB DDR at AUTO setting gives 2.5-3-3-5 timings
BFG 6800 GT video card (PCI-Express)
Antec Sonata II case with Antec 450w Smart power supply
And the usual other goodies like a serial ATA drive, floppy, DVD burner
on parallel ATA. The case has a 120mm exhaust fan and I added a 92mm
intake fan that blows cool air on the motherboard over the CPU and RAM.
The BIOS reports 33'C and 34'C for the board and CPU temps respectively.

Windows XP and all of my programs run fine. MemTest86 shows no RAM
problems, yet Prime95 loads up and starts to run, but then it gives
SUMOUT errors after about 1-3 hours, no rounding errors or crashes
though.

I think the RAM is fine because it works great in my old P4 system that
runs Prime95 100%, but it may not be compatible with Asus. I also tried
dropping the CPU speed to 1.0 GHz to see if the CPU was acting up, and
it still failed.

Does anyone know of any compatibility issues with AMD and Asus?
Or with OCZ memory and Asus boards?

Your tRAS could be a bit higher. Use CPUZ (cpuid.com) to verify
what the Auto settings are doing. Then, go back into the BIOS
and make your changes. This Anandtech article recommends tRAS
6-8, and I'd try 8 as there is no loss with that setting.

http://www.anandtech.com/mb/showdoc.aspx?i=2465&p=3

You could also play with CAS, and see if CAS 3 is any better
than CAS 2.5 in terms of passing Prime95.

I would disable Cool N' Quiet, until you get a handle on what
is happening. At least then, the Vcore VID pins should stick
with one value - Vcore is allowed to "droop", it is normally
covered in processor specifications (which I cannot find on
the AMD site!), and if you see the voltage drop by 0.1V when
using Asus Probe, that is not out of the ordinary. If you
saw much larger droop, that might be more of a concern.
But to really blame droop, we'd need to compare to someone
elses results with the same hardware config.

Some of the early dual cores seemed to have problems with the
thermal contact between the cores and the IHS (integrated heat
spreader). The symptoms were that one core overclocked much
better than the other - in extreme cases, the dual core was
barely able to pass at stock speeds. Some people removed the
heat spreader using a razor blade, and found placing their
heatsink right on the cores gave stability to higher clocks.
I was reading a thread last night, and the latest dual cores
seem to ship with a better quality thermal interface material,
and the opinion is that the new stuff is acceptable. There are
still people stripping off the IHS, but just to impress their
friends.

To test the cores, try assigning Prime95 first to one core and
then to the other core, and see if one core is responsible for
the problems. If both cores do it, then you'll have to look
elsewhere (like maybe a motherboard PCB fabrication problem).

Another option you can look at, is Prime95 has a "small FFT" and
a "large FFT" option. "small FFT" will tend to run in cache,
and if that increases the failure rate, then either the processor
is at fault, part of a processor core is overheating, motherboard
Vcore is out of spec, or your power supply +12V could be dipping
too low. Using a "large FFT" option means system memory gets
more of a workout. Trying the two methods may hint at whether
memory or the processor is the more likely culprit.

HTH,
Paul
 
B

Bill

Egil said:
Are you using latest bios 1010? And latest drivers.

Yes...sorry I should have included that in my post.

In fact, I've tried running the original 1008 and now 1010. For drivers,
I started with the ones on the Asus CD and video CD, then upgraded to
the latest. No change.

However, today I came across the OCZ discussion forums on Google and
found that OCZ recommends running the DDR at 2.90v (!), even though the
standard is 2.60v and they quote 2.80v as the enhanced rating. I bumped
it up and it's been running for over an hour without errors.

I'll keep my fingers crossed.

:)
 
B

Bill

Paul said:
Your tRAS could be a bit higher. Use CPUZ (cpuid.com) to verify
what the Auto settings are doing. Then, go back into the BIOS
and make your changes. This Anandtech article recommends tRAS
6-8, and I'd try 8 as there is no loss with that setting.

http://www.anandtech.com/mb/showdoc.aspx?i=2465&p=3

I'll give that a look...thanks for the link.
I would disable Cool N' Quiet, until you get a handle on what

That's disabled by default in the BIOS. I wouldn't use it anyway unless
I was running it on a laptop.
To test the cores, try assigning Prime95 first to one core and
then to the other core, and see if one core is responsible for
the problems. If both cores do it, then you'll have to look
elsewhere (like maybe a motherboard PCB fabrication problem).

That was something I missed in the documentation for Prime95, the
"affinity" setting. I tried it, and both cores failed around the same
point.

But I think I stumbled upon the answer today while googling at work. The
memory company OCZ has a forum, and in there I found that even though
the memory is rated at 2.60v, with enhanced settings at 2.8v, the techs
strongly recommend running my memory at 2.90v or higher. I was surprised
by this, but after bumping up the DDR voltage to 2.90v everything is
running great so far - well over an hour and no errors!

I'll run the tests again overnight and see what happens. Thanks for the
info.
 
P

Paul

That's disabled by default in the BIOS. I wouldn't use it anyway unless
I was running it on a laptop.

Have a look at section 3.6 of your user manual. You can use it,
but there are a couple of settings and some software to install.
You don't have to use it, if you don't want to - the benefit
of CNQ is it'll use a little less electriciy when the CPU is idle.
I'll run the tests again overnight and see what happens. Thanks for the
info.

Yes, post back how it turns out. There might be some other
potential OCZ memory users out there who need to know this.
I'm kinda surprised it needs 2.9V at DDR400, if that is
the speed setting you are using. (Verify with CPUZ from
cpuid.com, to see what settings the BIOS is using.) If this
was my RAM, I'd dial Vdimm down until it throws errors in
Prime, then lift it one more notch - all in the name of
keeping the temp down on the memory. If the memory has a
lifetime warranty, then it probably doesn't matter :)

Paul
 
B

Bill

Paul said:
Have a look at section 3.6 of your user manual. You can use it,
but there are a couple of settings and some software to install.
You don't have to use it, if you don't want to - the benefit
of CNQ is it'll use a little less electriciy when the CPU is idle.

I realize that, hence the reason I would use it on a laptop to conserve
battery power. But since I leave my desktop computer on 27/7 and it's
always processing something, there is little advantage to using CnQ.
Yes, post back how it turns out. There might be some other
potential OCZ memory users out there who need to know this.

Good news!
Prime95 runs for hours without any errors at 2.90 volts.

What confuses me about all this is that the memory tests never showed
any kind of errors at all, yet Prime95 really stresses the CPU more than
the memory. Go figure.
I'm kinda surprised it needs 2.9V at DDR400, if that is
the speed setting you are using. (Verify with CPUZ from
cpuid.com, to see what settings the BIOS is using.)

It's confirmed, I run it 2-2-2-7 1T at 2.90v at DDR200. This is the gold
series extreme memory and it's designed to run this fast, and the reason
I opted to stick with DDR instead of making the move to DDR2. If DDR2
was as fast as this stuff at the same price, I would have bought a
Pentium dual-core instead.

The added bonus is that fast DDR2 memory costs a fair bit more, and I
saved about $100 sticking with fast DDR memory.
If this
was my RAM, I'd dial Vdimm down until it throws errors in
Prime, then lift it one more notch - all in the name of
keeping the temp down on the memory. If the memory has a
lifetime warranty, then it probably doesn't matter :)

At the default "Auto" settings, the voltage is 2.60 and I can't come
close to 2-2-2 timings. At the recommended 2.80v I can get 2-2-2, but I
still get errors in Prime95. I can slow it down to about 3-6-6-10 2T and
it will usually pass the test at 2.80v.

But then it's noticeably slower in certain applications and I didn't
fork over twice as much money over cheaper memory just to run it at slow
timings.

OCZ themselves recommend running this specific memory at 2.90-3.00v. And
the memory is still guaranteed for life at or below 3.00v. Many of their
other modules are NOT covered at 3v though.

Anywho, I'm a happy camper now!
 
P

Paul

I realize that, hence the reason I would use it on a laptop to conserve
battery power. But since I leave my desktop computer on 27/7 and it's
always processing something, there is little advantage to using CnQ.


Good news!
Prime95 runs for hours without any errors at 2.90 volts.

What confuses me about all this is that the memory tests never showed
any kind of errors at all, yet Prime95 really stresses the CPU more than
the memory. Go figure.


It's confirmed, I run it 2-2-2-7 1T at 2.90v at DDR200. This is the gold
series extreme memory and it's designed to run this fast, and the reason
I opted to stick with DDR instead of making the move to DDR2. If DDR2
was as fast as this stuff at the same price, I would have bought a
Pentium dual-core instead.

The added bonus is that fast DDR2 memory costs a fair bit more, and I
saved about $100 sticking with fast DDR memory.


At the default "Auto" settings, the voltage is 2.60 and I can't come
close to 2-2-2 timings. At the recommended 2.80v I can get 2-2-2, but I
still get errors in Prime95. I can slow it down to about 3-6-6-10 2T and
it will usually pass the test at 2.80v.

But then it's noticeably slower in certain applications and I didn't
fork over twice as much money over cheaper memory just to run it at slow
timings.

OCZ themselves recommend running this specific memory at 2.90-3.00v. And
the memory is still guaranteed for life at or below 3.00v. Many of their
other modules are NOT covered at 3v though.

Anywho, I'm a happy camper now!

So, you've tested and found that 2.9V is necessary to make it
stable, and that is all I'm recommending (just no excess
voltage that doesn't give performance in return).

Some overclockers actually like to position a fan over
their DIMMs, when applying the higher voltages. Now, your
voltages really aren't that high, so there is no reason to
get excited. My rule of thumb with components, is if you
burn yourself on them (cannot hold a finger on them for
2 seconds), then you should consider adding some air flow.

I don't know why, but some memories seem to be able to take
a lot more voltage than others. If you applied 2.9V to some
Crucial Ballistix, with Micron chips on it, you might discover
fairly quickly, that a DIMM died on you. Winbond BH5 chips
on the other hand, have no trouble running with 3.3V. And there
are both DFI motherboards and the OCZ DIMM booster that can apply
more voltage than that, if an enthusiast needs it.

In terms of the voltages being used these days, I believe the
AMD spec sheet lists 2.9V as the highest shared voltage between
the memory controller on the processor and the I/O power being
used on the memory (see VDDIO spec in doc 31411 from AMD). But
as has been demonstrated many times now, the AMD processors
can take more than that. I did run into a thread on one of the
private forums a couple of days ago, by "Bigtoe" the OCZ rep,
and he commented that the Athlon64 family is sensitive to the
differential between Vcore and Vdimm. In other words, if a
person was to use extreme voltages for Vdimm, then Vcore
should be lifted up a bit as well. So his feeling is, the
Athlon64 family is not necessarily sensitive to the absolute
voltage applied to the shared Vdimm supply, but to the voltage
difference between Vdimm and Vcore. That would make some of the
more modern, lower Vcore processors, more prone to failure due
to excessive VDimm. At your 2.9V setting, you have nothing to
worry about. This warning is reserved for people using the
DFI board or the OCZ DIMM booster. There have been a few processor
lost to overvolting, in the overclocker community. I don't
know if AMD would comment on an issue like this or not, as
their spec sheet says what they believe is a safe set of
conditions. Helping overclockers torture warrantied products
would not likely be good for AMD's business :)

As for your observation that the memory can pass memtest86
but fail Prime95, that is the whole reason for running Prime :)
Someone else came up with this combo, and I didn't believe
it was necessary at first, but I believe it now.

It seems, for what ever reason, that Prime95 or gaming for
a few hours, can expose more problems than you can see in
memtest86. Even SuperPI is recommended as a way to test.
The value of memtest86, is that it tests all the memory, due
to the fact that the executable is "moved out of the way",
and the memory underneath is tested. It allows stuck-at type
faults to be tested with no OS running, which is better
than any other test you can run from Windows. But Prime95
and other tools, must still be run as a final "qualification"
test.

Paul
 
P

Paul

As usual, you've posted some very interesting and pertinent stuff.
This makes me wonder about those of us using relatively high-voltage
DIMMs (like my low-latency Corsair that likes 2.9v) and CnQ which can
lead to a Vcore-Vdimm delta of about 0.9V. Do you see any problem
with that situation?

The AMD spec sheet lists 2.9V as being the maximum value. That
means you can use any Vcore that will run, while supplying 2.9V
to the VDDIO pins (memory). You have nothing to worry about.

An overclocker might not be running CNQ, and Bigtoe's
recommendation, was to bump the fixed Vcore above its normal
value, if higher memory voltages are being used. If the memory
cannot handle the voltage, it will be killed, but the processor
might still survive.

Some processor datasheets actually specify a value for the
max differential between the various supplies. Motorola
processors have specs for this. There is no reason for AMD
to be speccing this, as a 2.9V operating range was supposed
to be generous in the first place :) Commercial JEDEC
spec is only 2.6V, and 2.9V was supposed to give plenty of
room for supply variation. No silicon manufacturer wants
to design headroom into parts, so a few overclockers can
have fun :) If you go above 2.9V on an Athlon64 family
part, it is basically Russian Roulette as far as AMD is
concerned. But many people have got away with it.

Paul
 
B

Bill

Paul said:
So, you've tested and found that 2.9V is necessary to make it
stable, and that is all I'm recommending (just no excess
voltage that doesn't give performance in return).

Actually I can't take much credit for any testing - OCZ information
recommends running their XTC memory (and others) at 2.90v for optimal
performance. I just tried their suggestions and it worked.

I did play with various voltage settings, but only after I learned that
OCZ recommended higher than normal voltage.
Some overclockers actually like to position a fan over
their DIMMs, when applying the higher voltages. Now, your
voltages really aren't that high, so there is no reason to
get excited. My rule of thumb with components, is if you
burn yourself on them (cannot hold a finger on them for
2 seconds), then you should consider adding some air flow.

I don't have a fan over the DIMM's but the CPU heatsink happens to blow
in the direction of the memory, and the special intake duct dumps air on
the CPU area. It's not all room temp, but since my CPU often runs fairly
cool, it's close, and the intake dumps cool air.

There is also a 120mm fan circulating air in the case to add to the
overall cooling - the graphics card likes it. And the case comes with an
installed 120mm fan to exhaust air as well.

Overall, case cooling is good.
As for your observation that the memory can pass memtest86
but fail Prime95, that is the whole reason for running Prime :)
Someone else came up with this combo, and I didn't believe
it was necessary at first, but I believe it now.

I'm a believer too!

I thought my system was stable running Windows, lots of apps, Boinc,
etc., but now I see it really wasn't.

The good news is that my system is now rock solid.
It seems, for what ever reason, that Prime95 or gaming for
a few hours, can expose more problems than you can see in
memtest86.

Actually, gaming was never an issue.

I could play HL2 or Q4 without issue. The only thing that failed was
Prime95, and to be honest, I started to wonder about a conflict with the
setup since I couldn't get anything else to fail.
The value of memtest86, is that it tests all the memory, due
to the fact that the executable is "moved out of the way",
and the memory underneath is tested. It allows stuck-at type

I used to consider Memtest a valuable tool...but now it's just another
confirmation as far as I'm concerned. Prime95 has to be one of the major
tests or the system isn't stable.

Thanks for the edification.

:)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top