would vidcard problems stop with change to new series of same thing?

M

mechphisto

I've a friend who has an ATI Radeon HD 2600XT 512MB by PowerColor.
In his PC, it constantly BSOD crashes his system whenever he exits
from games or high graphics programs (sometimes during). The card
works fine in another system.

We sent it back to PowerColor for another of the same thing, thinking
it might be defective in some way--but the new card does the exact
same thing.
His RAM's been checked, his WinXP Pro has been reinstalled, we've used
supplied drivers, WinXP drivers, Catalyst drivers...nothing's fixed
the problem.

PowerColor won't refund the card, but they will offer an upgrade for
nominal price.

Question is: Could this be a problem inherent in Radeon HD cards that
might not be playing nice with his mobo or something, and an upgraded
card would do the same thing?
Unfortunately there are so many of these 2600XT's on eBay, being sold
by actual stores, that he has no real chance of selling the card. He's
either going to have to eat the loss, or take the upgrade.
Opinions?

Thanks,
Liam
 
D

Dave

I've a friend who has an ATI Radeon HD 2600XT 512MB by PowerColor.
In his PC, it constantly BSOD crashes his system whenever he exits
from games or high graphics programs (sometimes during). The card
works fine in another system.

We sent it back to PowerColor for another of the same thing, thinking
it might be defective in some way--but the new card does the exact
same thing.
His RAM's been checked, his WinXP Pro has been reinstalled, we've used
supplied drivers, WinXP drivers, Catalyst drivers...nothing's fixed
the problem.

PowerColor won't refund the card, but they will offer an upgrade for
nominal price.

Question is: Could this be a problem inherent in Radeon HD cards that
might not be playing nice with his mobo or something, and an upgraded
card would do the same thing?
Unfortunately there are so many of these 2600XT's on eBay, being sold
by actual stores, that he has no real chance of selling the card. He's
either going to have to eat the loss, or take the upgrade.
Opinions?

Thanks,
Liam

It sounds like one system you are using the video card in has a defective
power supply. -Dave
 
M

mechphisto

It sounds like one system you are using the video card in has a defective
power supply. -Dave

Really? How so? (It's a 550 watt.)
How would a defective PSU cause a system to BSOD with a driver/
win32.sys error as the indicator, as opposed to not work at all?
I'm eager to hear, as I think he'd be VERY pleased to know a $40 PSU
could solve his problems--but I wouldn't want it to be a wasted $40.
Thanks for any feedback!
 
D

Dave

It sounds like one system you are using the video card in has a defective
Really? How so? (It's a 550 watt.)

So? 550W crap can be weaker than a good name-brand 300W power supply. The
550W doesn't mean anything, in other words.
How would a defective PSU cause a system to BSOD with a driver/
win32.sys error as the indicator, as opposed to not work at all?

Quite easily. All it takes is a momentary dip in voltage, often too quick
to register on a multimeter. The error message displayed by windoze often
points to the driver/app that was in use at the time that the HARDWARE
experienced a failure.
I'm eager to hear, as I think he'd be VERY pleased to know a $40 PSU
could solve his problems--

IF the power supply is the problem, then a $40 PSU is only a band-aid. Buy
a $40 PSU and the problem will reappear sooner or later (probably sooner).
but I wouldn't want it to be a wasted $40.

Neither would I. Fact is, You've got a hardware problem somewhere, in the
video card (ruled out already), RAM (virtually ruled out already),
motherboard (not unheard of) or power supply.

Of those components, the likelihood of failure would be power supply 99.9%,
with all three other components comprising .1% combined.

About the closest you could come to a -decent- power supply for 40 bucks is
the following, if you are willing to deal with a $20 mail-in rebate.
http://www.deals2buy.com/show/45008043/deals.htm

This 500W one listed above is likely MUCH better quality, and has more real
power, than the 550W power supply currently in use! Having a good power
supply on hand as a back-up is a good idea anyway. But I strongly suspect
that the system in question needs it's power supply replaced anyway. That's
what the symptom you write about is pointing to.

As a test, you could borrow a power supply from another system for testing.
If the power supply from the other system you used the video card in is
compatible, swap THAT power supply temporarily into the system that has the
problem. That will tell you if the power supply is the problem (I suspect
it is) -Dave
 
P

PhxGrunge

Sounds more like the VPU Recover option in CCC is checked. If so, uncheck
it, then see if the errors return.
 
P

Paul

I've a friend who has an ATI Radeon HD 2600XT 512MB by PowerColor.
In his PC, it constantly BSOD crashes his system whenever he exits
from games or high graphics programs (sometimes during). The card
works fine in another system.

We sent it back to PowerColor for another of the same thing, thinking
it might be defective in some way--but the new card does the exact
same thing.
His RAM's been checked, his WinXP Pro has been reinstalled, we've used
supplied drivers, WinXP drivers, Catalyst drivers...nothing's fixed
the problem.

PowerColor won't refund the card, but they will offer an upgrade for
nominal price.

Question is: Could this be a problem inherent in Radeon HD cards that
might not be playing nice with his mobo or something, and an upgraded
card would do the same thing?
Unfortunately there are so many of these 2600XT's on eBay, being sold
by actual stores, that he has no real chance of selling the card. He's
either going to have to eat the loss, or take the upgrade.
Opinions?

Thanks,
Liam

How much RAM is in the computer ?

Do the symptoms change when you drop down to a minimum amount of
RAM for the system ? (Say a single 1GB stick.)

Have you tried taking a brand new hard drive, disconnecting the
old one, and installing Windows, the drivers, and then a test
game on it ? Are the symptoms repeatable on that clean install ?
If the game works, and doesn't crash, then you'd suspect a
problem with the Windows install, of some kind.

Paul
 
M

mechphisto

How much RAM is in the computer ?

Do the symptoms change when you drop down to a minimum amount of
RAM for the system ? (Say a single 1GB stick.)

Have you tried taking a brand new hard drive, disconnecting the
old one, and installing Windows, the drivers, and then a test
game on it ? Are the symptoms repeatable on that clean install ?
If the game works, and doesn't crash, then you'd suspect a
problem with the Windows install, of some kind.

Paul

Two 1GB sticks. Yeah, we've tried with one out and then the other--
same thing.
I have tried a different HD, same thing.
I've had to reinstall WinXP from a format three times, no change.
The only 3 pieces of hardware that haven't been swapped out are the
CPU, mobo, and PSU.
 
P

Paul

Two 1GB sticks. Yeah, we've tried with one out and then the other--
same thing.
I have tried a different HD, same thing.
I've had to reinstall WinXP from a format three times, no change.
The only 3 pieces of hardware that haven't been swapped out are the
CPU, mobo, and PSU.

There is a thread here, where there is mention of VPU recoveries,
a win32.sys, mostly Vista related, but some WinXP at the end of the
thread. All driver problems of one sort or another.

http://www.rage3d.com/board/showthread.php?t=33886850&page=5

In this thread, there is some explanation of what win32.sys might do.

http://it.slashdot.org/article.pl?sid=05/12/30/1310243&tid=220

"The graphics rendering engine is divided between the Win32 subsystem
which is a user process (csrss.exe), and the Win32 executive (Win32.sys)
which actually runs in kernel space. The portion of the graphics system
in the executive is limitted almost exclusively to the actual displaying
of images and direct interaction with the drivers that interface with
the display hardware."

So that means part of Windows tossed cookies, presumably while
talking to the graphics driver. But it doesn't explain why though.

Imagine for a moment, that a driver leaked memory, or corrupted
a chained memory structure. It could be, during the "unwinding" of
the mess, and the recovery of allocated memory, that the fault
occurs. A driver bug could do it, a hardware issue, any number
of things.

If it was my problem, and I'd already done the basics (memtest86+,
Prime95 for memory stability), I'd carefully collect crash
symptoms and exact error messages, and Google them, trying to
find a match. Presumably the various versions of Catalyst didn't
all behave the same, and each driver set you tested, adds extra
data to your results.

On my ATI video card, I remember how happy I was that I got the new
card installed, booted Windows, and put the shiny ATI installer
CD in the drive. Moments later I was looking at a system crash.
The drivers on the CD crashed my system. It took me about three
downloads of different drivers, before the card was completely
happy, and it has been happy to this day. But the first attempt,
was a spectacular flop. (This was right around the time that
SmartGART was introduced.)

Paul
 
M

mechphisto

There is a thread here, where there is mention of VPU recoveries,
a win32.sys, mostly Vista related, but some WinXP at the end of the
thread. All driver problems of one sort or another.

http://www.rage3d.com/board/showthread.php?t=33886850&page=5

In this thread, there is some explanation of what win32.sys might do.

http://it.slashdot.org/article.pl?sid=05/12/30/1310243&tid=220

"The graphics rendering engine is divided between the Win32 subsystem
which is a user process (csrss.exe), and the Win32 executive (Win32.sys)
which actually runs in kernel space. The portion of the graphics system
in the executive is limitted almost exclusively to the actual displaying
of images and direct interaction with the drivers that interface with
the display hardware."

So that means part of Windows tossed cookies, presumably while
talking to the graphics driver. But it doesn't explain why though.

Imagine for a moment, that a driver leaked memory, or corrupted
a chained memory structure. It could be, during the "unwinding" of
the mess, and the recovery of allocated memory, that the fault
occurs. A driver bug could do it, a hardware issue, any number
of things.

If it was my problem, and I'd already done the basics (memtest86+,
Prime95 for memory stability), I'd carefully collect crash
symptoms and exact error messages, and Google them, trying to
find a match. Presumably the various versions of Catalyst didn't
all behave the same, and each driver set you tested, adds extra
data to your results.

On my ATI video card, I remember how happy I was that I got the new
card installed, booted Windows, and put the shiny ATI installer
CD in the drive. Moments later I was looking at a system crash.
The drivers on the CD crashed my system. It took me about three
downloads of different drivers, before the card was completely
happy, and it has been happy to this day. But the first attempt,
was a spectacular flop. (This was right around the time that
SmartGART was introduced.)

Paul

Thanks for the feedback.
I've been dealing with this problem for some time:
http://groups.google.com/group/comp...=author:[email protected]#bc48e737751e46aa
http://groups.google.com/group/micr...=author:[email protected]#ea7f14db4c09c63b
The fact that this card worked fine in another machine, and the
problem would happen in the machine in question despite how many times
WinXP is reinstalled or what version or provider of the driver is
used, leads me to think it's hardware.
Since all hardware has been swapped out and tested, except for the
CPU, mobo, and PSU... this is what I have left. Either the PSU is a
problem, or this series of video card is simply incompatible with this
mobo/CPU--which led me to ask whether switching from a 2600 to a 2900
would even fix the problem.

Thanks for the reply,
Liam
 
M

mechphisto

So? 550W crap can be weaker than a good name-brand 300W power supply. The
550W doesn't mean anything, in other words.


Quite easily. All it takes is a momentary dip in voltage, often too quick
to register on a multimeter. The error message displayed by windoze often
points to the driver/app that was in use at the time that the HARDWARE
experienced a failure.


IF the power supply is the problem, then a $40 PSU is only a band-aid. Buy
a $40 PSU and the problem will reappear sooner or later (probably sooner).


Neither would I. Fact is, You've got a hardware problem somewhere, in the
video card (ruled out already), RAM (virtually ruled out already),
motherboard (not unheard of) or power supply.

Of those components, the likelihood of failure would be power supply 99.9%,
with all three other components comprising .1% combined.

About the closest you could come to a -decent- power supply for 40 bucks is
the following, if you are willing to deal with a $20 mail-in rebate.http://www.deals2buy.com/show/45008043/deals.htm

This 500W one listed above is likely MUCH better quality, and has more real
power, than the 550W power supply currently in use! Having a good power
supply on hand as a back-up is a good idea anyway. But I strongly suspect
that the system in question needs it's power supply replaced anyway. That's
what the symptom you write about is pointing to.

As a test, you could borrow a power supply from another system for testing.
If the power supply from the other system you used the video card in is
compatible, swap THAT power supply temporarily into the system that has the
problem. That will tell you if the power supply is the problem (I suspect
it is) -Dave

Dangit!
I was certain this was the answer, but I swapped out with a much
higher quality PSU:
Antec putting out +3.3@30A, +5@40A, +12@18A
(The thing weighs a ton.)
And it still crashed with a win32k.sys claimed as at fault when I alt-
tabbed out of a game.
Looks like it's not the PSU at fault.
 
D

Dave

Dangit!
I was certain this was the answer, but I swapped out with a much
higher quality PSU:
Antec putting out +3.3@30A, +5@40A, +12@18A
(The thing weighs a ton.)
And it still crashed with a win32k.sys claimed as at fault when I alt-
tabbed out of a game.
Looks like it's not the PSU at fault.

Sheesh, only thing left is the mainboard. -Dave
 
S

ShadowTek

Have you tried something like Memtest to test the hardware for
stability while the graphics card is removed?
 
M

mechphisto

Have you tried something like Memtest to test the hardware for
stability while the graphics card is removed?

yeah, 'fraid so.
memtest86+ with several passes. No errors. =/
 
P

Paul

Thanks for the feedback.
I've been dealing with this problem for some time:
http://groups.google.com/group/comp..._frm/thread/a04bc864b91a2eb5/bc48e737751e46aa
http://groups.google.com/group/micr..._frm/thread/e3e6e149089da1af/ea7f14db4c09c63b
The fact that this card worked fine in another machine, and the
problem would happen in the machine in question despite how many times
WinXP is reinstalled or what version or provider of the driver is
used, leads me to think it's hardware.
Since all hardware has been swapped out and tested, except for the
CPU, mobo, and PSU... this is what I have left. Either the PSU is a
problem, or this series of video card is simply incompatible with this
mobo/CPU--which led me to ask whether switching from a 2600 to a 2900
would even fix the problem.

Thanks for the reply,
Liam

OK, in addition to using memtest86+ for testing system memory, you
also need to run a program like Prime95. What Prime95 does, is a
math calculation with a known answer. The "torture test" option is
used to test the adequacy of CPU cooling (as it makes the processor
hot), as well as push a lot of data through system memory. You can
download the "official release" from the mersenne.org main page,
or use this multicore version. This one starts a thread per core,
and for dual or quad cores, gives them a better workout. When it
asks "Join GIMPS?", say no, and just use the defaults in the custom
setup dialog. Allow to run for at least four hours - if the system
isn't very stable, it will fail in 10 seconds, so for badly set up
hardware, the feedback is immediate.

http://www.mersenne.org/gimps/p95v255a.zip

In one of your threads, you mention the chipset is getting hot,
and I'd experiment with adding cooling, either by mounting a
small fan on or next to it, or using a larger fan held outside
the box. And then see if stability improves.

It is also possible, when a manufacturer fits a heatsink to a
hot chipset, that they do a poor job of installation. Sometimes,
a large quantity of thermal interface material is used, and
either the TIM makes poor contact, or acts as an insulator,
making the chipset temp higher than it should be. That
puts the chipset, and memory access, on the edge of stability.
Some chipsets even come with no TIM used, and an air gap,
guaranteeing trouble.

In those threads, you also mention that game crashes also happen
"in-game", as well as just as you exit from the game. Is the
error still reported as win32k.sys ? Or is the ATI driver named
at that point.

In terms of hardware tests, in increasing stress on the hardware,
the tests to run are:

1) memtest86+ -- Main value is detecting RAM with stuck-at faults.
Not generally known for exceptional stress to hardware.
2) Prime95 -- There are a number of "burn" type test programs.
The benefit of Prime95, is that it compares the computation
results, to expected results. So not only does it
stress CPU and memory, but also announces when a
consistency check has failed. The program cannot
test all memory, which is a weakness. On my machine,
it can test up to about 760MB of my 1GB total - the
rest of memory is used by Windows.
3) 3D Gaming -- I have used 3DMark2001SE build 330, and left the
demo loop running overnight. The main benefit was not
having to do anything, while it runs. That is no longer the
most stressful test possible. Crysis is now a candidate
for being stressful. In any case, select a DirectX or
OpenGL test program, which doesn't have a lot of known
driver issues, allows stress testing the hardware, without
triggering game related bugs.

So I'd say at the current time, you should go back and do (2). Note that
in (2), there is no stress at all on the video card or video slot. So
if you see errors detected at that point, then it could be motherboard
or memory. Sometimes bumping up Vdimm can help, or doing a bit of
BIOS tuning for the motherboard, will put the system on the right
path.

Paul
 
M

mechphisto

OK, in addition to using memtest86+ for testing system memory, you
also need to run a program like Prime95. What Prime95 does, is a
math calculation with a known answer. The "torture test" option is
used to test the adequacy of CPU cooling (as it makes the processor
hot), as well as push a lot of data through system memory. You can
download the "official release" from the mersenne.org main page,
or use this multicore version. This one starts a thread per core,
and for dual or quad cores, gives them a better workout. When it
asks "Join GIMPS?", say no, and just use the defaults in the custom
setup dialog. Allow to run for at least four hours - if the system
isn't very stable, it will fail in 10 seconds, so for badly set up
hardware, the feedback is immediate.

http://www.mersenne.org/gimps/p95v255a.zip

In one of your threads, you mention the chipset is getting hot,
and I'd experiment with adding cooling, either by mounting a
small fan on or next to it, or using a larger fan held outside
the box. And then see if stability improves.

It is also possible, when a manufacturer fits a heatsink to a
hot chipset, that they do a poor job of installation. Sometimes,
a large quantity of thermal interface material is used, and
either the TIM makes poor contact, or acts as an insulator,
making the chipset temp higher than it should be. That
puts the chipset, and memory access, on the edge of stability.
Some chipsets even come with no TIM used, and an air gap,
guaranteeing trouble.

In those threads, you also mention that game crashes also happen
"in-game", as well as just as you exit from the game. Is the
error still reported as win32k.sys ? Or is the ATI driver named
at that point.

In terms of hardware tests, in increasing stress on the hardware,
the tests to run are:

1) memtest86+ -- Main value is detecting RAM with stuck-at faults.
Not generally known for exceptional stress to hardware.
2) Prime95 -- There are a number of "burn" type test programs.
The benefit of Prime95, is that it compares the computation
results, to expected results. So not only does it
stress CPU and memory, but also announces when a
consistency check has failed. The program cannot
test all memory, which is a weakness. On my machine,
it can test up to about 760MB of my 1GB total - the
rest of memory is used by Windows.
3) 3D Gaming -- I have used 3DMark2001SE build 330, and left the
demo loop running overnight. The main benefit was not
having to do anything, while it runs. That is no longer the
most stressful test possible. Crysis is now a candidate
for being stressful. In any case, select a DirectX or
OpenGL test program, which doesn't have a lot of known
driver issues, allows stress testing the hardware, without
triggering game related bugs.

So I'd say at the current time, you should go back and do (2). Note that
in (2), there is no stress at all on the video card or video slot. So
if you see errors detected at that point, then it could be motherboard
or memory. Sometimes bumping up Vdimm can help, or doing a bit of
BIOS tuning for the motherboard, will put the system on the right
path.

Paul

Wow! What a lot of great information and advice! I really appreciate
your doing that.
About the overheating mobo--that was a board ago. The Northbridge was
getting as hot as 105C when using SiS benchmark. We RMA'ed it and got
this entirely different mobo.

The memtest86+ I've ran for several passes with no errors.
I'm running Prime95 now. (Hadn't known about that program before.)
It's made it through 1024K, 8K, 10K, 896K, and 768K with all "passes".
I'll let it go on for another couple of hours and see what happens.

According to Speedfan, the Core temp hasn't gotten above 55C, which
isn't bad. The Northbridge has topped out at 74C. That's kind of hot;
I may need to see about putting a fan on that sink. But that's still
not too bad, especially running at max for an hour or so now.

I've run 3DMark on this system before and it got through it, twice,
just fine. I'll see about running it overnight. Does it specifically
test DirectX and OpenGL as you recommended I test for? Or do I need to
look for something else?

Anything changes, or after I complete the rest of the tests, I'll post
an update.
Thanks for the advice!!
-Liam
 
M

mechphisto

I've a friend who has an ATI Radeon HD 2600XT 512MB by PowerColor.
In his PC, it constantly BSOD crashes his system whenever he exits
from games or high graphics programs (sometimes during). The card
works fine in another system.

We sent it back to PowerColor for another of the same thing, thinking
it might be defective in some way--but the new card does the exact
same thing.
His RAM's been checked, his WinXP Pro has been reinstalled, we've used
supplied drivers, WinXP drivers, Catalyst drivers...nothing's fixed
the problem.

PowerColor won't refund the card, but they will offer an upgrade for
nominal price.

Question is: Could this be a problem inherent in Radeon HD cards that
might not be playing nice with his mobo or something, and an upgraded
card would do the same thing?
Unfortunately there are so many of these 2600XT's on eBay, being sold
by actual stores, that he has no real chance of selling the card. He's
either going to have to eat the loss, or take the upgrade.
Opinions?

Thanks,
Liam

Hard drive??
Is that possible?
As you can see in the thread, I've detailed how I'm trying to narrow
the problem down.
The only hardware that has not been swapped out are the CPU, the mobo,
the video card which I thought/think is the root of the problem, and
the hard drive.
I've just finished testing the CPU, and it seems to pass all the
tests. Technically, the video card shouldn't be a problem: it's been
replaced by the manufacturer and works fine in another PC.
I ruled out the HD because I've performed badblocks (Linux low level
block checker) and chkdsk and scandisk and Partition Magic's error
check--and it passed them all. (Except half the time, after a BSOD
win32k.sys crashes, chkdsk ends up converting so many lost chains that
system files are damaged and I end up having to reformat the drive).
Just to see, I installed a different HD (IDE vs. the SATA in question)
and installed the same OS and drivers and patches and game... and I
can't seem to get it to crash!

I'm doing everything I normally do that causes the system to BSOD,
which is does without fail normally, but with this HD, it won't crash.

I figured the HD couldn't cause a BSOD crash because it passes all
those tests, and mainly just stores and transfers data--not process
and "run" it.

Is this likely the source of the problem? Or is it the fact it's a
different drive or is IDE instead of SATA masking/bypassing the still
existing root problem?

Thanks for any suggestions!
-Liam
 
C

Carl

Hello, I have been following this thread with interest. Why don't you go to
your hard drive manufacturer's website and download the zero fill utility,
it might take a long time to complete, but they return the hard drive to
being totally clean, some say that it will return it to the same state it
was in sitting on the shop shelf waiting to be bought. I have zero filled a
few drives before, and they all passed the different tests you have done,
before the zero fill, but I was getting random BSOD's and had reformatted
and reinstalled several times.
 
S

ShadowTek

Is this likely the source of the problem? Or is it the fact it's a
different drive or is IDE instead of SATA masking/bypassing the still
existing root problem?

Maybe your SATA host controller is flaky? Or maybe the SATA port that
you were using is bad?

How many SATA ports do you have? Try a different one and see what
happens.

If that doesn't work and your IDE drive continues to work well, then
try a different SATA hard drive and see what happens.
 
P

Paul

ShadowTek said:
Maybe your SATA host controller is flaky? Or maybe the SATA port that
you were using is bad?

How many SATA ports do you have? Try a different one and see what
happens.

If that doesn't work and your IDE drive continues to work well, then
try a different SATA hard drive and see what happens.

The motherboard is a M61P-S3.

http://www.newegg.com/Product/Product.aspx?Item=N82E16813128034

The chipset is a single chip, combining Northbridge and Southbridge functions.
Geforce 6100/430.

http://c1.neweggimages.com/NeweggImage/productimage/13-128-034-04.jpg

It could be, that the SATA interface only throws errors, when the
chip temp hits 74C. Thus, testing in Linux, when not gaming, is
no longer stressing the chipset chip quite the same way.

By comparison, an IDE interface would be much more tolerant of
temperature.

The chipset runs the SATA interface at 1.5Gbit/sec or 3.0Gbit/sec,
and the I/O pad on the chip probably sees some performance degradation
when operated at high temperature. It may take disk testing, while
a game is running, to detect a problem. If there is a way to do
something like that...

Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top