Random reboot / STOP 0xD1 during Firewire HD spin-up(?)

D

DS

Hi all,

My machine has a problem that is driving me absolutely nuts: It
randomly reboots an average of once per day (51 times in the last two
months, almost always when I'm working on it of course; it's all I can
do these days not to bash it into tiny pieces, frankly), with the
following error saved in the event log every time (as in, same
parameters and everything):

The computer has rebooted from a bugcheck. The bugcheck was:
0x000000d1 (0x00000010, 0x00000002, 0x00000001, 0xeb05071d). Microsoft
Windows 2000 [v15.2195]. A dump was saved in:
C:\WINNT\Minidump\MiniXXXXXX-XX.dmp.

I am already aware of most of the usual topics related to STOP 0xD1
errors, as listed here for instance:

http://aumha.org/win5/kbestop.php

....and the way in which Windows reports parameters:

Parameter 1 - An address that was referenced improperly
Parameter 2 - An IRQL that was required to access the memory
Parameter 3 - The type of access, where 0 is a read operation and 1 is
a write operation
Parameter 4 - The address of the instruction that referenced memory in
parameter 1

This is a DRIVER_IRQL_NOT_LESS_OR_EQUAL error. According to the Device
Manager, the memory address accessed is owned by the System Board
device, and IRQ 2 is not owned by anything - It also appears this
occurred during a write operation. If memory serves, though, way back
when you had to care about IRQs, 2 and 9 were generally linked, and I
*do* have devices on 9:

Microsoft ACPI-Compliant System
SCSI/RAID Host Controller
VAXSCSI Controller

The first runs more or less everything on the motherboard (including
the Firewire hardware, we'll get to that), while to the best of my
knowledge the second and third are virtual device drivers, not actual
hardware (I have no SCSI hardware and my on-board SATA RAID controller
is disabled).

While I *do* overclock, my machine only began having these problems
when I installed a new Western Digital Firewire / USB2.0 drive (320GB
media center); in fact, if I turn off the Firewire drive, this stops
happening (though WD assures me it's a software problem - of course).

I run Win2K fully updated to SP4 plus all updates since. I have tried
closing all applications and killing all but the most basic processes,
and this still occurs. Generally, it seems to occur when the computer
has sat for a bit without too much disk access, and I try to access the
Firewire drive.

Sometimes I hear a drive spinning up there's no problem; other times I
hear a funny noise like the drive has been reset or something
(something like a clicking noise) and the computer reboots. What's
also problematic is that I can't really tell whether the noise is
coming from the Firewire drive (which sites on top of my case, under my
desk) or one of the internal drives (also WD; one of them has a
tendency to go through start/stop cycles more often than normal and
also sounds like this, initially I was worried it's run fine otherwise
for many years).

I have run Western Digital's diagnostic tests on all of my drives from
a boot CD, to no avail - Nothing appears to be wrong with them, and all
the SMART parameters are in order (well, except for the Firewire drive,
which unfortunately does not report SMART parameters).

The only remaining options I see, other than RMAing the drive and
risking (a huge inconvenience plus I risk that it isn't actually bad),
are to run in safe mode for some time and see if that helps and / or to
run the CPU at stock speed and see if that helps. Since this only
happens when the Firewire drive is on, though, I have trouble
convincing myself that the overclocking makes a difference; I really
think it might be driver related. The other piece of evidence for this
is another recurrent error (which never coincides with my reboots,
incidentally, and seems benign otherwise):

The description for Event ID ( 25 ) in Source ( sbp2port ) cannot be
found. The local computer may not have the necessary registry
information or message DLL files to display messages from a remote
computer. The following information is part of the event:

I reported this to WD as well, since this too only appeared after I
installed their Firewire drive, but they had no idea what it was, and
since it seemed benign they told me not to worry about it.

Any help would be most appreciated.

DS
 
G

Glen

What motherboard is this?

DS said:
Hi all,

My machine has a problem that is driving me absolutely nuts: It
randomly reboots an average of once per day (51 times in the last two
months, almost always when I'm working on it of course; it's all I can
do these days not to bash it into tiny pieces, frankly), with the
following error saved in the event log every time (as in, same
parameters and everything):

The computer has rebooted from a bugcheck. The bugcheck was:
0x000000d1 (0x00000010, 0x00000002, 0x00000001, 0xeb05071d). Microsoft
Windows 2000 [v15.2195]. A dump was saved in:
C:\WINNT\Minidump\MiniXXXXXX-XX.dmp.

I am already aware of most of the usual topics related to STOP 0xD1
errors, as listed here for instance:

http://aumha.org/win5/kbestop.php

...and the way in which Windows reports parameters:

Parameter 1 - An address that was referenced improperly
Parameter 2 - An IRQL that was required to access the memory
Parameter 3 - The type of access, where 0 is a read operation and 1 is
a write operation
Parameter 4 - The address of the instruction that referenced memory in
parameter 1

This is a DRIVER_IRQL_NOT_LESS_OR_EQUAL error. According to the Device
Manager, the memory address accessed is owned by the System Board
device, and IRQ 2 is not owned by anything - It also appears this
occurred during a write operation. If memory serves, though, way back
when you had to care about IRQs, 2 and 9 were generally linked, and I
*do* have devices on 9:

Microsoft ACPI-Compliant System
SCSI/RAID Host Controller
VAXSCSI Controller

The first runs more or less everything on the motherboard (including
the Firewire hardware, we'll get to that), while to the best of my
knowledge the second and third are virtual device drivers, not actual
hardware (I have no SCSI hardware and my on-board SATA RAID controller
is disabled).

While I *do* overclock, my machine only began having these problems
when I installed a new Western Digital Firewire / USB2.0 drive (320GB
media center); in fact, if I turn off the Firewire drive, this stops
happening (though WD assures me it's a software problem - of course).

I run Win2K fully updated to SP4 plus all updates since. I have tried
closing all applications and killing all but the most basic processes,
and this still occurs. Generally, it seems to occur when the computer
has sat for a bit without too much disk access, and I try to access the
Firewire drive.

Sometimes I hear a drive spinning up there's no problem; other times I
hear a funny noise like the drive has been reset or something
(something like a clicking noise) and the computer reboots. What's
also problematic is that I can't really tell whether the noise is
coming from the Firewire drive (which sites on top of my case, under my
desk) or one of the internal drives (also WD; one of them has a
tendency to go through start/stop cycles more often than normal and
also sounds like this, initially I was worried it's run fine otherwise
for many years).

I have run Western Digital's diagnostic tests on all of my drives from
a boot CD, to no avail - Nothing appears to be wrong with them, and all
the SMART parameters are in order (well, except for the Firewire drive,
which unfortunately does not report SMART parameters).

The only remaining options I see, other than RMAing the drive and
risking (a huge inconvenience plus I risk that it isn't actually bad),
are to run in safe mode for some time and see if that helps and / or to
run the CPU at stock speed and see if that helps. Since this only
happens when the Firewire drive is on, though, I have trouble
convincing myself that the overclocking makes a difference; I really
think it might be driver related. The other piece of evidence for this
is another recurrent error (which never coincides with my reboots,
incidentally, and seems benign otherwise):

The description for Event ID ( 25 ) in Source ( sbp2port ) cannot be
found. The local computer may not have the necessary registry
information or message DLL files to display messages from a remote
computer. The following information is part of the event:

I reported this to WD as well, since this too only appeared after I
installed their Firewire drive, but they had no idea what it was, and
since it seemed benign they told me not to worry about it.

Any help would be most appreciated.

DS
 
D

DS

Hi Glen,

Thanks for writing! Selected system specs are as follows:

* ABit IS-7 motherboard (ID string is
03/30/2005-i865PE-W83627-6A79AA1BC-24, newest BIOS rev. as of today)
* Intel Pentium 4 2.4C (Northwood core, HT / PSB800 support) @ 3.0 GHz
(12 x 250 FSB, AGP/PCI fixed @ 66/33)
* 4 x 256MB PC3200 Crucial DDR-SDRAM DIMMs (SPD timing, 5:4 FSB:Mem,
"Game Accelerator" BIOS setting off)

All voltages are default except CPU, which is at 1.550 instead of 1.525
(spec). Thermal spec for this CPU is 74°C; mine generally runs at
43-44°C and never tops 47°C, thanks to a massive copper heatsink
(Zalman) and multiple case fans. PS is an Antec 400W supply, not
generic, and all voltages are within 2% of spec. I am aware of the
problem this MB shows with respect to the chipset cooling fan sometimes
coming off or its thermal compound cracking and becoming ineffective,
and have previously reworked that thermal contact with Arctic Silver 5,
same as my CPU heatsink / fan. For reference, this setup is about
three years old, though two of the four DIMMs were added maybe a year
ago.

A bit of searching indicates that this MB has a TSB43AB23 1394a-2000
OHCI PHY/link-layer Controller as the on-board Firewire hardware; specs
from Texas Instruments are here:

http://focus.ti.com/docs/prod/folders/print/tsb43ab23.html

I can in any case confirm since my last posting that running at stock
speed does *not* solve the problem - I reset everything to CPU default
and still the damn thing rebooted during use - In fact while I was
typing my first reply.

That's about all I can think to add for now; if you need further
information, though, please don't hesitate to ask. Thanks again,

DS
 
G

Glen

Without being there to troubleshoot, I can only offer a few general
comments.. Are any other devices sharing an IRQ with the Firewire
port? Also, if you go into Task Manager and list your active processes
it would help to narrow down the possible causes.

Hi Glen,

Thanks for writing! Selected system specs are as follows:

* ABit IS-7 motherboard (ID string is
03/30/2005-i865PE-W83627-6A79AA1BC-24, newest BIOS rev. as of today)
* Intel Pentium 4 2.4C (Northwood core, HT / PSB800 support) @ 3.0 GHz
(12 x 250 FSB, AGP/PCI fixed @ 66/33)
* 4 x 256MB PC3200 Crucial DDR-SDRAM DIMMs (SPD timing, 5:4 FSB:Mem,
"Game Accelerator" BIOS setting off)

All voltages are default except CPU, which is at 1.550 instead of 1.525
(spec). Thermal spec for this CPU is 74°C; mine generally runs at
43-44°C and never tops 47°C, thanks to a massive copper heatsink
(Zalman) and multiple case fans. PS is an Antec 400W supply, not
generic, and all voltages are within 2% of spec. I am aware of the
problem this MB shows with respect to the chipset cooling fan sometimes
coming off or its thermal compound cracking and becoming ineffective,
and have previously reworked that thermal contact with Arctic Silver 5,
same as my CPU heatsink / fan. For reference, this setup is about
three years old, though two of the four DIMMs were added maybe a year
ago.

A bit of searching indicates that this MB has a TSB43AB23 1394a-2000
OHCI PHY/link-layer Controller as the on-board Firewire hardware; specs
from Texas Instruments are here:

http://focus.ti.com/docs/prod/folders/print/tsb43ab23.html

I can in any case confirm since my last posting that running at stock
speed does *not* solve the problem - I reset everything to CPU default
and still the damn thing rebooted during use - In fact while I was
typing my first reply.

That's about all I can think to add for now; if you need further
information, though, please don't hesitate to ask. Thanks again,

DS
 
D

DS

Hello again,

The short answer (to the IRQ question) is yes. The Firewire device:

Texas Instruments OHCI Compliant IEEE 1394 Host Controller

....is listed in the Device Manager as using IRQ 17, which my Realtek
AC'97 audio hardware also uses.

The thing is, by my understanding all IRQs above 15 are "virtual"
according to the ACPI standard, and are eventually redirected to IRQ 2.
This is consistent with the STOP code, which identifies IRQ 2 as the
offending IRQ, but any device with an IRQ above 15 could potentially
produce such a STOP code. This also implies that all devices with IRQs
above 15 are effectively "sharing" IRQs, though generally the APIC is
supposed to be up to the task...

Are you thinking that maybe assigning the Firewire controller to an IRQ
by itself might help? For it to really be "alone" though, I suppose
it'd have to be IRQ 15 or below... I've seen reports of problems with
older, not entirely ACPI-compliant hardware and stand-alone Firewire
boards not being happy in Win2K with ACPI-HAL but tolerating "Standard"
HAL, but the IS-7 is fully ACPI compliant.

As for your other request, my current process list (as I type this, in
order of increasing PID) is below. Note that even when I close all of
my "optional" background apps and kill every process / stop every
unnecessary service I can, I still experience this problem - Though
with that said I'm certainly open to suggestions.

System Idle Process
System
smss.exe
csrss.exe
winlogon.exe
services.exe
lsass.exe
svchost.exe
ccSetMgr.exe
ccEvtMgr.exe
spoolsv.exe
DefWatch.exe
dhcore.exe
svchost.exe
nvsvc32.exe
stisvc.exe
Rtvscan.exe
uphclean.exe
Explorer.EXE
UAService7.exe
vsmon.exe
MsPMSPSv.exe
svchost.exe
MsMpEng.exe
zlclient.exe
point32.exe
Acrotray.exe
KEMailLb.exe
ccApp.exe
jusched.exe
VPTray.exe
SOUNDMAN.EXE
TeaTimer.exe
MSASCui.exe
ctfmon.exe
NoAds.exe
iexplore.exe
taskmgr.exe
WinMgmt.exe
OUTLOOK.EXE

I just checked the ABit forums, and there are some threads along these
lines, but no one with exactly my systems, and the culprits are all
over the place - Zone Alarm (I have it, but was not seeing the Event
Log errors associated with this issue), issues with on-board NIC (mine
works fine), underpowered PSU (that occurred to me, but Firewire drive
is externally powered; am I overloading my household circuits??),
unidentified conflict with other PCI devices (possible), etc., but
nothing definite, and nothing clearly associated with a STOP error like
the one I'm getting - Usually if it's power related the users reported
nothing whatsoever in their event logs.

Thanks,

DS
 
G

Glen

DS said:
Hello again,

The short answer (to the IRQ question) is yes. The Firewire device:

Texas Instruments OHCI Compliant IEEE 1394 Host Controller

...is listed in the Device Manager as using IRQ 17, which my Realtek
AC'97 audio hardware also uses.

The thing is, by my understanding all IRQs above 15 are "virtual"
according to the ACPI standard, and are eventually redirected to IRQ 2.
This is consistent with the STOP code, which identifies IRQ 2 as the
offending IRQ, but any device with an IRQ above 15 could potentially
produce such a STOP code. This also implies that all devices with IRQs
above 15 are effectively "sharing" IRQs, though generally the APIC is
supposed to be up to the task...

Are you thinking that maybe assigning the Firewire controller to an IRQ
by itself might help? For it to really be "alone" though, I suppose
it'd have to be IRQ 15 or below... I've seen reports of problems with
older, not entirely ACPI-compliant hardware and stand-alone Firewire
boards not being happy in Win2K with ACPI-HAL but tolerating "Standard"
HAL, but the IS-7 is fully ACPI compliant.

Just for a test, try uninstalling the Realtek audio drivers and disable
the onboard audio on the IS7. If the reboots stop, you'll need to
contact either Abit or Realtek for updated audio drivers.
 
D

DS

Hi Glen,

I did as you suggested earlier today - So far no problem, but I'd like
to let it run for a few days before I declare victory, this wouldn't be
the first time I'd thought I'd dealt with the issue, only to have it
come back after a short hiatus...

Something of interest, by the way - Regardless of what the Device
Manager says in Win2K, as you probably know most BIOSs report
(hardware) IRQ assignments on boot-up. These days that screen often
goes by so fast that you can't read it, but before I disabled the
on-board audio the Firewire controller was assigned to IRQ 9; now, it's
on IRQ 5 - At least, according to the boot screen, Win2K still claims
it's on IRQ 17 (must be an ACPI virtual assignment), but does show that
it's alone on that IRQ now.

With respect to my audio drivers, I'm generally pretty good about
keeping them up to date via the Realtek site, since they release their
own drivers for this particular audio chipset and are very good about
updating them frequently (ABit is definitely slower in this regard).
In this case I had v3.86, and the newest is v3.87 (which I DLed for
future use). With that in mind, if this *is* a driver issue I'm
guessing it'll come back with the newer drivers, since I doubt they
fixed this problem in such a minor revision.

Having said that, if it's really the case that the Firewire hardware
just needs its own interrupt, I can always go in and manually assign
them in the BIOS to ensure that this is the case; I shouldn't need to
do anything else, in that situation, to get things working properly
again, so I hope your guess is correct, since I have an obvious way to
address that problem... If it's something else, on the other hand,
well, back to square one... We'll see.

Best wishes,

DS
 
D

DS

Hi Glen,

As an update, this worked! Actually I can even leave the drivers
installed and just disable the audio hardware in the BIOS and this
problem disappears.

Since I wanted to get things working with the audio hardware enabled, I
then went into the BIOS to manually remap my IRQs, thinking that it was
just an IRQ sharing issue. What I discovered is that my MB will not
let me isolate the Firewire and sound hardware - They are automatically
assigned to the same PCI IRQ! Not very helpful.

I gave it some more thought though, and one thing I noticed was that
the various USB controllers were using a total of five interrupts(!) -
Four for the USB 1.0/1.1 hardware and one for the USB2.0 hardware. It
occurred to me that perhaps the issue was not really one of IRQ
*sharing* - This should work, after all, if the ACPI standards are to
be believed. Rather, perhaps it was an issue of having too *many*
shared PCI IRQs at once - Perhaps the APIC was not able to handle this.
Alternatively, the external drive connects via a USB2.0 connection as
well as an IEEE1394 connection, so perhaps simultaneous requests were
going out via the USB2.0 and Firewire interrupts and causing
problems...

In any case, what I did next was to disable "Assign IRQ for USB" - Now
I have sound, and not a single crash in several days!! The problem
appears to be solved, which is wonderful - So nice not to have to worry
about that again. With that in mind, to anyone who is reading, if your
computer randomly reboots and you don't know why, TRY THIS. I still
have no idea why it fixed the problem, but it did. As I've said so
many times before, computers are weird.

Glen, thanks for the advice that got me started on this track!

DS
 
D

DS

Well, I'm back. Turns out the problem isn't totally gone after all;
freeing up IRQs seems to have limited its severity somewhat, but it
still occurs from time to time and is still driving me nuts. So far,
the only situation where I have not yet observed a single crash is when
the sound hardware is disabled - Highly annoying. I'm going to contact
ABit and Realtek about this (my sound hardware is based on the Realtek
ALC650 chipset), but in the meantime, I would very much appreciate
additional suggestions.

Thanks,

DS
 
G

Glen

If the problem is a poorly written driver (a good bet in Realtek's
case), unfortunately no other suggestions will help.
 
D

DS

That's what I was afraid of - Realtek has not responded to my support
request as of yet, and ABit sent me the usual "Idiot's Guide to Random
Reboots", indicating that they didn't really read my posting very
carefully, since I can already rule out the usual causes (i.e.
overheating, bad memory, bad PS, etc.). What about generic /
third-party / open-source sound drivers, any chance there?

Thanks again,

DS
 
G

Glen

For Windows? Doubtful. A better solution is to just forget about
the onboard sound (leave it completely disabled) and get a PCI
audio card. For what they cost these days it's worth the expense.
 
D

DS

Good point, and not a moment too soon - The damn thing rebooted again
just now, and it's done it enough times already today that I'm about
ready to resort to physical violence. I'll have to grab one at Best
Buy and see if it helps; guess I can always return it for a refund if
not. What a POS this Realtek sound hardware is, my god...

DS
 
G

Gary Chanson

DS said:
Good point, and not a moment too soon - The damn thing rebooted again
just now, and it's done it enough times already today that I'm about
ready to resort to physical violence. I'll have to grab one at Best
Buy and see if it helps; guess I can always return it for a refund if
not. What a POS this Realtek sound hardware is, my god...

I think you have something wrong with your hardware, possibly bad memory.
A-Bit motherboards have been my choice for both my own use and my customers
for a number of years and I've never seen a problem such as yours. The
Realtek drivers might not be perfect but they generally work reliably.

One thing worth trying is installing the latest BIOS, even if it's the
same version already installed. Flash memory has been know to get corrupted.
Also, make sure you have installed the latest chipset drivers.

If you get to the point where you are running the latest version of the
BIOS and all of the drivers for the motherboard devices, tested memory, tried
a different Video card, tried removing any other cards, etc., and still have a
problem, send the motherboard back as defective.
 
D

DS

Hi Gary,

Thanks for writing. My memory passes all tests in MemTest86+ 1.65
without a single error; between that and the fact that the problem
occurs in the absence of any add-in cards other than my AGP video board
butgoes away when the Firewire drive is powered down, I'm confident
that memory and add-in cards are not the issue. I will definitely give
the BIOS flashing option a try, though.

I generally keep all drivers completely up to date, but in this case I
must admit I have not installed the newest chipset drivers - The last
update I applied was maybe 6 months ago. With such a mature chipset as
my Intel i865, any updates Intel makes to their unified Intel Chipset
drivers package are highly unlikely to affect anything for my hardware,
and unlike the VIA 4-in-1 package the Intel "drivers" are mainly just
INF files for Windows, but in any case I have grabbed the file and will
give this a try as well.

We agree, ABit hardware is generally very good - I've built a number of
machines based on their MBs and have always been happy with the
quality. Realtek I do not know as much about, except that they update
the drivers for the ALC650 every few weeks it seems like, for better or
for worse. I have had software / driver issues related to this chipset
in the past, but nothing this severe.

In the meantime I have purchased a new sound card, and discovered, much
to my dismay, that the crashes still occur even when the on-board sound
hardware is disabled. The new sound card is much better than what I
had before, and not too expensive either (it's an SB Audigy SE), so I'm
glad I bought it in any case, but it's disappointing with respect to
this problem. For the on-board sound I must not have waited long
enough when testing this before. In any case I can now confirm that
the only time I have never seen this happen to date is when the
Firewire drive is off.

More recently I have been pursuing trying to find a way to manually
control my drives and induce stop-start cycles and other operations
just to see if I can reliably determine what exactly causes this, since
at this point I'm having trouble seeing what software issues could
cause this - The Firewire drive requires no software other than what is
already built in to Win2K SP4, after all, and this does seem to happen
most often when I hear drives spinning up.

I am tempted to revisit the power issue; having done additional
research, it has become increasingly clear that the overall wattage of
a PS, while certainly relevant, is not the whole story, since not all
rails have access to 100% of that wattage, and that age is definitely a
factor as well. I'm wondering if perhaps one rail, the one the
Firewire drive pulls power from, is getting overloaded *sometimes* when
multipe drives spin up at once, for instance.
From my research on what my components *should* draw, I'm clearly
in-spec with respect to the peak power draws for my hardware, with the
voltages confirming this (I would expect some to go low if they were
really being overloaded). Estimates of peak power draw were made using
this website:

http://www.extreme.outervision.com/psucalculator.jsp

....which indicates, with the most conservative settings I can think of,
that 350W should be enough. Of course the need for surge compensation
and the results of capacitor aging will increase this number, as they
note. With that said, this is a very good Antec PS (True Power 480W),
so significant surge compensation should not be necessary, and putting
in the recommended 20% for capacitor aging, I still only need 420W; I
do not top the 460W minimum it actually delivers unless I go to 15%
surge compensation in addition to this - See here for specs:

http://www.antec.com/specs/true480_spe.html

Still, anything is possible, so with that in mind I have grabbed
Speedfan, a freeware utility that will monitor and log voltages, in the
hopes of detecting changes in voltages prior to a crash. Everything
seems fine when all of my hardware is powered up, though, and since its
update interval is every second, no more, if it's a short transient
that causes the problem, I might not see it - So far nothing, we shall
see.

In any case, if the PSU is the issue, replacing the MB will not help.
If I can find and borrow a good PSU for a week, I'll give that a try
and see if that clears things up - I just wish there were a (simple)
way to test the actual power draw / power output of the PSU I have. I
really hope the PSU manufacturers start putting in monitoring chips
that report this sort of information, as it would be invaluable to
tracking down these sorts of problems.

Suggestions are welcome, as always. Thanks in advance,

DS
 
D

DS

As an update, I have done everything you suggested except swapping out
the video board, and for the last few days the problem seems to have
disappeared as a result! I was of course very excited at the
possibility of a real solution, even if I didn't really understand why
it worked.

I can only assume that this feeling of hope is exactly what my computer
was waiting for, since it crashed today, in the same fashion as before.
While my belief in resistentialism has been greatly strengthened by
these experiences and, minus the color change, I am beginning to
identify more and more with Bruce Banner when angry, unfortunately I am
no closer to a solution. Since my MB is no longer under warranty, and
between the two of us we suspect the power supply and video card,
respectively (or at least want to replace them to see if the problem
stops), at that point I could almost build a new PC.

I did get one good suggestion recently from folks on the A-Bit forums,
and that was to buy an add-on Firewire card, disable the on-board one,
and see if that helps. At this point that's a bit cheaper and easier
to accomplish than swapping the PS, MB, or video card, plus I could get
Firewire 800 suport this way, meaning it would give me some added
functionality even if it turned out not to fix this problem.

With that in mind, I'm wondering what folks think of this idea, and if
anyone has other things to try (preferably simple and cheap) as well.
I'm considering this card in particular:

http://www.newegg.com/Product/Product.asp?Item=N82E16815150029

I know SIIG, I like the lifetime warranty, and this has Firewire 400
and 800 support, not to mention a 4-pin power connector that might
address power issues depending on how the MB-integrated Firewire
hardware is wired.

Comments welcome.

DS
 
D

DS

Another update:

The new SIIG Firewire board didn't solve the problem, which I must
admit is disappointing; however, in the past week I believe I have
nailed the problem.

First, I did a fresh Windows 2000 SP4 install on a different HD; the
problem remained, indicating that this is *not* just software, as WD
has always argued, unless their drives are totally incompatible with
Win2K anyway.

Then I thought about something I should've checked before; I can't
believe WD didn't point this out, but for some reason (I thought I had
disabled this), I found in my power saving settings that my HDs were
told to shut down after 30 minutes. I set them to never shut down, and
the problem is now gone.

It's been four days since the last reboot, very atypical given the
frequency of this problem. I was reluctant to call it too soon and then
watch it reboot again, but it really does look like it's fixed.

The fact that I cannot allow this drive to power save without my PC
rebooting is totally ridiculous, I might add, but it certainly doesn't
appear to have anything to do with ABit - I just wanted to get this on
the record here so that if anyone else has these problems they can try
some other things.

In the meantime, I have (of course) encountered some different
problems, however. To make a long story short, it looks like the drive
may be failing - I'm getting a lot of Event Log errors during paging
operations / delayed write failures for this thing (as well as Event ID
9 errors, which I understand indicate a firmware update may be needed,
and sometimes Event ID 25 as well), and occasional freezes / slow
performance when accessing certain files on the drive.

I'd originally thought this might be due to the 1394 bridge chipset
used on the drive (a WDXF3200JBRNN), having read this:

http://www.bustrace.com/delayedwrite/

Unfortunately, while there are firmware updates available for some 1394
bridges (notably Oxford Semiconductors products), finding out which one
my drive has proven impossible, and in any case running the 1394Test
utility available via the above link doesn't indicate any errors, nor
does this registry hack I found:

http://www.enigmativity.com/blog/CommentView,guid,2db26076-8369-47d6-99d9-469eb0fa3788.aspx

Then I got another bright idea: Disconnect the Firewire cable and run
on USB 2.0 only. Something else I should've thought of a long time ago.
When I do that, the event log errors become much more understandable -
They all simplfy to bad block errors, nothing else.

With that in mind, it looks like I have a number of problems coming
together to cause this; I'm running the drive diagnostics right now
("only" takes four hours for an extended test!), and may have to RMA
this thing. I still can't understand how it is that they can sell a
product that is such a POS that it fails following power saving, and
then offer such useless support once this happens - From my searches
there are *many* people with Firewire drives having similar issues -
But whatever, I will take that up with them.

Thanks to all for your help - Let's hope this is the end of the issue.

DS
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top